Creating Datasets for YOLOv4 in Darknet

Guide to Labeling Images and Training YOLOv4 / YOLOv4-Tiny in Darknet

Step 1: Install Required Software

Before starting, you will need:

Windows or Linux system
Darknet (Download from )
Python (Recommended: Python 3.8 or newer)
OpenCV (Optional, but improves image processing)
Labeling Tool (We recommend LabelImg: )

Step 2: Collect and Label Images

Gather Images
- Collect at least 500-1000 images per object class.
- The more diverse the images, the better the model will perform.
- Ensure images are clear, high-quality, and cover different angles, lighting conditions, and environments.
- Save images in .jpg or .png format.
Install and Use LabelImg for Labeling
- Install LabelImg:
  pip install labelImg
- Run LabelImg:
  labelImg
- Open LabelImg and select the folder containing images.
- Choose YOLO format in the settings (this ensures labels are saved correctly for YOLO training).
- Use the bounding box tool to draw boxes around objects.
- Each bounding box should tightly fit the object with minimal extra space.
- Assign each object the correct class label.
- Save annotations (.txt files) in the same directory as images.
Understanding YOLO Labeling Format
- Each .txt annotation file corresponds to an image and contains:
  class_id x_center y_center width height
- class_id: The ID number of the object class (starts from 0).
- x_center, y_center: The normalized center coordinates of the bounding box (values between 0 and 1).
- width, height: The normalized width and height of the bounding box (values between 0 and 1).
- Example annotation for an image with a car:
  0 0.5 0.5 0.4 0.4
- If an image contains multiple objects, each object has its own line.

Organize Data Properly

Create a structured dataset folder:

dataset/
├── images/
│   ├── train/
│   │   ├── img1.jpg
│   │   ├── img2.jpg
│   └── val/
│       ├── img3.jpg
│       ├── img4.jpg
├── labels/
│   ├── train/
│   │   ├── img1.txt
│   │   ├── img2.txt
│   └── val/
│       ├── img3.txt
│       ├── img4.txt
├── obj.names
├── train.txt
├── val.txt
├── obj.data

Create Required Files
- obj.names: List of object classes, one per line.
- train.txt and val.txt: Contains full paths to training and validation images.
- obj.data: Configuration file with dataset details:
  classes = <number_of_classes> train = dataset/train.txt valid = dataset/val.txt names = dataset/obj.names backup = backup/

Step 3: Train YOLOv4 / YOLOv4-Tiny

Download Pretrained Weights
Modify Configuration File (.cfg)
- Use yolov4.cfg or yolov4-tiny.cfg.
- Change width=416 height=416 (or 256x256, 128x128).
- Set max_batches = (number_of_classes * 2000), but at least 4000.
- Set steps = 80% and 90% of max_batches.
- Change filters in [convolutional] before [yolo] to (number_of_classes + 5) * 3.

Start Training

./darknet detector train dataset/obj.data yolov4.cfg yolov4.conv.137 -map

(Use yolov4-tiny.cfg for Tiny model.)

Monitor Training
- Training loss should gradually decrease.
- Model weights are saved in backup/.
- Check chart.png to visualize training progress.

Step 4: Test Your Model

Run Detection on Test Images

./darknet detector test dataset/obj.data yolov4.cfg backup/yolov4_best.weights test.jpg

Interpret Results
- Bounding boxes should correctly detect objects with confidence scores.
- If false detections occur, refine dataset with more diverse images.

Step 5: Convert Model for Deployment

Convert to ONNX or TensorRT for better performance if needed.
YOLO models can be used in OpenCV, TensorFlow, or Jetson devices.

Best Practices

Use at least 80% of images for training, 20% for validation.
Label images as accurately as possible.
Train on multiple sizes (416x416, 256x256, 128x128) to test different speed-accuracy trade-offs.
Augment dataset with flipped, rotated, and brightened images for robustness.
Increase dataset size if overfitting occurs.

Summary

Collect and label images properly.
Organize dataset and create annotation files.
Train the model with YOLOv4 or YOLOv4-Tiny.
Test and optimize the model for accuracy.
Convert model for deployment in real-world applications.

PreviousCreating Datasets for Moonlight NextCreating Datasets for YOLOv8 +

Last updated 4 months ago