Programming Comments - Training a neural network

This is post #4 of a 5-part series on Darknet. If you've not yet read the summary, I suggest you start there.

Summary

We now have thousands of images files we can use for training, and Darknet has been installed & built, possibly with support for GPU.

This post will show you how to configure Darknet to train a neural network.

Configuring Darknet

This is the most tedious part of the process. There are several configuration files, some require paths, and it is easy to make a mistake, especially when trying out different settings.

Cfg

Run the following commands:

cd ~/darknet/cfg
cp yolov3-tiny.cfg stone_barcodes-yolov3-tiny.cfg

Edit stone_barcodes-yolov3-tiny.cfg and make the following changes:

batch=64
subdivisions=8
max_batches = 2000
- This determines the number of iterations to perform. A starting rule of thumb is 2000 iterations per class, but once you train the network a few times, looking at the average loss will help determine if this can be lowered or needs to be increased.
steps=1600,1800
- This should be 80% and 90% of max_batches.
classes=1
- This line appears several times in the file. Make sure to change all instances of it.

The next edit in the .cfg file is to the [convolutional] sections that appears before every [yolo] section.

The lines that say filters=255 need to be modified to (the number of classes + 5) x 3. Meaning since we have a single class ("barcodes") we want to use (1 + 5) x 3 = 18. There are 2 places in yolov3-tiny where this shows up, and after modified they should be similar to this:

...
pad=1
filters=18
activation=linear
[yolo]
...

Save the .cfg file and exit. Note that this barely touches the surface of what can be done. For more ideas and customizations, read through AlexeyAB's "readme".

Data

We need to create a .data file. Run the following commands:

cd ~/darknet/cfg
cp voc.data stone_barcodes.data

Edit the file so it is similar to this:

classes = 1
train = /home/stephane/darknet/data/stone_barcodes_train.txt
valid = /home/stephane/darknet/data/stone_barcodes_test.txt
names = /home/stephane/darknet/data/stone_barcodes.names
backup = /home/stephane/darknet/backup/

Obviously fix up the paths. You can use relative paths if you prefer. If so, the paths must be relative to the directory from which you'll be running Darknet.

Some of these files don't exist yet; we're going to be creating them shortly.

Names

The .names file contains a single word per line, identifying the name of all the classes we'll be recognizing during training. These names correspond to the zero-based class indexes we inserted into the labels during synthetic image generation. In our case, we have a single class, and we're going to call it "barcode". Run the followng commands:

cd ~/darknet/data
echo "barcode" > stone_barcodes.names

Images & labels

We need to copy the images and labels, and then create a text file that lists all the images Darknet should be reading to train the network. Run the following commands:

cd ~/darknet/data
mkdir stone_barcodes
cp ~/barcode_images/barcode_* ~/darknet/data/stone_barcodes/
find (pwd)/stone_barcodes -type f -name "*.jpg" | sort -R | > stone_barcodes_train.txt
find (pwd)/stone_barcodes -type f -name "*.jpg" | sort -R | > stone_barcodes_test.txt

If using bash instead of fish, use this instead: find $(pwd)/stone_barcodes -type f -name "*.jpg" | sort -R | > stone_barcodes_...etc.

Start training

Everything should now be in place to start training your first network. Run the following command:

cd ~/darknet
mkdir -p ~/darknet/backup
./darknet detector train cfg/stone_barcodes.data cfg/stone_barcodes-yolov3-tiny.cfg

This will start the training, and display a plot of the average loss during training. The loss will be in the hundreds at the beginning, and gradually come down. By the time training is done, you should see numbers like 0.5, 0.3, or even < 0.1. The lower the number, the better the results.

chart showing average loss during training

The weight files created by Darknet during training are saved in the ~/darknet/backup directory at every 1000 iterations, and again once the maximum number of iterations has been reached. These weights, combined with the .cfg file, is the neural network. You'll need both of those files to apply detection and object location within an image.

At this point, all you can do is sit back and let the training happen. Depending on the size of the network you are training, and whether Darknet has a GPU it can use, the training can take hours or days.

Go to post #5 of the series to see how to access the neural networks from a C++ application.