Auto-Labeling Tool for Object Detection

Stop wasting all of your time labeling datasets

Alvaro Leandro Cavalcante Carneiro
Towards Data Science

--

Photo by Kyle Hinkson on Unsplash

Anyone who has worked with object detection knows that the labeling/annotation process is the hardest part. It is not difficult because it's complex like training a model, but because the process is very tedious and time-consuming.

In my previous works with this technology, I had the challenge of dealing with datasets with thousands of images or a few hundreds images with dozens of objects per image. In both situations, my only options were to waste days creating labels or using a lot of human resources to do so.

Considering this annoying bottleneck, I’ve created a simple (yet effective) auto annotation tool to make this process easier. Although it doesn’t completely replace the manual annotation process, it’ll help you to save a lot of time in your life. Based on this, this article explains how this tool works and how you can use it to simplify your next object detection project!

How it works

The auto annotation tool is based on the idea of a semi-supervised architecture, where a model trained with a small amount of labeled data is used to produce the new labels for the rest of the dataset. As simple as that, the library uses an initial and simplified object detection model to generate the XML files with the image annotations (considering the PASCAL VOC format). This process can be illustrated by the following image:

Auto annotation process. (Image by author)

As a semi-supervised solution, unfortunately, it’s impossible to avoid manual annotation, but you’ll need to label just a small amount of your data.

It’s hard to determine the number of images to manually label, as it depends on the complexity of your problem. If you want to detect dogs and cats and have 2000 images in the dataset, for example, probably 200 images are enough (100 per class). On the other hand, if you have dozens of classes or objects that are hard to detect, you should need more manual annotations to see the benefits of the semi-supervised approach.

On the other hand, there are some interesting advantages of spending at least some time annotating images manually. First of all, you’ll have a closer look into the data, which could help you to discover problems (e.g. the objects are too close to each other, or lighting conditions are different than you thought) and determine the model constraints.

Besides that, a reduced version of the dataset is often used for hyperparameter tuning and neural architecture search, so this might be a great moment to try to find the model configurations that best fit your problem.

That said, once you have labeled some images and trained an initial model, you’ll be able to use the auto annotation tool to speed up this process for the rest of your dataset!

Using the auto annotation tool

This project is completely open-source and it’s available on GitHub. The code is written in Python and currently, it just supports TensorFlow models (although the Pytorch support will come soon).

You can install the library using pip, as shown below:

$ pip install auto-annotate

It’s recommended to use a Python virtual environment to avoid any compatibility issues with your TensorFlow version. After the installation, you can use the library both from the command line or directly in your Python code. For both, you’ll have the same set of parameters:

  • saved_model_path: The path of the saved_model folder with the initial model.
  • label_map_path: The path of the label_map.pbtxt file.
  • imgs_path: The path of the folder with the dataset images to label.
  • xml_path (optional): Path to save the resulting XML files. The default behavior is to save in the same folder of the dataset images.
  • threshold: Confidence threshold to accept the detections made by the model. the default is 0.5.

Command line

The easier way to use the library is to call it from the command line. To do so, in your terminal, execute the following command with your own parameters:

python -m auto_annotate --label_map_path /example/label_map.pbtxt \
--saved_model_path /example/saved_model \
--imgs_path /example/dataset_images \
--xml_path /example/dataset_labels \
--threshold 0.65

Python code

If you, for some reason, would like to use the library directly in your python code, you can simply do:

If everything worked correctly, you will see a progress bar of the dataset annotation. During the execution, you can use an annotation tool like LabelImg to open your images and auto-generated labels to verify if they are being generated as expected.

Post-action: Review the labels

As we already know, every machine learning model makes mistakes, and this is not different in this labeling process. If you retrain your initial model with the brand new labels generated by this tool, you’ll be assuming weak supervision, as you’ll find some noise in the labels/annotations.

Weak supervision has its own problems, and you would like to avoid this, if possible. That said, it’s recommended to review the labels after the auto annotation process, to find and fix the wrong predictions. Again we face a manual process, but reviewing and improving the quality of some labels is considerably faster than drawing the bounding boxes from zero.

Furthermore, the quality of the generated predictions will depend on the accuracy of the initial model and the confidence threshold. If the confidence threshold is high (close to 1) the model will generate fewer incorrect predictions (false positives), and you’ll have to draw the boxes for the missing objects (false negatives).

On the other hand, a threshold near 0 will generate more incorrect predictions, but you will just need to erase and fix the skewed bounding boxes. The best confidence value will be a parameter to tune based on your problem and model performance.

After the review, you’ll be ready to retrain the initial model with the whole dataset.

Project extension: MLOps use case

After auto-annotating my whole dataset and training an awesome model I just finished my project, right? Well, maybe not…

Machine learning engineering and model productization are very important areas nowadays, as we understood that a model in production needs to be monitored and improved, and this is not different for deep learning models, like object detectors.

When you release an image-based project, the users will probably send images that are not so common in the training dataset. Consider the cat and dog detector, for example: you may not have many images of dogs on the beach in your dataset, but you may receive a lot of them during the summer vacations!

That said, a great use case of this project is to create an auto-annotation pipeline to constantly generate new labels from the images sent by the users in production. You could then integrate the new images and the generated annotations into an automatic training pipeline to retrain your object detection model every month.

With this approach, you will guarantee that your model will always be updated with the user behavior and perform well. Also, as your model becomes more robust, less manual validation will be required to annotate the new images.

Conclusion and final thoughts

As deep learning keep advancing with new state-of-the-art architectures and more computer power is available, data is still a huge bottleneck in artificial intelligence.

To train an object detection model that is good enough to go to production, a lot of annotated data is necessary, and this can easily become the most expensive part of the project.

Given that fact, an auto annotation tool can be a great helper in this process, as it speeds up the human task by inferring the location and class of objects in the image.

Among other options, the auto annotation tool introduced in this article has the advantage of being free, open-source, and easy to use. Although the manual part is still necessary, this library has helped me with a lot of different projects so far, and I think it can also help other people!

--

--

MSc. Computer Science | Data Engineer. I write about artificial intelligence, deep learning and programming.