Member-only story

Train your own object detector with Faster-RCNN & PyTorch

A guide to object detection with Faster-RCNN and PyTorch

Johannes Schmidt
18 min readFeb 23, 2021
Creating a human head detector

After working with CNNs for the purpose of 2D/3D image segmentation and writing a beginner’s guide about it, I decided to try another important field in Computer Vision (CV) — object detection. There are several popular architectures like RetinaNet, YOLO, SDD and even powerful libraries like detectron2 that make object detection incredibly easy. In this tutorial, however, I want to share with you my approach on how to create a custom dataset and use it to train an object detector with PyTorch and the Faster-RCNN architecture. I will show you how images that were downloaded from the internet can be used to generate annotations (bounding boxes) with the open-source, multi-dimensional image viewer napari. The provided code is specifically written for Faster-RCNN models, but parts might work with other model architectures (e.g. YOLO) because general principles apply to all common object detection models that are based on anchor/default boxes. Due to transfer learning, you will see that training an object detector sometimes requires very few images! You can find the code and a jupyter notebook in this GitHub repository:

github.com/johschmidt42/PyTorch-Object-Detection-Faster-RCNN-Tutorial

For this tutorial, I am going to train a human head detector. This is a common task for phone camera applications: detecting human faces or heads within an image. If you want to train your own object detector, e.g. for racoon detection, car detection or whatever comes into your mind, you’re at the right place. So please go ahead. It might be useful for you.

For training and experiment management, I will use PyTorch Lightning and neptune. If you’re not familiar with these packages, do not worry, you’ll be able to implement your own training logic and choose your own experiment tracker. Here’s the table of content:

  1. Getting images
  2. Annotating
  3. Dataset building
  4. Faster R-CNN in PyTorch
  5. Training
  6. Inference

Getting images

--

--

Johannes Schmidt
Johannes Schmidt

Written by Johannes Schmidt

Software & Data Engineer at Datamesh GmbH. Sharing knowledge and code around software (cloud) development, data engineering & data science!

Responses (5)