Train your own object detector with Faster-RCNN & PyTorch
A guide to object detection with Faster-RCNN and PyTorch
After working with CNNs for the purpose of 2D/3D image segmentation and writing a beginner’s guide about it, I decided to try another important field in Computer Vision (CV) — object detection. There are several popular architectures like RetinaNet, YOLO, SDD and even powerful libraries like detectron2 that make object detection incredibly easy. In this tutorial, however, I want to share with you my approach on how to create a custom dataset and use it to train an object detector with PyTorch and the Faster-RCNN architecture. I will show you how images that were downloaded from the internet can be used to generate annotations (bounding boxes) with the open-source, multi-dimensional image viewer napari. The provided code is specifically written for Faster-RCNN models, but parts might work with other model architectures (e.g. YOLO) because general principles apply to all common object detection models that are based on anchor/default boxes. Due to transfer learning, you will see that training an object detector sometimes requires very few images! You can find the code and a jupyter notebook in this GitHub repository:
github.com/johschmidt42/PyTorch-Object-Detection-Faster-RCNN-Tutorial