Last updated July 15, 2024
In Developers Corner

A Hands-On Guide to IceVision Framework for Object Detection

Share

Published on November 30, 2021

by Yugesh Verma

IceVision is a framework for object detection which allows us to perform object detection in a variety of ways using various pre-trained models provided by this framework. It also offers data curation features along with a dashboard for exploratory data analysis. The best feature it has is that it provides an end-to-end deep learning workflow that allows the practitioners to train networks with easy-to-use robust high-performance libraries such as PyTorch-Lightning and FastAI. In this article, we are going to discuss the IceVision framework for object detection with hands-on implementation. The major points that we will discuss here are listed below.

Table of Contents

What is IceVision?
Installing IceVision
Data Preparation
1. Importing Libraries
2. Download and Prepare a Dataset
3. Parse the data
4. Creating Datasets with Augmentations and Transform
Model Building
1. Pre-Modelling Procedures
2. Training
  1. Training using FastAI
  2. Training using PyTorch Lightning

Let’s begin the discussion by understanding what IceVision is.

What is IceVision?

IceVision is a framework that allows us to preprocess our data for object detection and train a model for object detection on the data so that using the model we can make inferences on the data. The framework provides layered connections between deep learning engines, libraries and models. Also, the framework has datasets that can be used for learning the basic implementation of the IceVision frameworks for object detection where the models under the framework are built using the libraries like TorchVision and Ultralytics YOLO.

We can select from many models built on the framework and also switch between them very easily. Basically using the IceVision, we can train a model according to the datasets and after that, we can change the datasets or model as per our requirement. According to its official GitHub profile, some of the features of IceVision are listed below.

Using the auto_fix from the framework, we can automate the data curation and cleaning procedure.
We can also have access to a dashboard using the framework which can be helpful in explanatory data analysis.
In the framework, we have various models which can be used for object detection, segmentation, and classification.
The framework is compatible with the various libraries which can be used for various aspects of computer vision programming.
We have various transformation module in the framework which help in training the model more accurately.

In the next part of the article, we are going to see a basic example of implementing IceVision framework.

Installing IceVision

Let’s start with the installation which can be done by using the following lines of codes.

!wget https://raw.githubusercontent.com/airctic/IceVision/master/IceVision_install.sh

The above-given lines of code will let us have the packages of Torch, TorchVision, IceVision framework, IceData, MMDetection, YOLOv5 and EfficientDet. After gathering, we can install them using the following line of code.

!bash IceVision_install.sh cuda11 master

Output

Since we are using Google Colab we have some of the requirements like torch and TorchVision already installed in the environment. We can also change the installation target to cuda10 or CPU. Now we can restart our kernel using the restart button on the runtime panel of the notebook or we can simply use the Ctrl + m button for that.

Data Preparation

For moving forward to the modelling, we are required to have records using which we can build a model. In this section of the article, we will discuss how we can prepare data for modelling using the IceVision framework.

Importing Libraries

We can import all the components of the IceVision framework using the following line of code.

from IceVision.all import *

Download and Prepare a Dataset

Now we can take our steps to the modelling side. Before going for the modelling, we are required to have a dataset for this purpose. We have a data set called Fridge Objects dataset with 134 images belonging to the four classes:

Can
Carton
Milk bottle
Water bottle

Using the IceVision module for data import, we can import our data using this link.

Import the Data

 import icedata

path = icedata.fridge.load_data()

Output:

Parse the Data

Using the parser module of the framework, we can load the annotation file and split the data into the training and testing, and validation parts. The submodule under the parser helps in annotating for the common errors in the data.

# Create the parser

parser = parsers.VOCBBoxParser(annotations_dir=path / "odFridgeObjects/annotations", images_dir=path / "odFridgeObjects/images")

Using the following lines of code we can split the data into training and validation datasets.

# Parse annotations to create records

train, valid = parser.parse()

parser.class_map

Output:

Creating Datasets with Augmentations and Transform

As we know that data augmentation and transformation help in making a model well trained and perform accurately on the data. This framework also provides this facility where the Albumentations library helps in defining and executing transformations. There are various transformations provided in the framework. In this article, we are using the aug_tfms module for the transformation of the image which helps the model to get transformations like rotation, cropping, horizontal flips, and more.

Let’s define a function for transformation

train_trans = tfms.A.Adapter([*tfms.A.aug_tfms(size=384, presize=512), tfms.A.Normalize()])

valid_trans = tfms.A.Adapter([*tfms.A.resize_and_pad(384), tfms.A.Normalize()])

Using the function with data

train_data = Dataset(train, train_tfms)

valid_data = Dataset(valid, valid_tfms)

Let’s visualize the data after augmentation is performed.

vis = [train_data[1] for _ in range(8)]

print("training  data")

show_samples(vis, ncols=4)

Output:

training data

vis = [valid_data[1] for _ in range(8)]

print("validation data")

show_samples(vis, ncols=4)

Output:

validation data

Model Building

Before training a model we are required to instantiate the model variable. Make the data according to the model and various procedures to follow before any modelling procedure. So let’s start with the pre modelling procedure.

Pre-Modelling Procedures

In order to build a model using the IceVision framework, we are required to select libraries, models, and backbones for the model. Also, it is mandatory for us to choose these all from the given options under the framework.

Here we are using the RetinaNet model with the backbone of resnet50_fpn_1x. Which can be specified by using the following line of codes.

model_type = models.mmdet.retinanet

backbone = model_type.backbones.resnet50_fpn_1x(pretrained=True)

Now we can instantiate the model using the following lines of code.

model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), **extra_args)

Since we have various options of models and backbone we are required to make the data according to the model. Till now we have seen how we can call the data and make changes on the data. For editing data according to the model, the framework provides the facility of data loaders using which, we can make changes on the data for modelling purposes.

# Data Loaders

train_load = model_type.train_dl(train_data, batch_size=8, num_workers=4, shuffle=True)

valid_load = model_type.valid_dl(valid_data, batch_size=8, num_workers=4, shuffle=False)

Let’s visualize the batch for validation in the loader.

model_type.show_batch(first(valid_load), ncols=4)

Output:

Now we can track the progress of the training using the FastAI and PyTorch lighting for which we can use the framework provided metric class. We are just required to instantiate a variable that can hold the metrics under it.

metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]

Training

Now the above-defined metrics can be used for training the model using the FastAI or PyTorch-lightning. Both will support the same metrics.

Training using fastai

training = model_type.fastai.learner(dls=[train_load, valid_load], model=model, metrics=metrics)

Output:

Tuning the Model

training.fine_tune(20, 0.00158, freeze_epochs=1)

Output:

The above-given output is some of the results from tunning of a model where the most optimal result is highlighted. In the tabular results, we have a measure of training and validation losses with the metrics which we have chosen to track the training.

We can also train the model using the PyTorch Lightning. The procedure is almost same but the coding part for PyTorch lightening is different. We can use the following line of codes for training the model using the pytorch lightening:

class LightModel(model_type.lightning.ModelAdapter):

    def configure_optimizers(self):

        return Adam(self.parameters(), lr=1e-4)

light_model = LightModel(model, metrics=metrics)

We can instantiate the model using the following lines of codes:

trainer = pl.Trainer(max_epochs=5, gpus=1)

trainer.fit(light_model, train_load, valid_load)

Also, we can check the results using the following lines of codes:

model_type.show_results(model, valid_ds, detection_threshold=.5)

Output:

The above-given output is the final result of the process we used for object detection using the IceVision framework. We can see that it is working well. We can use it for our projects because it is an open-source framework.

Final Words

In this article, we have seen an overview of the IceVision framework for object detection. Along with that, we have also seen how we can use models and data from the framework and how we can make a whole process work for the object detection task. I encourage users to follow the framework more and try to perform other tasks related to computer vision problems.

References

📣 Want to advertise in AIM? Book here

Yugesh Verma

Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.