Data Science/Machine Learning

Self Paced Course New

Data Analyst

Java

Digital Marketing

Python

Web Development

Embedded Systems & IoT

Advanced Certification in Applied Data Science, Machine Learning & IoT By E&ICT Academy, IIT Guwahati 9 months

Online Certification in Applied Data Science, Machine Learning and Edge AI By E&ICT Academy, IIT Guwahati 6 months

Certification in Data Science and Machine Learning With Python By The IoT Academy Noida 180 Hrs

Course Categories

Data Science / Machine Learning
Data Science / Machine Learning

Top Courses
Self Paced Course New
Self Paced Course

Top Courses
- 6 months
  Data Analytics, Machine Learning & Generative AI Course
Data Analyst
Data Analyst Certification Course

Top Courses
- 4 months | Offline
  Data Analyst Certification Course By The IoT Academy
Java
Java Development

Top Courses
- | 6 Months
  Full Stack Java Developer Course By E&ICT Academy IIT Guwahati
- 6 Months | Offline
  Certification Course for Java Full Stack Developer by The IoT Academy
Digital Marketing
Digital Marketing

Top Courses
- 7 Months | Online
  Digital Marketing Certification Course By The IoT Academy
Python
Python Training

Top Courses
- 45 Days | Online
  Python Certification Course By The IoT Academy
Web Development
Web Development

Top Courses
- 6 Weeks / 3 Months | Online/offline
  Full Stack Web Development Course By The IoT Academy
Embedded System and IoT
Embedded Systems and IoT

Top Courses
- 1.5M/3M/6M | Offline
  Embedded System Course In Noida By The IoT Academy
- 1.5M/3M/6M | Offline
  Internet Of Things Course In Noida By The IoT Academy

TensorFlow Hub for Object Detection using Faster RCNN

Written By
Published on November 17th, 2022

Table of Contents [show]

Table of Contents

Introduction

Object detection is a computer vision task recently influenced by advances in machine learning.

What algorithm do you employ for tasks involving object detection? R-CNN It provided a useful structure.

First, let's clarify what object detection is.

What is Object Detection?

Object Detection is the process of finding real instances of objects, such as a car, bicycle, television, flowers, and people, in still images or videos. It allows recognition, localization, and detection of multiple objects within an image, giving us a much better understanding of the image. It is commonly used in image retrieval, security, surveillance, and advanced driver assistance systems (ADAS) applications.

Object detection can be done in several ways:-

Object detection function
Viola Jones Object Detection
SVM classification with HOG features
Deep Learning object detection

Object Detection Workflow

Each object detection algorithm has a different way of working, but they all work on the same principle.

In general, the object detection task is performed in three steps:-

It generates small segments on the input.
For each segmented rectangular region, feature extraction is performed to predict whether the rectangle contains a valid object.
Overlapping boxes are merged into a single bounding rectangle (Non-Maximum Suppression)

Now that you have understood the basic object detection workflow let's move forward in the object detection tutorial and understand what Tensorflow is.

Our Learners Also Read: What is the main difference between RNN and LSTM?

What is TensorFlow?

Tensorflow is Google's Open Source Machine Learning Framework for data flow programming across various tasks. The nodes in the graph represent mathematical operations, while the edges of the graph represent multidimensional data arrays (tensors) that communicate with each other.

Tensors are just multidimensional arrays, extensions of 2-dimensional arrays to higher-dimensional data. There are various features of Tensorflow that make it suitable for deep learning. So, without wasting time, let's see how we can implement object detection using Tensorflow.

A Brief Overview of Different R-CNN Algorithms for Object Detection

Let's quickly review the three R-CNN family algorithms—R-CNN, Fast R-CNN, and Faster R-CNN—that we looked at in the first post. This will make it straightforward for us to implement when we anticipate bounding boxes in previously unexplored images (new data).

R-CNN

R-CNN uses a selective search to extract several regions from a given image, and it then determines whether regions contain objects. These regions are initially extracted, and for each region, CNN is utilized to extract particular features. Finally, object detection is accomplished using these features. Unfortunately, because there are so few steps in the process, R-CNN becomes comparatively slow.

Fast R-CNN

Fast R-CNN, on the other hand, sends the entire image to ConvNet, which creates regions of interest (instead of passing extracted areas from the image). Additionally, it employs a single model that collects features from areas, categorizes them into various classes, and produces bounding boxes rather than three independent models (as we saw with R-CNN).classifies them into other classes, and returns bounding boxes.

All these steps are performed simultaneously, making it faster compared to R-CNN. However, R-CNN is not fast enough when applied to a large dataset because it also uses selective search to extract regions.

Faster R-CNN

Faster R-CNN is an extension of Fast R-CNN. The name suggests that Faster R-CNN is faster than Fast R-CNN due to the region prediction network (RPN).

The main contributions to this:

Region Design Network (RPN) is a fully convolutional network that generates designs with different scales and aspect ratios. RPN implements neural network terminology that emphasizes telling object detection (Fast R-CNN) where to look.

Rather than using image pyramids (i.e., multiple instances of an image but at different scales) or filter pyramids (i.e., various filters with different sizes), this article introduced the concept of anchor boxes. An anchor box is a specific scale and aspect ratio reference box. With multiple reference frames, there are various scales and aspect ratios for a single area. One can think of this as a pyramid made out of reference anchor boxes. Then, each zone is mapped to a separate reference anchor box to enable the detection of objects with various scales and aspect ratios.

RPN and Fast R-CNN both use convolutional computations. The calculating time is shortened by this.

The following figure depicts the Faster R-architecture CNN There are two modules in it:-

RPN: Used to create draught areas.
For object detection in suggested locations, use quick R-CNN.

It is the responsibility of the RPN module to create draught regions. The Fast R-CNN detection engine is directed where to look for items in the image by this application of the attentional principle in neural networks.

Object Detection Code

The next step is to create a new Python file, paste this code, and navigate to the "Object Detection" directory inside the research subfolder.

“`

import numpy as np

import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile

import pathlib

from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image

from IPython.display import display

from object_detection.utils import ops as utils_ops

from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

while "models" in pathlib.Path.cwd().parts:

os.chdir('..')

def load_model(model_name):

base_url = 'http://download.tensorflow.org/models/object_detection/'

model_file = model_name + '.tar.gz'

model_dir = tf.keras.utils.get_file(

fname=model_name,

origin=base_url + model_file,

untar=True)

model_dir = pathlib.Path(model_dir)/"saved_model"

model = tf.saved_model.load(str(model_dir))

return model

PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt'

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

model_name = 'ssd_inception_v2_coco_2017_11_17'

detection_model = load_model(model_name)

“`

def run_inference_for_single_image(model, image):

image = np.asarray(image)

# Use 'tf.convert to tensor' to convert the input, which must be a tensor.

input_tensor = tf.convert_to_tensor(image)

# Add an axis using 'tf.newaxis' because the model anticipates a batch of photos.

input_tensor = input_tensor[tf.newaxis,…]

# Run inference

model_fn = model.signatures['serving_default']

output_dict = model_fn(input_tensor)

# All outputs are batches tensors.

# Convert to numpy arrays and remove the batch dimension using index [0].

# We're only interested in the first num_detections.

num_detections = int(output_dict.pop('num_detections'))

output_dict = {key:value[0, :num_detections].numpy()

for key,value in output_dict.items()}

output_dict['num_detections'] = num_detections

# detection_classes should be ints.

output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

# Handle models with masks:

if 'detection_masks' in output_dict:

# Resize the bbox mask to fit the image.

detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(

output_dict['detection_masks'], output_dict['detection_boxes'],

image.shape[0], image.shape[1])

detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,

tf.uint8)

output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()

return output_dict

def show_inference(model, image_path):

#Later, to generate the output image with boxes and labels on it, the array-based representation of the image will be employed.

image_np = np.array(Image.open(image_path))

# Actual detection.

output_dict = run_inference_for_single_image(model, image_np)

# Visualization of the results of a detection.

vis_util.visualize_boxes_and_labels_on_image_array(

image_np,

output_dict['detection_boxes'],

output_dict['detection_classes'],

output_dict['detection_scores'],

category_index,

instance_masks=output_dict.get('detection_masks_reframed', None),

use_normalized_coordinates=True,

line_thickness=8)

display(Image.fromarray(image_np))

“`

We have a test images folder inside the object detection folder. Two photos that will be used to test the model are already present in that folder. The results can be obtained by running the below cells while also inserting the images for which we wish to locate items.

“`

PATH_TO_TEST_IMAGES_DIR = pathlib.Path('models/research/object_detection/test_images')

TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))

“`

for image_path in TEST_IMAGE_PATHS:

print(image_path)

show_inference(detection_model, image_path)

“`

Conclusion

We have now reached the conclusion of this blog, in which we learned how to utilize the Tensorflow Object Detection API to identify things in both webcam feeds and photos.