TensorFlow Hub for Object Detection using Faster RCNN

  • Written By  

  • Published on November 17th, 2022

Table of Contents [show]

 

Introduction

 

Object detection is a computer vision task recently influenced by advances in machine learning.

What algorithm do you employ for tasks involving object detection? R-CNN It provided a useful structure.

First, let's clarify what object detection is.

 

 

What is Object Detection?

 

Object Detection is the process of finding real instances of objects, such as a car, bicycle, television, flowers, and people, in still images or videos. It allows recognition, localization, and detection of multiple objects within an image, giving us a much better understanding of the image. It is commonly used in image retrieval, security, surveillance, and advanced driver assistance systems (ADAS) applications.

Object detection can be done in several ways:-

 

  • Object detection function
  • Viola Jones Object Detection
  • SVM classification with HOG features
  • Deep Learning object detection

 

 

Object Detection Workflow

 

Each object detection algorithm has a different way of working, but they all work on the same principle.

In general, the object detection task is performed in three steps:-

 

  • It generates small segments on the input.
  • For each segmented rectangular region, feature extraction is performed to predict whether the rectangle contains a valid object.
  • Overlapping boxes are merged into a single bounding rectangle (Non-Maximum Suppression)

 

Now that you have understood the basic object detection workflow let's move forward in the object detection tutorial and understand what Tensorflow is.

 

 

Our Learners Also Read: What is the main difference between RNN and LSTM?

 

 

What is TensorFlow?

 

Tensorflow is Google's Open Source Machine Learning Framework for data flow programming across various tasks. The nodes in the graph represent mathematical operations, while the edges of the graph represent multidimensional data arrays (tensors) that communicate with each other.

 

Tensors are just multidimensional arrays, extensions of 2-dimensional arrays to higher-dimensional data. There are various features of Tensorflow that make it suitable for deep learning. So, without wasting time, let's see how we can implement object detection using Tensorflow.

 

 

A Brief Overview of Different R-CNN Algorithms for Object Detection

 

Let's quickly review the three R-CNN family algorithms—R-CNN, Fast R-CNN, and Faster R-CNN—that we looked at in the first post. This will make it straightforward for us to implement when we anticipate bounding boxes in previously unexplored images (new data).

 

R-CNN

R-CNN uses a selective search to extract several regions from a given image, and it then determines whether regions contain objects. These regions are initially extracted, and for each region, CNN is utilized to extract particular features. Finally, object detection is accomplished using these features. Unfortunately, because there are so few steps in the process, R-CNN becomes comparatively slow.

 

Fast R-CNN

Fast R-CNN, on the other hand, sends the entire image to ConvNet, which creates regions of interest (instead of passing extracted areas from the image). Additionally, it employs a single model that collects features from areas, categorizes them into various classes, and produces bounding boxes rather than three independent models (as we saw with R-CNN).classifies them into other classes, and returns bounding boxes.

 

All these steps are performed simultaneously, making it faster compared to R-CNN. However, R-CNN is not fast enough when applied to a large dataset because it also uses selective search to extract regions.

 

Faster R-CNN

Faster R-CNN is an extension of Fast R-CNN. The name suggests that Faster R-CNN is faster than Fast R-CNN due to the region prediction network (RPN).

 

The main contributions to this:

 

Region Design Network (RPN) is a fully convolutional network that generates designs with different scales and aspect ratios. RPN implements neural network terminology that emphasizes telling object detection (Fast R-CNN) where to look.

Rather than using image pyramids (i.e., multiple instances of an image but at different scales) or filter pyramids (i.e., various filters with different sizes), this article introduced the concept of anchor boxes. An anchor box is a specific scale and aspect ratio reference box. With multiple reference frames, there are various scales and aspect ratios for a single area. One can think of this as a pyramid made out of reference anchor boxes. Then, each zone is mapped to a separate reference anchor box to enable the detection of objects with various scales and aspect ratios.

RPN and Fast R-CNN both use convolutional computations. The calculating time is shortened by this.

 

The following figure depicts the Faster R-architecture CNN There are two modules in it:-

 

  • RPN: Used to create draught areas.
  • For object detection in suggested locations, use quick R-CNN.

 

It is the responsibility of the RPN module to create draught regions. The Fast R-CNN detection engine is directed where to look for items in the image by this application of the attentional principle in neural networks.

 

 

Object Detection Code

 

The next step is to create a new Python file, paste this code, and navigate to the "Object Detection" directory inside the research subfolder.

 

“`

import numpy as np

import os

import six.moves.urllib as urllib

import sys

import tarfile

import tensorflow as tf

import zipfile

import pathlib

from collections import defaultdict

from io import StringIO

from matplotlib import pyplot as plt

from PIL import Image

from IPython.display import display

from object_detection.utils import ops as utils_ops

from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

 

while "models" in pathlib.Path.cwd().parts:

    os.chdir('..')

 

def load_model(model_name):

  base_url = 'http://download.tensorflow.org/models/object_detection/'

  model_file = model_name + '.tar.gz'

  model_dir = tf.keras.utils.get_file(

    fname=model_name, 

    origin=base_url + model_file,

    untar=True)

 

  model_dir = pathlib.Path(model_dir)/"saved_model"

 

  model = tf.saved_model.load(str(model_dir))

 

  return model

 

PATH_TO_LABELS = 'models/research/object_detection/data/mscoco_label_map.pbtxt'

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

 

model_name = 'ssd_inception_v2_coco_2017_11_17'

detection_model = load_model(model_name)

“`

 

“`

def run_inference_for_single_image(model, image):

  image = np.asarray(image)

  # Use 'tf.convert to tensor' to convert the input, which must be a tensor.

  input_tensor = tf.convert_to_tensor(image)

  # Add an axis using 'tf.newaxis' because the model anticipates a batch of photos.

  input_tensor = input_tensor[tf.newaxis,…]

 

  # Run inference

  model_fn = model.signatures['serving_default']

  output_dict = model_fn(input_tensor)

 

  # All outputs are batches tensors.

  # Convert to numpy arrays and remove the batch dimension using index [0].

  # We're only interested in the first num_detections.

  num_detections = int(output_dict.pop('num_detections'))

  output_dict = {key:value[0, :num_detections].numpy() 

                 for key,value in output_dict.items()}

  output_dict['num_detections'] = num_detections

 

  # detection_classes should be ints.

  output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

   

  # Handle models with masks:

  if 'detection_masks' in output_dict:

    # Resize the bbox mask to fit the image.

    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(

              output_dict['detection_masks'], output_dict['detection_boxes'],

               image.shape[0], image.shape[1])      

    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,

                                       tf.uint8)

    output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()

    

  return output_dict

 

def show_inference(model, image_path):

  #Later, to generate the output image with boxes and labels on it, the array-based representation of the image will be employed.

  image_np = np.array(Image.open(image_path))

 

  # Actual detection.

    

  output_dict = run_inference_for_single_image(model, image_np)

  # Visualization of the results of a detection.

  vis_util.visualize_boxes_and_labels_on_image_array(

      image_np,

      output_dict['detection_boxes'],

      output_dict['detection_classes'],

      output_dict['detection_scores'],

      category_index,

      instance_masks=output_dict.get('detection_masks_reframed', None),

      use_normalized_coordinates=True,

      line_thickness=8)

 

  display(Image.fromarray(image_np))

“`

 

We have a test images folder inside the object detection folder. Two photos that will be used to test the model are already present in that folder. The results can be obtained by running the below cells while also inserting the images for which we wish to locate items.

 

“`

PATH_TO_TEST_IMAGES_DIR = pathlib.Path('models/research/object_detection/test_images')

TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))

“`

 

“`

for image_path in TEST_IMAGE_PATHS:

    print(image_path)

    show_inference(detection_model, image_path)

“`

 

 

Conclusion

 

We have now reached the conclusion of this blog, in which we learned how to utilize the Tensorflow Object Detection API to identify things in both webcam feeds and photos.

 

 

 

About The Author:

logo

Digital Marketing Course

₹ 29,499/-Included 18% GST

Buy Course
  • Overview of Digital Marketing
  • SEO Basic Concepts
  • SMM and PPC Basics
  • Content and Email Marketing
  • Website Design
  • Free Certification

₹ 41,299/-Included 18% GST

Buy Course
  • Fundamentals of Digital Marketing
  • Core SEO, SMM, and SMO
  • Google Ads and Meta Ads
  • ORM & Content Marketing
  • 3 Month Internship
  • Free Certification
Trusted By
client icon trust pilot