How do you build a live-time application for detecting objects by using React and - (r)

Jul 12, 2024

-sidebar-toc>

Since cameras have improved and improved, real-time object detection has become an increasingly popular feature. From self-driving cars and smart surveillance systems to augmented reality apps, the technology can be utilized in a variety of instances.

Computer vision, an esoteric name for the technique which makes use of cameras and computers to carry out operations like the ones mentioned earlier it is a huge and complex area. However, you may not realize that it is possible to begin to get involved in instantaneous object detection using your web browser.

The prerequisites

Below is a list of principal technologies utilized in this article:

  • TensorFlow.js: TensorFlow.js is a JavaScript library that brings the power of machine learning to the browser. It lets you load already-trained models that are trained for object detection and run them directly within the browser, eliminating the requirement for complicated server-side processing.
  • Coco SSD: The application utilizes a pre-trained object recognition model called Coco SSD, a lightweight model that is capable of recognising the vast majority of daily objects in real time. Although Coco SSD is a powerful tool, you should keep in mind that it's trained using an overall dataset of objects. If you've got specific detection needs, you can train a custom model using TensorFlow.js by following this tutorial.

Start a new React project

  1. Create a fresh React project. You can do this by executing the following command:
NPM create vite@latest -object detection React template

It will create a basic React project that you can build by using vite.

  1. After that, then, install the TensorFlow as well as the Coco SSD libraries by running these commands within the project:
npm i @tensorflow-models/coco-ssd @tensorflow/tfjs

The time has come to start developing your app.

Configuring the app

Before writing the code that will implement the object detection logic Let's look at the logic created in this tutorial. The UI for this app could appear in the following format:

A screenshot of the completed app with the header and a button to enable webcam access.
UI layout of the app.

If a user presses on the Start Webcam button, they're prompted to grant the app permission to access the feed of webcams. Once permission has been granted, the application starts displaying the live feed of the webcam, and it detects items in the feed. Then, it renders a square to display the object it detects in the feed live and labels them and also.

To start, create the user interface for the app by copying this instructions into the App.jsx file:

import ObjectDetection from './ObjectDetection'function App() 
 return (
 
 Image Object Detection
 
 
 );
 
 
 Export default App

This code fragment specifies the header of the page and also imports a custom component called the ObjectDetection. The component is responsible for recording the webcam feed and for detecting objects in real time.

To create this component, make a brand new file called ObjectDetection.jsx in your homedirectory and then paste the following code inside it:

import  useEffect, useRef, useState  from 'react';
 
 const ObjectDetection = () => 
 const videoRef = useRef(null);
 const [isWebcamStarted, setIsWebcamStarted] = useState(false)
 
 const startWebcam = async () => 
 // TODO
 ;
 
 const stopWebcam = () => 
 // TODO
 ;
 
 return (
 
 
 isWebcamStarted ? "Stop" : "Start" Webcam
 
 
 isWebcamStarted ?  : 
 
 
 );
 ;
 
 export default ObjectDetection;

Here's the code to implement startWebcam. "startWebcam" function:

const startWebcam = async () => 
 try 
 setIsWebcamStarted(true)
 const stream = await navigator.mediaDevices.getUserMedia( video: true );
 
 if (videoRef.current) 
 videoRef.current.srcObject = stream;
 
  catch (error) 
 setIsWebcamStarted(false)
 console.error('Error accessing webcam:', error);
 
 ;

This feature will ask users to grant access to the webcam, and when granted permission it will set the video to show the live webcam feed the user.

If the code fails accessing the webcam feed (possibly because of a deficiency of webcam on the current device or because the user has been not granted access) it will print a message to the console. You can use an error block that displays reasons for failure to the user.

Next to replace this stopWebcam function using the following code:

const stopWebcam = () => 
 const video = videoRef.current;
 
 if (video) 
 const stream = video.srcObject;
 const tracks = stream.getTracks();
 
 tracks.forEach((track) => 
 track.stop();
 );
 
 video.srcObject = null;
 setPredictions([])
 setIsWebcamStarted(false)
 
 ;

The code inspects for streaming video streams that are accessible through the video object and stops each of them. Finally, it sets the isWebcamStarted status to true.

In this case, you can try running the app to check if you can access and browse the webcam feed.

You must paste this code in your index.css file to ensure that the app appears identical to the version you have seen earlier:

#root 
 font-family: Inter, system-ui, Avenir, Helvetica, Arial, sans-serif;
 line-height: 1.5;
 font-weight: 400;
 color-scheme: light dark;
 color: rgba(255, 255, 255, 0.87);
 background-color: #242424;
 min-width: 100vw;
 min-height: 100vh;
 font-synthesis: none;
 text-rendering: optimizeLegibility;
 -webkit-font-smoothing: antialiased;
 -moz-osx-font-smoothing: grayscale;
 
 
 a 
 font-weight: 500;
 color: #646cff;
 text-decoration: inherit;
 
 
 a:hover 
 color: #535bf2;
 
 
 body 
 margin: 0;
 display: flex;
 place-items: center;
 min-width: 100vw;
 min-height: 100vh;
 
 
 h1 
 font-size: 3.2em;
 line-height: 1.1;
 
 
 button 
 border-radius: 8px;
 border: 1px solid transparent;
 padding: 0.6em 1.2em;
 font-size: 1em;
 font-weight: 500;
 font-family: inherit;
 background-color: #1a1a1a;
 cursor: pointer;
 transition: border-color 0.25s;
 
 
 button:hover 
 border-color: #646cff;
 
 
 button:focus,
 
 button:focus-visible 
 outline: 4px auto -webkit-focus-ring-color;
 
 
 @media (prefers-color-scheme: light) 
 :root 
 color: #213547;
 background-color: #ffffff;
 
 
 a:hover 
 color: #747bff;
 
 
 button 
 background-color: #f9f9f9;
 
 
 
 .app 
 width: 100%;
 display: flex;
 justify-content: center;
 align-items: center;
 flex-direction: column;
 
 
 .object-detection 
 width: 100%;
 display: flex;
 flex-direction: column;
 align-items: center;
 justify-content: center;
 
 .buttons 
 width: 100%;
 display: flex;
 justify-content: center;
 align-items: center;
 flex-direction: row;
 
 button 
 margin: 2px;
 
 
 
 div 
 margin: 4px;
 
 

Also, remove the App.css file to avoid messing up the styles of your components. You are now prepared to create the logic for real-time object recognition within your application.

Set up real-time object detection

  1. The first step is to add the imports of Tensorflow and Coco SSD at the top of     ObjectDetection.jsx    :
import * as cocoSsd from '@tensorflow-models/coco-ssd';
 
 import '@tensorflow/tfjs';
  1. Create a new condition in the     ObjectDetection    component to store the array of predictions generated by the Coco SSD model:
const [predictions setPredictions, useState] = ([]);
  1. Then, you can create a program that load into the Coco SSD model, collects the video feed and makes prediction:
const predictObject = async () => 
 const model = await cocoSsd.load();
 
 model.detect(videoRef.current).then((predictions) => 
 setPredictions(predictions);
 )
 
 .catch(err => 
 console.error(err)
 );
 ;

This program uses the video feed to generate predictions of objects that are present within the feed. The program will give you an array of predicted objects, each containing a label with a percentage of confidence as well as a list of numbers that indicate the location of the object in the video frame.

It is necessary to call this function in order to process video frames as they come and use the forecasts stored in the predictions state to display boxes and labels for every identified object on the live feed of video.

  1. Next, use to use     setInterval    function that calls this function on a continuous basis. You must also prevent this function from being called after the user has shut off the webcam feed. For this, you must use the     ClearInterval    functionality of JavaScript.Add the state container as well as the     useEffect    hooks in the     ObjectDetection    element to establish the     predictObject    Function to run continuously when the webcam is enabled and then removed from the websitecam when it is deactivated:
const [detectionInterval, setDetectionInterval] = useState()
 
 useEffect(() => 
 if (isWebcamStarted) 
 setDetectionInterval(setInterval(predictObject, 500))
  else 
 if (detectionInterval) 
 clearInterval(detectionInterval)
 setDetectionInterval(null)
 
 
 , [isWebcamStarted])

This sets up the app to detect the objects present before the camera every 500 milliseconds. You can consider changing the value based upon how quick you'd like your object detection rate to happen, but keep the possibility that using this too frequently could result in your app using a lot of memory in the browser.

  1. Once you've got the data for your prediction, you can use it to make     prediction    state container, you can make use of it to show the label as well as a box within the live video feed. In order to do this, you must change the     return    declaration of the     ObjectDetection    Return the following information:
return (
 
 
 isWebcamStarted ? "Stop" : "Start" Webcam
 
 
 isWebcamStarted ?  : 
 /* Add the tags below to show a label using the p element and a box using the div element */
 predictions.length > 0 && (
 predictions.map(prediction => 
 return 
 prediction.class + ' - with ' 
 + Math.round(parseFloat(prediction.score) * 100) 
 + '% confidence. '
 
 >
 )
 )
 
 /* Add the tags below to show a list of predictions to user */
 predictions.length > 0 && (
 
 Predictions:
 
 predictions.map((prediction, index) => (
 
 `$prediction.class ($(prediction.score * 100).toFixed(2)%)`
 
 ))
 
 
 )
 
 
 );

The program will display the list of predictions beneath the feed from the webcam and create a box around the predicted object with the coordinates of Coco SSD along with a name above the boxes.

  1. To style the boxes and label correctly, add this code to the     index.css     file:
.feed 
 position: relative;
 
 p 
 position: absolute;
 padding: 5px;
 background-color: rgba(255, 111, 0, 0.85);
 color: #FFF;
 border: 1px dashed rgba(255, 255, 255, 0.7);
 z-index: 2;
 font-size: 12px;
 margin: 0;
 
 
 .marker 
 background: rgba(0, 255, 0, 0.25);
 border: 1px dashed #fff;
 z-index: 1;
 position: absolute;
 
 
 

This completes the development of the application. It is now possible to restart the development server in order for testing the application. What it will be like once it's completed:

A GIF showing the user running the app, allowing camera access to it, and then the app showing boxes and labels around detected objects in the feed.
A demonstration of real-time webcam for object detection.

The complete code in this GitHub repository.

Deploy the completed app to

When your repository for git is up and running then follow these steps to install your app to :

  1. Log in or create an account to view the dashboard of your My dashboard.
  2. Authorize with your Git provider.
  3. Select the Static Websites in the left-hand sidebar, then select Add Site. Add Site.
  4. Select the repository as well as the branch you would like to make available from.
  5. Assign a unique name to your site.
  6. Include the build settings according to the following format:
  • Command to build: yarn build or NPM run build
  • Node version: 20.2.0
  • Publish directory: dist
  1. Then, click Create site.

After the app has been deployed, you can click "Visit Site" from the dashboard in order to open the app. Try using the app on different devices that have cameras to test what happens.

Summary

It's been a success to build a live-time object detection application made with React, TensorFlow.js, and . This allows you to discover the exciting possibilities of computer vision as well as develop interactive experiences in the user's browser.

Keep in mind that using the Coco SSD model we used is just a starting point. If you continue to explore it is possible to explore customized object detection by using TensorFlow.js, allowing you to tailor the app to identify the most relevant objects to the needs of your business.

There are endless possibilities! The app can serve as the foundation to develop more complex apps like Augmented Reality Experiences as well as smart surveillance technology. By deploying your app on 's reliable platform, you are able to share your work with the world and witness the power of computer vision come to reality.

   What's a problem you've come across that you think real-time object detection could solve? Tell us about it in the comments section below!

Kumar Harsh

Kumar is an software designer and a technical author based in India. He specializes in JavaScript as well as DevOps. Find out more information about his expertise at his web site.