Article | March 15, 2021
A few weeks ago, I created a Tensorflow model that would take an image of a street, break it down into regions, and - using a convolutional neural network - highlight areas of the image that contained vehicles, like cars and trucks.
I called it an image classification AI, but what I had really created was an object detection and location program, following a typical convolutional method that has been used for decades - as far back as the seventies.
By cutting up my input image into regions and passing each one into the network to get class predictions, I had created an algorithm that roughly locates classified objects. Other people have created programs that perform this operation frame by frame on real time video, allowing computer programs to draw boxes around recognised objects, understand what they are and track their motion through space.
In this article, I’ll give an interesting introduction to object detection in real-time video. I’ll explain why this kind of artificial intelligence is important, how it works, and how you can implement your own system with Yolo V3. From there, you can build a huge variety of real time object tracking programs, and I’ve included links to further material too.
Importance of real-time object detection
Object detection, location and classification has really become a massive field in machine learning in the past decade, with GPUs and faster computers appearing on the market, which has allowed more computationally expensive deep learning nets to run heavy operations. Real time object detection in video is one such AI, and it has been used for a wide variety of purposes over the past few years.
In surveillance, convolutional models have been trained on human facial data to recognise and identify faces. An AI can then analyse each frame of a video and locate recognised faces, classifying them with remarkable precision.
Real-time object detection has also been used to measure traffic levels on heavily frequented streets. AIs can identify cars and count the number of vehicles in a scene, and then track that number over time, providing crucial information about congested roads.
In wildlife, with enough training data a model can learn to spot and classify types of animals. For example, a great example was done with tracking racoons through a webcam, here. All you need is enough training images to build your own custom model, and such artificial intelligence programs are actively being used all around the world.
Background to Yolo V3
Until about ten years ago, the technology required to perform real-time object tracking was not available to the general public. Fortunately for us, in 2021 there are many machine learning libraries available and practically anyone can get started with these amazing programs.
Arguably the best object detection algorithms for amateurs - and often even professionals - is You Only Look Once, or YOLO. This collection of algorithms and datasets was created in the 2000s and became incredibly popular thanks to its impressive accuracy and speed, which lends it easily to live computer vision.
My method for object detection and recognition I mentioned at the start of this article happens to be a fairly established technique. Traditional object recognition would split up each frame of a video into “regions”, flatten them into strings of pixel values, and pass them through a deep learning neural network one by one. The algorithm would then output a 0 to 1 value indicating the chance that the specific region has a recognized object - or rather, a part of a recognized object - within its bounds.
Finally, the algorithm would output all the regions that were above a particular “certainty” threshold, and then it would compile adjacent regions into bounding boxes around recognized objects. Fairly straightforward, but when it comes down to the details, this algorithm isn’t exactly the best.
Yolo V3 uses a different method to identify objects in real time video, and it’s this algorithm that gives it its desirable balance between speed and accuracy - allowing it to fairly accurately detect objects and draw bounding boxes around them at about thirty frames per second.
Darknet-53 is Yolo’s latest Fully Convolutional Network, or FCN, and is packed with over a hundred convolutional layers. While traditional methods pass one region at a time through the algorithm, Darknet-53 takes the entire frame, flattening it before running the pixel values through 106 layers. They systematically split the image down into separate regions, predicting probability values for each one, before assembling connected regions to create “bounding boxes” around recognized objects.
Luckily for us there’s a really easy way we can implement YoloV3 in real time video simply with our webcams; effectively this program can be run on pretty much any computer with a webcam. You should note however that the library does prefer a fast computer to run at a good framerate. If you have a GPU it’s definitely worth using it!
The way we’ll use YoloV3 is through a library called ImageAI. This library provides a ton of machine learning resources for image and video recognition, including YoloV3, meaning all we have to do is download the pre-trained weights for the standard YoloV3 model and set it to work with ImageAI. You can download the YoloV3 model here. Place this in your working directory.
We’ll start with our imports as follows:
import numpy as np
from imageai import Detection
Of course, if you don’t have ImageAI, you can get it using “pip install imageai” on your command line or Python console, like normal. CV2 will be used to access your webcam and grab frames from it, so make sure any webcam settings on your device are set to default so access is allowed.
Next, we need to load the deep learning model. This is a pre-trained, pre-weighted Keras model that can classify objects into about a hundred different categories and draw accurate bounding boxes around them. As mentioned before, it uses the Darknet model. Let’s load it in:
modelpath = "path/yolo.h5"
yolo = Detection.ObjectDetection()
All we’re doing here is creating a model and loading in the Keras h5 file to get it started with the pre-built network - fairly self-explanatory.
Then, we’ll use CV2 to access the webcam as a camera object and define its parameters so we can get those frames that are needed for object detection:
cam = cv2.VideoCapture(0)
You’ll need to set the 0 in cv2.VideCapture(0) to 1 if you’re using a front webcam, or if your webcam isn’t showing up with 0 as the setting. Great, so we have imported everything, loaded in our model and set up a camera object with CV2. We now need to create a run loop:
ret, img = cam.read()
This will allow us to get the next immediate frame from the webcam as an image. Our program doesn’t run at a set framerate; it’ll go as fast as your processor/camera will allow.
Next, we need to get an output image with bounding boxes drawn around the detected and classified objects, and it’ll also be handy to get some print-out lines of what the model is seeing:
img, preds = yolo.detectCustomObjectsFromImage(input_image=img,
As you can see, we’re just using the model to predict the objects and output an annotated image. You can play around with the minimum_percentage_probability to see what margin of confidence you want the model to classify objects with, and if you want to see the confidence percentages on the screen, set display_percentage_probability to True.
To wrap the loop up, we’ll just show the annotated images, and close the program if the user wants to exit:
if (cv2.waitKey(1) & 0xFF == ord("q")) or (cv2.waitKey(1)==27):
Last thing we need to do outside the loop is to shut the camera object;
And that’s it! It’s really that simple to use real time object detection in video. If you run the program, you’ll see a window open that displays annotated frames from your webcam, with bounding boxes displayed around classified objects.
Obviously we’re using a pre-built model, but many applications make use of YoloV3’s standard classification network, and there are plenty of options with ImageAI to train the model on custom datasets so it can recognize objects outside of the standard categories. Thus, you’re not sacrificing much by using ImageAI.
Good luck with your projects if you choose to use this code!
Yolo V3 is a great algorithm for object detection that can detect a multitude of objects with impressive speed and accuracy, making it ideal for video feeds as we showed on the examples aboves.
Yolo v3 is important but it’s true power comes when combined with other algorithms that can help it process information faster, or even increasing the number of detected objects. Similar algorithms to these are used today in the industry and have been perfected over the years.
Today self-driving cars for example will use techniques similar to those described in this article, together with lane recognition algorithms and bird view to map the surroundings of a car and pass that information to the pilot system, which then will decide the best course of action.
Article | April 12, 2021
Digital Transformation is not a magic wand; it is a complex yet essential enterprise commitment to change. Companies that have succeeded have reaped significant benefits. The Deloitte Digital Transformation Survey 2020 found that greater digital maturity is associated with better financial performance. The higher-maturity companies in this year’s sample were about 3X more likely than lower-maturity companies to report annual net revenue growth and net profit margins — a pattern that was consistent across industries.
Unfortunately, most enterprises do not fully appreciate what it entails. Some see it as a technology or a budget problem; others believe it is an optional strategy — they are both wrong. To truly succeed, transformation needs to be led from the top by setting the strategy and allocating resources. Antonio Neri of HPE hits the spot when he says, “Digital transformation is no longer an option for enterprises, but a strategic imperative.”
For me, one of the most significant examples of top-driven organisational change is Jeff Bezos’ call to “Rearchitecting the Firm” in 2002. It is a seminal work. The principles of this mandate went on to form the backbone of Amazon in the modern cloud world. It was clear, direct, and backed up by management action.
More than 75% of CEOs agreed that the pandemic sped up their companies’ transformation plans
COVID is a catalyst for change
The flurry of digital technology solutions spurred by COVID-19 presents a unique opportunity for enterprises to rethink how technology decisions are made and apply them in new and meaningful ways. Covid-19 dramatically accelerated technology adoption across all industries. According to a Fortune-Deloitte CEO survey and the KPMG 2020 CEO Outlook Survey, more than 75% of CEOs agreed that the pandemic sped up their companies’ transformation plans. As Microsoft CEO Satya Nadella noted, “We’ve seen two years’ worth of digital transformation in two months.”
80% of companies plan to accelerate their companies’ digital transformation plans, primarily incentivized by the global pandemic implications. The same study also concludes that only 30% of digital transformations have achieved their objectives which is troubling.
80% of companies plan to accelerate their companies’ digital transformation plans, however only 30% of digital transformations have achieved their objectives - BCG Research
Most people forget that digital transformation is less about technology and much more about the organization’s culture and business shift. Key stakeholders need to rethink customer experience, business models, and operations fundamentally. It is all about finding new ways to deliver value, generate revenue, improve efficiency, and, most importantly driving sustainable innovation. Bear in mind, just moving to the cloud is not Digital Transformation.
Crises Breed Innovation
I am of the firm belief that uncertainty drives creativity. Crises are the breeding ground for innovation. You must make decisions quickly, and you never have enough time or information to weigh difficult choices thoroughly.
McKinsey’s analysis shows that bold innovators emerge from crises substantially ahead of peers — and maintain this advantage for years to come. Innovators not only outperformed the market during the financial crisis but continued to widen the gap during and after the recovery. Analysis of the performance of approximately 2,000 companies between 2007 and 2017 against the S&P 500 reinforces those conclusions: staying focused on growth and innovation through a downturn helped the top-performing companies to generate higher returns to shareholders.
Staying focused on growth and innovation through a downturn helped the top-performing companies to generate higher returns to shareholders
Antonio Neri and other leaders confirm that as the pace of technology disruption continues to accelerate, digital-native and digitally transformed companies are outpacing their competitors.
The McKinsey study shows that roughly one in ten companies in their sample achieved higher revenue growth, innovation, digital adoption, and profitability than the others over the entire 2007–17 economic cycle and during the downturn years. The outperformers also delivered excess returns of roughly 8%, while the rest hovered around zero throughout the period.
So, what does it take to succeed?
Do existing leadership teams have the skills to undertake true digital transformation? I thought it would be a good idea to look at how companies are hiring critical resources. A study by professors from Harvard and Darden and executives from Spencer Stuart published in the Harvard Business Review addressed this specific question. The team looked at more than 100 search criteria for C-suite positions in Fortune 1000 companies across a broad range of industries, and the results were very suggestive.
There has been a rise in tech and digital expertise search even before the pandemic: 59% of executive searches included technological or digital knowledge. Company boards were asking for these skills across a wide variety of roles. This fact also suggests that people with the right skill sets are already in leadership positions. Not surprisingly, 100% of the specs for CIOs, CMOs, and CTOs sought technical or digital skills. However, the functions that got neglected in the search for technological and digital expertise were more revealing. Less than a third of the job specs for CHROs and chief accounting officers mentioned these required skills. Even more worrying — only 40–60% of searches for roles such as CEO, board director, and CFOs required digital know-how.
At the very minimum, we need all leaders to understand how to build digital businesses. This shift alone could be the difference between success and failure.
But is that enough for now?
Almost every organization has stepped up its digital transformation efforts in 2020–21. Success is as much about the right technology platform choice as it is about leadership, agility, talent, and a clear vision. A new and emerging factor is consumers wanting the brands they use to focus on sustainability issues. So do employees and prospective employees. The driver for this shift largely springs from realizing that human activities’ ecological footprint is a probable cause for the crisis we face today.
While we keep talking about the usual polluters like utilities, transportation, agriculture, and climate change causes, some lesser-discussed and more exciting facts would make the issue more relatable.
Did you know that in processing 3.5 billion searches a day, Google accounts for about 40% of the internet’s carbon footprint? They have been carbon neutral since 2007, but their infrastructure still emits a considerable volume of CO2.
Did you know that Bitcoin currently uses enough power (121 terawatt-hours) to run Cambridge University for almost 700 years?
To address sustainability in a meaningful manner, we need to take a holistic view of the players, their impact and then push for a mutually beneficial solution . Else, it is bound to fail.
As a first step, 26 CEOs of Europe-based companies have signed a Declaration to support Green and Digital Transformation of the EU. They formed a European Green Digital Coalition, committing on behalf of their companies to not only make the tech sector to become more sustainable, circular, and a zero polluter but also to support sustainability goals of other priority sectors such as energy, transport, agriculture, and construction while contributing to an innovative, inclusive and resilient society.
Like these CEOs, Accenture also believes that there is great value at the intersection of digital technologies and sustainability — they call it Twin Transformers. Companies leveraging both are 2.5X more likely to be among tomorrow’s strongest-performing businesses than others.
BigTech is conscious of its responsibilities to the climate. Almost all majors players have made pledges to reverse CO2 emission. Since they are all profit-driven, I am sure they have also figured out this also means good business by the numbers too (a counter-intuitive rationalisation but better than getting caught in the justification game)
In the future, a company’s commitment to ESG-related programs will drive the ability to attract investors and retain talent. Companies also realize that ESG factors, when integrated into strategic digital transformation decisions, may offer potential long-term performance advantages. One of the critical levers for moving to sustainable systems will be technology, a lot of technology, and a lot of investment. But how do we make it accessible to all and profitable to the providers at the same time?
HPE is one company that has made significant strides in this regard by embracing the twin doctrine of digital transformation and sustainability. Their customers can reduce their energy costs by more than 30% by eliminating overprovisioning through HPE GreenLake. In fact, their consumption-based offerings have reduced customer carbon footprint by 50% in one case. Minimizing e-waste is another area of focus for them too.
So what have we learned from all this? As an ancient Chinese proverb states, “When the winds of change blow, some people build walls, others build windmills.”
What will you build?
Article | March 15, 2020
The world is facing a global public health crisis due to the COVID-19 pandemic caused by the novel coronavirus named SARS-CoV-2. This is a new pathogen, and scientists around the world are researching the SARS-CoV-2 coronavirus in order to develop therapeutics to treat, cure or prevent the COVID-19 disease. In early March, artificial intelligence (AI) developer DeepMind announced the use of its AlphaFold deep learning system to predict protein structures associated with the COVID-19 disease in efforts to help scientists understand how to fight the SARS-CoV-2 coronavirus.
Article | July 6, 2020
A significant number of workers across the globe have been forced to work from home due to the COVID-19 pandemic. Enterprises saw a temporary dip in workforce productivity; however, with time, employee productivity has surged. A survey with 42 Indian CXOs by Deloitte says that 60% of the companies have reported an increase in individual employee productivity. Many organizations who were earlier not in favor of remote working, have been forced to try it and have realized that with certain policy changes, a remote working model can be beneficial for their organizations.