Automated Object Detection from Imagery using Deep Learning

Arpit Shah
Nov 21, 2020
6 min read

Updated: Mar 8

Workflows demonstrated in this post (Section Hyperlinks)

Extracting Building Footprint from Aerial/Satellite Optical Imagery
Extracting Swimming Pools from Aerial/Satellite Optical Imagery (with Video Demonstration)

In my previous post, I had highlighted the utilization of Random Forest-based Machine Learning algorithms on Geospatial Datasets to map Deforestation, classify Agricultural Land Use and predict Voter Turnout respectively. The methodology invariably began by feeding supervised training data for the algorithm to learn from, subsequent to which it was able to sift through the entire dataset using decision trees-based approach and make predictions in quick-time and with a high level of accuracy.

Another subset of Machine Learning - Deep Learning algorithms use a method similar to how a human brain functions when it needs to assess a situation by factoring in inputs and considering choices - Neural Networks.

Besides, a unique characteristic which makes this class of algorithms so powerful is that their decision-making prowess improves automatically the more they interact with the data i.e. without the need for explicit programming.

Consider the recommendations made by YouTube and Spotify - two platforms with humongous repository of content. The incredible aspect is not that their curation of videos and songs are highly compatible with one's preferences implicitly conveyed through historical usage but the fact that they are increasingly able to suggest random content of the type you've never consumed before but are likely to appreciate (and at that precise time of the day!) - a tell-tale sign of powerful Deep Learning algorithms-at-work. Refer related Literature from the hyperlinks above.

Know about some of the other workflows which make use of Deep Learning to good effect here.

Video 1: How Artificial Neural Networks, the fundamental operating mechanism of Deep Learning Algorithms, function on quantitative data and images. Source: Esri's Spatial Data Science MOOC

Geospatial Data Processing Software have integrated ready-to-use Deep Learning Models which assist in performing Imagery Analytics at scale broadly through-

Object Detection - identifying all the objects in a geographic frame by drawing a bounding box around it and labelling it
Semantic Segmentation i.e. identifying and labelling what each pixel in a geographic frame constitutes (in a subsequent post, I have demonstrated the use of a Deep Learning Model to identify pixels that are Power Lines)

At a granular-level, there are variants to these Computer Vision methods - such as Instance Segmentation, Panoptic Segmentation and Image Classification.

In this post, I will demonstrate the utilization of Deep Learning for Object Detection through two workflows- the extraction of Buildings and the extraction of Swimming Pools. Let's begin...

Workflow 1: Extracting Building Footprint from Aerial/Satellite Optical Imagery

A Building Footprint represents the outline of all the buildings within a geographic frame. Through the outline, one can estimate the dimensional characteristics about the object - such as the height of the building or the area occupied by it. A Building Footprint layer serves as a valuable input for several geospatial workflows such as Urban Planning, Property Insurance, Public Transit Expansion and Surveillance Planning. For example, in a previous post of mine, I had demonstrated the estimation of Rooftop Solar Power Generation potential in a locality by utilizing a Building Footprint layer upon which geoprocessing tools were deployed to derive the solar power generation potential through the installation of solar panels on rooftops.

Figure 1: Geospatial Layer - Building Footprint of a neighborhood - utilized for the Rooftop Solar Power Potential study

As you can imagine, digitizing Building outlines manually or even in a semi-automated way is painstaking and prone to errors/human intervention. Deep Learning assists in automating the entire process, and with a high level of accuracy and speed, aided by the availability of high resolution Aerial and Satellite Imagery, advances in Computing Hardware, and Geospatial Software that can integrate and deploy these algorithms. I must emphasize that the level of accuracy is dependent on aspects such as quality of training data and the parameters used to run the algorithm.

Esri - the world's leading GIS software developer - has developed a ready-to-use Deep Learning model for extracting Building Footprints. The algorithm was trained using labelled buildings from a large quantity of high-resolution (10-40 cm) Aerial and Satellite Imagery datasets sourced from multiple regions across USA. While the model naturally works best on detecting Buildings within USA, it also fares reasonably well on other developed countries too whose building external structures look similar to those commonly found in USA.

Figure 2 below is hyperlinked to a map-based presentation where you'll be able to observe the Deep Learning algorithm's rendered output across multiple locations around the world.

Figure 2: Image from the hyperlinked ArcGIS StoryMap depicting Building Footprint at a location in Sweden. Extracted through the deployment of Esri's ready-to-use Deep Learning Model

I decided to test the Deep Learning model myself on a high resolution (30 cm) Optical Satellite Imagery acquired for a location near the Barajas Airport in Madrid, Spain in 2009 by European Space Imaging. Upon running the Model with conservative parameters, this is the output that was generated-

The Sliders below are best viewed on large screens

Slider 1: Building Footprint Extraction output near the Barajas Airport in Madrid, Spain using Esri's ready-to-use Deep Learning Model

Slider 2: Another Building Footprint Extraction output near the Barajas Airport in Madrid, Spain using Esri's ready-to-use Deep Learning Model

Figure 3: A total of 114 buildings were detected and demarcated from the Imagery near Barajas Airport in Madrid, Spain by Esri's ready-to-use Deep Learning model. Dimensional attributes of these features were added by ArcGIS Pro GIS software

While the output appears reasonably accurate, it could have rendered much better had I iterated with more stringent parameters. Moreover, the GIS software also has automated, in-built geoprocessing tools such Regularize Footprint which would have allowed me to finetune the Building outlines (refer related video demonstration from another post of mine). That being said, as the developers feed in more, diverse training data with the algorithm as well as share feedback regarding the previous extractions made (false positives and false negatives), the Deep Learning model will continue to evolve and become better at extracting Building Footprints.

Workflow 2: Extracting Swimming Pools from Aerial/Satellite Optical Imagery

In this workflow, I will demonstrate the process to extract Swimming Pools from Aerial Imagery captured over the city of Redlands in California, USA (Credit: Esri Learn ArcGIS). The Swimming Pool Footprint, if you can call it so, will be useful for Tax Assessors who can proceed to raise the property tax for those who have developed new swimming pools over the last year - this is because the valuation of a property increases from the presence of a swimming pool within.

You can gauge the utility of this automated way of Object Detection - the tax assessors currently use manual surveys, and that too infrequently, to obtain this information and update their records.

Methodology-wise, this workflow is very similar to that of detecting Building Footprint - one begins by feeding in Supervised observations/sample Swimming Pools images located in this area to train the Deep Learning Model. Thereafter, the processing parameters are set which the algorithm ingests and validates. Upon successful validation, the algorithm to proceed to extract the Swimming Pools from the entire geographic extent over Redlands California. Refer the video below which demonstrates the entire process-

The video is best viewed in YouTube's HD mode on a Desktop or in Landscape mode on Mobile Phones. You may adjust the play speed as per your viewing preferences

Video 1: Extracting Swimming Pools from Aerial imagery over the city of Redlands in California, USA using Esri's ready-to-use Deep Learning Model

Isn't the utility of Deep Learning fascinating? I wish I could show you how the model improves its accuracy over time - perhaps I'll rerun the model over the same imagery at a later date.

The potential of such algorithms is limitless and I wouldn't blame you if you were to even find it scary - even at this preliminary juncture, some of these algorithms have already surpassed the best that human ingenuity has to offer - watch this brilliant documentary AlphaGo which demonstrates the power of Google's DeepMind AI in a Computer v/s Human contest against the world champion Lee Sedol at the popular strategy board game Go.

ABOUT US

Intelloc Mapping Services, Kolkata | Mapmyops.com offers Mapping services that can be integrated with Operations Planning, Design and Audit workflows. These include but are not limited to Drone Services, Subsurface Mapping Services, Location Analytics & App Development, Supply Chain Services, Remote Sensing Services and Wastewater Treatment. The services can be rendered pan-India and will aid your organization to meet its stated objectives pertaining to Operational Excellence, Sustainability and Growth.

Broadly, the firm's area of expertise can be split into two categories - Geographic Mapping and Operations Mapping. The Infographic below highlights our capabilities-

Mapmyops (Intelloc Mapping Services) - Range of Capabilities and Problem Statements that we can help address

Our Mapping for Operations-themed workflow demonstrations can be accessed from the firm's Website / YouTube Channel and an overview can be obtained from this brochure. Happy to address queries and respond to documented requirements. Custom Demonstration, Training & Trials are facilitated only on a paid-basis. Looking forward to being of service.

Regards,

Arpit Shah

Credits: Esri, European Space Imaging