top of page
  • Writer's pictureArpit Shah

Extracting Infra Objects from Satellite Imagery using Deep Learning

Updated: Jun 13, 2023

Case 1: Extracting Buildings Footprint

A Building Footprint, as the name suggests, is an imprint of a constructed object on a surface. As with any object's imprint, one can determine the object's outline, features as well as area occupied from it. A building footprint data layer (comprising of multiple building footprints across a geographic extent) acts as a foundation for several workflows - urban planning, insurance, transportation, security etc. In my previous blog post and video, we witnessed the workflow to calculate rooftop solar potential in a locality using building footprint data. Essentially, if one knows how a building looks like from an aerial view, one can use Geoprocessing tools to identify how much sunlight the roof will receive over a duration of time.

Building Footprint of a Locality
Figure 1: Building Footprint of a Locality

Digitizing building footprints manually is a very tedious process and susceptible to errors. Imagine plotting the outline of a hundred buildings in your neighborhood manually on a computer! Thanks to advances in technology (higher resolution imagery, increase in computing / processing speed, object and pattern detection algorithms) performing this task is now more convenient, much faster and can be stunningly accurate for those who have the wherewithal.

Modern Mapping technology now integrates ready-to-use deep learning models to aid multiple workflows.


What is deep learning? It is a subset of Machine Learning - algorithms which improve automatically through experience. Recollect how, in my previous blog post, Random Forest (ML) algorithm was used to identify deforestation, mapping crop types and predicting voter turnout by feeding 'training data' to the model, basis which it classified and predicted the output with a high level of accuracy.

Deep Learning (ML) algorithm uses a similar method that our brain uses (Neural networks) to arrive at a decision when faced with choices. Amazon, Google and Netflix use Deep Learning to show you recommended content based your historical searches and set preferences (which acts as

'training data' for the algorithm. How often is the next song in your playlist, as recommended by Spotify or YouTube, so close to what you'd love to listen at that very moment?!

Video Source: Esri's Spatial Data Science MOOC

You may read this article which outlines 10+ fascinating applications of Deep Learning. In geospatial applications, particularly related to Imagery Analytics, Deep Learning is used for Object Detection, Instance Segmentation (identifying boundary of the object detected), Image Classification (using predefined rules to identify whether an object is X or Y or Z) and Pixel Classification (Semantic Segmentation - identifying whether a pixel is from a desert, an ocean, forested area and so on) purposes. Please refer to another article of mine where Deep Learning framework has been applied to classify Power Lines.


As with such Machine Learning algorithms, the more quality data the model is trained with, the more accurate the 'Deep' learning model's output tends to become.

One of the two topics covered in this blog is a ready-to-use Deep learning model to extract building footprints (i.e. Object Detection) from a geospatial dataset (satellite imagery). The model was trained on a large quantity of imagery datasets of the USA (30-60 cm resolution). Naturally, the model works best for building footprint detection and extraction in USA, however, it works reasonably well for other countries too with similar housing patterns.

Clicking on the image below will lead you to an engaging storymap on this topic where you can explore the DL model's results for yourself.

Esri's ArcGIS Storymap containing samples of building footprints extracted. Image shows footprints extracted from imagery over an area of Interest in Sweden

Figure 2: Esri's ArcGIS Storymap containing samples of building footprints extracted. Image shows footprints extracted from an area of interest in Sweden


I would have preferred to try this Deep Learning model to extract footprints from an urban region in India, however, I could only manage to access a 30 cm resolution imagery sample over Madrid's Barajas Airport region in 2009 (from European Space Imaging) at the time of writing this post and hence, have used it to test the model.

Upon running the model with conservative parameters, the output generated can be inferred from the depiction below -

(The sliders below are best viewed on a PC.)

Output 1: Building Footprint Extraction using Esri's ready-to-use Deep Learning model

Output 2: Building Footprint Extraction using Esri's ready-to-use Deep Learning model


The outputs are appealing on two counts - Firstly, without me specifying what or how buildings look like in the imagery from Europe, the model was able to use its internal, trained-in knowledge about building types in the USA and proceeded to detect similar looking objects here (green in the figure below), differentiating it from the rest of the imagery by masking it out.

Total 114 buildings were detected in output 2. Building Area is also captured in the Attribute table on Esri's ArcGIS Pro
Figure 3: Total 114 buildings were detected in output 2. Building Area is also captured in the Attribute table

It would be fair to assume that additional footprints would have been detected had I opted for more stringent parameters. Also, I didn't have access to the 'regularize footprint' geoprocessing tool which would have lent more precision to the building outline. In another article with an elaborate video walkthrough, I've done so.

That being said, from the Model's perspective, as the researchers feed more training data and give feedback pertaining to the output generated (in terms of false positives and false negatives) to the Deep Learning model, it should evolve and become consistently better at what it does (refer video here).


Case 2: Detecting Swimming Pools

The second case which we will cover in this article is fairly simple - we want to detect the number of newly constructed swimming pools in the city of Redlands, California. This information will be useful for tax assessors who typically rely on 'infrequent' surveys for this information (Swimming pools within a property enhances its value and resultingly, the property tax).

The methodology is largely similar to detecting Building footprint - we identify training samples, load it in the model and validate it, and finally use the DL model to detect new swimming pools. Video of the workflow can be seen below-

(The video is best viewed in YouTube's HD setting either on PC or on Landscape mode in Mobile Phone. You can reduce / increase the play speed as per your viewing preferences)

Video: Workflow for detecting Swimming Pools in the city of Redlands, California using Esri's Deep Learning model

Isn't this fascinating? Explore another interesting use case involving Deep Learning in GIS - 'Classifying Power Lines from LiDAR Dataset' here.


'How strong can a Deep Learning Model get?', one may wonder.

A: Enough to beat the best humans in business and go even further. I'd encourage you to watch this brilliant documentary - AlphaGo.


Intelloc Mapping Services | Mapmyops is engaged in providing mapping solutions to organizations which facilitate operations improvement, planning & monitoring workflows. These include but are not limited to Supply Chain Design Consulting, Drone Solutions, Location Analytics & GIS Applications, Site Characterization, Remote Sensing, Security & Intelligence Infrastructure, & Polluted Water Treatment. Projects can be conducted pan-India and overseas.

Several demonstrations for these workflows are documented on our website. For your business requirements, reach out to us via email - or book a paid consultation (video meet) from the hyperlink placed at the footer of the website's landing page.


Much Thanks to Esri & European Space Imaging for the training material.


Recent Posts

See All
bottom of page