top of page
  • Writer's pictureArpit Shah

Extracting Water Bodies using Remote Sensing

Updated: Jun 17, 2023

Those who have read my pre-2022 articles on Imagery Analytics or any similar news articles online often tend to see the final output and the writer's interpretation of it. Deriving the output from raw imagery and the procedure involved is generally, hidden from view. After all, it makes more sense to watch a piece of art and admire its beauty or spot its flaws rather than watching an artist painstakingly prepare it. However, if you have seen those soothing, fast-forwarded artsy videos on social media, you'll know that there is a lot of joy to be derived from watching the process of creating art as well.

There is much to like about analyzing satellite imagery. Much of it is because it is 'explorative' in nature - like an archaeological expedition, one keeps on digging to uncover layers of artefacts along the way.

In this article, we will see the process involved in satellite imagery analytics using a case of 'Water body mapping'. We will analyze two types of Satellite imagery - Optical and Radar. Along this exhaustive journey, you would get an idea of what geotechnology has to offer in this space and the various methods to extract meaningful information from satellite imagery.

 

Section Hyperlinks below:


 
Figure 1: Electromagnetic Spectrum Infographic. Source: EO College
Figure 1: Electromagnetic Spectrum Infographic. Source: EO College

Optical satellite imagery looks much like a normal photograph. Information is captured by the satellite's receiver using the visible light emitted by passive light sources (mainly sunlight) which hits and is reflected from the earth's surface. Radar satellite imagery, on the other hand, is formed from the reflectance of radio waves (not visible to human eye) which are emitted by an active light source - the transmitter onboard the satellite itself.


Notice from the infographic in Figure 1 above, that the radio waves can penetrate the earth's atmosphere just like visible light.


Both optical and radar imagery have their own set of features, advantages and disadvantages.


In this exercise, we will map the waterbodies in Masuria, North Poland. This area is famous for its Lake District (> 2000 lakes within). Mapping water bodies is necessary for multiple purposes and is part of numerous geo-workflows such as flood mapping, water level mapping, tidal effects, changes in river course, glacial movement etc.


Just imagine how difficult would it be to manually plot the contours of a water body with coordinate level precision from just an image or via a cartographic survey. Satellite Imagery Analytics, in comparison, helps save a lot of time and effort.


Before we analyze satellite images, we need to acquire the images. Imagery acquisition requires a little knowhow. Without getting into technicalities (see first few sections of this video to know more), one needs to know the desired imagery features, resolution, timeline of image(s), where and how to acquire it and, of course, cost of imagery acquisition plays an important role too. In this exercise I have used optical and radar imagery acquired by EUs Copernicus Earth Observation satellites. These are free and openly accessible to registered users.

 

Mapping Water Bodies using Optical (Satellite) Imagery

Cloud Corrected Sentinel 2 Optical Imagery over Masurian Lake District acquired in March, 2018 and visualized in natural color mode.
Figure 2.01: This Cloud Corrected Sentinel 2 Optical Imagery over Masurian Lake District acquired in March, 2018 and visualized in natural color mode.
 
Here, we have visualized the optical imagery in False color Infrared mode. Visualizing in different color modes (using the bands in the imagery) helps us to understand and interpret the data better. For example, FCI helps to distinguish water bodies (blue) better from vegetation (red) as seen above.

Figure 2.02: Here, we have visualized the optical imagery in False Color Infrared mode. Visualizing in different color modes (using the bands in the imagery) helps us to understand and interpret the data better. For example, FCI helps to distinguish water bodies (blue) better from vegetation (red) as seen above.

 
Satellite Imagery can be acquired raw or pre-processed. Acquiring pre-processed imagery saves a lot of computing resources and time. The three pre-processed image masks above (excluding the top left) denote the pixels containing snow, clouds and water as estimated by the analytics model of the satellite service provider. This helps us to interpret the imagery better and discount for areas (clouds, snow) which may not be relevant for our study.

Figure 2.03: Satellite Imagery can be acquired raw or pre-processed. Acquiring pre-processed imagery saves a lot of computing resources and time. The three pre-processed image masks above (excluding the top left) denote the pixels containing snow, clouds and water as estimated by the preprocessing algorithms deployed by the satellite service provider (ESA). This helps us to interpret the imagery better and filter away areas which may not be relevant for our study (such as clouds & snow).

 
Just as we do in spreadsheet based analysis, we need to clean the imagery data and organize / structure it first before proceeding to do the actual analysis on it. Our optical imagery contains 13 bands of information captured in 3 different resolutions (10m, 20m & 60m). To analyse an image, however, we need to make each pixel comparable. Therefore, we 'resample' our image to 10m resolution for the analysis operations to function properly later on. The image above is resampled and visualized in Land / Water mode.

Figure 2.04: Just as we do in spreadsheet-based analysis, we need to clean the dataset and organize / structure it first before proceeding to do the actual analysis on it. Our optical imagery contains 13 bands of information captured in 3 different resolutions (10m, 20m & 60m). To analyze an image, however, we need to make each pixel comparable. Therefore, we 'resample' our image to 10m resolution for the processing operations to function properly later on. The image above is resampled and visualized in Land / Water mode. Water pixels are captured in B2, B3 & B8 bands which are acquired in 10m resolution by the Sentinel 2 optical satellite.

 
Figure 2.05: To continue cleaning & structuring the imagery further, we need to remove components which are not useful for our analysis. After all, satellite imagery is heavy in size - the one we are using is actually 1 GB in size and to run analysis on the entire image is a waste of computing resources and time. Therefore we proceed to 'subset' the image. Subsetting can involve mere cropping of the geographic extent of the image (as depicted in the image above) OR can involve removing certain bands of information not useful for our study OR both. Doing so reduces the imagery size considerably and makes processing quicker.

Figure 2.05: To continue cleaning & structuring the imagery further, we would need to remove components which are not useful for our analysis. After all, satellite imagery is heavy in size - the one we are using is actually 1 GB in size and to run analysis on the entire image is computationally resource-intensive and also takes a lot of time. Therefore we proceed to 'subset' the image. Subsetting can involve mere cropping of the geographic extent of the image (as depicted in the image above) OR can involve removing certain bands of information not useful for our study OR both. Doing so reduces the imagery size considerably and makes the processing quicker.

 

Normalized Difference Water Index (NDWI) is the analysis method (developed by McFeeters in 1996) which we will use to detect, amplify and demarcate water bodies.

NDWI Formula as defined by McFeeters

Figure 2.06: NDWI Formula as defined by McFeeters


The crux of the formula above is simple to understand although the technical aspect - how different types of electromagnetic waves interact with different types of objects - is indeed, complex.


Sunlight, the primary source of illumination in optical imagery, is composed of Visible light (which we can see) as well as Near Infrared (NIR) and Short-wave Infrared (SWIR) waves (which we can't see).


Water appears relatively brighter in Visible Light than it does in NIR and SWIR light where it appears very dark. Soil & Vegetation, in contrast, appears darker in visible light but brighten up significantly when exposed to NIR and SWIR light.


This is because different objects have different reflectivity values when exposed to different types of electromagnetic waves (see image below). In our analysis, we use the NDWI formula because it exploits this principle to good effect and helps us to ascertain whether the pixel under review is a Water pixel or not.

Reflectance Properties of Water, Soil & Green Vegetation. Source RUS Copernicus

Figure 2.07: Reflectance Properties of Water, Soil & Green Vegetation. Source RUS Copernicus


The NDWI formula determines whether the Near Infrared value of a pixel exceeds the Visible Light value of that pixel or not. If it does, then both the numerator and resulting output will turn negative, implying it being a Land pixel. On the other hand, if Visible Light value exceeds the Near Infrared value then both the numerator and output will remain positive, implying it being a Water pixel. The fascinating aspect is that, if a pixel is low on Visible Light value then invariably it will be high on NIR value as water has contrasting reflectance characteristics compared to soil and vegetation. Thus, this ratio formula amplifies the dominating property of the surface captured in the pixel.

The image on the left is the NDWI output. White pixels have positive NDWI values and can be classified as water whereas dark pixels have negative NDWI values and can be classified as different types of land. In the right image, we've applied a blanket change - all the dark pixels have been converted to black (because it doesn't matter to us if a pixel classified as land is due to vegetation, urban or desert: none of those are useful for our study).

Figure 2.08: The image on the left is the NDWI output. White pixels have positive NDWI values and can be classified as water whereas dark pixels have negative NDWI values and can be classified as land. In the right image, we've applied a blanket change - all the dark pixels have been converted to black (because the landcover type doesn't matter to us - be it vegetation, urban or desert: we are only interested to know the water pixels).


NDWI has its flaws - For example, Urban infrastructure can be misclassified as water. There are more sophisticated detection techniques available to detect water bodies more precisely.


We will save this mask / layer of information and will use it to visualize and compare it with the Radar imagery output towards the end of this article.

 

Mapping Water Bodies using Radar (Satellite) Imagery


One of the most important aspects of radar imagery is that neither is it impacted by the earth's atmosphere, nor does it require an external source of light (sunlight) to capture information. Radar illumination generated by the satellite's transmitter can penetrate through clouds as well as capture surface property information with precision even at night. This renders it to be more useful in certain situations where optical imagery is either insufficient or ineffective. The way to process a radar imagery is also significantly different. Let's begin.


While we were able to extract the water mask from just one Optical imagery set, we will use five Radar imagery sets (acquired between October and December 2020) to extract the Water mask. We will delve into the reason behind this later in this article.


This is how a radar imagery over the same area (Masurian Lake District) looks like -

Depicted above is Radar Imagery captured on Oct 2020. We've already done the first pre-processing step i.e. 'subsetted' the image to match the extent we had used in the Optical image. In terms of visualization, all we see is a sprinkling of salt and pepper.  Nonetheless, you'll be able to discern that large water bodies are clearly black in color. Why is it so? Because the radio-waves emitted from the satellite hit the water and reflect away from it i.e. do not return to the satellite. Hence, open water bodies are classified with minimal data values - resulting in the black spots. In contrast, radio waves do return to the satellite in varying magnitudes from land surfaces - be it vegetation, barren land or urban infrastructure. The more the energy return, the whiter the pixels will look. This concept about how radio waves interact with objects is called backscatter, the knowledge of which assists in the analytics process immensely.

Figure 3.01: Depicted above is Radar Imagery captured on October 2020. We've already done the first preprocessing step i.e. 'subsetted' the image to match the extent we had used in the Optical image. In terms of visualization, all we see is a sprinkling of salt and pepper. Nonetheless, you'll be able to discern that large water bodies are clearly black in color. Why is it so? Because the radio-waves emitted from the satellite hit the water and reflect away from it i.e. do not return to the satellite. Hence, open water bodies are classified with minimal data values - resulting in the black spots. In contrast, radio waves do return to the satellite in varying magnitudes from land surfaces - be it vegetation, barren land or urban infrastructure. The more the energy return (reflectance), the whiter the pixels will look. This concept about how radio waves interact with objects is called backscatter, the knowledge of which assists in doing the analytics process immensely.

 
Our next processing step is to apply the latest orbit files metadata to the image. A satellite captures data as it navigates in its orbit. Knowing the accurate location of the satellite while the data is being captured helps in precisely attributing the data values to pixels (geo-coding).  Precise orbit files take some time to be prepared i.e. they are not immediately available during the time of imagery acquisition and thus, this step is mandatory in multiple geo-workflows.

Figure 3.02: Our next processing step is to apply the latest orbit files metadata to the image. A satellite captures data as it navigates in its orbit. Knowing the accurate location of the satellite while the data is being captured helps in precisely attributing the data values to pixels (geocoding).

Precise orbit files take some time to be prepared i.e. they are not immediately available during the time of imagery acquisition and thus, this step is common in satellite imagery workflows.

 
The next step is to apply Thermal Noise Removal. The satellite receiver which captures reflected radio waves generates its own background energy which is incorrectly added to the pixel data value (making pixels whiter) and which would lead to incorrect interpretation of the surface properties. The image on the right is the thermal noise corrected image. Notice that it is visibly darker in certain sections than the original image on the left.

Figure 3.03: The next step is to apply Thermal Noise Removal. The satellite receiver which captures reflected radio waves generates its own background energy which is incorrectly added to the pixel data value (making pixels whiter) and which would lead to incorrect interpretation of the surface properties. The image on the right is the thermal noise-corrected image. Notice that it is visibly darker in certain sections than the original image on the left.

 
The next pre-processing step is to apply Radiometric Calibration to the image (the right image has been calibrated). To put it simply, the reflectance values captured by the radio waves on the earth's surface get distorted by the time it reaches the receiver on the satellite (remember ionosphere?) and even during the time it is processed and stored as data. To correct this, calibration is done at all the levels where distortions take place so that the data values for each pixel denote its earth's surface level reflectance values thereby making the data geo-precise and comparable.

Figure 3.04: The next pre-processing step is to apply Radiometric Calibration to the image (the right image has been calibrated). To put it simply, the reflectance values captured by the radio waves on the earth's surface get distorted by the time it reaches the receiver on the satellite (remember ionosphere?) and even during the time when it is pre-processed and stored. To correct this, calibration is done at all the levels where distortions take place so that the data values for each pixel denote the accurate reflectance as generated on the earth's surface thereby making the datasets geo-precise and comparable.


Each pixel in our Synthetic Aperture Radar (SAR) imagery captures surface information in 5m by 20m resolution.

 
Next, we've applied the pre-processing step known as 'Terrain Correction'. Data captured by the satellite is in the radar geometry form and not in geographic projection form. The orientation of image and distances between objects are distorted as a result. Map Projections is an entire topic within itself so we'll not delve into that. However, if you observe the terrain corrected image closely, you'll see that the image is inverted and is slightly narrower than the base image in Figure 3.01. The terrain corrected imagery above is, geographically speaking, similar to Figure 2.08 now.

Figure 3.05: Next, we've applied the preprocessing step known as 'Terrain Correction'. Data captured by the satellite is in the radar geometry form and not in geographic projection form. The orientation of image and distances between objects are distorted as a result. Map Projections is an entire topic within itself so we'll not delve into that. However, if you observe the terrain corrected image closely, you'll see that the image is inverted and is slightly narrower than the original image in Figure 3.01. The terrain corrected imagery above is, geographically speaking, similar to Figure 2.08 now.


The Terrain Corrected image resembles what we see on a regular 2D map in terms of a) distance between objects as well as b) Arctic Circle being the true north. Originally, the image was captured the way the satellite captured the information in its orbit i.e. in radar geometry.

 

Each of the processing steps above is time-consuming. To do these same steps on all the five images will take considerable time and increases chances of human error. To automate this process and reduce the time taken, the imagery analytics software has high utility features known as Graph Builder and Batch Processing. Essentially, we will plot the processing steps in chronological order here, set the parameters, and run it with a click of a button on all the 5 imagery datasets together. (For this to work, all the images need to have common processing parameters which in our case, happens to be just so.)

Processing Graph for Waterbody Mapping

Figure 3.06: Processing Graph for Waterbody Mapping


Batch Processing in Progress - Water bodies Mapping

Figure 3.07: Batch Processing in Progress


All five images captured in different timelines (Oct - Dec 2020) processed at once and laid out side by side in the image above.

Figure 3.08: All five images captured in different timelines (Oct - Dec 2020) are processed at once and laid out side by side in the image above.

 

The last preprocessing step is Coregistration. Essentially, for the software each image is a separate entity and it can't compare and analyze pixel values from the same coordinates from different images. Therefore, we coregister the images i.e. form a stack of the five images in a single product so that the software knows that all the images are identical in terms of geographic extent and same pixel location across images can be analyzed in an 'apple to apple' manner.

Coregistered stack. Notice that each image is stored as a band as part of the same mother image. This helps us to do the actual analysis i.e. Band Maths in our next step.

Figure 3.09: Coregistered stack product view. Notice that each image is stored as a band in the product. This step is necessary for us to do the actual analysis i.e. Band Maths processing step, next.

 
I suppose Albert Einstein had once famously remarked: 'If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about the solution.'

I find this to be applicable to Satellite imagery workflows as well. The actual analysis portion takes little time but the processing methodology and steps involved are time-consuming and need to be done in an error-free manner to ensure that the data is set up in a right manner for analysis and interpretation to be effective.

The image above is a multi-temporal RGB image. We have merged 3 images into one. Please note, the colors are not natural colors. What we are seeing is an image which captures the changes which have occurred on the same surface over a period of time. Bright pixels and Black pixels are those which are unchanged over the 3 image timelines, green pixels are those which have changed from the 2nd image timeline onward whereas the blue pixels are those which have changed in the 3rd image timeline used in this temporal merge.

Figure: 3.10: The imagery above is a multitemporal RGB image. We have merged 3 images into one. Please note, the colors in the image are are not natural colors. What we are seeing is an image which captures the changes which has occurred on the same surface over a period of time. Bright pixels and Black pixels are those which are unchanged over the 3 imagery timelines, green pixels are those which have changed from the 2nd image timeline onward whereas the blue pixels are those which have only changed in the 3rd image timeline in our temporal merge.


Isn't this useful?


These kind of temporal images are useful to see where, for example, agricultural harvesting has happened as the water bodies tend to remain the same across the timelines (although important changes like flooding, course change and water level changes can be captured using this method as well).

 
This is our first analysis step. We have done a Mean Averaging here i.e. averaged the pixel data values in the same location across the 5 image timelines and transferred it to a 6th image (the one on the bottom right).

Figure 3.11: This is our first analysis step. We have done Mean Averaging here i.e. averaged the pixel data values in the same location across the 5 image timelines and transferred it to a 6th image - the one on the bottom right.


Why do we do that? It is akin to balancing out the sour taste in a food dish by adding sugar to it. The base images have inherent noises which lead to variation in data values. By mean averaging, we balance out these variations so that the noise is tempered and we get a more smoother image. You may not be able to discern the smoothness above, but in full screen mode in the software, this effect can be observed very clearly. This is why we had opted for multitemporal analysis using five radar images - it helps us to balance out any noise and outliers in the pixel data values which aids us in correct analysis and interpretation.

 
Having smoothened the image, all we need to do now is to create a similar water mask as we did in the optical image. For this we need to separate land pixel values from water pixel values i.e. apply 'Thresholding'. However, that is not an easy thing to do here.   Just see the histogram output of the pixel intensity (left) - there is very little difference in the pixel data values of a black pixel to a white pixel in our mean averaged image - the values are so tiny.

Figure 3.12: Having smoothened the image, all we need to do now is to create a similar water mask as we did in the optical image. For this we need to separate land pixel values from water pixel values i.e. apply 'Thresholding'. However, that is not an easy thing to do here.


Just see the histogram output of the pixel intensity (image to your left) - there is very little difference in the pixel data values of a black pixel to a white pixel in our Mean averaged image - the values are so tiny and clustered together.


How can we threshold / classify the data, then?

 
Observe the image on the right. The black pixels appear blacker whereas the white and gray pixels appear whiter/brighter. What has happened here? We have log-scaled the intensity values i.e. converted it from being in linear scale to being in logarithmic scale - (dB=10*log10()).

Figure 3.13: Observe the image on the right. The black pixels appear blacker whereas the white and gray pixels appear whiter/brighter. What has happened here? We have log-scaled the intensity values i.e. converted it from being in linear scale to being in logarithmic scale - (dB=10*log10()).


Essentially, we have exponentially amplified the pixel colors by converting the linear pixel data values into logarithmic data.

 
Thresholding works well when a histogram of an image is bi-modal i.e. has twin peaks. The image on the left - histogram of the log-scaled image - shows just that. When we converted the intensity band to a logarithmic scale, we created a clear gap between the values of the dark and the bright pixels which resulted in this formation. For our exercise, we can reasonably conclude that pixels with an intensity value <23.54 corresponds to water pixels and >23.54 corresponds to non-water pixels.

Figure 3.14: Thresholding works well when a histogram of an image is bi-modal i.e. has twin peaks. The image on the left - histogram of the log-scaled image - shows just that. When we converted the intensity band to a logarithmic scale, we created a clear gap between the values of the dark and the bright pixels which resulted in this formation. For our exercise, we can reasonably conclude that pixels with an intensity value <23.54 corresponds to water pixels and >23.54 corresponds to non-water pixels.

Logarithms are useful after all!

 
Using 23.54 as the threshold, we've used band maths to remove all the land pixels as depicted in the image on the right. This image becomes our water mask of the Masurian Lake District using Radar imagery.

Figure 3.15: Using 23.54 as the threshold, we've used band maths to remove all the land pixels as depicted in the image on the right. This image becomes our water mask of the Masurian Lake District using Radar imagery.

 

The final step is to compare both the water masks in a Geographic Information System.

The red pixels are water pixels identified using optical imagery whereas the blue pixels represent the pixels identified using radar imagery. Common pixels (which most are) are colored as a mix i.e. violet.  The optical imagery over the area of our study has been acquired in spring, whereas the radar imagery has been acquired in winter months. Therefore, there is bound to be certain changes in landscape which account for the occasional singular classification. Also, since the method of data acquisition and processing is different for optical and radar, small deviations in output is expected.

Figure 3.16: The red pixels are water pixels identified using optical imagery whereas the blue pixels represent the pixels identified using radar imagery. Common pixels (which most are) are colored as a mix i.e. violet.


The optical imagery over the area of our study has been acquired in spring, whereas the radar imagery has been acquired in winter months. Therefore, there is bound to be certain changes in landscape which account for the difference. Also, since the method of data acquisition and processing is different for optical and radar, small deviations in output is expected as well.


A PowerPoint presentation comparing both the the imagery types (optical and radar) can be found here.

 

Final output: In the slider above, you can compare our Water mask output overlayed on Google basemap imagery.


Isn't our Water body detection accurate? As you review this output, I hope you had fun knowing the process involved in satellite imagery analytics as I had indicated in the first paragraph of this article.


Thank you for reading!

 

ABOUT US


Intelloc Mapping Services | Mapmyops is engaged in providing mapping solutions to organizations which facilitate operations improvement, planning & monitoring workflows. These include but are not limited to Supply Chain Design Consulting, Drone Solutions, Location Analytics & GIS Applications, Site Characterization, Remote Sensing, Security & Intelligence Infrastructure, & Polluted Water Treatment. Projects can be conducted pan-India and overseas.


Several demonstrations for these workflows are documented on our website. For your business requirements, reach out to us via email - projects@mapmyops.com or book a paid consultation (video meet) from the hyperlink placed at the footer of the website's landing page.

Regards,

Much Thanks to EU Copernicus, RUS Copernicus & QGIS for the training material and software. A video output using older imagery can be viewed from here.

1,220 views

Recent Posts

See All
bottom of page