Mapping Water Bodies from Satellite Imagery
Updated: Aug 22, 2021
Those who have read my previous articles on imagery analytics or any such articles online tend to see the final output and the writer's interpretation of it. Deriving the output from base imagery and what it entails is generally, hidden from view. After all, it makes more sense to watch a piece of art and admire its beauty or spot its flaws rather than watching an artist prepare it. However, if you have seen those soothing, fast-forwarded artsy videos on social media, you'll know that there is a lot of joy to be derived from watching the process also.
There is much to like about analysing satellite imagery. Much of it is because it is 'explorative' in nature - like an archaeological expedition, one keeps on digging to uncover layers of artefacts along the way.
In this article, we will see the process of satellite imagery analytics using a case of 'water body mapping'. We will analyse both the imagery types - optical and radar. Along this exhaustive journey, you would get an idea of what geo-technology has to offer in this space and the various methods to extract meaningful information from satellite imagery.
Figure 1: Electromagnetic Spectrum Infographic. Source: EO College
Optical satellite imagery looks much like a normal photograph. Information is captured by the satellite's receiver using the visible light emitted by passive light sources (mainly sunlight) which hits and is reflected from the earth's surface. Radar imagery, on the other hand, is formed from the reflectance of radio waves (not visible to human eye) which are emitted by an active light source - the transmitter onboard the satellite itself.
Notice in the infographic above that radio waves can penetrate the earth's atmosphere just like visible light. Both optical and radar imagery have their own set of features, advantages and disadvantages.
In this exercise, we will map the waterbodies in Masuria, North Poland. This area is famous for its Lake District (> 2000 lakes within). Mapping water bodies is necessary for multiple purposes and is part of numerous geo-workflows such as flood mapping, water level mapping, tidal effects, changes in river course, glacial movement etc.
Just imagine how difficult would it be to manually plot the contours of a water body with coordinate level precision from just an image or via a cartographic survey. Satellite Imagery Analytics, in comparison, helps save a lot of time and effort.
Before we analyse satellite images, we need to acquire the images. Imagery acquisition requires a little knowhow. Without getting into technicalities, one needs to know the desired imagery features, resolution, timeline of image(s), where and how to acquire it and, of course, costing plays an important role too. In this exercise I have used optical and radar imagery acquired by EUs Copernicus Earth Observation satellites. These are free and openly accessible to users.
Analysing Optical Imagery to Map Water bodies
Figure 2.01: Cloud Corrected Sentinel 2 Optical Imagery over Masurian Lake District acquired in March, 2018 and visualized in natural color mode.
Figure 2.02: Here, we have visualized the optical imagery in False color Infrared mode. Visualizing in different color modes (using the bands in the imagery) helps us to understand and interpret the data better. For example, FCI helps to distinguish water bodies (blue) better from vegetation (red) as seen above.
Figure 2.03: Satellite Imagery can be acquired raw or pre-processed. Acquiring pre-processed imagery saves a lot of computing resources and time. The three pre-processed image masks above (excluding the top left) denote the pixels containing snow, clouds and water as estimated by the analytics model of the satellite service provider. This helps us to interpret the imagery better and discount for areas (clouds, snow) which may not be relevant for our study.
Figure 2.04: Just as we do in spreadsheet based analysis, we need to clean the imagery data and organize / structure it first before proceeding to do the actual analysis on it. Our optical imagery contains 13 bands of information captured in 3 different resolutions (10m, 20m & 60m). To analyse an image, however, we need to make each pixel comparable. Therefore, we 'resample' our image to 10m resolution for the analysis operations to function properly later on. The image above is resampled and visualized in Land / Water mode.
(Water pixels are captured in B2, B3 & B8 bands which are acquired in 10m resolution by the Sentinel 2 optical satellite).
Figure 2.05: To continue cleaning & structuring the imagery further, we need to remove components which are not useful for our analysis. After all, satellite imagery is heavy in size - the one we are using is actually 1 GB in size and to run analysis on the entire image is a waste of computing resources and time. Therefore we proceed to 'subset' the image. Subsetting can involve mere cropping of the geographic extent of the image (as depicted in the image above) OR can involve removing certain bands of information not useful for our study OR both. Doing so reduces the imagery size considerably and makes processing quicker.
Normalized Difference Water Index (NDWI) is the analysis method (developed by McFeeters in 1996) which we will use to detect, amplify and demarcate water bodies.
Figure 2.06: NDWI Formula as defined by McFeeters
The fundamental behind the formula above is simple to understand although the technical aspect (how different types of electromagnetic waves interact with different types of objects) is indeed, complex.
Sunlight, the light source in optical imagery, is composed of visible light (which we can see) as well as near infra-red (NIR) and short-wave infra-red (SWIR) waves (which we can't see).
Water appears relatively brighter in Visible Light than it does in NIR and SWIR light where it appears nearly pitch dark. Soil & Vegetation, in contrast, appears darker in visible light but brighten up significantly when exposed to NIR and SWIR light.
This is because different objects have different reflectivity values when exposed to different types of electromagnetic waves (see image below). In our analysis, we use the NDWI formula because it exploits this principle to good effect and helps us to ascertain whether the pixel is water or not.
Figure 2.07: Reflectance Properties of Water, Soil & Green Vegetation. Source RUS Copernicus
Using a simple mathematical ratio, the NDWI formula determines whether the NIR value of a pixel exceeds the Visible Light (Green) value of that pixel or not. If it does, then both the numerator and resulting output will turn negative, implying land. On the other hand, if Visible Light value exceeds the NIR value then both the numerator and output will remain positive, implying open water body. The fascinating aspect is that, if a pixel is low on Visible Light value then invariably it will be high on NIR value as water has contrasting reflectance property compared to soil and vegetation. Thus, this ratio formula amplifies the dominating property of the surface captured in the pixel.
Figure 2.08: The image on the left is the NDWI output. White pixels have positive NDWI values and can be classified as water whereas dark pixels have negative NDWI values and can be classified as different types of land. In the right image, we've applied a blanket change - all the dark pixels have been converted to black (because it doesn't matter to us if a pixel classified as land is due to vegetation, urban or desert: none of those are useful for our study).
NDWI has its flaws - For example, bright urban infrastructure can be misclassified as water. There are more sophisticated detection techniques available to detect water bodies more precisely.
We will save this mask / layer of information and will use it to visualize and compare with radar imagery output towards the end of this article.
Analysing Radar Imagery to Map Water bodies
One of the most important aspects of radar imagery is that neither is it impacted by the earth's atmosphere, nor does it require an external source of light (sunlight) to capture information. Therefore, it can penetrate through clouds as well as capture surface property information with precision even at night. This renders it to be more useful in certain situations where optical imagery is either insufficient or ineffective. The way to process a radar imagery is also significantly different.
While we were able to extract the water mask from just one optical image set, we will use five radar image sets (acquired between Oct and Dec 2020) to extract the water mask. We will delve into the reason behind this later in this article.
This is how a radar imagery over the same area (Masurian Lake District) looks like -
Figure 3.01: Above is Radar Imagery captured in Oct 2020. We've already done the first pre-processing step i.e. 'subsetted' the image to match the extent we had used in the Optical image. In terms of visualization, all we see is a sprinkling of salt and pepper. Nonetheless, you'll be able to discern that large water bodies are clearly black in color. Why is it so? Because the radio-waves emitted from the satellite hit the water and reflect away from it i.e. do not return to the satellite. Hence, open water bodies are classified with minimal data values - resulting in the black spots. In contrast, radio waves do return to the satellite in varying magnitudes from land surfaces - be it vegetation, barren land or urban infrastructure. The more the energy return, the whiter the pixels will look. This concept about how radio waves interact with objects is called backscatter, the knowledge of which assists in the analytics process immensely.
Figure 3.02: Our next processing step is to apply the latest orbit files metadata to the image. A satellite captures data as it navigates in its orbit. Knowing the accurate location of the satellite while the data is being captured helps in precisely attributing the data values to pixels (geo-coding).
Precise orbit files take some time to be prepared i.e. they are not immediately available during the time of imagery acquisition and thus, this step is mandatory in multiple geo-workflows.
Figure 3.03: The next step is to apply Thermal Noise Removal. The satellite receiver which captures reflected radio waves generates its own background energy which is incorrectly added to the pixel data value (making pixels whiter) and which would lead to incorrect interpretation of the surface properties. The image on the right is the thermal noise corrected image. Notice that it is visibly darker in certain sections than the original image on the left.
Figure 3.04: The next pre-processing step is to apply Radiometric Calibration to the image (the right image has been calibrated). To put it simply, the reflectance values captured by the radio waves on the earth's surface get distorted by the time it reaches the receiver on the satellite (remember ionosphere?) and even during the time it is processed and stored as data. To correct this, calibration is done at all the levels where distortions take place so that the data values for each pixel denote its earth's surface level reflectance values thereby making the data geo-precise and comparable.
(Each pixel in our Synthetic Aperture Radar (SAR) imagery captures surface information in 5m by 20m resolution).
Figure 3.05: Next, we've applied the pre-processing step known as 'Terrain Correction'. Data captured by the satellite is in the radar geometry form and not in geographic projection form. The orientation of image and distances between objects are distorted as a result. Map Projections is an entire topic within itself so we'll not delve into that. However, if you observe the terrain corrected image closely, you'll see that the image is inverted and is slightly narrower than the base image in Figure 3.01. The terrain corrected imagery above is, geographically speaking, similar to Figure 2.08 now.
The TC image resembles what we see on a regular 2D map in terms of distance between objects as well as Arctic Circle being the true north. Originally, the image was captured the way the satellite captured the information in its orbit i.e. in radar geometry.
Each of the processing steps above is time consuming. To do these same steps on all the five images will take considerable time and increases chances of manual error. To automate this process and reduce the time taken, the software has high utility features known as Graph Builder and Batch Processing. Essentially, we plot the processing steps in chronological order, set the parameters and run it at once on all the 5 images. (Obviously, for this to work, all the images need to have common processing parameters which in our case happens to be just so.)
Figure 3.06: Graph Builder
Figure 3.07: Batch Processing in Progress
Figure 3.08: All five images captured in different timelines (Oct - Dec 2020) processed at once and laid out side by side in the image above.
The last pre-processing step is Coregistration. Essentially, for the software, each image is a separate entity and it can't compare and analyse pixel values from different images. Therefore, we coregister the images i.e. form a stack so that the software knows that all the images are identical in terms of geographic extent and same pixel location across images can be analysed in an 'apple to apple' manner.
Figure 3.09: Coregistered stack. Notice that each image is stored as a band as part of the same mother image. This helps us to do the actual analysis i.e. Band Maths in our next step.
Einstein had once famously remarked: 'If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.'
I find this to be applicable to satellite imagery problem statements as well. The actual analysis portion takes little time but the pre-processing steps are time consuming and need to be done in an error-free manner for the ensuing analysis and interpretation to be effective.
Figure: 3.10: The image above is a multi-temporal RGB image. We have merged 3 images into one. Please note, the colors are not natural colors. What we are seeing is an image which captures the changes which have occurred on the same surface over a period of time. Bright pixels and Black pixels are those which are unchanged over the 3 image timelines, green pixels are those which have changed from the 2nd image timeline onward whereas the blue pixels are those which have changed in the 3rd image timeline used in this temporal merge.
Isn't this useful?
These kind of temporal images are useful to see where, for example, agricultural harvesting has happened. The water bodies remain the same across the timelines, although important changes like flooding, course change and water level changes can be captured using this method.
Figure 3.11: This is our first analysis step. We have done a Mean Averaging here i.e. averaged the pixel data values in the same location across the 5 image timelines and transferred it to a 6th image (the one on the bottom right).
Why do we do that? It is akin to balancing out the sour taste in a food dish by adding sugar to it. The base images have inherent noises which lead to variation in data values. By mean averaging, we balance out these variations so that the noise is tempered and we get a more smoother image. (You may not be able to discern the smoothness above, but in full screen mode in the software, the change can be observed very clearly). This is why we had opted for multi-temporal analysis using five radar images - it helps us to balance out any outliers in the pixel data values aiding us in correct analysis and interpretation.
Figure 3.12: Having smoothened the image, all we need to do now is to create a similar water mask as we did in the optical image. For this we need to separate land pixel values from water pixel values i.e. apply 'Thresholding'. However, that is not an easy thing to do here.
Just see the histogram output of the pixel intensity (left) - there is very little difference in the pixel data values of a black pixel to a white pixel in our mean averaged image - the values are so tiny.
How can we threshold the data, then?
Figure 3.13: Observe the image on the right. The black pixels appear blacker whereas the white and gray pixels appear whiter/brighter. What has happened here? We have log-scaled the intensity values i.e. converted it from being in linear scale to being in logarithmic scale - (dB=10*log10()).
Essentially, we have exponentially amplified the pixel colors by viewing the pixel data values in logarithmic scale.
Figure 3.14: Thresholding works well when a histogram of an image is bi-modal i.e. has twin peaks. The image on the left - histogram of the log-scaled image - shows just that. When we converted the intensity band to a logarithmic scale, we created a clear gap between the values of the dark and the bright pixels which resulted in this formation. For our exercise, we can reasonably conclude that pixels with an intensity value <23.54 corresponds to water pixels and >23.54 corresponds to non-water pixels.
Logarithms are useful after all!
Figure 3.15: Using 23.54 as the threshold, we've used band maths to remove all the land pixels as depicted in the image on the right. This image becomes our water mask of the Masurian Lake District using Radar imagery.
The final step is to compare both the water masks in a Geographic Information System.
Figure 3.16: The red pixels are water pixels identified using optical imagery whereas the blue pixels represent the pixels identified using radar imagery. Common pixels (which most are) are colored as a mix i.e. violet.
The optical imagery over the area of our study has been acquired in spring, whereas the radar imagery has been acquired in winter months. Therefore, there is bound to be certain changes in landscape which account for the occasional singular classification. Also, since the method of data acquisition and processing is different for optical and radar, small deviations in output is expected. A comparison slide between both the the imagery types (optical and radar) can be found here.
Final output: In the slider above, you can compare the detected water masks with basemap imagery from Google Maps.
Aren't the water body detections accurate? As you review this satellite imagery analytics output, I hope you had fun knowing the process involved as I had indicated in the very first paragraph of this article.
Thank you for reading!
Intelloc Mapping Services | Mapmyops.com is engaged in selling products which capture geo-data (Drones), process geo-data (Geographic Information System) as well as services (PoI Datasets & Satellite Imagery). Together, these help organizations to benefit from Geo-Intelligence for purposes such as operations improvement, project management and digital enabled growth.
Write to us on firstname.lastname@example.org. Download our one-page profile here. Request a demo.
Much Thanks to EU Copernicus, RUS Copernicus & QGIS for the training material and software. A video output using older imagery can be viewed from here.