Date of Defense
14-4-2026 5:00 PM
Location
Microsoft Teams
Document Type
Thesis Defense
Degree Name
Master of Science in Water Resources
College
COE
Department
Civil and Environmental Engineering
First Advisor
Prof. Mohamed Hamouda
Keywords
Satellite-based soil moisture; SMAP; SAR4SM; arid regions; International Soil Moisture Network (ISMN); machine learning; leak detection; water distribution systems.
Abstract
This thesis is concerned with the evaluation of satellite-based soil moisture (SSM) datasets in arid regions and their potential application to leak detection in water distribution systems. The work focuses on the SMAP Level-4 soil moisture product and a Sentinel-1 SAR-based (SAR4SM) retrieval framework, used in conjunction with in-situ observations from the International Soil Moisture Network (ISMN) and historical leak records from the Al Ain Distribution Network (AADC), currently part of TAQA Distribution. The main objectives are (i) to quantify the performance of SMAP and SAR4SM surface soil moisture estimates over arid environments against ground-based measurements, and (ii) to develop and assess a machine learning framework that attributes changes in SSM to possible leak occurrences in an operational distribution network.
In Phase 1, SMAP 9 km Level 4 products and 250 m SAR4SM soil moisture maps were collocated with 0-5 cm ISMN observations from selected arid region stations. Because the collocation and preprocessing workflow is lengthy and requires handling large volumes of satellite data, it was not feasible to apply the full processing chain to all candidate stations. Instead, an initial subset of 10 arid-region stations was processed and screened, and based on their metadata and preliminary performance statistics, 2 representative stations were retained. Performance was quantified using Mean Deviation, Mean Absolute Error, Root Mean Square Deviation, unbiased Root Mean Square Error (ubRMSE), correlation coefficient (R), and coefficient of determination (R²). In Phase 2, 2724 historical leaks were combined with SMAP-derived soil moisture indices. A Gradient Boosting classifier based on pipe diameter, age, material, district, and month of occurrence was complemented by a Random Forest classifier trained on SMAP trend metrics within 11-day windows, and the two models were fused at the decision level.
Results show that at representative ISMN stations (USCRN-Mercury-3-SSW and DAHRA-DAHRA), SMAP systematically outperforms SAR4SM, with lower ubRMSE and higher R and R², indicating superior skill in reproducing in situ soil moisture dynamics. Over the full arid station ensemble, SMAP performance is strongly modulated by Köppen-Geiger climate class, elevation, and USDA soil texture: hot desert (BWh) and hot steppe (BSh) climates, moderate elevations (1080-1420 m), and loam and sand textures provide the most favorable combination of low random error and robust temporal correlation, whereas cold steppe (BSk), high elevations, and sandy loam show degraded performance. In the leak detection application, the infrastructure-only Gradient Boosting model correctly detected 104 of 139 independent leak events (74.8%), while fusion with the SMAP-based Random Forest increased correct detections to 109 of 139 events (78.4%), indicating the added value of satellite-derived soil moisture dynamics for leak identification. Overall, the study provides an assessment of advanced SSM products in arid regions and demonstrates their practical potential to support proactive leak detection and water-loss management in water-stressed environments.
Included in
EVALUATION OF REMOTE SENSING SOIL MOISTURE DATASETS IN ARID REGIONS AND THEIR PROSPECT FOR LEAK DETECTION IN WATER DISTRIBUTION SYSTEMS
Microsoft Teams
This thesis is concerned with the evaluation of satellite-based soil moisture (SSM) datasets in arid regions and their potential application to leak detection in water distribution systems. The work focuses on the SMAP Level-4 soil moisture product and a Sentinel-1 SAR-based (SAR4SM) retrieval framework, used in conjunction with in-situ observations from the International Soil Moisture Network (ISMN) and historical leak records from the Al Ain Distribution Network (AADC), currently part of TAQA Distribution. The main objectives are (i) to quantify the performance of SMAP and SAR4SM surface soil moisture estimates over arid environments against ground-based measurements, and (ii) to develop and assess a machine learning framework that attributes changes in SSM to possible leak occurrences in an operational distribution network.
In Phase 1, SMAP 9 km Level 4 products and 250 m SAR4SM soil moisture maps were collocated with 0-5 cm ISMN observations from selected arid region stations. Because the collocation and preprocessing workflow is lengthy and requires handling large volumes of satellite data, it was not feasible to apply the full processing chain to all candidate stations. Instead, an initial subset of 10 arid-region stations was processed and screened, and based on their metadata and preliminary performance statistics, 2 representative stations were retained. Performance was quantified using Mean Deviation, Mean Absolute Error, Root Mean Square Deviation, unbiased Root Mean Square Error (ubRMSE), correlation coefficient (R), and coefficient of determination (R²). In Phase 2, 2724 historical leaks were combined with SMAP-derived soil moisture indices. A Gradient Boosting classifier based on pipe diameter, age, material, district, and month of occurrence was complemented by a Random Forest classifier trained on SMAP trend metrics within 11-day windows, and the two models were fused at the decision level.
Results show that at representative ISMN stations (USCRN-Mercury-3-SSW and DAHRA-DAHRA), SMAP systematically outperforms SAR4SM, with lower ubRMSE and higher R and R², indicating superior skill in reproducing in situ soil moisture dynamics. Over the full arid station ensemble, SMAP performance is strongly modulated by Köppen-Geiger climate class, elevation, and USDA soil texture: hot desert (BWh) and hot steppe (BSh) climates, moderate elevations (1080-1420 m), and loam and sand textures provide the most favorable combination of low random error and robust temporal correlation, whereas cold steppe (BSk), high elevations, and sandy loam show degraded performance. In the leak detection application, the infrastructure-only Gradient Boosting model correctly detected 104 of 139 independent leak events (74.8%), while fusion with the SMAP-based Random Forest increased correct detections to 109 of 139 events (78.4%), indicating the added value of satellite-derived soil moisture dynamics for leak identification. Overall, the study provides an assessment of advanced SSM products in arid regions and demonstrates their practical potential to support proactive leak detection and water-loss management in water-stressed environments.