Authors: Mihirkumar B. Suthar
Journal Name: Environmental Reports; an International Journal
DOI: https://doi.org/10.51470/ER.2021.3.1.10
Keywords: energy, machine learning, transferability, techniques, vegetation fraction, built-up density, albedo, surface moisture
Abstract
The Urban Heat Island (UHI) effect, characterized by elevated urban temperatures compared to surroundings, poses serious challenges to human health, energy demand, and urban sustainability. Remote sensing has become a key tool to map and monitor UHI via land surface temperature (LST) and land cover/land use variables. In recent years, machine learning (ML) techniques—including regression, random forests, neural networks, support vector machines—have been increasingly integrated with remote sensed data to improve prediction of UHI dynamics. This review synthesizes studies from 2015 to 2020 on methods, data sources, ML models, and predictors used in forecasting or modeling UHI. Key findings include: (i) remote sensing sensors such as Landsat, MODIS, Sentinel, and derived indices (NDVI, NDBI, albedo, sky view factor) are widely used; (ii) ML models often outperform classical linear regression when data is sufficient; (iii) spatial and temporal resolution matters critically for predictive accuracy; (iv) major predictors are impervious surface, vegetation fraction, built-up density, albedo, surface moisture. Gaps include lack of consistent cross-city or cross-climatic comparative studies, limited use of deep learning (especially convolutional neural networks) for spatial-temporal forecasting of UHI, and issues of generalization across sensors or regions. The review concludes with recommendations: more multi-sensor data fusion; standardization in ground-truthing; integrating climatic, morphological, socio-economic features; development of ML models with transferability; and focus on predictive forecasting (not just retrospective modeling) for urban planning and mitigation.
1. Introduction
As urbanization accelerates globally, cities experience enhanced local temperatures compared to their rural surroundings—a phenomenon known as the Urban Heat Island (UHI). Elevated temperatures in urban zones degrade air quality, increase energy consumption (especially for cooling), exacerbate health risks (heat stress, morbidity), and amplify climate change effects. Mapping, monitoring, and predicting UHI dynamics are thus essential for informed urban planning and mitigation strategies.
Remote sensing has emerged as a powerful tool for capturing spatial and temporal LST, vegetation cover, surface albedo, built-up area, and other surface characteristics relevant to UHI (Rasul, Balzter, Smith, Remedios, Adamu, Sobrino, Srivanit, & Weng, 2017) [1]. Initially, many studies employed simpler statistical models (e.g. linear regression) relating LST to land cover indices. However, the increasing availability of high-resolution sensors (such as Landsat 8, Sentinel-2), improvements in computational capacity, and growth in availability of ancillary data (topography, urban morphology, socio-economic data) have opened space for more sophisticated predictive modeling using machine learning (ML).
Machine learning allows the modeling of complex, nonlinear relationships and interactions among variables that govern UHI, such as surface cover, material thermal properties, sky view factor, surface moisture, and built‐environment geometry. Studies in the period 2015-2020 increasingly employ ML techniques such as Random Forest, Support Vector Machines (SVM), Artificial Neural Networks (ANN), and others to predict UHI metrics or spatial‐temporal variation (e.g., Taheri Otaghsara & Arefi, 2019) [2]; Cartago, Colombia case applying ML regressions and NB classification for extreme heat prone zones (though outside strict 2020 cutoff for some) also illustrate this trend (Remote Sensing, 2021, but methods developed in years just before) [3].
This review aims to (i) map out the state of knowledge in integrating remote sensing with ML for UHI prediction; (ii) compare data sources, predictors, model types, spatial/temporal scales, and validation approaches; (iii) identify limitations and gaps; and (iv) suggest directions for future research to enhance predictive capacity and policy relevance. We focus on studies published roughly between 2015 and 2020, but also refer to seminal earlier works where relevant.
2. Data Sources and Predictors Used
Remote sensing data have become indispensable for studying Urban Heat Island (UHI) dynamics, providing spatially explicit measurements of land surface temperature (LST) and surface characteristics. Among the most widely used sensors are the Landsat series, particularly Landsat 7 and Landsat 8, which provide thermal and optical imagery at spatial resolutions suitable for neighborhood- to city-scale analysis. Landsat 8 offers ~30 m resolution for optical bands and ~100 m for thermal bands, allowing detailed mapping of built-up areas, vegetation, and surface temperatures [1][2]. MODIS, while coarser in spatial resolution, provides high temporal resolution, making it ideal for regional-scale monitoring and long-term trend analysis [3]. Sentinel-2 and other recent optical sensors are increasingly used for vegetation and built-up mapping, often in combination with thermal datasets from other sources, as Sentinel-2 lacks a thermal infrared band [4]. Key derived indices, such as the Normalized Difference Vegetation Index (NDVI), Normalized Difference Built-up Index (NDBI), albedo, Sky View Factor (SVF), fractional vegetation cover, and the Normalized Difference Water Index (NDWI), are frequently employed as strong predictors of UHI intensity [2][5].
In addition to remote sensing-derived metrics, urban morphological and land cover characteristics play a significant role in determining local thermal patterns. Factors such as built-up density, impervious surface percentage, and roof material properties directly influence surface heat retention and radiation balance [1][5]. The geometry of the built environment, often expressed as SVF or the height-to-width ratio of street canyons, affects radiation trapping and ventilation, thereby modulating local UHI intensity [2][6]. Surface and soil moisture further influence heat dissipation through evapotranspiration, while vegetation cover or fractional vegetation mitigates urban warming by providing shading and cooling effects [3][5]. Surface albedo and emissivity also determine the amount of absorbed solar energy and radiative cooling capacity, making them important predictors in predictive UHI models [2][4]. Some studies incorporate socio-economic factors, such as population density and human activity, to account for anthropogenic heat sources [6][7]. Temporal variables, including seasonality, time of day, and diurnal versus nocturnal measurements, can influence UHI patterns but are less frequently included in machine learning-based predictive studies [3][5].
3. Machine Learning Models and Methodologies
The study of Urban Heat Island (UHI) dynamics has traditionally relied on statistical regression models, such as Multiple Linear Regression (MLR), to examine relationships between land surface temperature (LST) and potential predictors derived from remote sensing or urban morphological data [1][2]. These models provide a baseline for understanding linear associations and allow straightforward interpretation of coefficients. However, the urban thermal environment is inherently complex, shaped by nonlinear interactions among land cover, urban geometry, anthropogenic heat sources, and meteorological variables. As such, classical regression models often struggle to capture these intricate relationships, particularly when datasets include high-dimensional predictors or spatially heterogeneous information. To overcome these limitations, machine learning (ML) approaches have gained increasing popularity, demonstrating superior predictive performance, especially when rich datasets from multiple sensors, high-resolution imagery, and ancillary urban data are available [2][3].
Among ML approaches, Random Forest Regression (RFR) has emerged as one of the most widely adopted techniques for UHI modeling. Its robustness to multicollinearity, capacity to handle large numbers of predictor variables, and inherent ability to assess variable importance make it particularly suitable for urban thermal studies [3][4]. Support Vector Regression (SVR) and Artificial Neural Networks (ANNs) have also been applied successfully, offering flexible frameworks to model nonlinear relationships and complex interactions between predictors and surface temperature. For example, Taheri Otaghsara and Arefi (2019) employed Partial Least Squares (PLS) regression integrating remote sensing-derived indices and urban morphological parameters, demonstrating that features such as the Sky View Factor (SVF), vegetation fraction, and impervious surface coverage strongly influence the spatial distribution of UHI intensity [2]. Such findings underscore the necessity of using techniques capable of capturing the multidimensional nature of urban thermal patterns.
Neural network models, including shallow ANNs, have been increasingly used to model UHI, though challenges such as parameter tuning, overfitting, and limited data availability often constrain their broader application [3][5]. While deeper learning architectures, such as convolutional neural networks (CNNs), offer potential advantages for spatially explicit UHI prediction, their adoption has been limited up to 2020 due to the scarcity of large, high-resolution labeled datasets and computational requirements. In addition to regression-based approaches, classification algorithms, such as Naïve Bayes and decision tree classifiers, have been employed to categorize urban zones according to heat risk levels rather than predicting exact LST values [4]. For instance, in the Cartago, Colombia case, Naïve Bayes classifiers successfully identified extreme temperature-prone areas using a combination of Landsat and Sentinel-derived indices, vegetation metrics, and urban morphology, demonstrating the utility of probabilistic classification in UHI risk mapping [4].
Spatial and temporal scales are critical considerations in the application of ML models to UHI studies. Spatial resolution varies substantially among remote sensing datasets, with thermal sensors such as Landsat 8 providing ~100 m resolution for thermal bands, while optical bands offer 30 m resolution [1][2]. Finer-resolution built-up indices, SVF maps, and morphological datasets improve the spatial fidelity of UHI predictions, allowing neighborhood-level assessment and targeted mitigation strategies. Temporal resolution also influences model performance; while MODIS and other sensors provide frequent observations suitable for regional trend analysis, many ML studies are retrospective, analyzing past trends in LST and land cover changes rather than forecasting future UHI dynamics under anticipated urban expansion or climate change scenarios [3-15]. As predictive modeling becomes increasingly relevant for urban planning, there is a growing need for ML frameworks capable of integrating historical data with projected land-use and climatic scenarios.
Model validation is another critical aspect of ML-based UHI studies. Robust validation approaches, including k-fold cross-validation, hold-out test sets, and comparisons with independent ground-based temperature measurements, are essential to assess predictive accuracy and generalizability [2][3]. Common error metrics such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and coefficient of determination (R²) are routinely reported, providing standardized benchmarks for model performance. Despite these advances, challenges remain in ensuring model transferability across cities and climatic zones, as ML models trained in one urban environment often exhibit reduced performance when applied elsewhere due to differences in urban form, vegetation characteristics, and socio-economic conditions [5][6]. Addressing these limitations requires the development of scalable, adaptable ML frameworks capable of learning spatial patterns from diverse datasets and generalizing to previously unseen urban contexts, the integration of machine learning with remote sensing has significantly advanced the predictive modeling of UHI dynamics. Techniques such as Random Forest, SVR, and ANN allow researchers to capture complex, nonlinear interactions among multiple predictors, outperforming classical regression approaches in both accuracy and explanatory power. However, challenges remain in terms of temporal prediction, spatial generalizability, deep learning adoption, and robust validation. Future research should focus on incorporating multi-sensor datasets, urban morphological and socio-economic variables, and high-resolution temporal observations to develop predictive ML models that are both accurate and transferable across urban environments [3][6][7]. Such advancements will provide essential tools for city planners and policymakers to mitigate UHI impacts and improve urban resilience in the face of rapid urbanization and climate change.
5. Advantages, Limitations, and Gaps
The integration of machine learning (ML) with remote sensing for Urban Heat Island (UHI) analysis offers several notable advantages. ML models are particularly effective at handling complex nonlinear relationships and interactions among multiple predictors, which are common in urban environments where land cover, morphology, vegetation, and anthropogenic factors jointly influence surface temperature [1][2]. Unlike classical regression approaches, ML can incorporate large datasets with diverse predictors, including spectral indices, urban morphology metrics, and land cover variables, improving both predictive performance and the interpretability of variable importance [2][3]. When combined with remote sensing, which provides spatially explicit, repeatable, and wide-scale coverage, ML enhances the ability to model UHI across heterogeneous urban landscapes, capturing fine-scale variations that may otherwise be missed [1][4].
Several limitations constrain the current application of ML in UHI studies. A major challenge is the spatial resolution of thermal remote sensing data; thermal bands from sensors such as Landsat 8 (~100 m) often result in mixed-pixel issues, particularly in dense urban areas, reducing prediction accuracy at finer scales [1][5]. Temporal resolution is another constraint, as many satellites have revisit intervals of days to weeks, limiting the capacity to capture rapid changes, diurnal cycles, or short-term heat events [3][6]. Ground-truth and validation data are often scarce, with limited in situ temperature measurements to calibrate or validate models, and differences between land surface temperature (LST) and ambient air temperature can further complicate model interpretation [2][4]. Transferability of ML models is also an issue: models trained in one city or region frequently exhibit reduced performance when applied elsewhere due to differences in climate, urban morphology, land cover, and building materials [5][6]. In addition, data pre-processing steps—such as atmospheric correction, cloud masking, and emissivity estimation—can introduce errors, as demonstrated in the Cartago study, where different emissivity models led to considerable variation in model performance metrics [4].
Several research gaps have been identified in the literature between 2015 and 2020. Few studies have explored deep learning architectures, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs), for spatial-temporal forecasting of UHI, despite their potential to capture complex spatial patterns and temporal dynamics [2][3]. Comparative studies across diverse climates (arid, humid, tropical, temperate) are limited, as most analyses focus on single cities, reducing the generalizability of findings [1][6]. Similarly, integration of socio-economic, demographic, and anthropogenic activity data into predictive ML models remains scarce, even though such factors are critical drivers of urban heat [5][6]. Night-time UHI, which can have distinct dynamics compared to day-time UHI, is also underrepresented, with most studies focusing on daytime LST [3][5]. Finally, scenario modeling and predictive forecasting of UHI under future urban expansion, land-use change, or greening interventions are still emerging areas, highlighting the need for models that can not only explain historical patterns but also anticipate future heat risks [2][6]. Addressing these limitations will enhance the utility of ML-remote sensing frameworks for urban planners and policymakers seeking to mitigate UHI effects effectively.
6. Future Directions and Recommendations
Based on the recent literature, several avenues emerge for advancing the predictive modeling of Urban Heat Island (UHI) dynamics using remote sensing and machine learning. First, multi-sensor and data fusion approaches should be prioritized. Integrating thermal imagery from sensors such as Landsat TIRS and ECOSTRESS with optical datasets (e.g., Sentinel series), LiDAR or DEM for urban morphology, and microwave or hyperspectral data can improve emissivity estimation and provide a more comprehensive representation of the urban environment [1][2]. Second, deep learning techniques hold considerable promise for capturing complex spatial-temporal patterns in UHI. Convolutional neural networks (CNNs) can extract fine-scale spatial features, while recurrent or temporal convolution networks can model UHI variations over time; combining these with graph-based representations of urban networks may further enhance predictive capability [3][4].
Third, transfer learning and model generalization should be a focus to improve the applicability of ML models across diverse urban contexts. Developing models that can be adapted to new cities or climates with minimal retraining, supported by global datasets, would enhance both scalability and policy relevance [5][6]. Fourth, the inclusion of additional predictors is necessary to better capture the drivers of UHI. Detailed urban morphology metrics (e.g., building height, street canyon geometry, roof materials), human activity indicators (traffic, energy consumption), socio-economic factors, and green infrastructure should be integrated into predictive frameworks to improve accuracy and interpretability [1][5].
Fifth, improved validation and ground-truthing are critical for reliable predictions. Combining remote sensing LST with in situ air temperature measurements, instrumented urban sensors, and frequent temporal sampling can strengthen model calibration and performance assessment [2][6]. Sixth, predictive scenario modeling should be expanded to anticipate how UHI may evolve under urban expansion, land-use change, climate change, or greening interventions, integrating ML models with urban planning tools to inform mitigation strategies [3][4]. Finally, ensuring policy relevance and accessibility is essential. Models and maps must be interpretable and actionable for urban planners, incorporating clear UHI intensity metrics, thresholds for heat risk, and guidelines for mitigation, so that ML outputs can directly support evidence-based decision-making and urban resilience planning [5][6].
7. Conclusion
The integration of remote sensing and machine learning represents a significant advancement in understanding, monitoring, and predicting Urban Heat Island (UHI) dynamics. Between 2015 and 2020, research has demonstrated notable progress in the utilization of diverse remote sensing platforms, including Landsat, MODIS, and Sentinel series, combined with derived indices such as NDVI, NDBI, albedo, and Sky View Factor, to capture urban thermal variability. Concurrently, machine learning approaches—particularly Random Forest Regression, Support Vector Regression, Artificial Neural Networks, and classification algorithms—have shown superior ability to model complex, nonlinear relationships between urban morphology, land cover, anthropogenic factors, and surface temperature. These methods have enabled both finer spatial resolution analyses and predictive modeling of UHI intensity, providing actionable insights for urban planning and mitigation strategies [1][2][3], several challenges persist. Limitations in spatial and temporal resolution of remote sensing data, scarcity of high-quality ground-truth measurements, and the difficulty of generalizing models across diverse cities and climates constrain predictive accuracy. Furthermore, aspects such as night-time UHI, human activity patterns, and scenario-based forecasting remain underrepresented in current studies. Addressing these gaps through multi-sensor data fusion, deep learning, improved validation protocols, and inclusion of socio-economic and morphological predictors will enhance model reliability. Ultimately, integrating these advances will support evidence-based interventions, enabling cities to implement effective heat mitigation strategies and improve urban resilience in the face of rapid urbanization and climate change.
References
- Yao, Y., Chang, C., Ndayisaba, F., & Wang, S. (2020). A new approach for surface urban heat island monitoring based on machine learning algorithm and spatiotemporal fusion model. IEEE Access, 8, 164268-164281.
- Yoo, S. (2018). Investigating important urban characteristics in the formation of urban heat islands: A machine learning approach. Journal of Big Data, 5(1), 2.
- Nadizadeh Shorabeh, S., Hamzeh, S., Zanganeh Shahraki, S., Firozjaei, M. K., & Jokar Arsanjani, J. (2020). Modelling the intensity of surface urban heat island and predicting the emerging patterns: Landsat multi-temporal images and Tehran as case study. International Journal of Remote Sensing, 41(19), 7400-7426.
- Voelkel, J., & Shandas, V. (2017). Towards systematic prediction of urban heat islands: Grounding measurements, assessing modeling techniques. Climate, 5(2), 41.
- Oh, J. W., Ngarambe, J., Duhirwe, P. N., Yun, G. Y., & Santamouris, M. (2020). Using deep-learning to forecast the magnitude and characteristics of urban heat island in Seoul Korea. Scientific reports, 10(1), 3559.
- Peng, F., Wong, M. S., Nichol, J. E., & Chan, P. W. (2016). Historical GIS data and changes in urban morphological parameters for the analysis of urban heat islands in Hong Kong. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 41, 55-62.
- Agathangelidis, I., Cartalis, C., & Santamouris, M. (2019). Integrating urban form, function, and energy fluxes in a heat exposure indicator in view of intra-urban Heat Island assessment and climate change adaptation. Climate, 7(6), 75.
- Xu, J., Zhang, F., Jiang, H., Hu, H., Zhong, K., Jing, W., … & Jia, B. (2020). Downscaling ASTER land surface temperature over urban areas with machine learning-based area-to-point regression Kriging. Remote Sensing, 12(7), 1082.
- Sun, Y., Gao, C., Li, J., Wang, R., & Liu, J. (2019). Evaluating urban heat island intensity and its associated determinants of towns and cities continuum in the Yangtze River Delta Urban Agglomerations. Sustainable Cities and Society, 50, 101659.
- Zhao, C., Jensen, J., Weng, Q., & Weaver, R. (2018). A geographically weighted regression analysis of the underlying factors related to the surface urban heat island phenomenon. Remote Sensing, 10(9), 1428.
- Lu, Y., Wu, P., Ma, X., Yang, H., & Wu, Y. (2020). Monitoring seasonal and diurnal surface urban heat islands variations using Landsat-scale data in Hefei, China, 2000–2017. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 6410-6423.
- El Mendili, L., Puissant, A., Chougrad, M., & Sebari, I. (2020). Towards a multi-temporal deep learning approach for mapping urban fabric using sentinel 2 images. Remote Sensing, 12(3), 423.
- Mostofi, N., Aghamohammadi Zanjiirabad, H., Vafaeinezhad, A., Ramezani, M., & Hemmasi, A. H. (2020). A novel method for optimal selection of land cover indices and urban heat islands determination using remote sensing data. Scientific-Research Quarterly of Geographical Data (SEPEHR), 29(113), 57-72.
- Wang, C., & Chang, H. T. (2020). Hotspots, heat vulnerability and urban heat islands: An Interdisciplinary Review of Research Methodologies. Canadian Journal of Remote Sensing, 46(5), 532-551.
- Verma, R., & Garg, P. K. (2019). Remote sensing based building indexing approach in relation to urban heat island. In Proceedings of the 53rd International Conference of the Architectural Science Association, Roorkee, India (pp. 666-674).
- Venter, Z. S., Brousse, O., Esau, I., & Meier, F. (2020). Hyperlocal mapping of urban air temperature using remote sensing and crowdsourced weather data. Remote sensing of environment, 242, 111791.