Mapping fire hazard potential in Kazakhstan: a machine learning and remote sensing perspective

Daniker Chepashev; Serik Nurakynov; Divyansh Sharma; Nurmakhambet Sydyk; Gulzhiyan Kabdulova

doi:10.1071/WF24232

RESEARCH ARTICLE (Open Access)

Previous Next Contents Vol 34(9)

Mapping fire hazard potential in Kazakhstan: a machine learning and remote sensing perspective

Daniker Chepashev ^A , Serik Nurakynov

^A ^* , Divyansh Sharma ^B , Nurmakhambet Sydyk ^A and Gulzhiyan Kabdulova ^A

+ Author Affiliations

- Author Affiliations

^A Laboratory of Space Monitoring of Emergencies, Institute of Ionosphere, Almaty 050000, Kazakhstan.

^B Department of Sustainable Engineering, TERI School of Advanced Studies, New Delhi 110 070, India.

^* Correspondence to: snurakynov@ionos.kz

International Journal of Wildland Fire 34, WF24232 https://doi.org/10.1071/WF24232

Submitted: 2 January 2025 Accepted: 22 July 2025 Published: 20 August 2025

© 2025 The Author(s) (or their employer(s)). Published by CSIRO Publishing on behalf of IAWF. Published by CSIRO Publishing. This is an open access article distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND)

Abstract

Background

Wildfires pose significant environmental challenges in Kazakhstan, exacerbated by climate change and human activities. With vast landscapes ranging from grasslands to dense forests, the country is highly vulnerable to wildfires, yet lacks comprehensive predictive tools for fire risk assessment.

Aims

This study aims to develop a wildfire hazard map for Kazakhstan using the Maximum Entropy (MaxEnt) model, integrating environmental variables processed via Google Earth Engine.

Methods

The MaxEnt model was applied using historical fire occurrence data from MODIS (Moderate Resolution Imaging Spectroradiometer) and VIIRS (Visible Infrared Imaging Radiometer Suite), combined with environmental predictors like climate, topography and vegetation. Key predictors were statistically analyzed for relevance, ensuring the model’s robustness. The output was validated using independent fire data.

Key results

The model achieved an area under the curve score of 0.79, an accuracy of 72% and recall of 71%. The resulting map delineates fire risk zones, identifying high-risk areas, predominantly in forested and steppe regions.

Conclusions

The study highlights the efficacy of the MaxEnt model in wildfire risk prediction, underscoring its potential for application in other regions.

Implications

The map provides critical insights for resource allocation, fire management strategies and policymaking. Continuous model refinement and integration of real-time data are recommended to enhance predictive accuracy and adaptability.

Keywords: Geographic Information System (GIS), Google Earth Engine (GEE), Kazakhstan, machine learning, maximum entropy, MaxEnt, remote sensing, wildfire risk.

Introduction

Forest fires are escalating into a critical global environmental challenge, garnering increasing concern owing to their profound impacts on ecosystems, human settlements and the global climate system (Jones et al. 2022; Sagar et al. 2024). The advent of Earth observation data and geospatial techniques has revolutionized the way wildfires are studied, providing a global perspective and enabling the characterization of fire regimes through active fire hot spots and burn area products (Reddy and Sarika 2022; Babu et al. 2023; Surbhi Singh and Jeganathan 2024). These technological advancements allow researchers to derive crucial attributes needed to characterize fire regimes, such as the size of the fire patch, extent of burned area, recurrence, and fire intensity and severity, thus facilitating better land use planning and mitigation strategies. Modern remote sensing tools, such as NASA’s Moderate Resolution Imaging Spectroradiometer (MODIS) and the Visible Infrared Imaging Radiometer Suite (VIIRS) on NOAA-20 satellites, have revolutionized our ability to monitor wildfires in real time (Guo et al. 2017; Reddy and Sarika 2022).

Alongside the revolutionary capabilities of remote sensing, substantial progress has been made in the field of data analytics through the implementation of machine learning and deep learning techniques. These techniques are essential for improving the precision of forest fire detection systems by enabling the early detection and management of wildfires, which reduces their detrimental effects on infrastructure, ecosystems and human lives (Quintano et al. 2018; Babu et al. 2023). Machine learning algorithms offer a sophisticated approach to analyze data from a variety of sources, which enhances our ability to understand and manage the complex dynamics of wildfires (Tariq et al. 2021; Surbhi Singh and Jeganathan 2024). Various studies have utilized and continuously refined machine learning methodologies to predict, manage and detect wildfires, demonstrating the increasing utility of these techniques in environmental monitoring and disaster management (Sayad et al. 2019; Guede-Fernández et al. 2021; Carta et al. 2023). These techniques excel in processing complex, multi-dimensional data, enabling researchers to identify subtle patterns that indicate susceptibility to wildfires. Despite their powerful capabilities, these models face specific challenges that can affect their performance. Traditional statistical models, though foundational, often struggle with the non-linear nature of environmental data, as evidenced in studies like that of Oliveira et al. (2012), where machine-learning approaches, particularly Random Forests (RFs), proved superior in predicting fire occurrences in Mediterranean Europe owing to their ability to handle complex interactions between variables. Machine learning models such as RFs and Support Vector Machines (SVMs) improve on the limitations of statistical models by managing non-linear relationships more effectively (Chicas et al. 2022). They integrate diverse data types, from climate variables to topographical features, to predict wildfire occurrences with greater accuracy. A notable application is Rodrigues and De la Riva (2014), who used these models to assess human-caused wildfires in Spain, revealing the potential of machine learning to inform fire management strategies. Although deep learning offers further advancements through multi-layered neural networks that can learn from extensive datasets, its application requires considerable computational resources, which may not be feasible in all research settings (Jiao et al. 2019). The work of Zhang et al. (2019), who applied Deep Convolutional Neural Networks to analyze wildfire characteristics, exemplifies the potential of deep learning but also highlights its demands for extensive data and computing power. Thus, the practical application of these advanced techniques must consider both the availability of technological resources and local conditions to ensure accessibility and feasibility (Mishra et al. 2023).

Among the various data-driven models used for assessing wildfire risks, the Maximum Entropy (MaxEnt) model stands out for its unique capabilities and advantages (Paudel et al. 2024). Ebrahimy et al. (2017) successfully utilized the MaxEnt model to analyze the contribution and importance of various environmental factors such as meteorological conditions and vegetation types in predicting forest fire occurrences. Their study highlighted the model’s ability to discern non-linear interactions between these factors, providing insights that are crucial for effective fire management. Moreover, the MaxEnt model is capable of integrating diverse types of data, from climatic conditions to topographic and human factors, thereby creating a comprehensive model of fire susceptibility (Thomason 2015). This flexibility makes it an invaluable tool in regions where fire occurrence data are sparse or in new or changing landscapes. The model’s predictive accuracy and reliability have been validated in numerous studies, demonstrating its superiority over other models in scenarios where traditional predictive modeling techniques may fall short (Mishra et al. 2023; Paudel et al. 2024). The strength of the MaxEnt model also lies in its predictive precision, which is essential for crafting effective fire management strategies that can mitigate potential damage and protect ecosystems. By providing detailed probability distributions of fire occurrences, MaxEnt helps policymakers and environmental managers to allocate resources more efficiently and plan more effective interventions (Yang et al. 2021). In summary, the MaxEnt model’s ability to handle incomplete data, along with its flexibility and predictive accuracy, makes it an excellent choice for modeling wildfire risks. Its application extends beyond simple risk assessments, offering the potential for development into comprehensive early warning systems and enhancing our overall understanding of wildfire dynamics (Javidan et al. 2021).

The issue of forest fires in Kazakhstan is particularly pressing owing to its distinct geographic and climatic conditions (Babu et al. 2019). Despite forests covering only 5% of its land mass (123,345 km²), the country faces severe wildfire challenges owing to its expansive grasslands and steppes, which constitute 70% of the landscape. Annually, over 18,000 fire incidents are reported, predominantly in rural regions such as Karaganda, Kostanay, Aktobe and North Kazakhstan (Merekeyev and Nurakynov 2022; Shinkarenko et al. 2023). These areas are characterized by flat or gently sloping terrains with sparse river systems, enabling rapid fire spread during dry seasons. Furthermore, recent trends indicate a worrying escalation in fire incidents in Kazakhstan. According to Global Forest Watch (2024), Kazakhstan lost approximately 32.7 kha of tree cover owing to fires between 2001 and 2023, with 2023 marking the highest loss of 14.0 kha – accounting for 70% of all tree cover loss that year. These data underscore the increasing intensity of fire seasons and the pressing need for robust predictive tools and management strategies.

Moreover, the increased fire activity is particularly concerning given the country’s limited resources and infrastructural readiness to manage and mitigate the effects of such fires (Lednev et al. 2021). Historical and ongoing management practices, or the lack thereof, contribute to the high vulnerability of these regions. The existing strategies for forest and fire management in Kazakhstan have been inadequate in addressing the complex dynamics of fire risk, further necessitating the development of more sophisticated and localized approaches to forest fire management. In Kazakhstan and the broader Central Asian region, climatic factors play a significant role in shaping the wildfire regime. The region is experiencing a faster rate of warming compared with the global average, with temperature increases significantly affecting the aridity and moisture levels of vegetation, thus altering fire dynamics. These climatic conditions, coupled with human factors such as land use changes, agricultural practices and urban expansion, create a complex matrix of risk factors that influence the frequency, intensity and geographical spread of forest fires (Zong et al. 2020).

Despite Kazakhstan’s significant exposure to wildfires, there is a noticeable gap in comprehensive wildfire risk mapping using advanced predictive models. Previous studies, such as the development of the Forest Fire Danger Index using geospatial techniques by Babu et al. (2019), primarily utilized MODIS data to create dynamic and static forest fire probability indices. Although these efforts represent valuable steps forward, they primarily focus on forested areas and often rely on indices that may not fully capture the nuanced interactions between various fire influencing factors. The study of the history of forest fires in the Burabai Region of Kazakhstan by Mazarzhanova et al. (2017) highlights the historical context and recurrence of fires using dendrochronological data. This study presents a deeper, historically rooted understanding of fire regimes but remains limited to localized tree-ring data, which although insightful, does not provide the immediate, spatially extensive risk assessment required for timely fire management. Further, Spivak et al. (2012) discuss the operational fire space monitoring system in Kazakhstan, which leverages remote sensing for real-time monitoring of fire areas and burnt areas. Whereas these systems are crucial for operational management, they do not necessarily provide predictive capabilities that can anticipate fire occurrences based on risk assessments derived from environmental and climatic data.

The current study aims to address these gaps by developing a static fire danger assessment map for Kazakhstan using the MaxEnt model. This model has been chosen for its robust capability to handle incomplete data and its ability to integrate various types of input variables (Javidan et al. 2021; Yang et al. 2021). Unlike previous efforts, this study utilizes a presence–absence testing framework based on historical hot spot data, which allows for a more detailed and nuanced understanding of fire risk factors. The use of MaxEnt in a cloud-computing environment like Google Earth Engine is a further innovation providing scalable, efficient processing capabilities, enhancing the predictive accuracy and applicability of the resulting fire danger map. In addition to technological advancements, this study contributes a novel approach by incorporating a web-based Geographic Information System (GIS) for disseminating wildfire risk data, making the information readily accessible to forest managers and policymakers. This integration of predictive modeling with real-time data dissemination represents a significant step forward in wildfire management in Kazakhstan. This research not only fills a critical gap by providing a scientifically robust tool for wildfire risk assessment but also enhances the existing fire management infrastructure in Kazakhstan. By addressing both the spatial and temporal dimensions of fire risk and integrating them into an operational tool, this study paves the way for more proactive, strategic responses to wildfire threats in the region.

Study area

Kazakhstan, the largest country in Central Asia and the ninth largest in the world, spans an expansive area of 2,727,900 km². It is bordered by Russia to the north, China to the east, and Kyrgyzstan, Uzbekistan and Turkmenistan to the south. The country’s vast territory includes a wide range of geographic features from the Caspian lowlands in the west to the Altai Mountains in the east, depicted in Fig. 1. Approximately one-third of Kazakhstan is covered by the Kazakh steppe, a vast plain that extends from the northern shores of the Caspian Sea to the foothills of the Altai Mountains. This region primarily consists of steppes, semi-deserts and deserts, transitioning into forest steppes in the north and northeast – areas that serve as a transitional belt to the dense forests of Siberia. Despite its extensive land area, only ~5% of the country, totaling 123,345 km², is covered by forests, which represent the main forest reserve of Kazakhstan. This varied landscape contributes to significant bio-mass diversity, essential for assessing wildfire risks and managing fire-prone areas (Spivak et al. 2012; Babu et al. 2019; Merekeyev and Nurakynov 2022).

Fig. 1.

Kazakhstan national boundary overlaid with all MODIS (Terra, MOD14) and VIIRS (VFIRE 375) fire data collections recorded between 2015 and 2022.

Climatically, Kazakhstan experiences sharp continental influences. In the northern parts of the country, annual precipitation ranges between 250 and 350 mm, whereas in the southern regions, it drops to a scant 100–120 mm (Oladejo et al. 2023). Winter temperatures in January can average −15°C, with extreme lows reaching −40°C, illustrating the region’s harsh winter conditions. Conversely, summer months see temperatures soaring to 40°C, especially in the lowland steppes and desert areas, creating conditions that can exacerbate the frequency and intensity of wildfires (Spivak et al. 2012). According to the Köppen–Geiger climate classification system (Fig. 2), the northern and eastern regions of Kazakhstan are categorized under the Dfa (cold climate with hot summers) and Dfb (cold climate with warm summers) zones. In contrast, the southern and western parts of the country, characterized by their arid conditions, fall into the BSk (dry steppe) and BWk (desert) climate zones (Peel et al. 2007). These climatic zones play a crucial role in determining the patterns and behaviors of wildfires, influencing both the natural vegetation types and the seasonal risk factors associated with fire events.

Fig. 2.

Koppen–Geiger climate classification of Kazakhstan.

Along with natural factors, human activities significantly influence wildfire dynamics in Kazakhstan. In the northern forest–steppe zones, agricultural practices such as crop cultivation and seasonal burning for land clearance are common, often inadvertently escalating fire risk during dry periods (Khaidarov and Arkhipov 2001; Dara et al. 2020). The vast steppes and semi-arid regions are heavily utilized for pastoralism, with overgrazing reducing vegetation cover and increasing the accumulation of dry biomass, thereby enhancing flammability (Venkatesh et al. 2022). In southern and western Kazakhstan, industrial activities – particularly oil and gas extraction – introduce ignition sources through infrastructure such as pipelines and machinery, while urban expansion into wildland–urban interfaces further amplifies anthropogenic fire hazards (Merekeyev and Nurakynov 2022). These activities, combined with traditional practices like uncontrolled stubble burning, underscore the critical role of human behavior in shaping regional fire regimes.

Methodology

The methodology of this study (Fig. 3) is structured around four principal steps designed to meticulously prepare, analyze and model wildfire risk across Kazakhstan, ensuring the accuracy and relevance of the fire danger assessment map developed. Initially, the study begins with the preparation of fire points for modeling. This involves compiling and preparing historical fire occurrence data, verifying their accuracy and formatting the data to suit the analytical needs of the models. Following this, all environmental layers, including vegetation, topography and climate data, are standardized using ArcGIS to the same geographic projection and cell size, ensuring uniformity crucial for accurate spatial analysis and compatibility across data layers. Table 1 provides an overview of the datasets employed in this analysis, specifying the source and spatial resolution for each. The third step is a statistical analysis to correlate environmental variables and remove multicollinearity, identifying and eliminating redundant variables to enhance the predictive power and stability of the modeling process. Finally, the study culminates in the construction of the predictive model using the MaxEnt method, integrating the meticulously prepared fire data with the optimized environmental layers to develop a robust model capable of accurately predicting areas at risk of wildfires with high precision.

Fig. 3.

Methodological workflow illustrating the principal stages of the study: (i) acquisition and preprocessing of various datasets required; (ii) compilation and harmonization of the predictor variables; (iii) multi-collinearity screening; (iv) MaxEnt modeling; and (v) generation of fire risk map.

Table 1.Dataset overview: fire occurrence and environmental variables.

Dataset		Source	Spatial resolution (m)
Fire occurrence data	MODIS	MODIS (MOD14)	1000
Fire occurrence data	VIIRS	VFIRE 375	375
Climatic variables	Annual mean temperature	WorldClim	928
	Maximum temperature of warmest month
	Annual mean precipitation
	Precipitation of driest month
	Quarterly mean precipitation
	Wind exposure index
	Diurnal Anisotropic Heating
	Köppen climate classification
Topography	Elevation	SRTM DEM (Shuttle Radar Topography Mission Digital Elevation Model)	30
	Slope
	Aspect
	Topographic Wetness Index
Land cover and vegetation	Percentage tree cover layer	MODIS (MOD44B)	250
	Normalized Difference Vegetation Index	MODIS (MOD13Q1)S	250
	Land cover (%)	ESA	10
Anthropogenic conditions	Distance to major roads	OpenStreetMaps	–

Fire occurrence data

For this study, fire occurrence data play a pivotal role. We used MODIS (Terra, MOD14) and VIIRS (VFIRE 375) to acquire extensive historical fire data for the Kazakhstan region. These tools have revolutionized our ability to monitor wildfires in real time (Guo et al. 2017; Reddy and Sarika 2022). These technologies provide critical data on active fire locations, fire behavior and post-fire effects, aiding in immediate response and long-term recovery planning.

MODIS provides temporally extensive and consistent active fire data, offering daily imagery with a spatial resolution of 1 km. In addition, VIIRS is available at a higher spatial resolution of 375 m. The integration of these products leverages MODIS’ multi-decadal temporal depth alongside the finer spatial granularity provided by VIIRS, thereby enhancing the comprehensiveness and reliability of fire detections employed for model training. These datasets deliver information on thermal anomalies recorded up to eight times daily as vector points with precise coordinates and associated attributes. This high temporal frequency ensures detection of most natural fires occurring under cloudless conditions across Kazakhstan. However, thermal anomalies hidden beneath cloud cover remain undetected, thus creating potential data gaps.

To address inherent incompleteness due to cloud-obscured fires, the MaxEnt model was specifically chosen. MaxEnt is adept at handling incomplete datasets, operating on the principle that, given incomplete information, the best prediction is obtained by selecting a probability distribution with maximum entropy (Thomason 2015). Additionally, data integrity was further ensured by excluding thermal anomalies associated with anthropogenic sources, such as factories, oil flares and other infrastructure-related hotspots, to prevent these from being classified as natural fires. Furthermore, to enhance data quality, only fire points with a confidence level of 90% or above, recorded from 2015 to 2022, were selected. Duplicate records were removed, resulting in 60,283 unique fire occurrence points that served as presence points for training the MaxEnt model. This dataset is essential not only for modeling current fire risk but also for understanding historical fire patterns, which can inform future fire management strategies (Reddy and Sarika 2022).

By focusing on fire occurrences with a confidence level of 90% or above and removing any duplicative or anthropogenic disturbances, we ensure that our data are optimal for the subsequent modeling processes. This data-centric approach lays the groundwork for accurate and effective wildfire risk assessment, essential for developing targeted management strategies in the region.

Driving factors in wildfire risk assessment

The assessment of wildfire risk hinges on the ‘fire environment triangle’, which encapsulates the primary elements influencing fire ignition and spread: climatic conditions, topographic conditions and vegetation cover (Roy et al. 2012). Additionally, considering the significant impact of human activities on fire occurrence, our methodology also incorporates a layer indicating the proximity to major roads, which serves as a proxy for anthropogenic influences on fire risk.

Climatic conditions

Climatic factors are paramount in influencing the initiation, persistence and spread of wildfires. Kazakhstan has experienced marked climatic changes, particularly noticeable in the central region, with pronounced spring warming trends. Seasonal precipitation patterns have also shifted, with increased rainfall in the northern and northwestern parts during spring and summer, while central and southern areas have become drier. Notably, the western and northwestern regions endure drier summers, contrasting with wetter conditions in the central and northern areas, which has led to heightened aridity throughout most of the country since the early 21st century, apart from in the northern regions (Zheleznova et al. 2022).

The interplay between temperature variations, precipitation fluctuation and periods of drought creates a dynamic feedback loop that critically affects fuel availability and flammability. Annual and interannual fluctuations in temperature and precipitation directly affect evapotranspiration rates and soil moisture levels, which in turn modulate the moisture content of vegetation. For example, sustained periods of high temperatures combined with reduced precipitation accelerate the drying of fine fuels – such as grasses and shrubs – thereby increasing their susceptibility to ignition. Moreover, drought conditions not only lower fuel moisture but also promote the accumulation of dead biomass, further enhancing the flammability of the landscape (Abedi Gheshlaghi et al. 2021). Conversely, episodes of high precipitation coupled with cooler temperatures can stimulate vigorous vegetation growth, leading to an increase in biomass. Although this growth initially creates denser fuel loads, it is the transition from moist to dry conditions that ultimately renders the vegetation highly combustible (Babu et al. 2023). Importantly, the relationship between these climatic variables and fuel properties is non-linear; for instance, the drying of fuel under a brief but intense heatwave can be far more impactful than a gradual increase in temperature, owing to rapid changes in moisture content and chemical composition of the vegetation.

This study incorporates comprehensive climatic data (Fig. 4), including the Wind Exposure Index (WEI), Diurnal Anisotropic Heating (DAH), Köppen Climate Classification, mean annual temperature, maximum temperature of the warmest month, mean annual precipitation, precipitation of the driest month and quarterly precipitation. The climate layers and Köppen Climate Classification were obtained from WorldClim and Koppen datasets available in Google Earth Engine, ensuring a consistent spatial resolution of 928 m. In contrast, the WEI and DAH layers were derived from digital elevation models (DEMs) using specialized tools within the SAGA-GIS software. This dual-source approach ensures that both climatic factors are accurately represented in our analysis.

Fig. 4.

Climatic predictors resampled to 1 km resolution for model input: (a) Wind Exposure Index (WEI); (b) Diurnal Anisotropic Heating (DAH); (c) mean annual temperature; (d) maximum temperature of warmest month; (e) mean annual precipitation; and (f) precipitation of driest month.

Topographic conditions

Topography plays a crucial role in wildfire dynamics, affecting the distribution and combustibility of fuel as well as local climatic conditions. Factors such as slope can significantly accelerate fire spread, with combustible materials more likely to move downhill rapidly. Surface features, including elevation, slope exposure and terrain complexity, directly influence fire behavior, with steeper slopes facilitating faster spread due to efficient convective preheating and contact point ignition (Guo et al. 2017). Vegetation on varied terrains reacts differently; for example, vegetation on smooth slopes is more prone to complete combustion compared with slopes with natural barriers like streambeds or rocks that can halt the progression of fires (Chuvieco and Congalton 1989). Additionally, altitude plays a notable role, with higher elevations typically experiencing more precipitation and thus fewer fires, as observed in areas above 2500 m above sea level. Consequently, topographic factors (Fig. 5) – slope, aspect and elevation – were included as essential components of this study. The Topographic Wetness Index (TWI), calculated to quantify topographic control on hydrological processes, was also included to assess potential soil moisture impacts on fuel availability and fire spread. This index and other topographic features were obtained from a dataset in Google Earth Engine with a spatial resolution of 30 m.

(1)

TWI = \ln \frac{(α)}{(\tan β)}

where α is the upslope contributing area and β is the slope angle.

Fig. 5.

Topographic predictors resampled to 1 km resolution for model input: (a) slope; (b) aspect; (c) elevation; and (d) Topographic Wetness Index (TWI).

Land cover and vegetation conditions

Vegetation characteristics, including the state of phytomass, are pivotal in determining fire behavior (Baeza et al. 2006). Factors such as the ratio of green-to-dead phytomass and the density of grass cover vary according to current and preceding weather conditions. During hot and dry periods, these indices typically fall below average, reducing fire risk unless previous fire activity has cleared the area, thus preventing new fires regardless of current conditions. Different vegetation types exhibit varied flammability; for instance, coniferous and deciduous trees generally present higher fire risk compared with wetlands or recently burned areas (Calviño-Cancela et al. 2016). The presence and density of vegetation significantly affect fire intensity and spread direction, influencing not only surface fires but also potential crown fires in forested areas.

Several remote sensing products are used to capture these dynamics within the study. A percentage tree cover layer, derived from MODIS data with a spatial resolution of 250 m is included to quantify forest density and distribution. Additionally, the Normalized Difference Vegetation Index (NDVI), also retrieved from MODIS (250 m resolution), was employed to assess the health and vigor of vegetation across different regions. NDVI is particularly valuable for identifying live green vegetation, providing a consistent metric to gage vegetative cover, which can influence fire susceptibility. The higher the NDVI value, the more potential fuel, which can contribute to higher fire intensity and faster spread, particularly in densely forested areas (Fig. 6).

(2)

NDVI = \frac{(NIR - Red)}{(NIR + Red)} (NIR, near infrared)

Moreover, a comprehensive land cover map from the European Space Agency (ESA) WorldCover project complements the NDVI and tree cover data by providing detailed classifications of land cover types across Kazakhstan. This map categorizes the landscape into various land cover types, such as trees, crops, rangeland, flooded vegetation, snow, bare ground, water and built areas, each with distinct fire behavior characteristics. Understanding these types helps in predicting how fires might ignite, spread and be managed considering the specific properties of each land cover type, such as moisture content, fuel load and accessibility (Oliveira et al. 2012; Dastour and Hassan 2024).

Fig. 6.

Vegetation and land cover layers: (a) percentage tree cover; (b) Normalized Difference Vegetation Index; and (c) land use – land cover.

Together, these layers – percentage tree cover, NDVI and the ESA WorldCover land map – form a robust framework for analyzing and modeling the vegetative factors that influence wildfire risks. This comprehensive approach allows a more nuanced understanding of how vegetation interacts with climatic and topographic conditions to affect fire dynamics, ultimately aiding in the creation of a detailed and accurate fire risk assessment for Kazakhstan.

Anthropogenic conditions

Human activity is a predominant factor in the ignition and propagation of wildfires. Analysis of fire incidents in areas like the Borovoye Reserve indicates that 99.5% of forest fires are human-caused (Mazarzhanova et al. 2017). Human influences extend to land cover changes such as deforestation and agricultural practices, which can directly trigger fires. The increasing popularity of forested areas due to recreational activities and enhanced access via roads has also increased fire risks. Consequently, a layer representing the distance to major roads (Fig. 7) was included to reflect this human dimension, highlighting areas where increased accessibility may elevate fire occurrence probabilities.

Fig. 7.

Euclidean distance (m) from roads derived from OpenStreetMap.

Data fusion and standardization

To effectively integrate environmental variables from various data sources, it was essential to standardize their spatial resolution and projection systems. As datasets such as climatic variables (928 m), topographic data (30 m), vegetation indices from MODIS (250 m) and anthropogenic data (variable resolutions) differed significantly in spatial resolutions and projections, ArcGIS tools were used to unify these variables. All layers were projected into a common geographic coordinate system (WGS84) and subsequently resampled to match a unified spatial resolution (1 km) suitable for the analysis. During the resampling process, the bilinear interpolation method was chosen. Bilinear interpolation calculates the value for each output pixel based on a weighted average derived from the four nearest input pixels. This approach was selected owing to its effectiveness in maintaining data continuity and accuracy when working with continuous data such as climate and vegetation indices.

Data quality control measures

Ensuring the accuracy and reliability of environmental data used in wildfire modeling is crucial. Therefore, comprehensive quality control measures were implemented across all datasets. Climatic data obtained from Google Earth Engine were cross-validated using established global climate datasets (e.g. ERA5 reanalysis) and checked against available ground-based weather station data from meteorological archives in Kazakhstan to verify accuracy in terms of temperature and precipitation values. These cross-checks revealed strong consistency, ensuring climatic predictors were suitable for modeling wildfire dynamics.

For land cover classification, the ESA WorldCover dataset was used owing to its global validation standards, with an overall accuracy typically exceeding 80% for Central Asian regions. This classification accuracy was explicitly assessed through confusion matrices provided by ESA, including producer’s and user’s accuracy for key classes like forest, grassland and cropland, critical to fire risk assessment. Any discrepancies identified during visual inspections were minor and deemed acceptable for analysis.

Topographic variables sourced from Google Earth Engine’s DEM underwent manual visual inspection in ArcGIS to detect and correct any anomalies or artifacts, particularly in slope and aspect calculations. Vegetation data (e.g. MODIS-derived NDVI and tree cover) were validated by visual cross-referencing with high-resolution Sentinel-2 imagery to confirm vegetation patterns and distribution accuracy. Collectively, these explicit quality control steps ensured the robustness of input data, minimizing uncertainty in the wildfire risk modeling outcomes.

Variable selection and multicollinearity analysis

Before building predictive models, it is essential to perform a thorough statistical analysis to examine the relationships between various environmental variables that influence wildfire occurrence. This process helps in a systematic variable selection that ensures reduced redundancy among predictor variables and retains only those contributing independent and meaningful information about wildfire occurrence (Oliveira et al. 2012).

Initially, Pearson’s correlation coefficient (|r|) was used to examine pairwise relationships between all candidate variables. A conservative threshold of |r| > 0.75 was applied to identify and flag strongly correlated pairs, helping eliminate variables with overlapping information. To further assess multicollinearity in the dataset, we computed the Variance Inflation Factor (VIF) for each variable. A VIF threshold of 5 was used as the cut-off, beyond which variables were considered to exhibit unacceptable levels of multicollinearity and were excluded.

(3)

VIF = \frac{(1)}{(1 - R^{2})}

where R² is the coefficient of determination from regressing a predictor against others. This two-step filtering method – based on well-established statistical techniques – ensures that only variables with low interdependence and strong individual relevance are retained for final modeling (Zhang et al. 2019). In total, 10 variables were selected based on their statistical independence and thematic significance to wildfire risk dynamics (e.g. elevation, precipitation, NDVI, distance to roads).

Although stepwise regression and other algorithmic selection techniques (such as recursive feature elimination) are valuable in regression-based frameworks, they are less suited for presence-only models like MaxEnt, which relies on ecological relevance and entropy-based constraints rather than linear model fit statistics (Paudel et al. 2024). Therefore, Pearson’s correlation and VIF analysis were deemed more appropriate and interpretable within the MaxEnt modeling context.

This selection process not only enhanced model stability and performance but also ensured that the retained predictors contributed distinct insights into the climatic, topographic, vegetative and anthropogenic factors influencing wildfire risk across Kazakhstan.

MaxEnt modeling for fire occurrences

Following the correlation analysis, the study uses MaxEnt modeling. It is a powerful machine-learning technique used extensively in ecological modeling and bioinformatics for predicting the distribution and presence of species based on limited data (Elith et al. 2011). In our study, MaxEnt is adapted to model the probability distributions of fire occurrences across Kazakhstan. The model operates under the principle of maximum entropy, which seeks the most uniform distribution of outcomes while adhering to the given constraints – in this case, the environmental conditions present at known fire locations (Frongillo and Reid 2014). This approach is especially advantageous in environmental sciences, where data scarcity and uncertainty are common. The MaxEnt model is primarily valued for its effectiveness in ecological and environmental modeling, where it predicts the probabilities of different outcomes under a set of given constraints (Elith et al. 2011).

To ensure the rational application of the MaxEnt model in this study, we took deliberate steps to address the key assumptions underlying the method. First, MaxEnt is explicitly designed for presence-only data scenarios, which aligns with our use of high-confidence (≥90%) MODIS and VIIRS fire occurrence points as presence data. Second, to ensure that these presence points were representative of the full range of environmental conditions across Kazakhstan, we included fire occurrences recorded over an 8-year period (2015–2022) across diverse climatic and geographic zones. This broad spatial and temporal coverage minimized potential sampling bias. Third, the predictor variables used in the model were carefully filtered through Pearson correlation and VIF analysis to eliminate redundancy and reduce multicollinearity – thereby satisfying the assumption of independence among environmental inputs. Finally, the model generated pseudo-absence (background) points from areas with contrasting environmental characteristics to ensure a comprehensive representation of the available environmental space. By addressing these key assumptions, we ensured the robustness and validity of applying the MaxEnt model for wildfire risk prediction in Kazakhstan.

Regarding model parameters, a regularization multiplier of 1 was selected to control model complexity, balancing the model’s ability to fit the training data against the risk of overfitting. Additionally, the model was run through 10 iterations using random seed selections to ensure robustness and consistency in prediction outcomes. Apart from explicitly stated adjustments, other model parameters were retained at their default settings, given their demonstrated reliability and common use in similar environmental modeling studies. The modeling process involves assigning a probability value ranging from 0 to 1 to each pixel on the map, indicating the degree of fire hazard. Essentially, the MaxEnt model predicts these values based on the patterns observed among the predictors at the presence points.

Model validation

To evaluate the predictive performance of the MaxEnt model, an independent validation dataset was employed. Fire occurrence data from NASA’s Fire Information for Resource Management System (FIRMS) for the year 2023 served as the test dataset. This ensured temporal and spatial independence between training (2015–2022) and validation data, minimizing bias and overfitting. The FIRMS hotspots were preprocessed to align with the model’s spatial resolution (1 km) and coordinate system (WGS84). Fire points with a confidence level ≥90% were retained, and duplicates were removed to mirror the stringent quality control applied to the training data. This yielded 2927 hotspots (out of the original 125,226 detections) for validation. Each fire point was overlaid on the modeled probability surface, allowing us to determine the proportion of fires captured within the five hazard classes.

The model’s discriminative performance was assessed using the AUC-ROC (Area Under the Curve Receiver Operating Characteristic) metric, which evaluates the ability of a model to distinguish between presence (fire) and absence (no fire) classes under varying probability thresholds (Thomason 2015; Javidan et al. 2021). The ROC curve plots the True Positive Rate (TPR, or sensitivity) against the False Positive Rate (FPR, or 1 − specificity) across all classification thresholds. The AUC is mathematically defined as:

(4)

AUC = \int_{0}^{1} TPR (FPR) dFPR

(5)

TPR = \frac{True positives}{True positives + false negatives}

(6)

FPR = \frac{False positives}{False positives + true negatives}

The AUC-ROC integrates sensitivity and specificity, providing a robust evaluation even under imbalanced data conditions. Its value ranges from 0 to 1, where AUC <0.50 indicates random classification performance, 0.50–0.70 highlights poor predictive capability, 0.70–0.90 denotes moderate to acceptable performance, and >0.90 implies high model accuracy (Paudel et al. 2024).

Continuous probability scores from MaxEnt were converted to binary fire/non‑fire predictions using the maximum‑sensitivity‑plus‑specificity (MS + SP) criterion, which balances omission and commission errors. From the resulting confusion matrix, we computed overall accuracy, recall (sensitivity) and specificity, with 95% confidence intervals obtained via bootstrap resampling. This structured validation framework ensured a rigorous, unbiased assessment of the model’s operational utility.

Results

Analysis of environmental variables influencing wildfire occurrences

Our study conducted a thorough analysis of the relationships between various environmental variables to determine their influence on wildfire occurrences in Kazakhstan. This analysis utilized Spearman’s rank correlation to assess the monotonic relationships between variables, and the VIF to effectively reduce multicollinearity (Fig. 8). These statistical methods ensured that the predictors included in our model provided unique and significant insights into wildfire dynamics.

Fig. 8.

Spearman rank correlation matrix between the variables considered for the study.

Initially, 16 environmental variables were considered based on their relevance to factors affecting wildfire behavior. Following the correlation and multicollinearity analysis, 10 variables were identified as the most significant and retained for final modeling. These include elevation, which impacts microclimatic conditions and vegetation types; annual mean temperature, crucial for understanding the climatic conditions conducive to fire; land use and land cover (LULC), reflecting human impacts and natural vegetation distributions; DAH, affecting daily temperature variations; and slope, which influences the speed of fire spread owing to gravitational effects.

Additionally, the Köppen climate classification broadly categorizes climate zones affecting vegetation patterns and potential fire behavior. Aspect affects sunlight exposure and thus vegetation dryness; precipitation in the driest month indicates moisture availability during critical periods; percentage tree cover measures forest density that can fuel fires, and distance to roads serves as a proxy for human access and potential ignition sources. These variables were chosen for their ability to provide a comprehensive and nuanced understanding of the factors that contribute to wildfire risk, facilitating the development of a robust model to predict wildfire occurrences across Kazakhstan.

Model predictions and risk mapping

Building on the refined selection of environmental variables, the MaxEnt model was used to predict and visualize wildfire risk across the diverse landscapes of Kazakhstan. The application of the MaxEnt model yielded a comprehensive map that vividly illustrates the spatial distribution of wildfire risk across Kazakhstan (Fig. 9). This detailed map presents the locations of potential fire occurrence with a high resolution of 1 km, providing a granular view of the landscape’s vulnerability to fire. Each pixel on the map is assigned a probability value ranging from 0 to 1. These values represent the likelihood of fire occurrence, with higher values indicating greater susceptibility to wildfires. This methodical gradation in fire probability values offers a nuanced and detailed understanding of fire-prone areas, distinguishing between regions with varying levels of fire risk. Such detailed visualization allows more informed decision-making, enabling policymakers, environmental managers and disaster response teams to focus their efforts on the areas most at risk. By facilitating targeted management and mitigation strategies, the map serves as a critical tool in both planning for and responding to wildfire events, ultimately aiming to minimize ecological damage and protect human life and property.

Fig. 9.

Wildfire occurrence probability surface generated by the MaxEnt model.

The MaxEnt model effectively delineates the landscape into distinct fire potential zones by analyzing the computed probability scores for each area. These five wildfire hazard classes are: very low (0–0.20), low (0.20–0.40), moderate (0.40–0.60), high (0.60–0.80) and very high (0.80–1.00). Fig. 9 depicts the spatial pattern of these classes, and Table 2 quantifies the extent of each zone across Kazakhstan. This categorization is pivotal for understanding and managing wildfire risks more efficiently. Areas falling within in very low fire potential (0–0.20) range typically exhibit a very low risk of wildfires. The very low risk class covers 416,928 km² (15.5%) of the national territory, chiefly in urban agglomerations, irrigated floodplains and high‑altitude belts where sparse or moist vegetation limits fuel continuity. The low availability of combustible vegetation due to these features contributes significantly to their reduced susceptibility to wildfires. Urban planning and landscape management often contribute to reduced vegetation density, further diminishing fire risks.

Table 2.Distribution of areas under different fire risk zones.

Fire risk	Area (ha)	Area (km²)	Percentage (%)
Very low	41,692,774.50	416,927.74	15.51
Low	37,320,010.78	373,200.10	13.89
Moderate	89,073,731.84	890,737.31	33.15
High	99,874,828.75	998,748.28	37.16
Very high	766,619.45	7666.19	0.29
Total	268,727,965.3	2,687,279.62	100

As the probability scores increase, the fire potential shifts to low (0.20–0.40): these areas (approximately 373,000 km² (13.9%)) often display a diverse mix of vegetation patterns, ranging from agricultural fields to sparsely wooded areas, which can act as fuel under certain conditions. They are also likely to be transitional zones where human activities such as farming and grazing reduce vegetation density but may also inadvertently increase fire risks due to the presence of residual dry plant material and the occasional use of fire for land clearing.

Moderate risk (0.40–0.60) zones, spanning an area of 890,737 km² (33.2%), are usually dominated by extensive grasslands, certain forest types, or agricultural fields that tend to dry out, particularly during prolonged periods of dry weather. These areas are inherently more susceptible to wildfires owing to the higher loads of combustible materials, including dry grasses and leaves, which can easily ignite under the right environmental conditions.

The high-risk zones, with probability scores of 0.60–0.80, and very high-risk zones with probability scores 0.80 or above, include regions predominantly covered by dense forests, woodlands, extensive farmlands, steppes and reedbeds. Significant areas include the forest zones in the Kostanay region, where diverse forest types like birch, aspen–birch and pine create a mosaic of highly combustible materials. The Burabay State National Natural Park and Semey Ormani Reserve in northeastern Kazakhstan are primarily composed of dense pine and birch forests, known for their rapid fire spread capabilities. The Ile-Balkhash State Natural Reserve features fire-prone forest–steppes, where the balance of herbaceous to woody vegetation increases fire risk. Additionally, the grassy steppes in Central Kazakhstan and riparian zones along the Syrdarya River, characterized by dense reed beds, have been identified as areas with recurrent fires. The frequency of fires in these regions and the density of thermal points detected by remote sensing significantly influence their representation in the machine learning model, showcasing a pattern of fire occurrence that is strongly correlated with specific vegetation types and exacerbated by regional climatic conditions.

Altogether, 70.4% of Kazakhstan’s land surface now falls into the moderate‑to‑very high hazard bracket, underscoring the urgency of region‑specific mitigation measures. The quantitative breakdown in Table 2 provides a baseline for allocating surveillance assets, prioritizing fuel‑reduction programs and refining response time benchmarks for each administrative region. By categorizing these fire potential zones, the model not only highlights regions at greatest risk but also aids in the strategic allocation of resources for fire prevention and control measures tailored to the specific characteristics and needs of each zone.

Validation of MaxEnt model

The predictive performance of the MaxEnt model was evaluated with an independent validation set comprising 2023–2024 fire season hotspots extracted from the NASA FIRMS product. Two complementary approaches were employed: (i) a threshold‑independent ROC analysis, and (ii) a spatial overlay of the validation points onto the probabilistic hazard surface (Fig. 10).

Fig. 10.

Receiver Operating Characteristic (ROC) curve depicting the diagnostic performance of the MaxEnt model against an independent 2023 FIRMS hotspot dataset.

The ROC curve yielded an AUC of 0.79, indicating good discriminatory capacity: the model correctly ranks a randomly chosen burned location above an unburned location 79% of the time. In wildfire susceptibility studies, AUC values between 0.7 and 0.9 are generally interpreted as evidence of reliable prediction, whereas values below 0.7 denote weak discrimination and values above 0.9 approach near‑perfect performance (Paudel et al. 2024). Hence, the present score confirms that the environmental covariates and fitted response functions capture the principal controls on ignition probability across Kazakhstan’s heterogeneous biomes.

Threshold‑dependent metrics derived from the confusion matrix corroborate this result. Using the commonly adopted maximum sum of sensitivity‑and‑specificity threshold, the model achieved an overall accuracy of 72% and a recall (sensitivity) of 71%. Accuracy indicates the proportion of true results, both true positives and true negatives, in the dataset. Recall in this context measures the model’s ability to identify all relevant instances of actual fire occurrences. The slight discrepancy between accuracy and recall suggests that although the model is generally adept at identifying areas where fires are likely to occur, it misses a small proportion of fire events. This gap may be attributed to the model’s conservative threshold setting or possible limitations in the environmental input data, which could be less representative of less frequent but significant fire events.

Fig. 11 spatially illustrates this pattern, and the quantitative overlay confirms it. Of the 2927 independent FIRMS hotspots used for validation, 1645 (56.2%) fall in the high risk class and 477 (16.3%) in the very high class, meaning 72.5% of all 2023 fires occur where the model predicts probabilities above 0.60. A further 729 points (24.9%) lie in the moderate class (0.41–0.60). In contrast, only 75 fires (2.5%) are located within the low or very low classes (<0.40). This concordance between the independent fire record and the MaxEnt probability surface affirms the model’s practical utility for real‑time decision support. Therefore, early‑warning bulletins and resource pre‑positioning can confidently be prioritized for the high and very high classes, which exhibit both the highest predicted likelihoods and the densest concentration of observed ignitions.

Fig. 11.

Spatial overlay of 2023 FIRMS validation points (red dots) on the Fire Risk Map.

This analysis reveals that although the MaxEnt model effectively identifies regions at risk of wildfires, there is still room for improvement. The slight gap between the model’s accuracy and its recall indicates a need for ongoing enhancements, possibly by refining its threshold settings or incorporating more comprehensive environmental data. These adjustments are essential to increase the model’s sensitivity and to ensure it captures a broader spectrum of fire events, thereby enhancing its utility in fire management strategies. This process of continual model refinement will enable more precise and effective responses to the complex dynamics of wildfire risk.

Discussion

This research significantly advances the application of the MaxEnt model to assess wildfire risks in Kazakhstan, introducing a nuanced approach by integrating a diverse array of environmental data into a robust, predictive framework. By incorporating region-specific variables such as vegetation dynamics, anthropogenic sources and microclimatic indices, this framework addresses previously overlooked drivers of fire susceptibility in arid landscapes. Our methodology harnesses rigorous statistical analysis and comprehensive data fusion techniques to capture the complex interactions among climatic, topographic, vegetative and anthropogenic factors. This study sets a new standard for predictive accuracy and operational utility in wildfire risk assessments, demonstrating how sophisticated modeling techniques can be harnessed to address complex environmental challenges.

Initially, our study focused on examining the relationships between 16 environmental variables, utilizing Spearman’s rank correlation to assess monotonic relationships and VIF to effectively address multicollinearity. This rigorous statistical analysis ensured that the predictors included in the model provided unique and substantial insights into wildfire dynamics. Through this process, 10 variables were selected for their significant roles in influencing wildfire occurrences: elevation, annual mean temperature, LULC, DAH, slope, Köppen climate classification, aspect, precipitation in the driest month, percentage tree cover and distance to roads. The integration of these variables into the MaxEnt model facilitated a comprehensive understanding of the factors contributing to wildfire risk, allowing the development of a robust model to predict wildfire occurrences across Kazakhstan.

The MaxEnt model’s detailed validation process reveals its capability to produce a granular and highly detailed map of wildfire risk across Kazakhstan, effectively distinguishing between various levels of fire susceptibility. This map, with its high resolution of 1 km per pixel, assigns probability values from 0 to 1 to each pixel to indicate the likelihood of wildfire occurrence. By classifying the landscape into distinct zones of fire potential, the model not only facilitates immediate firefighting efforts but also aids in strategic planning for land management and fire prevention measures (O’Mara et al. 2024). These insights are critical for deploying resources efficiently and for implementing proactive strategies that can significantly mitigate potential damage.

The MaxEnt model’s validation involved an independent test dataset consisting of fire hotspot data sourced from NASA’s FIRMS for the year 2023. An AUC score of 0.79 from the ROC analysis indicated the reliability of the model in distinguishing between fire-prone and non-fire-prone areas. When benchmarked against recent wildfire susceptibility studies (Table 3), this value is comparable with other single‑algorithm applications – for example, MaxEnt in Nepal (AUC 0.861; Paudel  et al.  2024) and slightly below more complex or ensemble approaches such as the PSO‑optimized neural‑fuzzy model in Vietnam (AUC 0.916; Tien Bui et al.  2017) and the alternating decision‑tree model in Iran (AUC 0.903; Jaafari  et al.  2018). It nevertheless exceeds the minimum threshold of 0.70 considered acceptable for operational fire‑risk mapping and underscores the model’s practical utility for Kazakhstan. This validation not only affirmed the model’s effectiveness in predicting fire occurrence but also demonstrated its practical utility in operational settings. Despite its strengths, the validation process also highlights areas for further refinement. Enhancing model precision may involve integrating more detailed or diverse environmental data, adjusting model parameters, or adopting more advanced machine-learning techniques that can capture complex patterns more effectively. Additionally, the model requires regular updates with new fire occurrence data to improve its sensitivity and adapt to changes in fire-prone environments (Giannakidou et al. 2024). This ongoing refinement is crucial to maintain the model’s relevance and effectiveness under varying conditions.

Table 3.Comparative AUC‑ROC performance of recent wildfire susceptibility studies.

Study (year)	Area	Modeling approach	AUC-ROC
Present study	Kazakhstan	MaxEnt	0.790
Marquez Torres et al. (2023)	Sicily, Italy	Integrated Bayesian network fire risk model	0.915
Tien Bui et al. (2017)	Lam Dong Province, Vietnam	PSO (Particle Swarm Optimization) -optimized neural fuzzy	0.916
Jaafari et al. (2018)	Zagros Mountains, Iran	Alternating Decision Trees (ADT)	0.903
Paudel et al. (2024)	Gorkha district, Nepal	MaxEnt	0.861
Ngoc Thach et al. (2018)	Thuan Chau, Vietnam	Multilayer perceptron neural net	0.894

The practical implications of this study are profound, particularly given the escalating fire risks in regions like Kazakhstan. The predictive model serves as an invaluable tool for disaster preparedness, enhancing the capabilities of authorities to forecast and strategize effectively for potential wildfire outbreaks (Yang et al. 2021; Wasserman and Mueller 2023). It supports the optimization of resource deployment for firefighting and informs the development of land use policies that could mitigate fire risks (O’Mara et al. 2024). Moreover, the model’s insights extend to environmental management, guiding decisions regarding the conservation of vulnerable ecosystems and the sustainable management of land (Marquez Torres et al. 2023). This guidance is instrumental in promoting long-term ecological balance and sustainability, contributing to the strategic decisions that underpin environmental stewardship.

One of the primary limitations of this study is the dependence of the MaxEnt model’s predictive accuracy on the quality and scope of the input data. Although a stringent confidence threshold (≥90%), removal of anthropogenic thermal anomalies and maintaining temporal consistency with MODIS detections mitigated most errors in the VIIRS record, a residual risk of misclassified hotspots persists because thermal channels alone cannot confirm post‑fire scarring. The absence of an explicit ex-post verification step, for instance, burn‑scar mapping from Sentinel‑2 imagery using spectral indices such as NBR, NBR2, the Char Soil Index or the Mid‑Infrared Burn Index, may introduce uncertainties in distinguishing fire scars. The discrepancies between the model’s accuracy and its recall highlight the need for continuous model enhancement, which could include the integration of additional predictive variables that capture the dynamic nature of wildfire risks more effectively (Marquez Torres et al. 2023). To overcome the static nature of the model, future iterations could benefit from the integration of real-time data feeds from advanced remote sensing technologies and on-ground sensors. This would enhance the model’s applicability in real-time scenarios, allowing rapid adjustments to changes in environmental conditions or unexpected anthropogenic impacts.

A further constraint – one not yet addressed in the present analysis – is the absence of a formal sensitivity analysis. The study did not perform a sensitivity analysis to quantify how variations in input variables (e.g. NDVI, climatic factors) influence model outputs. This limits our understanding of variable-specific contributions to fire risk predictions. A systematic jack‑knife or permutation importance experiment would clarify the marginal contribution and stability of each covariate, illuminate potential multicollinearity effects and reveal whether the model is overly reliant on any single climatic or topographic driver (Paudel et al. 2024).

Future research should aim to enhance the dynamic capabilities of the MaxEnt model to better adapt to real-time changes in environmental conditions. Integrating live satellite imagery and data from environmental sensors could refine the model’s predictions, improving its reliability and accuracy. Moreover, expanding the application of this modeling approach to other regions with similar climatic and geographic characteristics could validate its effectiveness more broadly, encouraging its adoption in global wildfire risk management practices.

Expanding the variety of data inputs, such as incorporating more detailed real-time weather data, human activity metrics and comprehensive vegetation profiles, would not only improve the model’s applicability and accuracy but also increase its relevance in the context of changing global climate patterns and increasing wildfire frequencies.

Conclusions

This study significantly advances wildfire risk assessment in Kazakhstan using the MaxEnt model, integrated with environmental variables processed through Google Earth Engine. The model effectively delineated areas of varying wildfire susceptibility, assigning probability values from 0 to 1 across the landscape with a resolution of 1 km. Key findings include identifying high-risk zones with probability scores exceeding 0.80, predominantly in regions with dense forest cover such as the Kostanay region and the Burabay State National Natural Park.

The validation of the MaxEnt model demonstrated a robust AUC score of 0.79, indicating a strong predictive capability in distinguishing between fire-prone and non-fire-prone areas. The model’s precision was further highlighted by an accuracy rate of 72% and a recall of 71%, showcasing its effectiveness in operational settings for real-time wildfire management.

Future enhancements should focus on refining the model by integrating real-time environmental data and expanding its application to similar regions globally. The goal is to develop a dynamic modeling tool that adapts to changing conditions and improves the strategic allocation of firefighting resources.

In summary, the study establishes a new benchmark for wildfire risk modeling in Kazakhstan, and it also underscores the necessity for continuous model optimization to enhance predictive accuracy and adaptability in the face of evolving wildfire threats.

Data availability

All necessary data are included within the article.

Conflicts of interest

The authors declare they have no conflicts of interest.

Declaration of funding

This research was funded by the Science Committee of the Ministry of Science and Higher Education of the Republic of Kazakhstan, grant no. BR24992865.

Declaration of use of AI

We acknowledge the use of ChatGPT (OpenAI, San Francisco, USA), to assist with language editing. The AI was employed solely to improve clarity, grammar, and structure. All substantive content, ideas, and interpretations are entirely those of the authors.

Acknowledgements

We are grateful to all the providers of free data and the authors of the articles discussed in this paper.

References

Abedi Gheshlaghi H, Feizizadeh B, Blaschke T, Lakes T, Tajbar S (2021) Forest fire susceptibility modeling using hybrid approaches. Transactions in GIS 25, 311-333.
| Crossref | Google Scholar |

Babu KN, Gour R, Ayushi K, Ayyappan N, Parthasarathy N (2023) Environmental drivers and spatial prediction of forest fires in the Western Ghats biodiversity hotspot, India: an ensemble machine learning approach. Forest Ecology and Management 540, 121057.
| Crossref | Google Scholar |

Babu KVS, Kabdulova G, Kabzhanova G (2019) Developing the Forest Fire Danger Index for the country Kazakhstan by using geospatial techniques. Journal of Environmental Informatics Letters 1, 48-59.
| Crossref | Google Scholar |

Baeza MJ, Raventós J, Escarré A, Vallejo VR (2006) Fire risk and vegetation structural dynamics in Mediterranean shrubland. Plant Ecology 187, 189-201.
| Crossref | Google Scholar |

Calviño-Cancela M, Chas-Amil ML, García-Martínez ED, Touza J (2016) Wildfire risk associated with different vegetation types within and outside wildland–urban interfaces. Forest Ecology and Management 372, 1-9.
| Crossref | Google Scholar |

Carta F, Zidda C, Putzu M, Loru D, Anedda M, Giusto D (2023) Advancements in forest fire prevention: a comprehensive survey. Sensors 23(14), 6635.
| Crossref | Google Scholar |

Chicas SD, Østergaard Nielsen J, Valdez MC, Chen CF (2022) Modelling wildfire susceptibility in Belize’s ecosystems and protected areas using machine learning and knowledge-based methods. Geocarto International 37, 15823-15846.
| Crossref | Google Scholar |

Chuvieco E, Congalton RG (1989) Application of remote sensing and geographic information systems to forest fire hazard mapping. Remote Sensing of Environment 29, 147-159.
| Crossref | Google Scholar |

Dara A, Baumann M, Hölzel N, Hostert P, Kamp J, Müller D, Ullrich B, Kuemmerle T (2020) Post-Soviet land-use change affected fire regimes on the Eurasian steppes. Ecosystems 23(5), 943-956.
| Crossref | Google Scholar |

Dastour H, Hassan QK (2024) A multidimensional machine learning framework for LST reconstruction and climate variable analysis in forest fire occurrence. Ecological Informatics 83, 102849.
| Crossref | Google Scholar |

Ebrahimy H, Rasuly A, Mokhtari D (2017) Development of a web GIS system based on the MaxEnt approach for wildfire management: a case study of East Azerbaijan. Ecopersia 5, 1859-1873.
| Google Scholar |

Elith J, Phillips SJ, Hastie T, Dudík M, Chee YE, Yates CJ (2011) A statistical explanation of MaxEnt for ecologists. Diversity and Distributions 17, 43-57.
| Crossref | Google Scholar |

Frongillo R, Reid MD (2014) Convex foundations for generalized MaxEnt models. AIP Conference Proceedings 1636, 11-16.
| Crossref | Google Scholar |

Giannakidou S, Radoglou-Grammatikis P, Lagkas T, Argyriou V, Goudos S, Markakis EK, Sarigiannidis P (2024) Leveraging the power of internet of things and artificial intelligence in forest fire prevention, detection, and restoration: a comprehensive survey. Internet of Things 26, 101171.
| Crossref | Google Scholar |

Global Forest Watch (2024) Kazakhstan deforestation Rates & Statistics | GFW. https://www.globalforestwatch.org/dashboards/country/KAZ/

Guede-Fernández F, Martins L, de Almeida RV, Gambôa H, Vieira P (2021) A deep learning based object identification system for forest fire detection. Fire 4(4), 75.
| Crossref | Google Scholar |

Guo F, Su Z, Wang G, Sun L, Tigabu M, Yang X, Hu H (2017) Understanding fire drivers and relative impacts in different Chinese forest ecosystems. Science of The Total Environment 605-606, 411-425.
| Crossref | Google Scholar | PubMed |

Jaafari A, Zenner EK, Pham BT (2018) Wildfire spatial pattern analysis in the Zagros Mountains, Iran: a comparative study of decision tree based classifiers. Ecological Informatics 43, 200-211.
| Crossref | Google Scholar |

Javidan N, Kavian A, Pourghasemi HR, Conoscenti C, Jafarian Z, Rodrigo-Comino J (2021) Evaluation of multi-hazard map produced using MaxEnt machine learning technique. Scientific Reports 11, 6496.
| Crossref | Google Scholar | PubMed |

Jiao Z, Zhang Y, Xin J, Mu L, Yi Y, Liu H, Liu D (2019) A Deep Learning Based Forest Fire Detection Approach using UAV and YOLOV3. In ‘1st International Conference on Industrial Artificial Intelligence, IAI’. Shenyang, China 2019. IEEE Xplore pp. 1–5. 10.1109/ICIAI.2019.8850815

Jones MW, Abatzoglou JT, Veraverbeke S, Andela N, Lasslop G, Forkel M, Smith AJP, Burton C, Betts R, van der Werf GR, Sitch S, Canadell JG, Santín C, Kolden CA, Doerr SH, Quéré CL (2022) Global and regional trends and drivers of fire under climate change. Reviews of Geophysics 60(3), e2020RG000726.
| Crossref | Google Scholar |

Khaidarov K, Arkhipov V (2001) Fire situation in Kazakhstan. In ‘Global Forest Fire Assessment 1990-2000’. (FAO: Rome)

Lednev S, Semenkov I, Sharapova A, Koroleva T (2021) The impact of fire on plant biodiversity in the semideserts of Central Kazakhstan. E3S Web Conf. Actual Problems of Ecology and Environmental Management (APEEM 2021) Volume 265, 2021. (EDP Sciences) 10.1051/e3sconf/202126501020

Marquez Torres A, Signorello G, Kumar S, Adamo G, Villa F, Balbi S (2023) Fire risk modeling: an integrated and data-driven approach applied to Sicily. Natural Hazards and Earth System Sciences 23, 2937-2959.
| Crossref | Google Scholar |

Mazarzhanova K, Kopabayeva A, Köse N, Akkemik Ü (2017) The first forest fire history of the Burabai Region (Kazakhstan) from tree rings of Pinus sylvestris. Turkish Journal of Agriculture and Forestry 41, 165-174.
| Crossref | Google Scholar |

Merekeyev A, Nurakynov S (2022) Assessment of wildfire hazard on the territory of Kazakhstan using remote sensing data. Journal of Geography and Environmental Management 65, 34-41.
| Crossref | Google Scholar |

Mishra B, Panthi S, Poudel S, Ghimire BR (2023) Forest fire pattern and vulnerability mapping using deep learning in Nepal. Fire Ecology 19, 3.
| Crossref | Google Scholar |

Ngoc Thach N, Bao-Toan Ngo D, Xuan-Canh P, Hong-Thi N, Hang Thi B, Nhat-Duc H, Dieu TB (2018) Spatial pattern assessment of tropical forest fire danger at Thuan Chau area (Vietnam) using GIS-based advanced machine learning algorithms: a comparative study. Ecological Informatics 46, 74-85.
| Crossref | Google Scholar |

Oladejo TO, Balogun FO, Haruna UA, Alaka HO, Almazan J, Shuaibu MS, Adedayo IS, Ermakhan Z, Sarria-Santamerra A, Eliseo DL-P (2023) Climate change in Kazakhstan: implications to population health. Bulletin of the National Research Centre 47, 144.
| Crossref | Google Scholar |

Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JMC (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest. Forest Ecology and Management 275, 117-129.
| Crossref | Google Scholar |

O’Mara T, Meador AS, Colavito M, Waltz A, Barton E (2024) Navigating the evolving landscape of wildfire management: a systematic review of decision support tools. Trees, Forests and People 16, 100575.
| Crossref | Google Scholar |

Paudel G, Pandey K, Lamsal P, Bhattarai A, Bhattarai A, Tripathi S (2024) Geospatial forest fire risk assessment and zoning by integrating MaxEnt in Gorkha District, Nepal. Heliyon 10, e31305.
| Crossref | Google Scholar | PubMed |

Peel MC, Finlayson BL, McMahon TA (2007) Updated world map of the Köppen–Geiger climate classification. Hydrology and Earth System Sciences 11, 1633-1644.
| Crossref | Google Scholar |

Quintano C, Fernández-Manso A, Fernández-Manso O (2018) Combination of Landsat and Sentinel-2 MSI data for initial assessing of burn severity. International Journal of Applied Earth Observation and Geoinformation 64, 221-225.
| Crossref | Google Scholar |

Reddy CS, Sarika N (2022) Monitoring trends in global vegetation fire hot spots using MODIS data. Spatial Information Research 30, 617-632.
| Crossref | Google Scholar |

Rodrigues M, De la Riva J (2014) An insight into machine-learning algorithms to model human-caused wildfire occurrence. Environmental Modelling and Software 57, 192-201.
| Crossref | Google Scholar |

Roy PS, Kushwaha SPS, Murthy MSR, Roy A, Kushwaha D, Chintala SR, Behera MD, Padalia H, Saran S, Singh S, Jha CS, Porwal MC (2012) ‘Biodiversity Characterisation at Landscape Level: National Assessment.’ (Indian Institute of Remote Sensing: Dehradun, India)

Sagar N, Suresh KP, Naveesh YB, Archana CA, Hemadri D, Patil SS, Archana VP, Raaga R, Nandan AS, Chethan AJ (2024) Forest fire dynamics in India (2005–2022): unveiling climatic impacts, spatial patterns, and interface with anthrax incidence. Ecological Indicators 166, 112454.
| Crossref | Google Scholar |

Sayad YO, Mousannif H, Moatassime HA (2019) Predictive modeling of wildfires: a new dataset and machine learning approach. Fire Safety Journal 104, 130.
| Crossref | Google Scholar |

Shinkarenko SS, Berdengalieva AN, Doroshenko VV, et al. (2023) An analysis of the dynamics of areas affected by steppe fires in Western Kazakhstan on the basis of earth remote sensing data. Arid Ecosystem 13, 29-38.
| Crossref | Google Scholar |

Spivak L, Arkhipkin O, Sagatdinova G (2012) Development and prospects of the fire space monitoring system in Kazakhstan. Frontiers of Earth Science 6, 276-282.
| Crossref | Google Scholar |

Surbhi Singh S, Jeganathan C (2024) Using ensemble machine learning algorithm to predict forest fire occurrence probability in Madhya Pradesh and Chhattisgarh, India. Advances in Space Research 73, 2969-2987.
| Crossref | Google Scholar |

Tariq A, Shu H, Li Q, Altan O, Khan MR, Baqa MF, Lu L (2021) Quantitative analysis of forest fires in southeastern Australia using SAR data. Remote Sensing 13, 2386.
| Crossref | Google Scholar |

Thomason A (2015) ‘Modeling Burn Probability: A Maxent Approach to Estimating California’s Wildfire Potental.’ (University of Southern California: Los Angeles, CA, USA)

Tien Bui D, Bui QT, Nguyen QP, Pradhan B, Nampak H, Trinh PT (2017) A hybrid artificial intelligence approach using GIS-based neural-fuzzy inference system and particle swarm optimization for forest fire susceptibility modeling at a tropical area. Agricultural and Forest Meteorology 233, 32-44.
| Crossref | Google Scholar |

Venkatesh K, John R, Chen J, Xiao J, Amirkhiz RG, Giannico V, Kussainova M (2022) Optimal ranges of social-environmental drivers and their impacts on vegetation dynamics in Kazakhstan. Science of The Total Environment 847, 157562.
| Crossref | Google Scholar | PubMed |

Wasserman TN, Mueller SE (2023) Climate influences on future fire severity: a synthesis of climate–fire interactions and impacts on fire regimes, high-severity fire, and forests in the western United States. Fire Ecology 19, 43.
| Crossref | Google Scholar |

Yang X, Jin X, Zhou Y (2021) Wildfire risk assessment and zoning by integrating maxent and GIS in Hunan Province, China. Forests 12, 1299.
| Crossref | Google Scholar |

Zhang G, Wang M, Liu K (2019) Forest fire susceptibility modeling using a convolutional neural network for Yunnan Province of China. International Journal of Disaster Risk Science 10, 386-403.
| Crossref | Google Scholar |

Zheleznova I, Gushchina D, Meiramov Z, Olchev A (2022) Temporal and spatial variability of dryness conditions in Kazakhstan during 1979–2021 based on reanalysis data. Climate 10, 144.
| Crossref | Google Scholar |

Zong X, Tian X, Yin Y (2020) Impacts of climate change on wildfires in Central Asia. Forests 11, 802.
| Crossref | Google Scholar |