top of page

Input data

Climate data

The climatic data required for simulations using the AquaCrop model include daily maximum and minimum temperatures at 2 meters, precipitation, and reference evapotranspiration (ETo) (Raes et al., 2023). For simulating drought impacts under the historical scenario, climate data were sourced from the ERA5 reanalysis dataset (spatial resolution of 0.25 degrees) of the European Centre for Medium-Range Weather Forecasts (ECMWF). These essential climate variables were accessed using the Open-Meteo API for the period spanning 1960 to 2022. 

The projected climate data for the future scenarios were obtained from the database of the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP). This unique cross-sectoral, cross-scale organization is designed to complement intra-sectoral research efforts to ultimately provide a comprehensive picture of climate change risks at different levels of global warming. ISIMIP employs multi-model ensembles so that uncertainties at the different modelling stages considered can be assessed quantitatively. The ISIMIP framework includes bias-adjusted climate-input datasets on a global grid of 0.5° x 0.5° resolution, with daily time steps, alongside socioeconomic scenarios. Specifically, for d-iap, the latest phase of the protocol (ISIMIP3b) was selected, based on results from phase 6 of the Coupled Model Intercomparison Project (CMIP6). Daily projections of six variables (Near-Surface Relative Humidity, Precipitation, Surface Downwelling Shortwave Radiation, Near-Surface Wind Speed, Daily Maximum Near-Surface Air Temperature, and Daily Minimum Near-Surface Air Temperature) were extracted from five CMIP6 general circulation models. These models include three with low climate sensitivity (GFDL-ESM4, MPI-ESM1-2-HR, MRI-ESM2-0) and two with high climate sensitivity (IPSL-CM6A-LR, UKESM1-0-LL). The periods 2041-2060 and 2081-2100 were chosen to represent mid-century and end-century intervals, respectively. Three Shared Socioeconomic Pathways (SSP1-2.6, SSP3-7.0 and SSP5-8.5) were utilized, encompassing a range of plausible climate futures. The SSP numbering corresponds to radiative forcing in watts per square meter (W m⁻²) projected for 2100. SSP1-2.6 represents a low greenhouse gas (GHG) emissions scenario, where CO  levels remain around current levels until mid-century, while SSP3-7.0 and SSP5-8.5 reflects high and very high GHG emissions with CO  levels, respectively. Bulk download of climate data files was facilitated through the ISIMIP repository's API. Subsequently, daily reference evapotranspiration (ETo) was estimated using a publicly available Python library.

2

2

Soil data

The soil characteristics required by AquaCrop include soil water content at permanent wilting point (PWP), field capacity (FC), and saturation (SAT), as well as the hydraulic conductivity at saturation (Ksat). These parameters were estimated using pedotransfer functions based on soil texture (granulometric content), gravel content, and organic matter, following the methodology by Saxton and Rawls (2006). Soil texture, organic matter content (organic carbon), gravel content, and rootable soil depth were sourced from the Harmonized World Soil Database version 2.0 (HWSD v2.0). This database is a comprehensive global soil inventory providing detailed information on the morphological, chemical, and physical properties of soils on a 30 arc-second global grid. HWSD v2.0 was jointly developed by the International Institute for Applied Systems Analysis (IIASA) and FAO, and in partnership with International Soil Reference and Information Centre (ISRIC), the European Soil Bureau Network (ESBN) and the Institute for Soil Sciences, Chinese Academy of Sciences (CAS).

The HWSD comprises a raster image file linked to an attribute database in Microsoft Access format, providing detailed information on soil composition for each of the 29,538 soil association mapping units (SMUs). Each SMU can include up to 12 soil unit/soil phase combination records with standardized soil parameter values for seven depth layers. To integrate this data with the simulation grid, the HWSD raster was clipped using QGIS (Open-source Geographic Information System software), determining the predominant SMU value for each grid cell via zonal statistics. Special considerations were made for coastal areas and to exclude pixels corresponding to water bodies. Since an SMU can consist of multiple soil units, each represented by a 'SHARE' (%) field, the weighted mean values of the target soil variables were calculated for each SMU across the seven depth layers. After this process, some areas had empty or null soil parameter values, corresponding to soils classified as sand dunes, salt flats, urban areas, and mining zones, among others. Specifically, 303 soil mapping units were identified, with 125 having a single soil unit (SHARE = 100%) and 178 having multiple soil units (SHARE < 100%). For the latter, the 'SHARE' percentages of soil units with non-null values were reweighted, excluding the percentage of soil units without information, using the formula:

For the group of 125 SMUs with SHARE = 100% (i.e., not associated with any other soil unit), a new grid was generated in QGIS using the r.reclass algorithm, where these SMU units were designated a NULL value. Subsequently, a new zonal statistics analysis was conducted to assign new SMU values to these conflicting cells (Figure 1). Finally, the weighted mean values of soil variables were calculated for each SMU across all available horizons.

Screenshot of Raster Reclassification of the soil layer under the cropland layer. The blank areas correspond to SMU = NULL.

Regarding the soil variables, organic matter content was estimated from organic carbon using the 'van Bemmelen' conversion factor, which assumes that organic matter contains 58% organic carbon. For estimating the hydraulic soil properties (PWP, FC, SAT, Ksat), the Saxton and Rawls (2006) pedotransfer functions (PTFs) were employed via an R package (primarily providing wrapper functions for running and analysing the outputs of DSSAT CSM). Utilizing the 'Robjects' library, this R package was integrated into the Python code. Adjustments were made to establish minimum and maximum limits for each soil parameter, ensuring proper functionality of the PTFs:

soil parameter limits_table.jpg

Crop data

The d-iap assesses the drought impacts on various crops which have been calibrated and included in the AquaCrop database, such as barley, cassava, cotton, dry bean, maize, paddy rice, potato, quinoa, sorghum, soybean, sugar beet, sugarcane, sunflower, teff, tomato, and wheat. The initial step involved the spatial allocation of these crops globally, based on “area harvested” data from the year 2000 at the national level, using the FAOSTAT statistical database. Although a generic cultivar was considered for each crop, some crop parameters adjustments were made among different cultivable zones, taking into account information on local cultivar types and management practices. Specifically, the following parameters were adjusted for each area based on expert knowledge and global bibliographic research: reference harvest index (HIo), plant density, and maximum canopy cover (CCx) associated with the plant density (Raes et al., 2023). Thus, a global crop database was generated for d-iap, incorporating these parameters at national level, and at subnational levels where applicable.

A key variable for conducting AquaCrop simulations is the sowing date, which varies annually and is primarily determined by precipitation patterns. Therefore, it was decided to determine it dynamically based on the occurrence of precipitation. Defining a potential sowing window was crucial for this process. Currently, various databases provide information on sowing and harvesting windows, but they have limitations due to the significant temporal and spatial variability of crop calendars. The FAO offers a valuable tool called the Crop Calendar, but it is primarily limited to agricultural areas in developing countries of Africa and Asia. To address this, the FAO Crop Calendar database was supplemented with other sowing date calendars. One such database is from the University of Wisconsin-Madison (USA), which digitizes and georeferences existing observations of crop planting and harvesting dates. This database provides gridded maps of planting dates for 19 crops at two different resolutions (5 minutes and 0.5 degrees) and in two formats (netCDF and ArcINFO ASCII). Additionally, the database was enhanced with data from the Global Information and Early Warning System on Food and Agriculture (FAO-GIEWS), several sources from the United States Department of Agriculture (USDA) (including, USDA Crop Production Maps database and United States - Crop Production Maps ), scientific publications, and expert knowledge from the development team in various cropping regions. While most data were specified at the national level, subnational data were included for large countries. This regionalization effort extended to 35 countries (20% of total countries), such as Australia, Brazil, Canada, DR Congo, China, Ghana, or India. Additionally, in regions where a crop is grown more than once per year, different growing seasons with distinct planting windows have been introduced for that crop. The start and end dates of the sowing period were integrated into the crop database (3053 entries). Each country's ISO code (ISO 3166-1 alpha-2) was incorporated to facilitate efficient data processing. For countries with regional or subnational information, ISO codes were included at a lower level (ISO 3166-2), associating crop values with specific states, regions, provinces, or districts as per available data. Integration of geographical coordinates from the simulation grid with ISO codes was achieved using the Nominatim API through a Python script, linking each cell's geographical coordinates with its corresponding country or region. This approach ensured that the simulation grid and crop database are geospatially linked.

Economic data

For assessing the economic impact of drought through various indicators (see 'What can you get from d-iap' section), the producer price is required. The producer price represents the compensation received by producers for each unit of goods or services sold, determined 'at the farm gate' or initial point of sale. This price excludes any VAT or similar deductible taxes invoiced to the purchaser and any separately invoiced transport charges by the producer. The data were sourced from FAOSTAT and primarily originated from national sources collected through the FAO questionnaire, which gathers information on annual and monthly producer prices for primary crops. The data were downloaded and analyzed to determine a central value that represents the price distribution. These prices, measured in dollars per tonne, span from 1991 to 2022 and encompass annual values for all crops. For cross-country price comparisons, prices are directly converted into dollars. However, to conduct year-on-year comparisons accurately, inflation must be considered. The World Bank defines six measures of inflation: Headline Consumer Price Index (CPI) inflation, Food CPI inflation, Energy CPI inflation, Core CPI inflation, Producer Price Index inflation, and the Gross Domestic Product (GDP) deflator. The GDP deflator, used in d-iap assessments, represents the change in prices over time for a product or a basket of products by comparing a reference period to a base period. It is calculated by dividing the current price value of a given aggregate by its real counterpart. To obtain real prices, the producer prices were divided by the decimal form of the GDP deflator (sourced from FAOSTAT), expressed in dollars using 2015 as the base year. This allows for the calculation of real values that account for country inflation, thus providing a more accurate representation of purchasing power over time. Although the FAO provides the Producer Price Index, which records annual changes in the selling prices received by farmers, this index was not used due to its limited data compared to the GDP deflator.

When retrieving FAO producer price data, certain countries may have incomplete or missing datasets. To address this issue, countries were categorized based on socioeconomic and geographic factors. Central values were calculated using the median value, which is more robust and less sensitive to outliers. In cases where countries have inadequate data or fewer than five prices available, the median of the group was used instead. If insufficient data availability could not mitigate unreliable data, 'NO DATA' was recorded in the generated database.

HEADER copia.png
bottom of page