A Big Earth Data Platform for Three Poles

Supported by the Strategic Priority Research Program of the Chinese Academy of Science (XDA19070100). Tao Che, the director of this program, who comes from Key Laboratory of Remote Sensing of Gansu Province, Northwest Institute of Eco-Environment and Resources, CAS. They used machine learning methods combined with multi-source gridded snow depth product data to derive a long-time series over the Northern Hemisphere. Firstly, the applicability of artificial neural network (ANN), support vector machine (SVM) and random forest (RF) method in snow depth fusion are compared. It is found that random forest method shows strong advantages in snow depth data fusion. Secondly, using the random forest method, combined with remote sensing snow depth products such as AMSR-E, AMSR-2, NHSD and GlobSnow and reanalysis data such as ERA-Interim and MERRA-2. These gridded snow depth products and environmental factor variables are used as the input independent variables of the model. In situ observations of China Meteorological Station (945), Russia Meteorological Station (620), Russian snow survey data (514), and global historical meteorological network (41261) are used as reference truth to train and verify the model. The daily gridded snow depth dataset of the snow hydrological year from 1980 to 2019 (September 1 of the previous year to May 31 of the current year) is prepared on the cloud platform provided by the CASEarth. Since the passive microwave brightness temperature data from 1980 to 1987 is the data of every other day, there will be a small number of missing trips in the data during this period. Using the ESM-SnowMIP and independent ground observation data for verification, the quality of the fusion data set has been improved. According to the comparison between the ground observation data and the snow depth products before fusion, the determination coefficient (R2) of the fusion data is increased from 0.23 (GlobSnow snow depth product) to 0.81, and the corresponding root mean square error (RMSE) and mean absolute error (MAE) are also reduced to 7.7 cm and 2.7 cm.

The energy supply resilience of the countries along the Belt and Road reflects the level of energy supply resilience of the countries along the Belt and Road, and the higher the value of the data, the stronger the energy supply resilience of the countries along the Belt and Road. "The energy supply resilience data for countries along the "Belt and Road" are prepared with reference to the International Energy Agency (IEA) national energy statistics (https://www.iea.org/data-and-statistics), using the 2000-2019 The energy supply resilience product was prepared based on sensitivity and adaptability analysis, using year-by-year data on coal, oil and natural gas supply in countries along the "Belt and Road", and taking into account the year-by-year changes of each energy source.

The data set includes ASTER GDEM data and its Mosaic. ASTER Global DEM (ASTER GDEM) is a Global digital elevation data product jointly released by NASA and Japan's ministry of economy, trade and industry (METI) on June 29, 2009. The DEM data is based on the observation results of NASA's new earth observation satellite TERRA.It is produced by the ASTER(Advanced Space borne Thermal Emission and Reflection Radio meter) sensor, which collects 1.3 million stereo image data, covering more than 99% of the earth's land surface.The data has a horizontal accuracy of 30 m (95% confidence) and an elevation accuracy of 7-14 m (95% confidence).This data is the third global elevation data, which is significantly higher than previous SRTM3 DEM and GTOPO30 data. We from NASA's web site (http://wist.echo.nasa.gov/api) to download the data of heihe river basin, and through the data center to distribute.The data distributed by the center completely retains the original appearance of the data without any modification to the data.If users need details about ASTER GDEM preparation process, please refer to the data documents of metadata connections, or visit http://www.ersdac.or.jp/GDEM/E/3.html or directly from https://lpdaac.usgs.gov/ reading and ASTER Global DEM related documents. ASTER GDEM is divided into several data blocks of 1×1 degree in distribution, and the distribution format is zip compression format. Each compressed file includes three files. The file naming format is as follows: ASTGTM_NxxEyyy_dem.tif ASTGTM_NxxEyyy_num.tif reademe.pdf Where xx is the starting latitude and yyy is the starting longitude._dem. Tif is the dem data file, _num. Tif is the data quality file, and reademe is the data description file. In order to facilitate users to use the data, on the basis of the fractional ASTER GDEM data, we splice fractional SRTM data to prepare the ASTER GDEM Mosaic map of the black river basin, which retains all the original features of ASTER GDEM without any resamulation. This data includes two files: heihe_aster_gdem_mosaic_dem.img Heihe_Aster_GDEM_Mosaic_num. Img The data is stored in the format of Erdas image, where the file _dem.img is the dem data file and the file _num. Img is the data quality file.

The SRTM sensor has two bands, namely C-band and X-band. The SRTM we are using now comes from the C-band. The publicly released SRTM digital elevation products include DEM data at three different resolutions: * SRTM1 covers only the continental United States, with a spatial resolution of 1s; * SRTM3 data covers the world with a spatial resolution of 3s. This is the most widely used dataset. The elevation reference of SRTM3 is the geoid of EGM96 and the horizontal reference is WGS84. The nominal absolute elevation accuracy is ± 16m, and the absolute plane accuracy is ± 20m. * SRTM30 data also covers the world, with a resolution of 30s. There are multiple versions of SRTM data. The early SRTM data was completed by NASA's "JPL" (Jet Propulsion Laboratory) ground data processing system (GDPS). The data is called SRTM3- 1. The National Geospatial Intelligence Agency has further processed the data, and the lack of data has been significantly improved. The data is called SRTM3-2. This dataset is mainly the fourth version of SRTM terrain data obtained by CIAT (International Center for Tropical Agriculture) using a new interpolation algorithm. This method better fills the SRTM 90 data hole. The interpolation algorithm comes from Reuter et al. (2007). The data of SRTM is organized as follows: every 5 latitude and longitude grids is divided into a file, which are divided into 24 rows (-60 to 60 degrees) and 72 columns (-180 to 180 degrees). The file naming rule is srtm_XX_YY.zip, where XX indicates the number of columns (01-72), and YY indicates the number of rows (01-24). The resolution of the data is 90 m. Data use: SRTM data uses a 16-bit value to represent the elevation value (-/ + / 32767 meters), the maximum positive elevation is 9000 meters, and the negative elevation (12,000 meters below sea level). -32767 standard for empty data.

This data set includes the daily averages of the temperature, pressure, relative humidity, wind speed, precipitation, global radiation, P2.5 concentration and other meteorological elements observed by the Qomolangma Station for Atmospheric and Environmental Observation and Research from 2005 to 2016. The data are aimed to provide service for students and researchers engaged in meteorological research on the Tibetan Plateau. The precipitation data are observed by artificial rainfall barrel, the evaporation data are observed by Φ20 mm evaporating pan, and all the others are daily averages and ten-day means obtained after half hour observational data are processed. All the data are observed and collected in strict accordance with the Equipment Operating Specifications, and some obvious error data are eliminated when processing the generated data.

Population age structure resilience reflects the level of population age structure resilience in the countries along the Belt and Road. The World Bank's statistical database was used to prepare the data on the resilience of the population age structure of the countries along the Belt and Road. Based on the sensitivity and adaptability analysis, a comprehensive diagnosis was made based on the year-on-year change of each indicator, and the product on the resilience of population age structure was prepared.

Population growth resilience reflects the level of resilience of population growth in the countries along the belt and road, and the higher the value, the stronger the resilience of population growth in the countries along the belt and road. The data on the resilience of population growth is prepared by referring to the World Bank's statistical database, using the year-on-year changes in the population of countries along the Belt and Road from 2000 to 2019, taking into account the year-on-year changes in each indicator, and through comprehensive diagnosis based on sensitivity and adaptability analysis. The resilience of population growth product.

This data set is based on the evaluation of existing land cover data and the evidence theory，including a 1:100,000 land use map for the year 20 2000、a 1:1,000,000 vegetation map、a 1:1,000,000 swamp-wetland map, a glacier map and a Moderate-Resolution Imaging Spectroradiometer land cover map for China in 2001 (MODIS2001) were merged，Finally, the decision is made based on the principle of maximum trust, and a new 1KM land cover data of China in 2000 with IGBP classification system is produced. The new land cover data not only maintain the overall accuracy of China's land use data, but also supplement the information of vegetation types and vegetation seasons in China's vegetation map, update China's wetland map, add the latest information of China's glacier map, and make the classification system more general.

There are many lakes in the Qinghai Tibet Plateau. The glacial phenology and duration of lakes in this region are very sensitive to regional and global climate change, so they are used as the key indicators of climate change research, especially the comparative study of the three polar environmental changes of the earth. However, due to its poor natural environment and sparse population, there is a lack of conventional field measurement of lake ice phenology. The lake ice was monitored with a resolution of 500 meters by using the normalized difference snow index (NDSI) data of MODIS. The traditional snow map algorithm is used to detect the lake daily ice amount and coverage under the condition of sunny days, and the lake daily ice amount and coverage under the condition of cloud cover are re determined through a series of steps based on the spatiotemporal continuity of the lake surface conditions. Through time series analysis, 308 lakes larger than 3km2 are identified as effective records of lake ice range and coverage, forming a daily lake ice range and coverage data set, including 216 lakes.

Near surface atmospheric forcing data were produced by using Wether Research and Forecasting (WRF) model over the Heihe River Basin at hourly 0.05 * 0.05 DEG resolution, including the following variables: 2m temperature, surface pressure, water vapor mixing ratio, downward shortwave & upward longwave radiation, 10m wind field and the accumulated precipitation. The forcing data were validated by observational data collected by 15 daily Chinese Meteorological Bureau conventional automatic weather station (CMA), a few of Heihe River eco-hydrological process comprehensive remote sensing observation (WATER and HiWATER) site hourly observations were verified in different time scales, draws the following conclusion: 2m surface temperature, surface pressure and 2m relative humidity are more reliable, especially 2m surface temperature and surface pressure, the average errors are very small and the correlation coefficients are above 0.96; correlation between downward shortwave radiation and WATER site observation data is more than 0.9; The precipitation agreed well with observational data by being verified based on rain and snow precipitation two phases at yearly, monthly, daily time scales . the correlation coefficient between rainfall and the observation data at monthly and yearly time scales were up to 0.94 and 0.84; the correlation between snowfall and observation data at monthly scale reached 0.78, the spatial distribution of snowfall agreed well with the snow fractional coverage rate of MODIS remote sensing product. Verification of liquid and solid precipitation shows that WRF model can be used for downscaling analysis in complex and arid terrain of Heihe River Basin, and the simulated data can meet the requirements of watershed scale hydrological modeling and water resources balance. The data for 2000-2012 was provided in 2013. The data for 2013-2015 was updated in 2016. The data for 2016-2018 was updated in 2019. The data for 2019-2021 was updated in 2021.

The Antarctic ice sheet is one of the largest potential sources of global sea level rise. Accurately determining the mass budget of the ice sheet is the key to understand the dynamic changes of the Antarctic ice sheet. It is very important to understand the evolution process of the ice sheet and accurately predict the future global sea level rise. Based on the MEaSUREs Antarctic groundingline and the basin boundaries, we discretize the groundingline, combine the MEaSUREs and RAMP annual ice velocity data from 1985 to 2015 with the BedMachine ice thickness data, and vectorially calculate the ice discharge at each flux gate of the groundingline. We use the surface mass balance data of RACMO2.3p2 model to spatially calculate the surface mass balance of each basin, and combined it with the ice discharge results to obtain the Antarctic ice sheet mass balance data set (1985-2015). The data set includes the mass balance results of each basin of the Antarctic ice sheet in the year 1985, 2000 and 2015, and the annual ice velocity data, ice thickness and annual ice discharge corresponding to the location of each flux gate. The data set realizes the fine evaluation of ice flux at the groundingline, and reflect the changes and spatial distribution characteristics of the mass balance of each basin of the Antarctic ice sheet in recent 30 years. It provides basic data for the subsequent fine change evaluation and prediction of the mass balance of the Antarctic ice sheet and the exploration of the mechanism of ice sheet loss.

The mass loss of the Greenland ice sheet has been the main contributor to global sea level rise in recent decades. Under the trend of global warming, the Greenland ice sheet is melting faster. It is of great scientific significance to explore the causes of mass loss and its response to climate change. Based on the MEaSUREs Greenland groundingline and the basin boundaries, we discretize the groundingline, combine the MEaSUREs annual ice velocity data from 1985 to 2015 with the BedMachine v3 ice thickness data, and vectorially calculate the ice discharge at each flux gate of the groundingline. We use the surface mass balance data of RACMO2.3p2 model to spatially calculate the surface mass balance of each basin, and combined it with the ice discharge results to obtain the Greenland ice sheet mass balance data set (1985-2015). The data set includes the mass balance results of each basin of the Greenland ice sheet in the year 1985, 2000 and 2015, and the annual ice velocity data, ice thickness and annual ice discharge corresponding to the location of each flux gate. The data set realizes the fine evaluation of ice flux at the groundingline, and reflect the changes and spatial distribution characteristics of the mass balance of each basin of the Greenland ice sheet in recent 30 years. It provides basic data for the subsequent fine change evaluation and prediction of the mass balance of the Greenland ice sheet and the exploration of the mechanism of ice sheet loss.