Dataset of ground truth land surface evapotranspiration at the satellite pixel scale in the Heihe River Basin (from multi-station observations to satellite pixel scale) Version 1.0

Surface evapotranspiration (ET) is an important link of water cycle and energy transmission in the earth system. The accurate acquisition of ET is helpful to the study of global climate change, crop yield estimation, drought monitoring, and has important guiding significance for regional and even global water resources planning and management. With the development of remote sensing technology, remote sensing estimation of surface evapotranspiration has become an effective way to obtain regional and global evapotranspiration. At present, a variety of low and medium resolution surface evapotranspiration products have been produced and released in business. However, there are still many uncertainties in the model mechanism, input data, parameterization scheme of remote sensing estimation of surface evapotranspiration model. Therefore, it is necessary to use the real method. The accuracy of remote sensing estimation of evapotranspiration products was quantitatively evaluated by sex test. However, in the process of authenticity test, there is a problem of spatial scale mismatch between the remote sensing estimation value of surface evapotranspiration and the site observation value, so the key is to obtain the relative truth value of satellite pixel scale surface evapotranspiration. Based on the flux observation matrix of "multi-scale observation experiment of non-uniform underlying surface evaporation" in the middle reaches of Heihe River Basin from June to September 2012, the stations 4 (Village), 5 (corn), 6 (corn), 7 (corn), 8 (corn), 11 (corn), 12 (corn), 13 (corn), 14 (corn), 15 (corn), 17 (orchard) and the lower reaches of January to December 2014 Oasis Populus euphratica forest station (Populus euphratica forest), mixed forest station (Tamarix / Populus euphratica), bare land station (bare land), farmland station (melon), sidaoqiao station (Tamarix) observation data (automatic meteorological station, eddy correlator, large aperture scintillation meter, etc.) are used as auxiliary data, and the high-resolution remote sensing data (surface temperature, vegetation index, net radiation, etc.) are used as auxiliary data. See Fig. 1 for the distribution map. Considering the land Through direct test and cross test, six scale expansion methods (area weight method, scale expansion method based on Priestley Taylor formula, unequal weight surface to surface regression Kriging method, artificial neural network, random forest, depth belief network) were compared and analyzed, and finally a comprehensive method (on the underlying surface) was optimized. The area weight method is used when the underlying surface is moderately inhomogeneous; the unequal weight surface to surface regression Kriging method is used when the underlying surface is moderately inhomogeneous; the random forest method is used when the underlying surface is highly inhomogeneous) to obtain the relative true value (spatial resolution of 1km) of the surface evapotranspiration pixel scale of MODIS satellite transit instantaneous / day in the middle and lower reaches of the flux observation matrix area respectively, and to observe through the scintillation with large aperture. The results show that the overall accuracy of the data set is good. The average absolute percentage error (MAPE) of the pixel scale relative truth instantaneous and day-to-day is 2.6% and 4.5% for the midstream satellite, and 9.7% and 12.7% for the downstream satellite, respectively. It can be used to verify other remote sensing products. The evapotranspiration data of the pixel can not only solve the problem of spatial mismatch between the remote sensing estimation value and the station observation value, but also represent the uncertainty of the verification process. For all site information and scale expansion methods, please refer to Li et al. (2018) and Liu et al. (2016), and for observation data processing, please refer to Liu et al. (2016).

Passive microwave SSM/I brightness temperature dataset for China (1987-2007)

This data set includes the microwave brightness temperatures obtained by the spaceborne microwave radiometer SSM/I carried by the US Defense Meteorological Satellite Program (DMSP) satellite. It contains the twice daily (ascending and descending) brightness temperatures of seven channels, which are 19H, 19V, 22V, 37H, 37V, 85H, and 85V. The Specialized Microwave Imager (SSM/I) was developed by the Hughes Corporation of the United States. In 1987, it was first carried into the space on the Block 5D-/F8 satellite of the US Defense Meteorological Satellite Program (DMSP) to perform a detection mission. In the 10 years from when the DMSP soared to orbit in 1987 to when the TRMM soared to orbit in 1997, the SSM/I was the world's most advanced spaceborne passive microwave remote sensing detection instrument, having the highest spatial resolution in the world. The DMSP satellite is in a near-polar circular solar synchronous orbit; the elevation is approximately 833 km, the inclination is 98.8 degrees, and the orbital period is 102.2 minutes. It passes through the equator at approximately 6:00 local time and covers the whole world once every 24 hours. The SSM/I consists of seven channels set at four frequencies, and the center frequencies are 19.35, 22.24, 37.05, and 85.50 GHz. The instrument actually comprises seven independent, total-power, balanced-mixing, superheterodyne passive microwave radiometer systems, and it can simultaneously measure microwave radiation from Earth and the atmospheric systems. Except for the 22.24 GHz frequency, all the frequencies have both horizontal and vertical polarization states. Some Eigenvalues of SSM/I Channel Frequency (GHz) Polarization Mode (V/H) Spatial Resolution (km * km) Footprint Size (km) 19V 19.35 V 25×25 56 19H 19.35 H 25×25 56 22V 22.24 V 25×25 45 37V 37.05 V 25×25 33 37H 37.05 H 25×25 33 85V 85.50 V 12.5×12.5 14 85H 85.50 H 12.5×12.5 14 1. File Format and Naming: Each group of data consists of remote sensing data files, .JPG image files and .met auxiliary information files as well as .TIM time information files and the corresponding .met time information auxiliary files. The data file names and naming rules for each group in the SSMI_Grid_China directory are as follows: China-EASE-Fnn-ML/HaaaabbbA/D.ccH/V (remote sensing data); China-EASE-Fnn -ML/HaaaabbbA/D.ccH/V.jpg (image file); China-EASE-Fnn-ML/HaaaabbbA/D.ccH/V.met (auxiliary information document); China-EASE-Fnn-ML/HaaaabbbA/D.TIM (time information file); and China-EASE- Fnn -ML/HaaaabbbA/D.TIM.met (time information auxiliary file). Among them, EASE stands for EASE-Grid projection mode; Fnn represents carrier satellite number (F08, F11, and F13); ML/H represents multichannel low resolution and multichannel high resolution; A/D stands for ascending (A) and descending (D); aaaa represents the year; bbb represents the Julian day of the year; cc represents the channel number (19H, 19V, 22V, 37H, 37V, 85H, and 85V); and H/V represents horizontal polarization (H) and vertical polarization (V). 2. Coordinate System and Projection: The projection method is an equal-area secant cylindrical projection, and the double standard latitude is 30 degrees north and south. For more information on EASE-GRID, please refer to http://www.ncgia.ucsb.edu/globalgrids-book/ease_grid/. If you need to convert the EASE-Grid projection method into a geographic projection method, please refer to the ease2geo.prj file, which reads as follows. Input Projection cylindrical Units meters Parameters 6371228 6371228 1 /* Enter projection type (1, 2, or 3) 0 00 00 /* Longitude of central meridian 30 00 00 /* Latitude of standard parallel Output Projection GEOGRAPHIC Spheroid KRASovsky Units dd Parameters End 3. Data Format: Stored as binary integers, Row number: 308 *166,each datum occupies 2 bytes. The data that are actually stored in this data set are the brightness temperatures *10, and after reading the data, they need to be divided by 10 to obtain true brightness temperature. 4. Data Resolution: Spatial resolution: 25 km, 12.5 km (SSM/I 85 GHz); Time resolution: day by day, from 1978 to 2007. 5. The Spatial Coverage: Longitude: 60°-140° east longitude; Latitude: 15°-55° north latitude. 6. Data Reading: Each group of data includes remote sensing image data files, .JPG image files and .met auxiliary information files. The JPG files can be opened with Windows image and fax viewers. The .met auxiliary information files can be opened with notepad, and the remote sensing image data files can be opened in ENVI and ERDAS software.

Dataset of passive microwave SSM / I and SSMIS brightness temperature in China (1987-2015)

This dataset mainly includes the twice a day (ascending-descending orbit) brightness temperature (K) of the space-borne microwave radiometers SSM / I and SSMIS carried by the US Defense Meteorological Satellite Program satellites (DMSP-F08, DMSP-F11, DMSP-F13, and DMSP-F17), time coverage from September 15, 1987 to December 31, 2015. The SSM/I brightness temperature of DMSP-F08, DMSP-F11 and DMSP-F13 include 7 channels: 19.35H, 19.35V, 22.24V, 37.05H, 37.05V, 85.50H and 85.50V; The SSMIS brightness temperature observation of DMSP-F17 consists of seven channels: 19.35H, 19.35V, 22.24V, 37.05H, 37.05V, 91.66H and 91.66v. Among them, DMSP-F08 satellite brightness temperature coverage time is from September 15, 1987 to December 31, 1991; DMSP-F11 satellite brightness temperature coverage time is from January 1, 1992 to December 31, 1995; The coverage time of DMSP-F13 satellite brightness temperature is from January 1, 1996 to April 29, 2009; The coverage time of DMSP-F17 satellite brightness temperature is from January 1, 2009 to December 31, 2015. 1. File format and naming: The brightness temperature is stored separately in units of years, and each directory is composed of remote sensing data files of each frequency, and the SSMIS data also contains the .TIM time information file. The data file names and their naming rules are as follows: EASE-Fnn-ML / HyyyydddA / D.subset.ccH / V (remote sensing data) EASE-Fnn-ML / HyyyydddA / D.subset.TIM (time information file) Among them: EASE stands for EASE-Grid projection method; Fnn stands for satellite number (F08, F11, F13, F17); ML / H stands for multi-channel low-resolution and multi-channel high-resolution respectively; yyyy represents the year; ddd represents Julian Day of the year (1-365 / 366); A / D stands for ascending (A) and descending (D) respectively; subset represents brightness temperature data in China; cc represents frequency (19.35GHz, 22.24 GHz, 37.05GHz, (85.50GHz, 91.66GHz); H / V stands for horizontal polarization (H) and vertical polarization (V), respectively. 2. Coordinate system and projection: The projection method of this data set is EASE-Grid, which is an equal area secant cylindrical projection, and the double standard parallels are 30 ° north and south. For more information about EASE-GRID, please refer to http://www.ncgia.ucsb.edu/globalgrids-book/ease_grid/. If you need to convert the EASE-Grid projection to Geographic projection, please refer to the ease2geo.prj file, the content is as follows: Input projection cylindrical units meters parameters 6371228 6371228 1 / * Enter projection type (1, 2, or 3) 0 00 00 / * Longitude of central meridian 30 00 00 / * Latitude of standard parallel Output Projection GEOGRAPHIC Spheroid KRASovsky Units dd parameters end 3. Data format: Stored as integer binary, Row number: 308 *166,each data occupies 2 bytes. The actual data stored in this dataset is the brightness temperature * 10. After reading the data, you need to divide by 10 to get the real brightness temperature. 4. Data resolution: Spatial resolution: 25.067525km, 12.5km (SSM / I 85GHz, SSMIS 91GHz) Time resolution: daily, from 1978 to 2015. 5. Spatial range: Longitude: 60.1 ° -140.0 ° east longitude; Latitude: 14.9 ° -55.0 ° north latitude. 6. Data reading: Remote sensing image data files in each set of data can be opened in ArcMap, ENVI and ERDAS software.

The spatial dataset of climate on the Tibetan Plateau (1961-2020)

The meteorological elements distribution map of the plateau, which is based on the data from the Tibetan Plateau National Weather Station, was generated by PRISM model interpolation. It includes temperature and precipitation. Monthly average temperature distribution map of the Tibetan Plateau from 1961 to 1990 (30-year average values): t1960-90_1.e00,t1960-90_2.e00,t1960-90_3.e00,t1960-90_4.e00,t1960-90_5.e00, t1960-90_6.e00,t1960-90_7.e00,t1960-90_8.e00,t1960-90_9.e00,t1960-90_10.e00, t1960-90_11.e00,t1960-90_12.e00 Monthly average temperature distribution map of the Tibetan Plateau from 1991 to 2020 (30-year average values): t1991-20_1.e00,t1991-20_2.e00,t1991-20_3.e00,t1991-20_4.e00,t1991-20_5.e00, t1991-20_6.e00,t1991-20_7.e00,t1991-20_8.e00,t1991-20_9.e00,t1991-20_10.e00, t1991-20_11.e00,t1991-20_12.e00, Precipitation distribution map of the Tibetan Plateau from 1961 to 1990 (30-year average values): p1960-90_1.e00,p1960-90_2.e00,p1960-90_3.e00,p1960-90_4.e00,p1960-90_5.e00, p1960-90_6.e00,p1960-90_7.e00,p1960-90_8.e00,p1960-90_9.e00,p1960-90_10.e00, p1960-90_11.e00,p1960-90_12.e00 Precipitation distribution map of the Tibetan Plateau from 1991 to 2020 (30-year average values): p1991-20_1.e00,p1991-20_2.e00,p1991-20_3.e00,p1991-20_4.e00,p1991-20_5.e00, p1991-20_6.e00,p1991-20_7.e00,p1991-20_8.e00,p1991-20_9.e00,p1991-20_10.e00, p1991-20_11.e00,p1991-20_12.e00, The temporal coverage of the data is from 1961 to 1990 and from 1991 to 2020. The spatial coverage of the data is 73°~104.95° east longitude, 26.5°~44.95° north latitude, and the spatial resolution is 0.05 degrees×0.05 degrees (longitude×latitude), and it uses the geodetic coordinate projection. Name interpretation: Monthly average temperature: The average value of daily average temperature in a month. Monthly precipitation: The total precipitation in a month. Dimensions: The file format of the data is E00, and the DN value is the average value of monthly average temperature (×0.01°C) and the average monthly precipitation (×0.01 mm) from January to December. Data type: integer Data accuracy: 0.05 degrees × 0.05 degrees (longitude × latitude). The original sources of these data are two data sets of 1) monthly mean temperature and monthly precipitation observation data from 128 stations on the Tibetan Plateau and the surrounding areas from the establishing times of the stations to 2000 and 2) HadRM3 regional climate scenario simulation data of 50×50 km grids on the Tibetan Plateau, that is, the monthly average temperature and monthly precipitation simulation values from 1991 to 2020. From 1961 to 1990, the PRISM (Parameter elevation Regressions on Independent Slopes Model) interpolation method was used to generate grid data, and the interpolation model was adjusted and verified based on the site data. From 1991 to 2020, the regional climate scenario simulation data were downscaled to generate grid data by the terrain trend surface interpolation method. Part of the source data came from the results of the GCM model simulation; the GCM model used the Hadley Centre climate model HadCM2-SUL. a) Mitchell JFB, Johns TC, Gregory JM, Tett SFB (1995) Climate response to increasing levels of greenhouse gases and sulphate aerosols. Nature, 376, 501-504. b) Johns TC, Carnell RE, Crossley JF et al. (1997) The second Hadley Centre coupled ocean-atmosphere GCM: model description, spinup and validation. Climate Dynamics, 13, 103-134. The spatial interpolation of meteorological data adopted the PRISM (Parameter-elevation Regressions on Independent Slopes Model) method: Daly, C., R.P. Neilson, and D.L. Phillips, 1994: A statistical-topographic model for mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33, 140~158. Due to the difficult observational conditions in the plateau area and the lack of basic research data, there were deletions of meteorological data in some areas. After adjustment and verification, the accuracy of the data was only good enough to be used as a reference for macroscale climate research. The average relative error rate of the monthly average temperature distribution of the Tibetan Plateau from 1961 to 1990 was 8.9%, and that from 1991 to 2020 was 9.7%. The average relative error rate of precipitation data on the Tibetan Plateau from 1961 to 1990 was 20.9%, and that from 1991 to 2020 was 22.7%. The area of missing data was interpolated, and the values of obvious errors were corrected.

A long-term and high-resolution global gridded photosynthetically active radiation product (1984-2018)

Photosynthetically active radiation (PAR) is fundamental physiological variable driving the process of material and energy exchange, and is indispensable for researches in ecological and agricultural fields. In this study, we produced a 35-year (1984-2018) high-resolution (3 h, 10 km) global grided PAR dataset with an effective physical-based PAR model. The main inputs were cloud optical depth from the latest International Satellite Cloud Climatology Project (ISCCP) H-series cloud products, the routine variables (water vapor, surface pressure and ozone) from the ERA5 reanalysis data, aerosol from the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) products and albedo from Moderate Resolution Imaging Spectroradiometer (MODIS) product after 2000 and CLARRA-2 product before 2000. The grided PAR products were evaluated against surface observations measured at seven experimental stations of the SURFace RADiation budget network (SURFRAD), 42 experimental stations of the National Ecological Observatory Network (NEON), and 38 experimental stations of the Chinese Ecosystem Research Network (CERN). The instantaneous PAR was validated at the SURFRAD and NEON, and the mean bias errors (MBEs) and root mean square errors (RMSEs) are 5.6 W m-2 and 44.3 W m-2, and 5.9 W m-2 and 45.5 W m-2, respectively, and correlation coefficients (R) are both 0.94 at 10 km scale. When averaged to 30 km, the errors were obviously reduced with RMSEs decreasing to 36.3 W m-2 and 36.3 W m-2 and R both increasing to 0.96. The daily PAR was validated at the SURFRAD, NEON and CERN, and the RMSEs were 13.2 W m-2, 13.1 W m-2 and 19.6 W m-2, respectively at 10 km scale. The RMSEs were slightly reduced to 11.2 W m-2, 11.6 W m-2, and 18.6 W m-2 when upscaled to 30 km. Comparison with the other well-known global satellite-based PAR product of the Earth's Radiant Energy System (CERES) reveals that our PAR product was a more accurate dataset with higher resolution than the CRERS. Our grided PAR dataset would contribute to the ecological simulation and food yield assessment in the future.