Spatio-temporal downscaling of gridded crop model yield estimates based on machine learning
Global gridded crop models (GGCMs) are essential tools for estimating agricultural crop yields and externalities at large scales, typically at coarse spatial resolutions. Higher resolution estimates are required for robust agricultural assessments at regional and local scales, where the applicability of GGCMs is often limited by low data availability and high computational demand. An approach to bridge this gap is the application of meta-models trained on GGCM output data to covariates of high spatial resolution. In this study, we explore two machine learning approaches – extreme gradient boosting and random forests - to develop meta-models for the prediction of crop model outputs at fine spatial resolutions. Machine learning algorithms are trained on global scale maize simulations of a GGCM and exemplary applied to the extent of Mexico at a finer spatial resolution. Results show very high accuracy with R2>0.96 for predictions of maize yields as well as the hydrologic externalities evapotranspiration and crop available water with also low mean bias in all cases. While limited sets of covariates such as annual climate data alone provide satisfactory results already, a comprehensive set of predictors covering annual, growing season, and monthly climate data is required to obtain high performance in reproducing climate-driven inter-annual crop yield variability. The findings presented herein provide a first proof of concept that machine learning methods are highly suitable for building crop meta-models for spatio-temporal downscaling and indicate potential for further developments towards scalable crop model emulators.