e-infrastructure Roadmap for Open Science in Agriculture

A bibliometric study

The e-ROSA project seeks to build a shared vision of a future sustainable e-infrastructure for research and education in agriculture in order to promote Open Science in this field and as such contribute to addressing related societal challenges. In order to achieve this goal, e-ROSA’s first objective is to bring together the relevant scientific communities and stakeholders and engage them in the process of coelaboration of an ambitious, practical roadmap that provides the basis for the design and implementation of such an e-infrastructure in the years to come.

This website highlights the results of a bibliometric analysis conducted at a global scale in order to identify key scientists and associated research performing organisations (e.g. public research institutes, universities, Research & Development departments of private companies) that work in the field of agricultural data sources and services. If you have any comment or feedback on the bibliometric study, please use the online form.

You can access and play with the graphs:

Discover all records
Home page


Predictive ability of machine learning methods for massive crop yield prediction


An important issue for agricultural planning purposes is the accurate yield estimation for the numerous crops involved in the planning. Machine learning (ML) is an essential approach for achieving practical and effective solutions for this problem. Many comparisons of ML methods for yield prediction have been made, seeking for the most accurate technique. Generally, the number of evaluated crops and techniques is too low and does not provide enough_ information for agricultural planning purposes. This paper compares the predictive accuracy of ML and linear regression techniques for crop yield prediction in ten crop datasets. Multiple linear regression, MS-Prime regression trees, perceptron multilayer neural networks, support vector regression and k-nearest neighbor methods were ranked. Four accuracy metrics were used to validate the models: the root mean square error (RMS), root relative square error (RRSE), normalized mean absolute error (MAE), and correlation factor (R). Real data of an irrigation zone of Mexico were used for building the models. Models were tested with samples of two consecutive years. The results show that M5Prime and k-nearest neighbor techniques obtain the lowest average RMSE errors (5.14 and 4.91), the lowest RRSE errors (79.46% and 79.78%), the lowest average MAE errors (18.12% and 19.42%), and the highest average correlation factors (0.41 and 0.42). Since M5-Prime achieves the largest number of crop yield models with the lowest errors, it is a very suitable tool for massive crop yield prediction in agricultural planning.

  • MX
    Data keywords
    • machine learning
    Agriculture keywords
    • agriculture
    Data topic
    • big data
    • modeling
    • sensors
    Document type

    Inappropriate format for Document type, expected simple value but got array, please use list format

    Institutions 10 co-publis
      Powered by Lodex 8.20.3
      logo commission europeenne
      e-ROSA - e-infrastructure Roadmap for Open Science in Agriculture has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 730988.
      Disclaimer: The sole responsibility of the material published in this website lies with the authors. The European Union is not responsible for any use that may be made of the information contained therein.