e-infrastructure Roadmap for Open Science in Agriculture

A bibliometric study

The e-ROSA project seeks to build a shared vision of a future sustainable e-infrastructure for research and education in agriculture in order to promote Open Science in this field and as such contribute to addressing related societal challenges. In order to achieve this goal, e-ROSA’s first objective is to bring together the relevant scientific communities and stakeholders and engage them in the process of coelaboration of an ambitious, practical roadmap that provides the basis for the design and implementation of such an e-infrastructure in the years to come.

This website highlights the results of a bibliometric analysis conducted at a global scale in order to identify key scientists and associated research performing organisations (e.g. public research institutes, universities, Research & Development departments of private companies) that work in the field of agricultural data sources and services. If you have any comment or feedback on the bibliometric study, please use the online form.

You can access and play with the graphs:

Discover all records
Home page


An ecoinformatics tool for microbial community studies: Supervised classification of Amplicon Length Heterogeneity (ALH) profiles of 16S rRNA


Support vector machines (SVM) and K-nearest neighbors (KNN) are two computational machine learning tools that perform supervised classification. This paper presents a novel application of such supervised analytical tools for microbial Community profiling and to distinguish patterning among ecosystems. Amplicon length heterogeneity (ALH) profiles from several hypervariable regions of 16S rRNA gene of cubacterial communities from Idaho agricultural soil samples and from Chesapeake Bay marsh sediments were separately analyzed. The profiles from all available hypervariable regions were concatenated to obtain a combined profile, which was then provided to the SVM and KNN classifiers. Each profile was labeled with information about the location or time of its sampling. We hypothesized that after a learning phase using feature vectors from labeled ALH profiles, both these classifiers would have the capacity to predict the labels of previously unseen samples. The resulting classifiers were able to predict the labels of the Idaho soil samples with high accuracy. The classifiers were less accurate for the classification of the Chesapeake Bay sediments suggesting greater similarity within the Bay's microbial community patterns in the sampled sites. The profiles obtained from the V1+V2 region were more informative than that obtained from any other single region. However, combining them with profiles from the VI region (with or without the profiles from the V3 region) resulted in the most accurate classification of the samples. The addition of profiles from the V9 region appeared to confound the classifiers. Our results show that SVM and KNN classifiers can be effectively applied to distinguish between eubacterial community patterns from different ecosystems based only on their ALH profiles. (c) 2005 Elsevier B V. All rights reserved.

  • US
  • Florida_Int_Univ (US)
  • George_Mason_Univ (US)
  • USDA_ARS_Agr_Res_Serv (US)
Data keywords
  • machine learning
Agriculture keywords
  • agriculture
Data topic
  • big data
  • modeling
Document type

Inappropriate format for Document type, expected simple value but got array, please use list format

Institutions 10 co-publis
  • USDA_ARS_Agr_Res_Serv (US)
Powered by Lodex 8.20.3
logo commission europeenne
e-ROSA - e-infrastructure Roadmap for Open Science in Agriculture has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 730988.
Disclaimer: The sole responsibility of the material published in this website lies with the authors. The European Union is not responsible for any use that may be made of the information contained therein.