Cargando…

Species determination using AI machine-learning algorithms: Hebeloma as a case study

The genus Hebeloma is renowned as difficult when it comes to species determination. Historically, many dichotomous keys have been published and used with varying success rate. Over the last 20 years the authors have built a database of Hebeloma collections containing not only metadata but also param...

Descripción completa

Detalles Bibliográficos
Autores principales: Bartlett, Peter, Eberhardt, Ursula, Schütz, Nicole, Beker, Henry J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9245212/
https://www.ncbi.nlm.nih.gov/pubmed/35773719
http://dx.doi.org/10.1186/s43008-022-00099-x
Descripción
Sumario:The genus Hebeloma is renowned as difficult when it comes to species determination. Historically, many dichotomous keys have been published and used with varying success rate. Over the last 20 years the authors have built a database of Hebeloma collections containing not only metadata but also parametrized morphological descriptions, where for about a third of the cases micromorphological characters have been analysed and are included, as well as DNA sequences for almost every collection. The database now has about 9000 collections including nearly every type collection worldwide and represents over 120 different taxa. Almost every collection has been analysed and identified to species using a combination of the available molecular and morphological data in addition to locality and habitat information. Based on these data an Artificial Intelligence (AI) machine-learning species identifier has been developed that takes as input locality data and a small number of the morphological parameters. Using a random test set of more than 600 collections from the database, not utilized within the set of collections used to train the identifier, the species identifier was able to identify 77% correctly with its highest probabilistic match, 96% within its three most likely determinations and over 99% of collections within its five most likely determinations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s43008-022-00099-x.