Cargando…

Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions

Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabol...

Descripción completa

Detalles Bibliográficos
Autores principales: Hjörleifsson Eldjárn, Grímur, Ramsay, Andrew, van der Hooft, Justin J. J., Duncan, Katherine R., Soldatou, Sylvia, Rousu, Juho, Daly, Rónán, Wandy, Joe, Rogers, Simon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8130963/
https://www.ncbi.nlm.nih.gov/pubmed/33945539
http://dx.doi.org/10.1371/journal.pcbi.1008920
_version_ 1783694620187688960
author Hjörleifsson Eldjárn, Grímur
Ramsay, Andrew
van der Hooft, Justin J. J.
Duncan, Katherine R.
Soldatou, Sylvia
Rousu, Juho
Daly, Rónán
Wandy, Joe
Rogers, Simon
author_facet Hjörleifsson Eldjárn, Grímur
Ramsay, Andrew
van der Hooft, Justin J. J.
Duncan, Katherine R.
Soldatou, Sylvia
Rousu, Juho
Daly, Rónán
Wandy, Joe
Rogers, Simon
author_sort Hjörleifsson Eldjárn, Grímur
collection PubMed
description Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.
format Online
Article
Text
id pubmed-8130963
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81309632021-05-27 Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions Hjörleifsson Eldjárn, Grímur Ramsay, Andrew van der Hooft, Justin J. J. Duncan, Katherine R. Soldatou, Sylvia Rousu, Juho Daly, Rónán Wandy, Joe Rogers, Simon PLoS Comput Biol Research Article Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links. Public Library of Science 2021-05-04 /pmc/articles/PMC8130963/ /pubmed/33945539 http://dx.doi.org/10.1371/journal.pcbi.1008920 Text en © 2021 Hjörleifsson Eldjárn et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hjörleifsson Eldjárn, Grímur
Ramsay, Andrew
van der Hooft, Justin J. J.
Duncan, Katherine R.
Soldatou, Sylvia
Rousu, Juho
Daly, Rónán
Wandy, Joe
Rogers, Simon
Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title_full Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title_fullStr Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title_full_unstemmed Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title_short Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
title_sort ranking microbial metabolomic and genomic links in the nplinker framework using complementary scoring functions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8130963/
https://www.ncbi.nlm.nih.gov/pubmed/33945539
http://dx.doi.org/10.1371/journal.pcbi.1008920
work_keys_str_mv AT hjorleifssoneldjarngrimur rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT ramsayandrew rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT vanderhooftjustinjj rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT duncankatheriner rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT soldatousylvia rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT rousujuho rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT dalyronan rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT wandyjoe rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions
AT rogerssimon rankingmicrobialmetabolomicandgenomiclinksinthenplinkerframeworkusingcomplementaryscoringfunctions