Cargando…

Assessment of transfer methods for comparative genomics of regulatory networks in bacteria

BACKGROUND: Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the ge...

Descripción completa

Detalles Bibliográficos
Autores principales: Kılıç, Sefa, Erill, Ivan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009822/
https://www.ncbi.nlm.nih.gov/pubmed/27586594
http://dx.doi.org/10.1186/s12859-016-1113-7
_version_ 1782451583131320320
author Kılıç, Sefa
Erill, Ivan
author_facet Kılıç, Sefa
Erill, Ivan
author_sort Kılıç, Sefa
collection PubMed
description BACKGROUND: Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. RESULTS: We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. CONCLUSIONS: Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1113-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5009822
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50098222016-09-09 Assessment of transfer methods for comparative genomics of regulatory networks in bacteria Kılıç, Sefa Erill, Ivan BMC Bioinformatics Research BACKGROUND: Comparative genomics can leverage the vast amount of available genomic sequences to reconstruct and analyze transcriptional regulatory networks in Bacteria, but the efficacy of this approach hinges on the ability to transfer regulatory network information from reference species to the genomes under analysis. Several methods have been proposed to transfer regulatory information between bacterial species, but the paucity and distributed nature of experimental information on bacterial transcriptional networks have prevented their systematic evaluation. RESULTS: We report the compilation of a large catalog of transcription factor-binding sites across Bacteria and its use to systematically benchmark proposed transfer methods across pairs of bacterial species. We evaluate motif- and accuracy-based metrics to assess the results of regulatory network transfer and we identify the precision-recall area-under-the-curve as the best metric for this purpose due to the large class-imbalanced nature of the problem. Methods assuming conservation of the transcription factor-binding motif (motif-based) are shown to substantially outperform those assuming conservation of regulon composition (network-based), even though their efficiency can decrease sharply with increasing phylogenetic distance. Variations of the basic motif-based transfer method do not yield significant improvements in transfer accuracy. Our results indicate that detection of a large enough number of regulated orthologs is critical for network-based transfer methods, but that relaxing orthology requirements does not improve results. Using the transcriptional regulators LexA and Fur as case examples, we also show how DNA-binding domain sequence similarity can yield confounding results as an indicator of transfer efficiency for motif-based methods. CONCLUSIONS: Counter to standard practice, our evaluation of metrics to assess the efficiency of methods for regulatory network information transfer reveals that the area under precision-recall (PR) curves is a more precise and informative metric than that of receiver-operating-characteristic (ROC) curves, confirming similar findings in other class-imbalanced settings. Our systematic assessment of transfer methods reveals that simple approaches to both motif- and network-based transfer of regulatory information provide equal or better results than more elaborate methods. We also show that there are not effective predictors of transfer efficacy, substantiating the long-standing practice of manual curation in comparative genomics analyses. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1113-7) contains supplementary material, which is available to authorized users. BioMed Central 2016-08-31 /pmc/articles/PMC5009822/ /pubmed/27586594 http://dx.doi.org/10.1186/s12859-016-1113-7 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kılıç, Sefa
Erill, Ivan
Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title_full Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title_fullStr Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title_full_unstemmed Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title_short Assessment of transfer methods for comparative genomics of regulatory networks in bacteria
title_sort assessment of transfer methods for comparative genomics of regulatory networks in bacteria
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009822/
https://www.ncbi.nlm.nih.gov/pubmed/27586594
http://dx.doi.org/10.1186/s12859-016-1113-7
work_keys_str_mv AT kılıcsefa assessmentoftransfermethodsforcomparativegenomicsofregulatorynetworksinbacteria
AT erillivan assessmentoftransfermethodsforcomparativegenomicsofregulatorynetworksinbacteria