Cargando…

Computational enhancer prediction: evaluation and improvements

BACKGROUND: Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to eval...

Descripción completa

Detalles Bibliográficos
Autores principales:	Asma, Hasiba, Halfon, Marc S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6451241/ https://www.ncbi.nlm.nih.gov/pubmed/30953451 http://dx.doi.org/10.1186/s12859-019-2781-x

_version_	1783409158172704768
author	Asma, Hasiba Halfon, Marc S.
author_facet	Asma, Hasiba Halfon, Marc S.
author_sort	Asma, Hasiba
collection	PubMed
description	BACKGROUND: Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity. RESULTS: We introduce here pCRMeval, a pipeline for in silico evaluation of any enhancer prediction tools that are flexible enough to be applied to the Drosophila melanogaster genome. pCRMeval compares the result of predictions with the extensive existing knowledge of experimentally-validated Drosophila CRMs in order to estimate the precision and relative sensitivity of the prediction method. In the case of supervised prediction methods—when training data composed of validated CRMs are used—pCRMeval can also assess the sensitivity of specific training sets. We demonstrate the utility of pCRMeval through evaluation of our SCRMshaw CRM prediction method and training data. By measuring the impact of different parameters on SCRMshaw performance, as assessed by pCRMeval, we develop a more robust version of SCRMshaw, SCRMshaw_HD, that improves the number of predictions while maintaining sensitivity and specificity. Our analysis also demonstrates that SCRMshaw_HD, when applied to increasingly less well-assembled genomes, maintains its strong predictive power with only a minor drop-off in performance. CONCLUSION: Our pCRMeval pipeline provides a general framework for evaluation that can be applied to any CRM prediction method, particularly a supervised method. While we make use of it here primarily to test and improve a particular method for CRM prediction, SCRMshaw, pCRMeval should provide a valuable platform to the research community not only for evaluating individual methods, but also for comparing between competing methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2781-x) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6451241
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-64512412019-04-16 Computational enhancer prediction: evaluation and improvements Asma, Hasiba Halfon, Marc S. BMC Bioinformatics Methodology Article BACKGROUND: Identifying transcriptional enhancers and other cis-regulatory modules (CRMs) is an important goal of post-sequencing genome annotation. Computational approaches provide a useful complement to empirical methods for CRM discovery, but it is critical that we develop effective means to evaluate their performance in terms of estimating their sensitivity and specificity. RESULTS: We introduce here pCRMeval, a pipeline for in silico evaluation of any enhancer prediction tools that are flexible enough to be applied to the Drosophila melanogaster genome. pCRMeval compares the result of predictions with the extensive existing knowledge of experimentally-validated Drosophila CRMs in order to estimate the precision and relative sensitivity of the prediction method. In the case of supervised prediction methods—when training data composed of validated CRMs are used—pCRMeval can also assess the sensitivity of specific training sets. We demonstrate the utility of pCRMeval through evaluation of our SCRMshaw CRM prediction method and training data. By measuring the impact of different parameters on SCRMshaw performance, as assessed by pCRMeval, we develop a more robust version of SCRMshaw, SCRMshaw_HD, that improves the number of predictions while maintaining sensitivity and specificity. Our analysis also demonstrates that SCRMshaw_HD, when applied to increasingly less well-assembled genomes, maintains its strong predictive power with only a minor drop-off in performance. CONCLUSION: Our pCRMeval pipeline provides a general framework for evaluation that can be applied to any CRM prediction method, particularly a supervised method. While we make use of it here primarily to test and improve a particular method for CRM prediction, SCRMshaw, pCRMeval should provide a valuable platform to the research community not only for evaluating individual methods, but also for comparing between competing methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2781-x) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-05 /pmc/articles/PMC6451241/ /pubmed/30953451 http://dx.doi.org/10.1186/s12859-019-2781-x Text en © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Asma, Hasiba Halfon, Marc S. Computational enhancer prediction: evaluation and improvements
title	Computational enhancer prediction: evaluation and improvements
title_full	Computational enhancer prediction: evaluation and improvements
title_fullStr	Computational enhancer prediction: evaluation and improvements
title_full_unstemmed	Computational enhancer prediction: evaluation and improvements
title_short	Computational enhancer prediction: evaluation and improvements
title_sort	computational enhancer prediction: evaluation and improvements
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6451241/ https://www.ncbi.nlm.nih.gov/pubmed/30953451 http://dx.doi.org/10.1186/s12859-019-2781-x
work_keys_str_mv	AT asmahasiba computationalenhancerpredictionevaluationandimprovements AT halfonmarcs computationalenhancerpredictionevaluationandimprovements

Computational enhancer prediction: evaluation and improvements

Ejemplares similares