Cargando…

Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes

Comparative algal genomics often relies on predicted genes from de novo assembled genomes. However, the artifacts introduced by different gene‐prediction approaches, and their impact on comparative genomic analysis remain poorly understood. Here, using available genome data from six dinoflagellate s...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yibi, González‐Pech, Raúl A., Stephens, Timothy G., Bhattacharya, Debashish, Chan, Cheong Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7065002/
https://www.ncbi.nlm.nih.gov/pubmed/31713873
http://dx.doi.org/10.1111/jpy.12947
_version_ 1783504978686509056
author Chen, Yibi
González‐Pech, Raúl A.
Stephens, Timothy G.
Bhattacharya, Debashish
Chan, Cheong Xin
author_facet Chen, Yibi
González‐Pech, Raúl A.
Stephens, Timothy G.
Bhattacharya, Debashish
Chan, Cheong Xin
author_sort Chen, Yibi
collection PubMed
description Comparative algal genomics often relies on predicted genes from de novo assembled genomes. However, the artifacts introduced by different gene‐prediction approaches, and their impact on comparative genomic analysis remain poorly understood. Here, using available genome data from six dinoflagellate species in the Symbiodiniaceae, we identified methodological biases in the published genes that were predicted using different approaches and putative contaminant sequences in the published genome assemblies. We developed and applied a comprehensive customized workflow to predict genes from these genomes. The observed variation among predicted genes resulting from our workflow agreed with current understanding of phylogenetic relationships among these taxa, whereas the variation among the previously published genes was largely biased by the distinct approaches used in each instance. Importantly, these biases affect the inference of homologous gene families and synteny among genomes, thus impacting biological interpretation of these data. Our results demonstrate that a consistent gene‐prediction approach is critical for comparative analysis of dinoflagellate genomes.
format Online
Article
Text
id pubmed-7065002
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-70650022020-03-16 Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes Chen, Yibi González‐Pech, Raúl A. Stephens, Timothy G. Bhattacharya, Debashish Chan, Cheong Xin J Phycol Letter Comparative algal genomics often relies on predicted genes from de novo assembled genomes. However, the artifacts introduced by different gene‐prediction approaches, and their impact on comparative genomic analysis remain poorly understood. Here, using available genome data from six dinoflagellate species in the Symbiodiniaceae, we identified methodological biases in the published genes that were predicted using different approaches and putative contaminant sequences in the published genome assemblies. We developed and applied a comprehensive customized workflow to predict genes from these genomes. The observed variation among predicted genes resulting from our workflow agreed with current understanding of phylogenetic relationships among these taxa, whereas the variation among the previously published genes was largely biased by the distinct approaches used in each instance. Importantly, these biases affect the inference of homologous gene families and synteny among genomes, thus impacting biological interpretation of these data. Our results demonstrate that a consistent gene‐prediction approach is critical for comparative analysis of dinoflagellate genomes. John Wiley and Sons Inc. 2020-02-14 2020-02 /pmc/articles/PMC7065002/ /pubmed/31713873 http://dx.doi.org/10.1111/jpy.12947 Text en © 2020 The Authors. Journal of Phycology published by Wiley Periodicals, Inc. on behalf of Phycological Society of America This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Letter
Chen, Yibi
González‐Pech, Raúl A.
Stephens, Timothy G.
Bhattacharya, Debashish
Chan, Cheong Xin
Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title_full Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title_fullStr Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title_full_unstemmed Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title_short Evidence That Inconsistent Gene Prediction Can Mislead Analysis of Dinoflagellate Genomes
title_sort evidence that inconsistent gene prediction can mislead analysis of dinoflagellate genomes
topic Letter
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7065002/
https://www.ncbi.nlm.nih.gov/pubmed/31713873
http://dx.doi.org/10.1111/jpy.12947
work_keys_str_mv AT chenyibi evidencethatinconsistentgenepredictioncanmisleadanalysisofdinoflagellategenomes
AT gonzalezpechraula evidencethatinconsistentgenepredictioncanmisleadanalysisofdinoflagellategenomes
AT stephenstimothyg evidencethatinconsistentgenepredictioncanmisleadanalysisofdinoflagellategenomes
AT bhattacharyadebashish evidencethatinconsistentgenepredictioncanmisleadanalysisofdinoflagellategenomes
AT chancheongxin evidencethatinconsistentgenepredictioncanmisleadanalysisofdinoflagellategenomes