Cargando…

Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)

The last decade has witnessed dramatic improvements in whole-genome sequencing capabilities coupled to drastically decreased costs, leading to an inundation of high-quality de novo genomes. For this reason, the continued development of genome quality metrics is imperative. Using the 2016 Atlantic bo...

Descripción completa

Detalles Bibliográficos
Autores principales: Neely, Benjamin A., Ellisor, Debra L., Davis, W. Clay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531373/
https://www.ncbi.nlm.nih.gov/pubmed/37761836
http://dx.doi.org/10.3390/genes14091696
_version_ 1785111703516610560
author Neely, Benjamin A.
Ellisor, Debra L.
Davis, W. Clay
author_facet Neely, Benjamin A.
Ellisor, Debra L.
Davis, W. Clay
author_sort Neely, Benjamin A.
collection PubMed
description The last decade has witnessed dramatic improvements in whole-genome sequencing capabilities coupled to drastically decreased costs, leading to an inundation of high-quality de novo genomes. For this reason, the continued development of genome quality metrics is imperative. Using the 2016 Atlantic bottlenose dolphin NCBI RefSeq annotation and mass spectrometry-based proteomic analysis of six tissues, we confirmed 10,402 proteins from 4711 protein groups, constituting nearly one-third of the possible predicted proteins. Since the identification of larger proteins with more identified peptides implies reduced database fragmentation and improved gene annotation accuracy, we propose the metric NP(10), which attempts to capture this quality improvement. The NP(10) metric is calculated by first stratifying proteomic results by identifying the top decile (or 10th 10-quantile) of identified proteins based on the number of peptides per protein and then returns the median molecular weight of the resulting proteins. When using the 2016 versus 2012 Tursiops truncatus genome annotation to search this proteomic data set, there was a 21% improvement in NP(10). This metric was further demonstrated by using a publicly available proteomic data set to compare human genome annotations from 2004, 2013 and 2016, which showed a 33% improvement in NP(10). These results demonstrate that proteomics may be a useful metrological tool to benchmark genome accuracy, though there is a need for reference proteomic datasets across species to facilitate the evaluation of new de novo and existing genome.
format Online
Article
Text
id pubmed-10531373
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105313732023-09-28 Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus) Neely, Benjamin A. Ellisor, Debra L. Davis, W. Clay Genes (Basel) Article The last decade has witnessed dramatic improvements in whole-genome sequencing capabilities coupled to drastically decreased costs, leading to an inundation of high-quality de novo genomes. For this reason, the continued development of genome quality metrics is imperative. Using the 2016 Atlantic bottlenose dolphin NCBI RefSeq annotation and mass spectrometry-based proteomic analysis of six tissues, we confirmed 10,402 proteins from 4711 protein groups, constituting nearly one-third of the possible predicted proteins. Since the identification of larger proteins with more identified peptides implies reduced database fragmentation and improved gene annotation accuracy, we propose the metric NP(10), which attempts to capture this quality improvement. The NP(10) metric is calculated by first stratifying proteomic results by identifying the top decile (or 10th 10-quantile) of identified proteins based on the number of peptides per protein and then returns the median molecular weight of the resulting proteins. When using the 2016 versus 2012 Tursiops truncatus genome annotation to search this proteomic data set, there was a 21% improvement in NP(10). This metric was further demonstrated by using a publicly available proteomic data set to compare human genome annotations from 2004, 2013 and 2016, which showed a 33% improvement in NP(10). These results demonstrate that proteomics may be a useful metrological tool to benchmark genome accuracy, though there is a need for reference proteomic datasets across species to facilitate the evaluation of new de novo and existing genome. MDPI 2023-08-25 /pmc/articles/PMC10531373/ /pubmed/37761836 http://dx.doi.org/10.3390/genes14091696 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Neely, Benjamin A.
Ellisor, Debra L.
Davis, W. Clay
Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title_full Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title_fullStr Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title_full_unstemmed Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title_short Proteomics as a Metrological Tool to Evaluate Genome Annotation Accuracy Following De Novo Genome Assembly: A Case Study Using the Atlantic Bottlenose Dolphin (Tursiops truncatus)
title_sort proteomics as a metrological tool to evaluate genome annotation accuracy following de novo genome assembly: a case study using the atlantic bottlenose dolphin (tursiops truncatus)
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531373/
https://www.ncbi.nlm.nih.gov/pubmed/37761836
http://dx.doi.org/10.3390/genes14091696
work_keys_str_mv AT neelybenjamina proteomicsasametrologicaltooltoevaluategenomeannotationaccuracyfollowingdenovogenomeassemblyacasestudyusingtheatlanticbottlenosedolphintursiopstruncatus
AT ellisordebral proteomicsasametrologicaltooltoevaluategenomeannotationaccuracyfollowingdenovogenomeassemblyacasestudyusingtheatlanticbottlenosedolphintursiopstruncatus
AT daviswclay proteomicsasametrologicaltooltoevaluategenomeannotationaccuracyfollowingdenovogenomeassemblyacasestudyusingtheatlanticbottlenosedolphintursiopstruncatus