Cargando…
Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors?
BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852416/ https://www.ncbi.nlm.nih.gov/pubmed/17295914 http://dx.doi.org/10.1186/gb-2007-8-2-r20 |
_version_ | 1782133053379837952 |
---|---|
author | Deshayes, Caroline Perrodou, Emmanuel Gallien, Sebastien Euphrasie, Daniel Schaeffer, Christine Van-Dorsselaer, Alain Poch, Olivier Lecompte, Odile Reyrat, Jean-Marc |
author_facet | Deshayes, Caroline Perrodou, Emmanuel Gallien, Sebastien Euphrasie, Daniel Schaeffer, Christine Van-Dorsselaer, Alain Poch, Olivier Lecompte, Odile Reyrat, Jean-Marc |
author_sort | Deshayes, Caroline |
collection | PubMed |
description | BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The reality or otherwise of these sequences has major implications for all subsequent functional characterization steps, including module prediction, comparative genomics and high-throughput proteomic projects. RESULTS: We show here, using Mycobacterium smegmatis as a model species, that a significant proportion of these ICDSs result from sequencing errors. We used a resequencing procedure and mass spectrometry analysis to determine the nature of a number of ICDSs in this organism. We found that 28 of the 73 ICDSs investigated correspond to sequencing errors. CONCLUSION: The correction of these errors results in modification of the predicted amino acid sequences of the corresponding proteins and changes in annotation. We suggest that each bacterial ICDS should be investigated individually, to determine its true status and to ensure that the genome sequence is appropriate for comparative genomics analyses. |
format | Text |
id | pubmed-1852416 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-18524162007-04-18 Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? Deshayes, Caroline Perrodou, Emmanuel Gallien, Sebastien Euphrasie, Daniel Schaeffer, Christine Van-Dorsselaer, Alain Poch, Olivier Lecompte, Odile Reyrat, Jean-Marc Genome Biol Research BACKGROUND: In silico analysis has shown that all bacterial genomes contain a low percentage of ORFs with undetected frameshifts and in-frame stop codons. These interrupted coding sequences (ICDSs) may really be present in the organism or may result from misannotation based on sequencing errors. The reality or otherwise of these sequences has major implications for all subsequent functional characterization steps, including module prediction, comparative genomics and high-throughput proteomic projects. RESULTS: We show here, using Mycobacterium smegmatis as a model species, that a significant proportion of these ICDSs result from sequencing errors. We used a resequencing procedure and mass spectrometry analysis to determine the nature of a number of ICDSs in this organism. We found that 28 of the 73 ICDSs investigated correspond to sequencing errors. CONCLUSION: The correction of these errors results in modification of the predicted amino acid sequences of the corresponding proteins and changes in annotation. We suggest that each bacterial ICDS should be investigated individually, to determine its true status and to ensure that the genome sequence is appropriate for comparative genomics analyses. BioMed Central 2007 2007-02-12 /pmc/articles/PMC1852416/ /pubmed/17295914 http://dx.doi.org/10.1186/gb-2007-8-2-r20 Text en Copyright © 2007 Deshayes et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Deshayes, Caroline Perrodou, Emmanuel Gallien, Sebastien Euphrasie, Daniel Schaeffer, Christine Van-Dorsselaer, Alain Poch, Olivier Lecompte, Odile Reyrat, Jean-Marc Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title | Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title_full | Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title_fullStr | Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title_full_unstemmed | Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title_short | Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? |
title_sort | interrupted coding sequences in mycobacterium smegmatis: authentic mutations or sequencing errors? |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852416/ https://www.ncbi.nlm.nih.gov/pubmed/17295914 http://dx.doi.org/10.1186/gb-2007-8-2-r20 |
work_keys_str_mv | AT deshayescaroline interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT perrodouemmanuel interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT galliensebastien interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT euphrasiedaniel interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT schaefferchristine interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT vandorsselaeralain interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT pocholivier interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT lecompteodile interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors AT reyratjeanmarc interruptedcodingsequencesinmycobacteriumsmegmatisauthenticmutationsorsequencingerrors |