Cargando…

Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is importa...

Descripción completa

Detalles Bibliográficos
Autores principales: Fantin, Yuri S., Neverov, Alexey D., Favorov, Alexander V., Alvarez-Figueroa, Maria V., Braslavskaya, Svetlana I., Gordukova, Maria A., Karandashova, Inga V., Kuleshov, Konstantin V., Myznikova, Anna I., Polishchuk, Maya S., Reshetov, Denis A., Voiciehovskaya, Yana A., Mironov, Andrei A., Chulanov, Vladimir P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3557274/
https://www.ncbi.nlm.nih.gov/pubmed/23382983
http://dx.doi.org/10.1371/journal.pone.0054835
_version_ 1782257300520566784
author Fantin, Yuri S.
Neverov, Alexey D.
Favorov, Alexander V.
Alvarez-Figueroa, Maria V.
Braslavskaya, Svetlana I.
Gordukova, Maria A.
Karandashova, Inga V.
Kuleshov, Konstantin V.
Myznikova, Anna I.
Polishchuk, Maya S.
Reshetov, Denis A.
Voiciehovskaya, Yana A.
Mironov, Andrei A.
Chulanov, Vladimir P.
author_facet Fantin, Yuri S.
Neverov, Alexey D.
Favorov, Alexander V.
Alvarez-Figueroa, Maria V.
Braslavskaya, Svetlana I.
Gordukova, Maria A.
Karandashova, Inga V.
Kuleshov, Konstantin V.
Myznikova, Anna I.
Polishchuk, Maya S.
Reshetov, Denis A.
Voiciehovskaya, Yana A.
Mironov, Andrei A.
Chulanov, Vladimir P.
author_sort Fantin, Yuri S.
collection PubMed
description Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing.
format Online
Article
Text
id pubmed-3557274
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35572742013-02-04 Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms Fantin, Yuri S. Neverov, Alexey D. Favorov, Alexander V. Alvarez-Figueroa, Maria V. Braslavskaya, Svetlana I. Gordukova, Maria A. Karandashova, Inga V. Kuleshov, Konstantin V. Myznikova, Anna I. Polishchuk, Maya S. Reshetov, Denis A. Voiciehovskaya, Yana A. Mironov, Andrei A. Chulanov, Vladimir P. PLoS One Research Article Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. Public Library of Science 2013-01-28 /pmc/articles/PMC3557274/ /pubmed/23382983 http://dx.doi.org/10.1371/journal.pone.0054835 Text en © 2013 Fantin et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Fantin, Yuri S.
Neverov, Alexey D.
Favorov, Alexander V.
Alvarez-Figueroa, Maria V.
Braslavskaya, Svetlana I.
Gordukova, Maria A.
Karandashova, Inga V.
Kuleshov, Konstantin V.
Myznikova, Anna I.
Polishchuk, Maya S.
Reshetov, Denis A.
Voiciehovskaya, Yana A.
Mironov, Andrei A.
Chulanov, Vladimir P.
Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title_full Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title_fullStr Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title_full_unstemmed Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title_short Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms
title_sort base-calling algorithm with vocabulary (bcv) method for analyzing population sequencing chromatograms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3557274/
https://www.ncbi.nlm.nih.gov/pubmed/23382983
http://dx.doi.org/10.1371/journal.pone.0054835
work_keys_str_mv AT fantinyuris basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT neverovalexeyd basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT favorovalexanderv basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT alvarezfigueroamariav basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT braslavskayasvetlanai basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT gordukovamariaa basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT karandashovaingav basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT kuleshovkonstantinv basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT myznikovaannai basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT polishchukmayas basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT reshetovdenisa basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT voiciehovskayayanaa basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT mironovandreia basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms
AT chulanovvladimirp basecallingalgorithmwithvocabularybcvmethodforanalyzingpopulationsequencingchromatograms