Cargando…

DNA Barcoding through Quaternary LDPC Codes

For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from...

Descripción completa

Detalles Bibliográficos
Autores principales: Tapia, Elizabeth, Spetale, Flavio, Krsticevic, Flavia, Angelone, Laura, Bulacio, Pilar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619643/
https://www.ncbi.nlm.nih.gov/pubmed/26492348
http://dx.doi.org/10.1371/journal.pone.0140459
_version_ 1782397149654286336
author Tapia, Elizabeth
Spetale, Flavio
Krsticevic, Flavia
Angelone, Laura
Bulacio, Pilar
author_facet Tapia, Elizabeth
Spetale, Flavio
Krsticevic, Flavia
Angelone, Laura
Bulacio, Pilar
author_sort Tapia, Elizabeth
collection PubMed
description For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(−2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(−9) at the expense of a rate of read losses just in the order of 10(−6).
format Online
Article
Text
id pubmed-4619643
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46196432015-10-29 DNA Barcoding through Quaternary LDPC Codes Tapia, Elizabeth Spetale, Flavio Krsticevic, Flavia Angelone, Laura Bulacio, Pilar PLoS One Research Article For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10(−2) per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10(−9) at the expense of a rate of read losses just in the order of 10(−6). Public Library of Science 2015-10-22 /pmc/articles/PMC4619643/ /pubmed/26492348 http://dx.doi.org/10.1371/journal.pone.0140459 Text en © 2015 Tapia et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Tapia, Elizabeth
Spetale, Flavio
Krsticevic, Flavia
Angelone, Laura
Bulacio, Pilar
DNA Barcoding through Quaternary LDPC Codes
title DNA Barcoding through Quaternary LDPC Codes
title_full DNA Barcoding through Quaternary LDPC Codes
title_fullStr DNA Barcoding through Quaternary LDPC Codes
title_full_unstemmed DNA Barcoding through Quaternary LDPC Codes
title_short DNA Barcoding through Quaternary LDPC Codes
title_sort dna barcoding through quaternary ldpc codes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619643/
https://www.ncbi.nlm.nih.gov/pubmed/26492348
http://dx.doi.org/10.1371/journal.pone.0140459
work_keys_str_mv AT tapiaelizabeth dnabarcodingthroughquaternaryldpccodes
AT spetaleflavio dnabarcodingthroughquaternaryldpccodes
AT krsticevicflavia dnabarcodingthroughquaternaryldpccodes
AT angelonelaura dnabarcodingthroughquaternaryldpccodes
AT bulaciopilar dnabarcodingthroughquaternaryldpccodes