Cargando…

Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical ste...

Descripción completa

Detalles Bibliográficos
Autores principales: Matochko, Wadim L., Derda, Ratmir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3876701/
https://www.ncbi.nlm.nih.gov/pubmed/24416071
http://dx.doi.org/10.1155/2013/491612
_version_ 1782297537938456576
author Matochko, Wadim L.
Derda, Ratmir
author_facet Matochko, Wadim L.
Derda, Ratmir
author_sort Matochko, Wadim L.
collection PubMed
description Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||n(i)||, where n(i) is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a I(N), where I(N) is a N × N unity matrix. Any bias in sequencing changes I(N) to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.
format Online
Article
Text
id pubmed-3876701
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-38767012014-01-12 Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing Matochko, Wadim L. Derda, Ratmir Comput Math Methods Med Research Article Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||n(i)||, where n(i) is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a I(N), where I(N) is a N × N unity matrix. Any bias in sequencing changes I(N) to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. Hindawi Publishing Corporation 2013 2013-12-12 /pmc/articles/PMC3876701/ /pubmed/24416071 http://dx.doi.org/10.1155/2013/491612 Text en Copyright © 2013 W. L. Matochko and R. Derda. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Matochko, Wadim L.
Derda, Ratmir
Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title_full Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title_fullStr Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title_full_unstemmed Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title_short Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
title_sort error analysis of deep sequencing of phage libraries: peptides censored in sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3876701/
https://www.ncbi.nlm.nih.gov/pubmed/24416071
http://dx.doi.org/10.1155/2013/491612
work_keys_str_mv AT matochkowadiml erroranalysisofdeepsequencingofphagelibrariespeptidescensoredinsequencing
AT derdaratmir erroranalysisofdeepsequencingofphagelibrariespeptidescensoredinsequencing