Cargando…

An integrative probabilistic model for identification of structural variation in sequencing data

Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This result...

Descripción completa

Detalles Bibliográficos
Autores principales: Sindi, Suzanne S, Önal, Selim, Peng, Luke C, Wu, Hsin-Ta, Raphael, Benjamin J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439973/
https://www.ncbi.nlm.nih.gov/pubmed/22452995
http://dx.doi.org/10.1186/gb-2012-13-3-r22
_version_ 1782243103050039296
author Sindi, Suzanne S
Önal, Selim
Peng, Luke C
Wu, Hsin-Ta
Raphael, Benjamin J
author_facet Sindi, Suzanne S
Önal, Selim
Peng, Luke C
Wu, Hsin-Ta
Raphael, Benjamin J
author_sort Sindi, Suzanne S
collection PubMed
description Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This results in reduced sensitivity to detect SVs, especially in repetitive regions. We introduce GASVPro, an algorithm combining both paired read and read depth signals into a probabilistic model that can analyze multiple alignments of reads. GASVPro outperforms existing methods with a 50 to 90% improvement in specificity on deletions and a 50% improvement on inversions. GASVPro is available at http://compbio.cs.brown.edu/software.
format Online
Article
Text
id pubmed-3439973
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34399732012-09-15 An integrative probabilistic model for identification of structural variation in sequencing data Sindi, Suzanne S Önal, Selim Peng, Luke C Wu, Hsin-Ta Raphael, Benjamin J Genome Biol Method Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This results in reduced sensitivity to detect SVs, especially in repetitive regions. We introduce GASVPro, an algorithm combining both paired read and read depth signals into a probabilistic model that can analyze multiple alignments of reads. GASVPro outperforms existing methods with a 50 to 90% improvement in specificity on deletions and a 50% improvement on inversions. GASVPro is available at http://compbio.cs.brown.edu/software. BioMed Central 2012-03-27 /pmc/articles/PMC3439973/ /pubmed/22452995 http://dx.doi.org/10.1186/gb-2012-13-3-r22 Text en Copyright © 2012 Sindi et al.; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method
Sindi, Suzanne S
Önal, Selim
Peng, Luke C
Wu, Hsin-Ta
Raphael, Benjamin J
An integrative probabilistic model for identification of structural variation in sequencing data
title An integrative probabilistic model for identification of structural variation in sequencing data
title_full An integrative probabilistic model for identification of structural variation in sequencing data
title_fullStr An integrative probabilistic model for identification of structural variation in sequencing data
title_full_unstemmed An integrative probabilistic model for identification of structural variation in sequencing data
title_short An integrative probabilistic model for identification of structural variation in sequencing data
title_sort integrative probabilistic model for identification of structural variation in sequencing data
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439973/
https://www.ncbi.nlm.nih.gov/pubmed/22452995
http://dx.doi.org/10.1186/gb-2012-13-3-r22
work_keys_str_mv AT sindisuzannes anintegrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT onalselim anintegrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT penglukec anintegrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT wuhsinta anintegrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT raphaelbenjaminj anintegrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT sindisuzannes integrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT onalselim integrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT penglukec integrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT wuhsinta integrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata
AT raphaelbenjaminj integrativeprobabilisticmodelforidentificationofstructuralvariationinsequencingdata