Cargando…

Unraveling overlapping deletions by agglomerative clustering

BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from severa...

Descripción completa

Detalles Bibliográficos
Autor principal: Wittler, Roland
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549816/
https://www.ncbi.nlm.nih.gov/pubmed/23369161
http://dx.doi.org/10.1186/1471-2164-14-S1-S12
_version_ 1782256476654403584
author Wittler, Roland
author_facet Wittler, Roland
author_sort Wittler, Roland
collection PubMed
description BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. RESULT: We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. CONCLUSION: We demonstrate that agglomerative clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions.
format Online
Article
Text
id pubmed-3549816
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35498162013-01-23 Unraveling overlapping deletions by agglomerative clustering Wittler, Roland BMC Genomics Proceedings BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. RESULT: We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. CONCLUSION: We demonstrate that agglomerative clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions. BioMed Central 2013-01-21 /pmc/articles/PMC3549816/ /pubmed/23369161 http://dx.doi.org/10.1186/1471-2164-14-S1-S12 Text en Copyright ©2013 Wittler; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Wittler, Roland
Unraveling overlapping deletions by agglomerative clustering
title Unraveling overlapping deletions by agglomerative clustering
title_full Unraveling overlapping deletions by agglomerative clustering
title_fullStr Unraveling overlapping deletions by agglomerative clustering
title_full_unstemmed Unraveling overlapping deletions by agglomerative clustering
title_short Unraveling overlapping deletions by agglomerative clustering
title_sort unraveling overlapping deletions by agglomerative clustering
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549816/
https://www.ncbi.nlm.nih.gov/pubmed/23369161
http://dx.doi.org/10.1186/1471-2164-14-S1-S12
work_keys_str_mv AT wittlerroland unravelingoverlappingdeletionsbyagglomerativeclustering