Cargando…
Unraveling overlapping deletions by agglomerative clustering
BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from severa...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549816/ https://www.ncbi.nlm.nih.gov/pubmed/23369161 http://dx.doi.org/10.1186/1471-2164-14-S1-S12 |
_version_ | 1782256476654403584 |
---|---|
author | Wittler, Roland |
author_facet | Wittler, Roland |
author_sort | Wittler, Roland |
collection | PubMed |
description | BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. RESULT: We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. CONCLUSION: We demonstrate that agglomerative clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions. |
format | Online Article Text |
id | pubmed-3549816 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35498162013-01-23 Unraveling overlapping deletions by agglomerative clustering Wittler, Roland BMC Genomics Proceedings BACKGROUND: Structural variations in human genomes, such as deletions, play an important role in cancer development. Next-Generation Sequencing technologies have been central in providing ways to detect such variations. Methods like paired-end mapping allow to simultaneously analyze data from several samples in order to, e.g., distinguish tumor from patient specific variations. However, it has been shown that, especially in this setting, there is a need to explicitly take overlapping deletions into consideration. Existing tools have only minor capabilities to call overlapping deletions, unable to unravel complex signals to obtain consistent predictions. RESULT: We present a first approach specifically designed to cluster short-read paired-end data into possibly overlapping deletion predictions. The method does not make any assumptions on the composition of the data, such as the number of samples, heterogeneity, polyploidy, etc. Taking paired ends mapped to a reference genome as input, it iteratively merges mappings to clusters based on a similarity score that takes both the putative location and size of a deletion into account. CONCLUSION: We demonstrate that agglomerative clustering is suitable to predict deletions. Analyzing real data from three samples of a cancer patient, we found putatively overlapping deletions and observed that, as a side-effect, erroneous mappings are mostly identified as singleton clusters. An evaluation on simulated data shows, compared to other methods which can output overlapping clusters, high accuracy in separating overlapping from single deletions. BioMed Central 2013-01-21 /pmc/articles/PMC3549816/ /pubmed/23369161 http://dx.doi.org/10.1186/1471-2164-14-S1-S12 Text en Copyright ©2013 Wittler; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Wittler, Roland Unraveling overlapping deletions by agglomerative clustering |
title | Unraveling overlapping deletions by agglomerative clustering |
title_full | Unraveling overlapping deletions by agglomerative clustering |
title_fullStr | Unraveling overlapping deletions by agglomerative clustering |
title_full_unstemmed | Unraveling overlapping deletions by agglomerative clustering |
title_short | Unraveling overlapping deletions by agglomerative clustering |
title_sort | unraveling overlapping deletions by agglomerative clustering |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549816/ https://www.ncbi.nlm.nih.gov/pubmed/23369161 http://dx.doi.org/10.1186/1471-2164-14-S1-S12 |
work_keys_str_mv | AT wittlerroland unravelingoverlappingdeletionsbyagglomerativeclustering |