Cargando…

Reconstructing DNA copy number by joint segmentation of multiple sequences

BACKGROUND: Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations respons...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Zhongyang, Lange, Kenneth, Sabatti, Chiara
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534631/
https://www.ncbi.nlm.nih.gov/pubmed/22897923
http://dx.doi.org/10.1186/1471-2105-13-205
_version_ 1782475370893672448
author Zhang, Zhongyang
Lange, Kenneth
Sabatti, Chiara
author_facet Zhang, Zhongyang
Lange, Kenneth
Sabatti, Chiara
author_sort Zhang, Zhongyang
collection PubMed
description BACKGROUND: Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations responsible for disease phenotypes. A number of different high throughput technologies can be used to identify copy number variable sites, and the literature documents multiple effective algorithms. We focus here on the specific problem of detecting regions where variation in copy number is relatively common in the sample at hand. This problem encompasses the cases of copy number polymorphisms, related samples, technical replicates, and cancerous sub-populations from the same individual. RESULTS: We present a segmentation method named generalized fused lasso (GFL) to reconstruct copy number variant regions. GFL is based on penalized estimation and is capable of processing multiple signals jointly. Our approach is computationally very attractive and leads to sensitivity and specificity levels comparable to those of state-of-the-art specialized methodologies. We illustrate its applicability with simulated and real data sets. CONCLUSIONS: The flexibility of our framework makes it applicable to data obtained with a wide range of technology. Its versatility and speed make GFL particularly useful in the initial screening stages of large data sets.
format Online
Article
Text
id pubmed-3534631
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35346312013-01-03 Reconstructing DNA copy number by joint segmentation of multiple sequences Zhang, Zhongyang Lange, Kenneth Sabatti, Chiara BMC Bioinformatics Methodology Article BACKGROUND: Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations responsible for disease phenotypes. A number of different high throughput technologies can be used to identify copy number variable sites, and the literature documents multiple effective algorithms. We focus here on the specific problem of detecting regions where variation in copy number is relatively common in the sample at hand. This problem encompasses the cases of copy number polymorphisms, related samples, technical replicates, and cancerous sub-populations from the same individual. RESULTS: We present a segmentation method named generalized fused lasso (GFL) to reconstruct copy number variant regions. GFL is based on penalized estimation and is capable of processing multiple signals jointly. Our approach is computationally very attractive and leads to sensitivity and specificity levels comparable to those of state-of-the-art specialized methodologies. We illustrate its applicability with simulated and real data sets. CONCLUSIONS: The flexibility of our framework makes it applicable to data obtained with a wide range of technology. Its versatility and speed make GFL particularly useful in the initial screening stages of large data sets. BioMed Central 2012-08-16 /pmc/articles/PMC3534631/ /pubmed/22897923 http://dx.doi.org/10.1186/1471-2105-13-205 Text en Copyright ©2012 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Zhang, Zhongyang
Lange, Kenneth
Sabatti, Chiara
Reconstructing DNA copy number by joint segmentation of multiple sequences
title Reconstructing DNA copy number by joint segmentation of multiple sequences
title_full Reconstructing DNA copy number by joint segmentation of multiple sequences
title_fullStr Reconstructing DNA copy number by joint segmentation of multiple sequences
title_full_unstemmed Reconstructing DNA copy number by joint segmentation of multiple sequences
title_short Reconstructing DNA copy number by joint segmentation of multiple sequences
title_sort reconstructing dna copy number by joint segmentation of multiple sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3534631/
https://www.ncbi.nlm.nih.gov/pubmed/22897923
http://dx.doi.org/10.1186/1471-2105-13-205
work_keys_str_mv AT zhangzhongyang reconstructingdnacopynumberbyjointsegmentationofmultiplesequences
AT langekenneth reconstructingdnacopynumberbyjointsegmentationofmultiplesequences
AT sabattichiara reconstructingdnacopynumberbyjointsegmentationofmultiplesequences