Cargando…

Reconstructing cancer genomes from paired-end sequencing data

BACKGROUND: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks&...

Descripción completa

Detalles Bibliográficos
Autores principales:	Oesper, Layla, Ritz, Anna, Aerni, Sarah J, Drebin, Ryan, Raphael, Benjamin J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358655/ https://www.ncbi.nlm.nih.gov/pubmed/22537039 http://dx.doi.org/10.1186/1471-2105-13-S6-S10

_version_	1782233794384756736
author	Oesper, Layla Ritz, Anna Aerni, Sarah J Drebin, Ryan Raphael, Benjamin J
author_facet	Oesper, Layla Ritz, Anna Aerni, Sarah J Drebin, Ryan Raphael, Benjamin J
author_sort	Oesper, Layla
collection	PubMed
description	BACKGROUND: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. RESULTS: By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. CONCLUSIONS: We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/.
format	Online Article Text
id	pubmed-3358655
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-33586552012-06-07 Reconstructing cancer genomes from paired-end sequencing data Oesper, Layla Ritz, Anna Aerni, Sarah J Drebin, Ryan Raphael, Benjamin J BMC Bioinformatics Proceedings BACKGROUND: A cancer genome is derived from the germline genome through a series of somatic mutations. Somatic structural variants - including duplications, deletions, inversions, translocations, and other rearrangements - result in a cancer genome that is a scrambling of intervals, or "blocks" of the germline genome sequence. We present an efficient algorithm for reconstructing the block organization of a cancer genome from paired-end DNA sequencing data. RESULTS: By aligning paired reads from a cancer genome - and a matched germline genome, if available - to the human reference genome, we derive: (i) a partition of the reference genome into intervals; (ii) adjacencies between these intervals in the cancer genome; (iii) an estimated copy number for each interval. We formulate the Copy Number and Adjacency Genome Reconstruction Problem of determining the cancer genome as a sequence of the derived intervals that is consistent with the measured adjacencies and copy numbers. We design an efficient algorithm, called Paired-end Reconstruction of Genome Organization (PREGO), to solve this problem by reducing it to an optimization problem on an interval-adjacency graph constructed from the data. The solution to the optimization problem results in an Eulerian graph, containing an alternating Eulerian tour that corresponds to a cancer genome that is consistent with the sequencing data. We apply our algorithm to five ovarian cancer genomes that were sequenced as part of The Cancer Genome Atlas. We identify numerous rearrangements, or structural variants, in these genomes, analyze reciprocal vs. non-reciprocal rearrangements, and identify rearrangements consistent with known mechanisms of duplication such as tandem duplications and breakage/fusion/bridge (B/F/B) cycles. CONCLUSIONS: We demonstrate that PREGO efficiently identifies complex and biologically relevant rearrangements in cancer genome sequencing data. An implementation of the PREGO algorithm is available at http://compbio.cs.brown.edu/software/. BioMed Central 2012-04-19 /pmc/articles/PMC3358655/ /pubmed/22537039 http://dx.doi.org/10.1186/1471-2105-13-S6-S10 Text en Copyright ©2012 Oesper et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Oesper, Layla Ritz, Anna Aerni, Sarah J Drebin, Ryan Raphael, Benjamin J Reconstructing cancer genomes from paired-end sequencing data
title	Reconstructing cancer genomes from paired-end sequencing data
title_full	Reconstructing cancer genomes from paired-end sequencing data
title_fullStr	Reconstructing cancer genomes from paired-end sequencing data
title_full_unstemmed	Reconstructing cancer genomes from paired-end sequencing data
title_short	Reconstructing cancer genomes from paired-end sequencing data
title_sort	reconstructing cancer genomes from paired-end sequencing data
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358655/ https://www.ncbi.nlm.nih.gov/pubmed/22537039 http://dx.doi.org/10.1186/1471-2105-13-S6-S10
work_keys_str_mv	AT oesperlayla reconstructingcancergenomesfrompairedendsequencingdata AT ritzanna reconstructingcancergenomesfrompairedendsequencingdata AT aernisarahj reconstructingcancergenomesfrompairedendsequencingdata AT drebinryan reconstructingcancergenomesfrompairedendsequencingdata AT raphaelbenjaminj reconstructingcancergenomesfrompairedendsequencingdata

Reconstructing cancer genomes from paired-end sequencing data

Ejemplares similares