Cargando…

Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data

BACKGROUND: The potential utility of the Burrows-Wheeler transform (BWT) of a large amount of short-read data ("reads") has not been fully studied. The BWT basically serves as a lossless dictionary of reads, unlike the heuristic and lossy reads-to-genome mapping results conventionally obta...

Descripción completa

Detalles Bibliográficos
Autores principales: Kimura, Kouichi, Koike, Asako
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4708002/
https://www.ncbi.nlm.nih.gov/pubmed/26678411
http://dx.doi.org/10.1186/1471-2105-16-S18-S5
_version_ 1782409386124115968
author Kimura, Kouichi
Koike, Asako
author_facet Kimura, Kouichi
Koike, Asako
author_sort Kimura, Kouichi
collection PubMed
description BACKGROUND: The potential utility of the Burrows-Wheeler transform (BWT) of a large amount of short-read data ("reads") has not been fully studied. The BWT basically serves as a lossless dictionary of reads, unlike the heuristic and lossy reads-to-genome mapping results conventionally obtained in the first step of sequence analysis. Thus, it is naturally expected to lead to development of sensitive methods for analysis of short-read data. Recently, one of the most active areas of research in sequence analysis is sensitive detection of rare genomic rearrangements from whole-genome sequencing (WGS) data of heterogeneous cancer samples. The application the BWT of reads to the analysis of genomic rearrangements is addressed in this study. RESULTS: A new method for sensitive detection of genomic rearrangements by using the BWT of reads in the following three steps is proposed: first, breakpoint regions, which contain breakpoints and are joined together by rearrangement, are predicted from the distribution of so-called discordant pairs by using a kind of the conjugate gradient method; second, reads partially matching the breakpoint regions are collected from the BWT of reads; and third, breakpoints are detected as branching points among the collected reads, and their precise positions are determined. The method was experimentally implemented, and its performance (i.e., sensitivity and specificity) was evaluated by using simulated data with known artificial rearrangements. It was applied to publicly available real biological WGS data of cancer patients, and the detection results were compared with published results. CONCLUSIONS: Serving as a lossless dictionary of reads, the BWT of short reads enables sensitive analysis of genomic rearrangements in heterogeneous cancer-genome samples when used in conjunction with breakpoint-region predictions based on a conjugate gradient method.
format Online
Article
Text
id pubmed-4708002
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47080022016-01-20 Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data Kimura, Kouichi Koike, Asako BMC Bioinformatics Research BACKGROUND: The potential utility of the Burrows-Wheeler transform (BWT) of a large amount of short-read data ("reads") has not been fully studied. The BWT basically serves as a lossless dictionary of reads, unlike the heuristic and lossy reads-to-genome mapping results conventionally obtained in the first step of sequence analysis. Thus, it is naturally expected to lead to development of sensitive methods for analysis of short-read data. Recently, one of the most active areas of research in sequence analysis is sensitive detection of rare genomic rearrangements from whole-genome sequencing (WGS) data of heterogeneous cancer samples. The application the BWT of reads to the analysis of genomic rearrangements is addressed in this study. RESULTS: A new method for sensitive detection of genomic rearrangements by using the BWT of reads in the following three steps is proposed: first, breakpoint regions, which contain breakpoints and are joined together by rearrangement, are predicted from the distribution of so-called discordant pairs by using a kind of the conjugate gradient method; second, reads partially matching the breakpoint regions are collected from the BWT of reads; and third, breakpoints are detected as branching points among the collected reads, and their precise positions are determined. The method was experimentally implemented, and its performance (i.e., sensitivity and specificity) was evaluated by using simulated data with known artificial rearrangements. It was applied to publicly available real biological WGS data of cancer patients, and the detection results were compared with published results. CONCLUSIONS: Serving as a lossless dictionary of reads, the BWT of short reads enables sensitive analysis of genomic rearrangements in heterogeneous cancer-genome samples when used in conjunction with breakpoint-region predictions based on a conjugate gradient method. BioMed Central 2015-12-09 /pmc/articles/PMC4708002/ /pubmed/26678411 http://dx.doi.org/10.1186/1471-2105-16-S18-S5 Text en Copyright © 2015 Kimura et al.; http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kimura, Kouichi
Koike, Asako
Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title_full Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title_fullStr Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title_full_unstemmed Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title_short Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
title_sort analysis of genomic rearrangements by using the burrows-wheeler transform of short-read data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4708002/
https://www.ncbi.nlm.nih.gov/pubmed/26678411
http://dx.doi.org/10.1186/1471-2105-16-S18-S5
work_keys_str_mv AT kimurakouichi analysisofgenomicrearrangementsbyusingtheburrowswheelertransformofshortreaddata
AT koikeasako analysisofgenomicrearrangementsbyusingtheburrowswheelertransformofshortreaddata