Cargando…

G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods

Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases...

Descripción completa

Detalles Bibliográficos
Autores principales: Manconi, Andrea, Manca, Emanuele, Moscatelli, Marco, Gnocchi, Matteo, Orro, Alessandro, Armano, Giuliano, Milanesi, Luciano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4354384/
https://www.ncbi.nlm.nih.gov/pubmed/25806367
http://dx.doi.org/10.3389/fbioe.2015.00028
_version_ 1782360754132877312
author Manconi, Andrea
Manca, Emanuele
Moscatelli, Marco
Gnocchi, Matteo
Orro, Alessandro
Armano, Giuliano
Milanesi, Luciano
author_facet Manconi, Andrea
Manca, Emanuele
Moscatelli, Marco
Gnocchi, Matteo
Orro, Alessandro
Armano, Giuliano
Milanesi, Luciano
author_sort Manconi, Andrea
collection PubMed
description Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high-throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: (i) data preparation, (ii) data normalization, (iii) CNV regions identification, and (iv) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on graphics processing units (GPUs). In this article, we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low-quality read sequences, to mask low-quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals.
format Online
Article
Text
id pubmed-4354384
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-43543842015-03-24 G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods Manconi, Andrea Manca, Emanuele Moscatelli, Marco Gnocchi, Matteo Orro, Alessandro Armano, Giuliano Milanesi, Luciano Front Bioeng Biotechnol Bioengineering and Biotechnology Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high-throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: (i) data preparation, (ii) data normalization, (iii) CNV regions identification, and (iv) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on graphics processing units (GPUs). In this article, we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low-quality read sequences, to mask low-quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals. Frontiers Media S.A. 2015-03-10 /pmc/articles/PMC4354384/ /pubmed/25806367 http://dx.doi.org/10.3389/fbioe.2015.00028 Text en Copyright © 2015 Manconi, Manca, Moscatelli, Gnocchi, Orro, Armano and Milanesi. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Manconi, Andrea
Manca, Emanuele
Moscatelli, Marco
Gnocchi, Matteo
Orro, Alessandro
Armano, Giuliano
Milanesi, Luciano
G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title_full G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title_fullStr G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title_full_unstemmed G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title_short G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods
title_sort g-cnv: a gpu-based tool for preparing data to detect cnvs with read-depth methods
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4354384/
https://www.ncbi.nlm.nih.gov/pubmed/25806367
http://dx.doi.org/10.3389/fbioe.2015.00028
work_keys_str_mv AT manconiandrea gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT mancaemanuele gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT moscatellimarco gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT gnocchimatteo gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT orroalessandro gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT armanogiuliano gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods
AT milanesiluciano gcnvagpubasedtoolforpreparingdatatodetectcnvswithreaddepthmethods