Cargando…
Altools: a user friendly NGS data analyser
BACKGROUND: Genotyping by re-sequencing has become a standard approach to estimate single nucleotide polymorphism (SNP) diversity, haplotype structure and the biodiversity and has been defined as an efficient approach to address geographical population genomics of several model species. To access co...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756442/ https://www.ncbi.nlm.nih.gov/pubmed/26883204 http://dx.doi.org/10.1186/s13062-016-0110-0 |
_version_ | 1782416333456015360 |
---|---|
author | Camiolo, Salvatore Sablok, Gaurav Porceddu, Andrea |
author_facet | Camiolo, Salvatore Sablok, Gaurav Porceddu, Andrea |
author_sort | Camiolo, Salvatore |
collection | PubMed |
description | BACKGROUND: Genotyping by re-sequencing has become a standard approach to estimate single nucleotide polymorphism (SNP) diversity, haplotype structure and the biodiversity and has been defined as an efficient approach to address geographical population genomics of several model species. To access core SNPs and insertion/deletion polymorphisms (indels), and to infer the phyletic patterns of speciation, most such approaches map short reads to the reference genome. Variant calling is important to establish patterns of genome-wide association studies (GWAS) for quantitative trait loci (QTLs), and to determine the population and haplotype structure based on SNPs, thus allowing content-dependent trait and evolutionary analysis. Several tools have been developed to investigate such polymorphisms as well as more complex genomic rearrangements such as copy number variations, presence/absence variations and large deletions. The programs available for this purpose have different strengths (e.g. accuracy, sensitivity and specificity) and weaknesses (e.g. low computation speed, complex installation procedure and absence of a user-friendly interface). Here we introduce Altools, a software package that is easy to install and use, which allows the precise detection of polymorphisms and structural variations. RESULTS: Altools uses the BWA/SAMtools/VarScan pipeline to call SNPs and indels, and the dnaCopy algorithm to achieve genome segmentation according to local coverage differences in order to identify copy number variations. It also uses insert size information from the alignment of paired-end reads and detects potential large deletions. A double mapping approach (BWA/BLASTn) identifies precise breakpoints while ensuring rapid elaboration. Finally, Altools implements several processes that yield deeper insight into the genes affected by the detected polymorphisms. Altools was used to analyse both simulated and real next-generation sequencing (NGS) data and performed satisfactorily in terms of positive predictive values, sensitivity, the identification of large deletion breakpoints and copy number detection. CONCLUSIONS: Altools is fast, reliable and easy to use for the mining of NGS data. The software package also attempts to link identified polymorphisms and structural variants to their biological functions thus providing more valuable information than similar tools. REVIEWERS: This article was reviewed by Prof. Lee and Prof. Raghava. OPEN PEER REVIEW: Reviewed by Prof. Lee and Prof. Raghava. For the full reviews, please go to the Reviewers’ comments section. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13062-016-0110-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4756442 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47564422016-02-18 Altools: a user friendly NGS data analyser Camiolo, Salvatore Sablok, Gaurav Porceddu, Andrea Biol Direct Application Note BACKGROUND: Genotyping by re-sequencing has become a standard approach to estimate single nucleotide polymorphism (SNP) diversity, haplotype structure and the biodiversity and has been defined as an efficient approach to address geographical population genomics of several model species. To access core SNPs and insertion/deletion polymorphisms (indels), and to infer the phyletic patterns of speciation, most such approaches map short reads to the reference genome. Variant calling is important to establish patterns of genome-wide association studies (GWAS) for quantitative trait loci (QTLs), and to determine the population and haplotype structure based on SNPs, thus allowing content-dependent trait and evolutionary analysis. Several tools have been developed to investigate such polymorphisms as well as more complex genomic rearrangements such as copy number variations, presence/absence variations and large deletions. The programs available for this purpose have different strengths (e.g. accuracy, sensitivity and specificity) and weaknesses (e.g. low computation speed, complex installation procedure and absence of a user-friendly interface). Here we introduce Altools, a software package that is easy to install and use, which allows the precise detection of polymorphisms and structural variations. RESULTS: Altools uses the BWA/SAMtools/VarScan pipeline to call SNPs and indels, and the dnaCopy algorithm to achieve genome segmentation according to local coverage differences in order to identify copy number variations. It also uses insert size information from the alignment of paired-end reads and detects potential large deletions. A double mapping approach (BWA/BLASTn) identifies precise breakpoints while ensuring rapid elaboration. Finally, Altools implements several processes that yield deeper insight into the genes affected by the detected polymorphisms. Altools was used to analyse both simulated and real next-generation sequencing (NGS) data and performed satisfactorily in terms of positive predictive values, sensitivity, the identification of large deletion breakpoints and copy number detection. CONCLUSIONS: Altools is fast, reliable and easy to use for the mining of NGS data. The software package also attempts to link identified polymorphisms and structural variants to their biological functions thus providing more valuable information than similar tools. REVIEWERS: This article was reviewed by Prof. Lee and Prof. Raghava. OPEN PEER REVIEW: Reviewed by Prof. Lee and Prof. Raghava. For the full reviews, please go to the Reviewers’ comments section. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13062-016-0110-0) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-17 /pmc/articles/PMC4756442/ /pubmed/26883204 http://dx.doi.org/10.1186/s13062-016-0110-0 Text en © Camiolo et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Application Note Camiolo, Salvatore Sablok, Gaurav Porceddu, Andrea Altools: a user friendly NGS data analyser |
title | Altools: a user friendly NGS data analyser |
title_full | Altools: a user friendly NGS data analyser |
title_fullStr | Altools: a user friendly NGS data analyser |
title_full_unstemmed | Altools: a user friendly NGS data analyser |
title_short | Altools: a user friendly NGS data analyser |
title_sort | altools: a user friendly ngs data analyser |
topic | Application Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756442/ https://www.ncbi.nlm.nih.gov/pubmed/26883204 http://dx.doi.org/10.1186/s13062-016-0110-0 |
work_keys_str_mv | AT camiolosalvatore altoolsauserfriendlyngsdataanalyser AT sablokgaurav altoolsauserfriendlyngsdataanalyser AT porcedduandrea altoolsauserfriendlyngsdataanalyser |