Cargando…
ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets
BACKGROUND: We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3904415/ https://www.ncbi.nlm.nih.gov/pubmed/24461215 http://dx.doi.org/10.1186/1751-0473-9-3 |
_version_ | 1782301210649296896 |
---|---|
author | Pope, Bernard J Nguyen-Dumont, Tú Hammet, Fleur Park, Daniel J |
author_facet | Pope, Bernard J Nguyen-Dumont, Tú Hammet, Fleur Park, Daniel J |
author_sort | Pope, Bernard J |
collection | PubMed |
description | BACKGROUND: We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. RESULTS: ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. METHODS: ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a ‘call’ to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening. CONCLUSIONS: ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users. |
format | Online Article Text |
id | pubmed-3904415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39044152014-01-29 ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets Pope, Bernard J Nguyen-Dumont, Tú Hammet, Fleur Park, Daniel J Source Code Biol Med Software Review BACKGROUND: We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. RESULTS: ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. METHODS: ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a ‘call’ to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening. CONCLUSIONS: ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users. BioMed Central 2014-01-24 /pmc/articles/PMC3904415/ /pubmed/24461215 http://dx.doi.org/10.1186/1751-0473-9-3 Text en Copyright © 2014 Pope et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Review Pope, Bernard J Nguyen-Dumont, Tú Hammet, Fleur Park, Daniel J ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title | ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title_full | ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title_fullStr | ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title_full_unstemmed | ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title_short | ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets |
title_sort | rover variant caller: read-pair overlap considerate variant-calling software applied to pcr-based massively parallel sequencing datasets |
topic | Software Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3904415/ https://www.ncbi.nlm.nih.gov/pubmed/24461215 http://dx.doi.org/10.1186/1751-0473-9-3 |
work_keys_str_mv | AT popebernardj rovervariantcallerreadpairoverlapconsideratevariantcallingsoftwareappliedtopcrbasedmassivelyparallelsequencingdatasets AT nguyendumonttu rovervariantcallerreadpairoverlapconsideratevariantcallingsoftwareappliedtopcrbasedmassivelyparallelsequencingdatasets AT hammetfleur rovervariantcallerreadpairoverlapconsideratevariantcallingsoftwareappliedtopcrbasedmassivelyparallelsequencingdatasets AT parkdanielj rovervariantcallerreadpairoverlapconsideratevariantcallingsoftwareappliedtopcrbasedmassivelyparallelsequencingdatasets |