Cargando…
PostSV: A Post–Processing Approach for Filtering Structural Variations
Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the s...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6974750/ https://www.ncbi.nlm.nih.gov/pubmed/32009779 http://dx.doi.org/10.1177/1177932219892957 |
_version_ | 1783490159877029888 |
---|---|
author | Alzaid, Eman Allali, Achraf El |
author_facet | Alzaid, Eman Allali, Achraf El |
author_sort | Alzaid, Eman |
collection | PubMed |
description | Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification–based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms. |
format | Online Article Text |
id | pubmed-6974750 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-69747502020-01-31 PostSV: A Post–Processing Approach for Filtering Structural Variations Alzaid, Eman Allali, Achraf El Bioinform Biol Insights Original Research Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification–based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms. SAGE Publications 2020-01-20 /pmc/articles/PMC6974750/ /pubmed/32009779 http://dx.doi.org/10.1177/1177932219892957 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Original Research Alzaid, Eman Allali, Achraf El PostSV: A Post–Processing Approach for Filtering Structural Variations |
title | PostSV: A Post–Processing Approach for Filtering Structural
Variations |
title_full | PostSV: A Post–Processing Approach for Filtering Structural
Variations |
title_fullStr | PostSV: A Post–Processing Approach for Filtering Structural
Variations |
title_full_unstemmed | PostSV: A Post–Processing Approach for Filtering Structural
Variations |
title_short | PostSV: A Post–Processing Approach for Filtering Structural
Variations |
title_sort | postsv: a post–processing approach for filtering structural
variations |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6974750/ https://www.ncbi.nlm.nih.gov/pubmed/32009779 http://dx.doi.org/10.1177/1177932219892957 |
work_keys_str_mv | AT alzaideman postsvapostprocessingapproachforfilteringstructuralvariations AT allaliachrafel postsvapostprocessingapproachforfilteringstructuralvariations |