Cargando…

PostSV: A Post–Processing Approach for Filtering Structural Variations

Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the s...

Descripción completa

Detalles Bibliográficos
Autores principales: Alzaid, Eman, Allali, Achraf El
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6974750/
https://www.ncbi.nlm.nih.gov/pubmed/32009779
http://dx.doi.org/10.1177/1177932219892957
_version_ 1783490159877029888
author Alzaid, Eman
Allali, Achraf El
author_facet Alzaid, Eman
Allali, Achraf El
author_sort Alzaid, Eman
collection PubMed
description Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification–based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms.
format Online
Article
Text
id pubmed-6974750
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-69747502020-01-31 PostSV: A Post–Processing Approach for Filtering Structural Variations Alzaid, Eman Allali, Achraf El Bioinform Biol Insights Original Research Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification–based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms. SAGE Publications 2020-01-20 /pmc/articles/PMC6974750/ /pubmed/32009779 http://dx.doi.org/10.1177/1177932219892957 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Alzaid, Eman
Allali, Achraf El
PostSV: A Post–Processing Approach for Filtering Structural Variations
title PostSV: A Post–Processing Approach for Filtering Structural Variations
title_full PostSV: A Post–Processing Approach for Filtering Structural Variations
title_fullStr PostSV: A Post–Processing Approach for Filtering Structural Variations
title_full_unstemmed PostSV: A Post–Processing Approach for Filtering Structural Variations
title_short PostSV: A Post–Processing Approach for Filtering Structural Variations
title_sort postsv: a post–processing approach for filtering structural variations
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6974750/
https://www.ncbi.nlm.nih.gov/pubmed/32009779
http://dx.doi.org/10.1177/1177932219892957
work_keys_str_mv AT alzaideman postsvapostprocessingapproachforfilteringstructuralvariations
AT allaliachrafel postsvapostprocessingapproachforfilteringstructuralvariations