Cargando…

SVhound: detection of regions that harbor yet undetected structural variation

BACKGROUND: Recent population studies are ever growing in number of samples to investigate the diversity of a population or species. These studies reveal new polymorphism that lead to important insights into the mechanisms of evolution, but are also important for the interpretation of these variatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Paulin, Luis F., Raveendran, Muthuswamy, Harris, R. Alan, Rogers, Jeffrey, von Haeseler, Arndt, Sedlazeck, Fritz J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9854228/
https://www.ncbi.nlm.nih.gov/pubmed/36670361
http://dx.doi.org/10.1186/s12859-022-05046-6
_version_ 1784873072246915072
author Paulin, Luis F.
Raveendran, Muthuswamy
Harris, R. Alan
Rogers, Jeffrey
von Haeseler, Arndt
Sedlazeck, Fritz J.
author_facet Paulin, Luis F.
Raveendran, Muthuswamy
Harris, R. Alan
Rogers, Jeffrey
von Haeseler, Arndt
Sedlazeck, Fritz J.
author_sort Paulin, Luis F.
collection PubMed
description BACKGROUND: Recent population studies are ever growing in number of samples to investigate the diversity of a population or species. These studies reveal new polymorphism that lead to important insights into the mechanisms of evolution, but are also important for the interpretation of these variations. Nevertheless, while the full catalog of variations across entire species remains unknown, we can predict which regions harbor additional not yet detected variations and investigate their properties, thereby enhancing the analysis for potentially missed variants. RESULTS: To achieve this we developed SVhound (https://github.com/lfpaulin/SVhound), which based on a population level SVs dataset can predict regions that harbor unseen SV alleles. We tested SVhound using subsets of the 1000 genomes project data and showed that its correlation (average correlation of 2800 tests r = 0.7136) is high to the full data set. Next, we utilized SVhound to investigate potentially missed or understudied regions across 1KGP and CCDG. Lastly we also apply SVhound on a small and novel SV call set for rhesus macaque (Macaca mulatta) and discuss the impact and choice of parameters for SVhound. CONCLUSIONS: SVhound is a unique method to identify potential regions that harbor hidden diversity in model and non model organisms and can also be potentially used to ensure high quality of SV call sets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05046-6.
format Online
Article
Text
id pubmed-9854228
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-98542282023-01-21 SVhound: detection of regions that harbor yet undetected structural variation Paulin, Luis F. Raveendran, Muthuswamy Harris, R. Alan Rogers, Jeffrey von Haeseler, Arndt Sedlazeck, Fritz J. BMC Bioinformatics Research Article BACKGROUND: Recent population studies are ever growing in number of samples to investigate the diversity of a population or species. These studies reveal new polymorphism that lead to important insights into the mechanisms of evolution, but are also important for the interpretation of these variations. Nevertheless, while the full catalog of variations across entire species remains unknown, we can predict which regions harbor additional not yet detected variations and investigate their properties, thereby enhancing the analysis for potentially missed variants. RESULTS: To achieve this we developed SVhound (https://github.com/lfpaulin/SVhound), which based on a population level SVs dataset can predict regions that harbor unseen SV alleles. We tested SVhound using subsets of the 1000 genomes project data and showed that its correlation (average correlation of 2800 tests r = 0.7136) is high to the full data set. Next, we utilized SVhound to investigate potentially missed or understudied regions across 1KGP and CCDG. Lastly we also apply SVhound on a small and novel SV call set for rhesus macaque (Macaca mulatta) and discuss the impact and choice of parameters for SVhound. CONCLUSIONS: SVhound is a unique method to identify potential regions that harbor hidden diversity in model and non model organisms and can also be potentially used to ensure high quality of SV call sets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05046-6. BioMed Central 2023-01-20 /pmc/articles/PMC9854228/ /pubmed/36670361 http://dx.doi.org/10.1186/s12859-022-05046-6 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Paulin, Luis F.
Raveendran, Muthuswamy
Harris, R. Alan
Rogers, Jeffrey
von Haeseler, Arndt
Sedlazeck, Fritz J.
SVhound: detection of regions that harbor yet undetected structural variation
title SVhound: detection of regions that harbor yet undetected structural variation
title_full SVhound: detection of regions that harbor yet undetected structural variation
title_fullStr SVhound: detection of regions that harbor yet undetected structural variation
title_full_unstemmed SVhound: detection of regions that harbor yet undetected structural variation
title_short SVhound: detection of regions that harbor yet undetected structural variation
title_sort svhound: detection of regions that harbor yet undetected structural variation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9854228/
https://www.ncbi.nlm.nih.gov/pubmed/36670361
http://dx.doi.org/10.1186/s12859-022-05046-6
work_keys_str_mv AT paulinluisf svhounddetectionofregionsthatharboryetundetectedstructuralvariation
AT raveendranmuthuswamy svhounddetectionofregionsthatharboryetundetectedstructuralvariation
AT harrisralan svhounddetectionofregionsthatharboryetundetectedstructuralvariation
AT rogersjeffrey svhounddetectionofregionsthatharboryetundetectedstructuralvariation
AT vonhaeselerarndt svhounddetectionofregionsthatharboryetundetectedstructuralvariation
AT sedlazeckfritzj svhounddetectionofregionsthatharboryetundetectedstructuralvariation