Cargando…
Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803319/ https://www.ncbi.nlm.nih.gov/pubmed/36584177 http://dx.doi.org/10.1371/journal.pone.0278424 |
_version_ | 1784861858830745600 |
---|---|
author | Lesack, Kyle Mariene, Grace M. Andersen, Erik C. Wasmuth, James D. |
author_facet | Lesack, Kyle Mariene, Grace M. Andersen, Erik C. Wasmuth, James D. |
author_sort | Lesack, Kyle |
collection | PubMed |
description | The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular tools remains unclear due to the limitations of existing benchmarks. Moreover, the performance of these tools for predicting variants in non-human genomes is less certain, as most tools were developed and benchmarked using data from the human genome. To evaluate the use of long-read data for the validation of short-read structural variant calls, the agreement between predictions from a short-read ensemble learning method and long-read tools were compared using real and simulated data from Caenorhabditis elegans. The results obtained from simulated data indicate that the best performing tool is contingent on the type and size of the variant, as well as the sequencing depth of coverage. These results also highlight the need for reference datasets generated from real data that can be used as ‘ground truth’ in benchmarks. |
format | Online Article Text |
id | pubmed-9803319 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-98033192022-12-31 Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans Lesack, Kyle Mariene, Grace M. Andersen, Erik C. Wasmuth, James D. PLoS One Research Article The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular tools remains unclear due to the limitations of existing benchmarks. Moreover, the performance of these tools for predicting variants in non-human genomes is less certain, as most tools were developed and benchmarked using data from the human genome. To evaluate the use of long-read data for the validation of short-read structural variant calls, the agreement between predictions from a short-read ensemble learning method and long-read tools were compared using real and simulated data from Caenorhabditis elegans. The results obtained from simulated data indicate that the best performing tool is contingent on the type and size of the variant, as well as the sequencing depth of coverage. These results also highlight the need for reference datasets generated from real data that can be used as ‘ground truth’ in benchmarks. Public Library of Science 2022-12-30 /pmc/articles/PMC9803319/ /pubmed/36584177 http://dx.doi.org/10.1371/journal.pone.0278424 Text en © 2022 Lesack et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Lesack, Kyle Mariene, Grace M. Andersen, Erik C. Wasmuth, James D. Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title | Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title_full | Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title_fullStr | Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title_full_unstemmed | Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title_short | Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans |
title_sort | different structural variant prediction tools yield considerably different results in caenorhabditis elegans |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803319/ https://www.ncbi.nlm.nih.gov/pubmed/36584177 http://dx.doi.org/10.1371/journal.pone.0278424 |
work_keys_str_mv | AT lesackkyle differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans AT marienegracem differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans AT andersenerikc differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans AT wasmuthjamesd differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans |