Cargando…

Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans

The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular...

Descripción completa

Detalles Bibliográficos
Autores principales: Lesack, Kyle, Mariene, Grace M., Andersen, Erik C., Wasmuth, James D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803319/
https://www.ncbi.nlm.nih.gov/pubmed/36584177
http://dx.doi.org/10.1371/journal.pone.0278424
_version_ 1784861858830745600
author Lesack, Kyle
Mariene, Grace M.
Andersen, Erik C.
Wasmuth, James D.
author_facet Lesack, Kyle
Mariene, Grace M.
Andersen, Erik C.
Wasmuth, James D.
author_sort Lesack, Kyle
collection PubMed
description The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular tools remains unclear due to the limitations of existing benchmarks. Moreover, the performance of these tools for predicting variants in non-human genomes is less certain, as most tools were developed and benchmarked using data from the human genome. To evaluate the use of long-read data for the validation of short-read structural variant calls, the agreement between predictions from a short-read ensemble learning method and long-read tools were compared using real and simulated data from Caenorhabditis elegans. The results obtained from simulated data indicate that the best performing tool is contingent on the type and size of the variant, as well as the sequencing depth of coverage. These results also highlight the need for reference datasets generated from real data that can be used as ‘ground truth’ in benchmarks.
format Online
Article
Text
id pubmed-9803319
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98033192022-12-31 Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans Lesack, Kyle Mariene, Grace M. Andersen, Erik C. Wasmuth, James D. PLoS One Research Article The accurate characterization of structural variation is crucial for our understanding of how large chromosomal alterations affect phenotypic differences and contribute to genome evolution. Whole-genome sequencing is a popular approach for identifying structural variants, but the accuracy of popular tools remains unclear due to the limitations of existing benchmarks. Moreover, the performance of these tools for predicting variants in non-human genomes is less certain, as most tools were developed and benchmarked using data from the human genome. To evaluate the use of long-read data for the validation of short-read structural variant calls, the agreement between predictions from a short-read ensemble learning method and long-read tools were compared using real and simulated data from Caenorhabditis elegans. The results obtained from simulated data indicate that the best performing tool is contingent on the type and size of the variant, as well as the sequencing depth of coverage. These results also highlight the need for reference datasets generated from real data that can be used as ‘ground truth’ in benchmarks. Public Library of Science 2022-12-30 /pmc/articles/PMC9803319/ /pubmed/36584177 http://dx.doi.org/10.1371/journal.pone.0278424 Text en © 2022 Lesack et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Lesack, Kyle
Mariene, Grace M.
Andersen, Erik C.
Wasmuth, James D.
Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title_full Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title_fullStr Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title_full_unstemmed Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title_short Different structural variant prediction tools yield considerably different results in Caenorhabditis elegans
title_sort different structural variant prediction tools yield considerably different results in caenorhabditis elegans
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9803319/
https://www.ncbi.nlm.nih.gov/pubmed/36584177
http://dx.doi.org/10.1371/journal.pone.0278424
work_keys_str_mv AT lesackkyle differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans
AT marienegracem differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans
AT andersenerikc differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans
AT wasmuthjamesd differentstructuralvariantpredictiontoolsyieldconsiderablydifferentresultsincaenorhabditiselegans