Cargando…

Measuring, visualizing and diagnosing reference bias with biastools

A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Mao-Jan, Iyer, Sheila, Chen, Nae-Chyun, Langmead, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515925/
https://www.ncbi.nlm.nih.gov/pubmed/37745608
http://dx.doi.org/10.1101/2023.09.13.557552
_version_ 1785109045218115584
author Lin, Mao-Jan
Iyer, Sheila
Chen, Nae-Chyun
Langmead, Ben
author_facet Lin, Mao-Jan
Iyer, Sheila
Chen, Nae-Chyun
Langmead, Ben
author_sort Lin, Mao-Jan
collection PubMed
description A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, which analyzes and categorizes instances of reference bias. Biastools has different sets of functionality tailored to different scenarios, i.e. (a) when the donor genome is well-characterized and input reads are simulated, (b) when the donor is well-characterized and reads are real, and (c) when the donor is not well-characterized and reads are real. When possible, biastools divides instances of reference bias into categories according to their cause: bias due to loss, flux, or local misalignment. Biastools’s scan mode detects large-scale mapping artifacts due to structural variation and flaws in the reference representation. Our findings confirm that including more variants in a graph genome alignment method results in fewer reference biases. We also find that end-to-end alignment modes are effective in reducing bias at insertions and deletions, compared to local aligners that allow soft clipping. Finally, we use biastools to characterize the ways in which using the new telomere-to-telomere human reference can improve bias at a large scale. In short, biastools is a tool uniquely focused on reference bias, making it a valuable resource as the field continues to develop new aligners and pangenome representations to reduce bias.
format Online
Article
Text
id pubmed-10515925
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-105159252023-09-23 Measuring, visualizing and diagnosing reference bias with biastools Lin, Mao-Jan Iyer, Sheila Chen, Nae-Chyun Langmead, Ben bioRxiv Article A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, which analyzes and categorizes instances of reference bias. Biastools has different sets of functionality tailored to different scenarios, i.e. (a) when the donor genome is well-characterized and input reads are simulated, (b) when the donor is well-characterized and reads are real, and (c) when the donor is not well-characterized and reads are real. When possible, biastools divides instances of reference bias into categories according to their cause: bias due to loss, flux, or local misalignment. Biastools’s scan mode detects large-scale mapping artifacts due to structural variation and flaws in the reference representation. Our findings confirm that including more variants in a graph genome alignment method results in fewer reference biases. We also find that end-to-end alignment modes are effective in reducing bias at insertions and deletions, compared to local aligners that allow soft clipping. Finally, we use biastools to characterize the ways in which using the new telomere-to-telomere human reference can improve bias at a large scale. In short, biastools is a tool uniquely focused on reference bias, making it a valuable resource as the field continues to develop new aligners and pangenome representations to reduce bias. Cold Spring Harbor Laboratory 2023-09-16 /pmc/articles/PMC10515925/ /pubmed/37745608 http://dx.doi.org/10.1101/2023.09.13.557552 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Lin, Mao-Jan
Iyer, Sheila
Chen, Nae-Chyun
Langmead, Ben
Measuring, visualizing and diagnosing reference bias with biastools
title Measuring, visualizing and diagnosing reference bias with biastools
title_full Measuring, visualizing and diagnosing reference bias with biastools
title_fullStr Measuring, visualizing and diagnosing reference bias with biastools
title_full_unstemmed Measuring, visualizing and diagnosing reference bias with biastools
title_short Measuring, visualizing and diagnosing reference bias with biastools
title_sort measuring, visualizing and diagnosing reference bias with biastools
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515925/
https://www.ncbi.nlm.nih.gov/pubmed/37745608
http://dx.doi.org/10.1101/2023.09.13.557552
work_keys_str_mv AT linmaojan measuringvisualizinganddiagnosingreferencebiaswithbiastools
AT iyersheila measuringvisualizinganddiagnosingreferencebiaswithbiastools
AT chennaechyun measuringvisualizinganddiagnosingreferencebiaswithbiastools
AT langmeadben measuringvisualizinganddiagnosingreferencebiaswithbiastools