Cargando…
Measuring, visualizing and diagnosing reference bias with biastools
A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, w...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515925/ https://www.ncbi.nlm.nih.gov/pubmed/37745608 http://dx.doi.org/10.1101/2023.09.13.557552 |
_version_ | 1785109045218115584 |
---|---|
author | Lin, Mao-Jan Iyer, Sheila Chen, Nae-Chyun Langmead, Ben |
author_facet | Lin, Mao-Jan Iyer, Sheila Chen, Nae-Chyun Langmead, Ben |
author_sort | Lin, Mao-Jan |
collection | PubMed |
description | A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, which analyzes and categorizes instances of reference bias. Biastools has different sets of functionality tailored to different scenarios, i.e. (a) when the donor genome is well-characterized and input reads are simulated, (b) when the donor is well-characterized and reads are real, and (c) when the donor is not well-characterized and reads are real. When possible, biastools divides instances of reference bias into categories according to their cause: bias due to loss, flux, or local misalignment. Biastools’s scan mode detects large-scale mapping artifacts due to structural variation and flaws in the reference representation. Our findings confirm that including more variants in a graph genome alignment method results in fewer reference biases. We also find that end-to-end alignment modes are effective in reducing bias at insertions and deletions, compared to local aligners that allow soft clipping. Finally, we use biastools to characterize the ways in which using the new telomere-to-telomere human reference can improve bias at a large scale. In short, biastools is a tool uniquely focused on reference bias, making it a valuable resource as the field continues to develop new aligners and pangenome representations to reduce bias. |
format | Online Article Text |
id | pubmed-10515925 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-105159252023-09-23 Measuring, visualizing and diagnosing reference bias with biastools Lin, Mao-Jan Iyer, Sheila Chen, Nae-Chyun Langmead, Ben bioRxiv Article A goal of recent alignment methods is to reduce reference bias, which occurs when reads containing non-reference alleles fail to align to their true point of origin. However, there is a lack of methods for systematically measuring, categorizing, and diagnosing reference bias. We present biastools, which analyzes and categorizes instances of reference bias. Biastools has different sets of functionality tailored to different scenarios, i.e. (a) when the donor genome is well-characterized and input reads are simulated, (b) when the donor is well-characterized and reads are real, and (c) when the donor is not well-characterized and reads are real. When possible, biastools divides instances of reference bias into categories according to their cause: bias due to loss, flux, or local misalignment. Biastools’s scan mode detects large-scale mapping artifacts due to structural variation and flaws in the reference representation. Our findings confirm that including more variants in a graph genome alignment method results in fewer reference biases. We also find that end-to-end alignment modes are effective in reducing bias at insertions and deletions, compared to local aligners that allow soft clipping. Finally, we use biastools to characterize the ways in which using the new telomere-to-telomere human reference can improve bias at a large scale. In short, biastools is a tool uniquely focused on reference bias, making it a valuable resource as the field continues to develop new aligners and pangenome representations to reduce bias. Cold Spring Harbor Laboratory 2023-09-16 /pmc/articles/PMC10515925/ /pubmed/37745608 http://dx.doi.org/10.1101/2023.09.13.557552 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Lin, Mao-Jan Iyer, Sheila Chen, Nae-Chyun Langmead, Ben Measuring, visualizing and diagnosing reference bias with biastools |
title | Measuring, visualizing and diagnosing reference bias with biastools |
title_full | Measuring, visualizing and diagnosing reference bias with biastools |
title_fullStr | Measuring, visualizing and diagnosing reference bias with biastools |
title_full_unstemmed | Measuring, visualizing and diagnosing reference bias with biastools |
title_short | Measuring, visualizing and diagnosing reference bias with biastools |
title_sort | measuring, visualizing and diagnosing reference bias with biastools |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515925/ https://www.ncbi.nlm.nih.gov/pubmed/37745608 http://dx.doi.org/10.1101/2023.09.13.557552 |
work_keys_str_mv | AT linmaojan measuringvisualizinganddiagnosingreferencebiaswithbiastools AT iyersheila measuringvisualizinganddiagnosingreferencebiaswithbiastools AT chennaechyun measuringvisualizinganddiagnosingreferencebiaswithbiastools AT langmeadben measuringvisualizinganddiagnosingreferencebiaswithbiastools |