Cargando…
stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads
Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcod...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8012683/ https://www.ncbi.nlm.nih.gov/pubmed/33815469 http://dx.doi.org/10.3389/fgene.2021.636239 |
_version_ | 1783673416634597376 |
---|---|
author | Guo, Junfu Shi, Chang Chen, Xi Wang, Ou Liu, Ping Yang, Huanming Xu, Xun Zhang, Wenwei Zhu, Hongmei |
author_facet | Guo, Junfu Shi, Chang Chen, Xi Wang, Ou Liu, Ping Yang, Huanming Xu, Xun Zhang, Wenwei Zhu, Hongmei |
author_sort | Guo, Junfu |
collection | PubMed |
description | Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness. |
format | Online Article Text |
id | pubmed-8012683 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-80126832021-04-02 stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads Guo, Junfu Shi, Chang Chen, Xi Wang, Ou Liu, Ping Yang, Huanming Xu, Xun Zhang, Wenwei Zhu, Hongmei Front Genet Genetics Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness. Frontiers Media S.A. 2021-03-18 /pmc/articles/PMC8012683/ /pubmed/33815469 http://dx.doi.org/10.3389/fgene.2021.636239 Text en Copyright © 2021 Guo, Shi, Chen, Wang, Liu, Yang, Xu, Zhang and Zhu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Guo, Junfu Shi, Chang Chen, Xi Wang, Ou Liu, Ping Yang, Huanming Xu, Xun Zhang, Wenwei Zhu, Hongmei stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title | stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title_full | stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title_fullStr | stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title_full_unstemmed | stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title_short | stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads |
title_sort | stlfrsv: a germline structural variant analysis pipeline using co-barcoded reads |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8012683/ https://www.ncbi.nlm.nih.gov/pubmed/33815469 http://dx.doi.org/10.3389/fgene.2021.636239 |
work_keys_str_mv | AT guojunfu stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT shichang stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT chenxi stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT wangou stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT liuping stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT yanghuanming stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT xuxun stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT zhangwenwei stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads AT zhuhongmei stlfrsvagermlinestructuralvariantanalysispipelineusingcobarcodedreads |