Cargando…
bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases
Rare diseases and conditions create unique challenges for genetic epidemiologists precisely because cases and samples are scarce. In recent years, whole-genome and whole-transcriptome sequencing (WGS/WTS) have eased the study of rare genetic variants. Paired WGS and WTS data are ideal, but logistica...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516001/ https://www.ncbi.nlm.nih.gov/pubmed/37745420 http://dx.doi.org/10.1101/2023.09.15.558026 |
_version_ | 1785109053469360128 |
---|---|
author | Huang, Yizhou Peter Harmon, Lauren Gardner, Eve Ma, Xiaotu Harsh, Josiah Xue, Zhaoyu Wen, Hong Ramos, Marcel Davis, Sean Triche, Timothy J. |
author_facet | Huang, Yizhou Peter Harmon, Lauren Gardner, Eve Ma, Xiaotu Harsh, Josiah Xue, Zhaoyu Wen, Hong Ramos, Marcel Davis, Sean Triche, Timothy J. |
author_sort | Huang, Yizhou Peter |
collection | PubMed |
description | Rare diseases and conditions create unique challenges for genetic epidemiologists precisely because cases and samples are scarce. In recent years, whole-genome and whole-transcriptome sequencing (WGS/WTS) have eased the study of rare genetic variants. Paired WGS and WTS data are ideal, but logistical and financial constraints often preclude generating paired WGS and WTS data. Thus, many databases contain a patchwork of specimens with either WGS or WTS data, but only a minority of samples have both. The NCI Genomic Data Commons facilitates controlled access to genomic and transcriptomic data for thousands of subjects, many with unpaired sequencing results. Local reanalysis of expressed variants across whole transcriptomes requires significant data storage, compute, and expertise. We developed the bamSliceR package to facilitate swift transition from aligned sequence reads to expressed variant characterization. bamSliceR leverages the NCI Genomic Data Commons API to query genomic sub-regions of aligned sequence reads from specimens identified through the robust Bioconductor ecosystem. We demonstrate how population-scale targeted genomic analysis can be completed using orders of magnitude fewer resources in this fashion, with minimal compute burden. We demonstrate pilot results from bamSliceR for the TARGET pediatric AML and BEAT-AML projects, where identification of rare but recurrent somatic variants directly yields biologically testable hypotheses. bamSliceR and its documentation are freely available on GitHub at https://github.com/trichelab/bamSliceR. |
format | Online Article Text |
id | pubmed-10516001 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-105160012023-09-23 bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases Huang, Yizhou Peter Harmon, Lauren Gardner, Eve Ma, Xiaotu Harsh, Josiah Xue, Zhaoyu Wen, Hong Ramos, Marcel Davis, Sean Triche, Timothy J. bioRxiv Article Rare diseases and conditions create unique challenges for genetic epidemiologists precisely because cases and samples are scarce. In recent years, whole-genome and whole-transcriptome sequencing (WGS/WTS) have eased the study of rare genetic variants. Paired WGS and WTS data are ideal, but logistical and financial constraints often preclude generating paired WGS and WTS data. Thus, many databases contain a patchwork of specimens with either WGS or WTS data, but only a minority of samples have both. The NCI Genomic Data Commons facilitates controlled access to genomic and transcriptomic data for thousands of subjects, many with unpaired sequencing results. Local reanalysis of expressed variants across whole transcriptomes requires significant data storage, compute, and expertise. We developed the bamSliceR package to facilitate swift transition from aligned sequence reads to expressed variant characterization. bamSliceR leverages the NCI Genomic Data Commons API to query genomic sub-regions of aligned sequence reads from specimens identified through the robust Bioconductor ecosystem. We demonstrate how population-scale targeted genomic analysis can be completed using orders of magnitude fewer resources in this fashion, with minimal compute burden. We demonstrate pilot results from bamSliceR for the TARGET pediatric AML and BEAT-AML projects, where identification of rare but recurrent somatic variants directly yields biologically testable hypotheses. bamSliceR and its documentation are freely available on GitHub at https://github.com/trichelab/bamSliceR. Cold Spring Harbor Laboratory 2023-09-17 /pmc/articles/PMC10516001/ /pubmed/37745420 http://dx.doi.org/10.1101/2023.09.15.558026 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Huang, Yizhou Peter Harmon, Lauren Gardner, Eve Ma, Xiaotu Harsh, Josiah Xue, Zhaoyu Wen, Hong Ramos, Marcel Davis, Sean Triche, Timothy J. bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title | bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title_full | bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title_fullStr | bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title_full_unstemmed | bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title_short | bamSliceR: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
title_sort | bamslicer: cross-cohort variant and allelic bias analysis for rare variants and rare diseases |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516001/ https://www.ncbi.nlm.nih.gov/pubmed/37745420 http://dx.doi.org/10.1101/2023.09.15.558026 |
work_keys_str_mv | AT huangyizhoupeter bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT harmonlauren bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT gardnereve bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT maxiaotu bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT harshjosiah bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT xuezhaoyu bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT wenhong bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT ramosmarcel bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT davissean bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases AT trichetimothyj bamslicercrosscohortvariantandallelicbiasanalysisforrarevariantsandrarediseases |