Cargando…
FILER: a framework for harmonizing and querying large-scale functional genomics knowledge
Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of d...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8759563/ https://www.ncbi.nlm.nih.gov/pubmed/35047815 http://dx.doi.org/10.1093/nargab/lqab123 |
_version_ | 1784633127362101248 |
---|---|
author | Kuksa, Pavel P Leung, Yuk Yee Gangadharan, Prabhakaran Katanic, Zivadin Kleidermacher, Lauren Amlie-Wolf, Alexandre Lee, Chien-Yueh Qu, Liming Greenfest-Allen, Emily Valladares, Otto Wang, Li-San |
author_facet | Kuksa, Pavel P Leung, Yuk Yee Gangadharan, Prabhakaran Katanic, Zivadin Kleidermacher, Lauren Amlie-Wolf, Alexandre Lee, Chien-Yueh Qu, Liming Greenfest-Allen, Emily Valladares, Otto Wang, Li-San |
author_sort | Kuksa, Pavel P |
collection | PubMed |
description | Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of data sources, experimental assays, biological conditions/tissues/cell types and file formats. FILER (FunctIonaL gEnomics Repository) is a framework for querying large-scale genomics knowledge with a large, curated integrated catalog of harmonized functional genomic and annotation data coupled with a scalable genomic search and querying interface. FILER uniquely provides: (i) streamlined access to >50 000 harmonized, annotated genomic datasets across >20 integrated data sources, >1100 tissues/cell types and >20 experimental assays; (ii) a scalable genomic querying interface; and (iii) ability to analyze and annotate user’s experimental data. This rich resource spans >17 billion GRCh37/hg19 and GRCh38/hg38 genomic records. Our benchmark querying 7 × 10(9) hg19 FILER records shows FILER is highly scalable, with a sub-linear 32-fold increase in querying time when increasing the number of queries 1000-fold from 1000 to 1 000 000 intervals. Together, these features facilitate reproducible research and streamline integrating/querying large-scale genomic data within analyses/workflows. FILER can be deployed on cloud or local servers (https://bitbucket.org/wanglab-upenn/FILER) for integration with custom pipelines and is freely available (https://lisanwanglab.org/FILER). |
format | Online Article Text |
id | pubmed-8759563 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-87595632022-01-18 FILER: a framework for harmonizing and querying large-scale functional genomics knowledge Kuksa, Pavel P Leung, Yuk Yee Gangadharan, Prabhakaran Katanic, Zivadin Kleidermacher, Lauren Amlie-Wolf, Alexandre Lee, Chien-Yueh Qu, Liming Greenfest-Allen, Emily Valladares, Otto Wang, Li-San NAR Genom Bioinform Standard Article Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of data sources, experimental assays, biological conditions/tissues/cell types and file formats. FILER (FunctIonaL gEnomics Repository) is a framework for querying large-scale genomics knowledge with a large, curated integrated catalog of harmonized functional genomic and annotation data coupled with a scalable genomic search and querying interface. FILER uniquely provides: (i) streamlined access to >50 000 harmonized, annotated genomic datasets across >20 integrated data sources, >1100 tissues/cell types and >20 experimental assays; (ii) a scalable genomic querying interface; and (iii) ability to analyze and annotate user’s experimental data. This rich resource spans >17 billion GRCh37/hg19 and GRCh38/hg38 genomic records. Our benchmark querying 7 × 10(9) hg19 FILER records shows FILER is highly scalable, with a sub-linear 32-fold increase in querying time when increasing the number of queries 1000-fold from 1000 to 1 000 000 intervals. Together, these features facilitate reproducible research and streamline integrating/querying large-scale genomic data within analyses/workflows. FILER can be deployed on cloud or local servers (https://bitbucket.org/wanglab-upenn/FILER) for integration with custom pipelines and is freely available (https://lisanwanglab.org/FILER). Oxford University Press 2022-01-14 /pmc/articles/PMC8759563/ /pubmed/35047815 http://dx.doi.org/10.1093/nargab/lqab123 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Standard Article Kuksa, Pavel P Leung, Yuk Yee Gangadharan, Prabhakaran Katanic, Zivadin Kleidermacher, Lauren Amlie-Wolf, Alexandre Lee, Chien-Yueh Qu, Liming Greenfest-Allen, Emily Valladares, Otto Wang, Li-San FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title | FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title_full | FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title_fullStr | FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title_full_unstemmed | FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title_short | FILER: a framework for harmonizing and querying large-scale functional genomics knowledge |
title_sort | filer: a framework for harmonizing and querying large-scale functional genomics knowledge |
topic | Standard Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8759563/ https://www.ncbi.nlm.nih.gov/pubmed/35047815 http://dx.doi.org/10.1093/nargab/lqab123 |
work_keys_str_mv | AT kuksapavelp fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT leungyukyee fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT gangadharanprabhakaran fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT kataniczivadin fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT kleidermacherlauren fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT amliewolfalexandre fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT leechienyueh fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT quliming fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT greenfestallenemily fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT valladaresotto fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge AT wanglisan fileraframeworkforharmonizingandqueryinglargescalefunctionalgenomicsknowledge |