Cargando…
TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions
MOTIVATION: Small insertion and deletion (sindel) of human genome has an important implication for human disease. One important mechanism for non-coding sindel (nc-sindel) to have an impact on human diseases and phenotypes is through the regulation of gene expression. Nevertheless, current sequencin...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900211/ https://www.ncbi.nlm.nih.gov/pubmed/36707993 http://dx.doi.org/10.1093/bioinformatics/btad060 |
_version_ | 1784882799971401728 |
---|---|
author | Agarwal, Aman Zhao, Fengdi Jiang, Yuchao Chen, Li |
author_facet | Agarwal, Aman Zhao, Fengdi Jiang, Yuchao Chen, Li |
author_sort | Agarwal, Aman |
collection | PubMed |
description | MOTIVATION: Small insertion and deletion (sindel) of human genome has an important implication for human disease. One important mechanism for non-coding sindel (nc-sindel) to have an impact on human diseases and phenotypes is through the regulation of gene expression. Nevertheless, current sequencing experiments may lack statistical power and resolution to pinpoint the functional sindel due to lower minor allele frequency or small effect size. As an alternative strategy, a supervised machine learning method can identify the otherwise masked functional sindels by predicting their regulatory potential directly. However, computational methods for annotating and predicting the regulatory sindels, especially in the non-coding regions, are underdeveloped. RESULTS: By leveraging labeled nc-sindels identified by cis-expression quantitative trait loci analyses across 44 tissues in Genotype-Tissue Expression (GTEx), and a compilation of both generic functional annotations and large-scale epigenomic profiles, we develop TIssue-specific Variant Annotation for Non-coding indel (TIVAN-indel), which is a supervised computational framework for predicting non-coding regulatory sindels. As a result, we demonstrate that TIVAN-indel achieves the best prediction performance in both with-tissue prediction and cross-tissue prediction. As an independent evaluation, we train TIVAN-indel from the ‘Whole Blood’ tissue in GTEx and test the model using 15 immune cell types from an independent study named Database of Immune Cell Expression. Lastly, we perform an enrichment analysis for both true and predicted sindels in key regulatory regions such as chromatin interactions, open chromatin regions and histone modification sites, and find biologically meaningful enrichment patterns. AVAILABILITY AND IMPLEMENTATION: https://github.com/lichen-lab/TIVAN-indel SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9900211 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-99002112023-02-07 TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions Agarwal, Aman Zhao, Fengdi Jiang, Yuchao Chen, Li Bioinformatics Original Paper MOTIVATION: Small insertion and deletion (sindel) of human genome has an important implication for human disease. One important mechanism for non-coding sindel (nc-sindel) to have an impact on human diseases and phenotypes is through the regulation of gene expression. Nevertheless, current sequencing experiments may lack statistical power and resolution to pinpoint the functional sindel due to lower minor allele frequency or small effect size. As an alternative strategy, a supervised machine learning method can identify the otherwise masked functional sindels by predicting their regulatory potential directly. However, computational methods for annotating and predicting the regulatory sindels, especially in the non-coding regions, are underdeveloped. RESULTS: By leveraging labeled nc-sindels identified by cis-expression quantitative trait loci analyses across 44 tissues in Genotype-Tissue Expression (GTEx), and a compilation of both generic functional annotations and large-scale epigenomic profiles, we develop TIssue-specific Variant Annotation for Non-coding indel (TIVAN-indel), which is a supervised computational framework for predicting non-coding regulatory sindels. As a result, we demonstrate that TIVAN-indel achieves the best prediction performance in both with-tissue prediction and cross-tissue prediction. As an independent evaluation, we train TIVAN-indel from the ‘Whole Blood’ tissue in GTEx and test the model using 15 immune cell types from an independent study named Database of Immune Cell Expression. Lastly, we perform an enrichment analysis for both true and predicted sindels in key regulatory regions such as chromatin interactions, open chromatin regions and histone modification sites, and find biologically meaningful enrichment patterns. AVAILABILITY AND IMPLEMENTATION: https://github.com/lichen-lab/TIVAN-indel SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2023-01-27 /pmc/articles/PMC9900211/ /pubmed/36707993 http://dx.doi.org/10.1093/bioinformatics/btad060 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Agarwal, Aman Zhao, Fengdi Jiang, Yuchao Chen, Li TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title | TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title_full | TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title_fullStr | TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title_full_unstemmed | TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title_short | TIVAN-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
title_sort | tivan-indel: a computational framework for annotating and predicting non-coding regulatory small insertions and deletions |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900211/ https://www.ncbi.nlm.nih.gov/pubmed/36707993 http://dx.doi.org/10.1093/bioinformatics/btad060 |
work_keys_str_mv | AT agarwalaman tivanindelacomputationalframeworkforannotatingandpredictingnoncodingregulatorysmallinsertionsanddeletions AT zhaofengdi tivanindelacomputationalframeworkforannotatingandpredictingnoncodingregulatorysmallinsertionsanddeletions AT jiangyuchao tivanindelacomputationalframeworkforannotatingandpredictingnoncodingregulatorysmallinsertionsanddeletions AT chenli tivanindelacomputationalframeworkforannotatingandpredictingnoncodingregulatorysmallinsertionsanddeletions |