Cargando…
A Framework for Automated Gene Selection in Genomic Applications
PURPOSE: An efficient framework to identify disease associated genes is needed to evaluate genomic data for both individuals with an unknown disease etiology and those undergoing genomic screening. Here, we propose a framework for gene selection used in genomic analyses, including applications limit...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487927/ https://www.ncbi.nlm.nih.gov/pubmed/34113001 http://dx.doi.org/10.1038/s41436-021-01213-x |
_version_ | 1784578053728370688 |
---|---|
author | Lazo de la Vega, L Yu, W Machini, K Austin-Tse, CA Hao, L Blout Zawatsky, CL Mason-Suares, H Green, RC Rehm, HL Lebo, MS |
author_facet | Lazo de la Vega, L Yu, W Machini, K Austin-Tse, CA Hao, L Blout Zawatsky, CL Mason-Suares, H Green, RC Rehm, HL Lebo, MS |
author_sort | Lazo de la Vega, L |
collection | PubMed |
description | PURPOSE: An efficient framework to identify disease associated genes is needed to evaluate genomic data for both individuals with an unknown disease etiology and those undergoing genomic screening. Here, we propose a framework for gene selection used in genomic analyses, including applications limited to genes with strong or established evidence levels and applications including genes with less or emerging evidence of disease association. METHODS: We extracted genes with evidence for gene-disease association from the Human Gene Mutation Database, Online Mendelian Inheritance in Man, and ClinVar to build a comprehensive gene list of 6,145 genes. Next, we applied stringent filters in conjunction with computationally curated evidence (DisGeNET) to create a restrictive list limited to 3,929 genes with stronger disease associations. RESULTS: When compared to manual gene curation efforts, including the Clinical Genome Resource, genes with strong or definitive disease associations are included in both gene lists at high percentages, while genes with limited evidence are largely removed. We further confirmed the utility of this approach in identifying pathogenic and likely pathogenic variants in 45 genomes. CONCLUSION: Our approach efficiently creates highly sensitive gene lists for genomic applications, while remaining dynamic and updatable, enabling time savings in genomic applications. |
format | Online Article Text |
id | pubmed-8487927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-84879272021-12-10 A Framework for Automated Gene Selection in Genomic Applications Lazo de la Vega, L Yu, W Machini, K Austin-Tse, CA Hao, L Blout Zawatsky, CL Mason-Suares, H Green, RC Rehm, HL Lebo, MS Genet Med Article PURPOSE: An efficient framework to identify disease associated genes is needed to evaluate genomic data for both individuals with an unknown disease etiology and those undergoing genomic screening. Here, we propose a framework for gene selection used in genomic analyses, including applications limited to genes with strong or established evidence levels and applications including genes with less or emerging evidence of disease association. METHODS: We extracted genes with evidence for gene-disease association from the Human Gene Mutation Database, Online Mendelian Inheritance in Man, and ClinVar to build a comprehensive gene list of 6,145 genes. Next, we applied stringent filters in conjunction with computationally curated evidence (DisGeNET) to create a restrictive list limited to 3,929 genes with stronger disease associations. RESULTS: When compared to manual gene curation efforts, including the Clinical Genome Resource, genes with strong or definitive disease associations are included in both gene lists at high percentages, while genes with limited evidence are largely removed. We further confirmed the utility of this approach in identifying pathogenic and likely pathogenic variants in 45 genomes. CONCLUSION: Our approach efficiently creates highly sensitive gene lists for genomic applications, while remaining dynamic and updatable, enabling time savings in genomic applications. 2021-06-10 2021-10 /pmc/articles/PMC8487927/ /pubmed/34113001 http://dx.doi.org/10.1038/s41436-021-01213-x Text en http://www.nature.com/authors/editorial_policies/license.html#termsUsers may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Lazo de la Vega, L Yu, W Machini, K Austin-Tse, CA Hao, L Blout Zawatsky, CL Mason-Suares, H Green, RC Rehm, HL Lebo, MS A Framework for Automated Gene Selection in Genomic Applications |
title | A Framework for Automated Gene Selection in Genomic Applications |
title_full | A Framework for Automated Gene Selection in Genomic Applications |
title_fullStr | A Framework for Automated Gene Selection in Genomic Applications |
title_full_unstemmed | A Framework for Automated Gene Selection in Genomic Applications |
title_short | A Framework for Automated Gene Selection in Genomic Applications |
title_sort | framework for automated gene selection in genomic applications |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8487927/ https://www.ncbi.nlm.nih.gov/pubmed/34113001 http://dx.doi.org/10.1038/s41436-021-01213-x |
work_keys_str_mv | AT lazodelavegal aframeworkforautomatedgeneselectioningenomicapplications AT yuw aframeworkforautomatedgeneselectioningenomicapplications AT machinik aframeworkforautomatedgeneselectioningenomicapplications AT austintseca aframeworkforautomatedgeneselectioningenomicapplications AT haol aframeworkforautomatedgeneselectioningenomicapplications AT bloutzawatskycl aframeworkforautomatedgeneselectioningenomicapplications AT masonsuaresh aframeworkforautomatedgeneselectioningenomicapplications AT greenrc aframeworkforautomatedgeneselectioningenomicapplications AT rehmhl aframeworkforautomatedgeneselectioningenomicapplications AT leboms aframeworkforautomatedgeneselectioningenomicapplications AT lazodelavegal frameworkforautomatedgeneselectioningenomicapplications AT yuw frameworkforautomatedgeneselectioningenomicapplications AT machinik frameworkforautomatedgeneselectioningenomicapplications AT austintseca frameworkforautomatedgeneselectioningenomicapplications AT haol frameworkforautomatedgeneselectioningenomicapplications AT bloutzawatskycl frameworkforautomatedgeneselectioningenomicapplications AT masonsuaresh frameworkforautomatedgeneselectioningenomicapplications AT greenrc frameworkforautomatedgeneselectioningenomicapplications AT rehmhl frameworkforautomatedgeneselectioningenomicapplications AT leboms frameworkforautomatedgeneselectioningenomicapplications |