Cargando…
A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation d...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7498332/ https://www.ncbi.nlm.nih.gov/pubmed/32614390 http://dx.doi.org/10.1093/nar/gkaa550 |
_version_ | 1783583487477940224 |
---|---|
author | Turakhia, Yatish Chen, Heidi I Marcovitz, Amir Bejerano, Gill |
author_facet | Turakhia, Yatish Chen, Heidi I Marcovitz, Amir Bejerano, Gill |
author_sort | Turakhia, Yatish |
collection | PubMed |
description | Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life. |
format | Online Article Text |
id | pubmed-7498332 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-74983322020-09-23 A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals Turakhia, Yatish Chen, Heidi I Marcovitz, Amir Bejerano, Gill Nucleic Acids Res Methods Online Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life. Oxford University Press 2020-07-02 /pmc/articles/PMC7498332/ /pubmed/32614390 http://dx.doi.org/10.1093/nar/gkaa550 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Turakhia, Yatish Chen, Heidi I Marcovitz, Amir Bejerano, Gill A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title | A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title_full | A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title_fullStr | A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title_full_unstemmed | A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title_short | A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
title_sort | fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7498332/ https://www.ncbi.nlm.nih.gov/pubmed/32614390 http://dx.doi.org/10.1093/nar/gkaa550 |
work_keys_str_mv | AT turakhiayatish afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT chenheidii afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT marcovitzamir afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT bejeranogill afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT turakhiayatish fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT chenheidii fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT marcovitzamir fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals AT bejeranogill fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals |