Cargando…

A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals

Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation d...

Descripción completa

Detalles Bibliográficos
Autores principales: Turakhia, Yatish, Chen, Heidi I, Marcovitz, Amir, Bejerano, Gill
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7498332/
https://www.ncbi.nlm.nih.gov/pubmed/32614390
http://dx.doi.org/10.1093/nar/gkaa550
_version_ 1783583487477940224
author Turakhia, Yatish
Chen, Heidi I
Marcovitz, Amir
Bejerano, Gill
author_facet Turakhia, Yatish
Chen, Heidi I
Marcovitz, Amir
Bejerano, Gill
author_sort Turakhia, Yatish
collection PubMed
description Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life.
format Online
Article
Text
id pubmed-7498332
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-74983322020-09-23 A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals Turakhia, Yatish Chen, Heidi I Marcovitz, Amir Bejerano, Gill Nucleic Acids Res Methods Online Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life. Oxford University Press 2020-07-02 /pmc/articles/PMC7498332/ /pubmed/32614390 http://dx.doi.org/10.1093/nar/gkaa550 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Turakhia, Yatish
Chen, Heidi I
Marcovitz, Amir
Bejerano, Gill
A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title_full A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title_fullStr A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title_full_unstemmed A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title_short A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
title_sort fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7498332/
https://www.ncbi.nlm.nih.gov/pubmed/32614390
http://dx.doi.org/10.1093/nar/gkaa550
work_keys_str_mv AT turakhiayatish afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT chenheidii afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT marcovitzamir afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT bejeranogill afullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT turakhiayatish fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT chenheidii fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT marcovitzamir fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals
AT bejeranogill fullyautomatedmethoddiscoverslossofmouselethalandhumanmonogenicdiseasegenesin58mammals