Cargando…

Transcriptional fates of human-specific segmental duplications in brain

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript informatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Dougherty, Max L., Underwood, Jason G., Nelson, Bradley J., Tseng, Elizabeth, Munson, Katherine M., Penn, Osnat, Nowakowski, Tomasz J., Pollen, Alex A., Eichler, Evan E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169893/
https://www.ncbi.nlm.nih.gov/pubmed/30228200
http://dx.doi.org/10.1101/gr.237610.118
_version_ 1783360579216343040
author Dougherty, Max L.
Underwood, Jason G.
Nelson, Bradley J.
Tseng, Elizabeth
Munson, Katherine M.
Penn, Osnat
Nowakowski, Tomasz J.
Pollen, Alex A.
Eichler, Evan E.
author_facet Dougherty, Max L.
Underwood, Jason G.
Nelson, Bradley J.
Tseng, Elizabeth
Munson, Katherine M.
Penn, Osnat
Nowakowski, Tomasz J.
Pollen, Alex A.
Eichler, Evan E.
author_sort Dougherty, Max L.
collection PubMed
description Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth–death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.
format Online
Article
Text
id pubmed-6169893
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-61698932019-04-01 Transcriptional fates of human-specific segmental duplications in brain Dougherty, Max L. Underwood, Jason G. Nelson, Bradley J. Tseng, Elizabeth Munson, Katherine M. Penn, Osnat Nowakowski, Tomasz J. Pollen, Alex A. Eichler, Evan E. Genome Res Method Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth–death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits. Cold Spring Harbor Laboratory Press 2018-10 /pmc/articles/PMC6169893/ /pubmed/30228200 http://dx.doi.org/10.1101/gr.237610.118 Text en © 2018 Dougherty et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Dougherty, Max L.
Underwood, Jason G.
Nelson, Bradley J.
Tseng, Elizabeth
Munson, Katherine M.
Penn, Osnat
Nowakowski, Tomasz J.
Pollen, Alex A.
Eichler, Evan E.
Transcriptional fates of human-specific segmental duplications in brain
title Transcriptional fates of human-specific segmental duplications in brain
title_full Transcriptional fates of human-specific segmental duplications in brain
title_fullStr Transcriptional fates of human-specific segmental duplications in brain
title_full_unstemmed Transcriptional fates of human-specific segmental duplications in brain
title_short Transcriptional fates of human-specific segmental duplications in brain
title_sort transcriptional fates of human-specific segmental duplications in brain
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6169893/
https://www.ncbi.nlm.nih.gov/pubmed/30228200
http://dx.doi.org/10.1101/gr.237610.118
work_keys_str_mv AT doughertymaxl transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT underwoodjasong transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT nelsonbradleyj transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT tsengelizabeth transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT munsonkatherinem transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT pennosnat transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT nowakowskitomaszj transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT pollenalexa transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain
AT eichlerevane transcriptionalfatesofhumanspecificsegmentalduplicationsinbrain