Cargando…

Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes

The identification and classification of genes and pseudogenes in duplicated regions still constitutes a challenge for standard automated genome annotation procedures. Using an integrated homology and orthology analysis independent of current gene annotation, we have identified 9,484 and 9,017 gene...

Descripción completa

Detalles Bibliográficos
Autores principales: Suyama, Mikita, Harrington, Eoghan, Bork, Peer, Torrents, David
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1484586/
https://www.ncbi.nlm.nih.gov/pubmed/16846249
http://dx.doi.org/10.1371/journal.pcbi.0020076
_version_ 1782128336710926336
author Suyama, Mikita
Harrington, Eoghan
Bork, Peer
Torrents, David
author_facet Suyama, Mikita
Harrington, Eoghan
Bork, Peer
Torrents, David
author_sort Suyama, Mikita
collection PubMed
description The identification and classification of genes and pseudogenes in duplicated regions still constitutes a challenge for standard automated genome annotation procedures. Using an integrated homology and orthology analysis independent of current gene annotation, we have identified 9,484 and 9,017 gene duplicates in human and mouse, respectively. On the basis of the integrity of their coding regions, we have classified them into functional and inactive duplicates, allowing us to define the first consistent and comprehensive collection of 1,811 human and 1,581 mouse unprocessed pseudogenes. Furthermore, of the total of 14,172 human and mouse duplicates predicted to be functional genes, as many as 420 are not included in current reference gene databases and therefore correspond to likely novel mammalian genes. Some of these correspond to partial duplicates with less than half of the length of the original source genes, yet they are conserved and syntenic among different mammalian lineages. The genes and unprocessed pseudogenes obtained here will enable further studies on the mechanisms involved in gene duplication as well as of the fate of duplicated genes.
format Text
id pubmed-1484586
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-14845862006-07-04 Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes Suyama, Mikita Harrington, Eoghan Bork, Peer Torrents, David PLoS Comput Biol Research Article The identification and classification of genes and pseudogenes in duplicated regions still constitutes a challenge for standard automated genome annotation procedures. Using an integrated homology and orthology analysis independent of current gene annotation, we have identified 9,484 and 9,017 gene duplicates in human and mouse, respectively. On the basis of the integrity of their coding regions, we have classified them into functional and inactive duplicates, allowing us to define the first consistent and comprehensive collection of 1,811 human and 1,581 mouse unprocessed pseudogenes. Furthermore, of the total of 14,172 human and mouse duplicates predicted to be functional genes, as many as 420 are not included in current reference gene databases and therefore correspond to likely novel mammalian genes. Some of these correspond to partial duplicates with less than half of the length of the original source genes, yet they are conserved and syntenic among different mammalian lineages. The genes and unprocessed pseudogenes obtained here will enable further studies on the mechanisms involved in gene duplication as well as of the fate of duplicated genes. Public Library of Science 2006-06 2006-06-30 /pmc/articles/PMC1484586/ /pubmed/16846249 http://dx.doi.org/10.1371/journal.pcbi.0020076 Text en © 2006 Suyama et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Suyama, Mikita
Harrington, Eoghan
Bork, Peer
Torrents, David
Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title_full Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title_fullStr Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title_full_unstemmed Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title_short Identification and Analysis of Genes and Pseudogenes within Duplicated Regions in the Human and Mouse Genomes
title_sort identification and analysis of genes and pseudogenes within duplicated regions in the human and mouse genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1484586/
https://www.ncbi.nlm.nih.gov/pubmed/16846249
http://dx.doi.org/10.1371/journal.pcbi.0020076
work_keys_str_mv AT suyamamikita identificationandanalysisofgenesandpseudogeneswithinduplicatedregionsinthehumanandmousegenomes
AT harringtoneoghan identificationandanalysisofgenesandpseudogeneswithinduplicatedregionsinthehumanandmousegenomes
AT borkpeer identificationandanalysisofgenesandpseudogeneswithinduplicatedregionsinthehumanandmousegenomes
AT torrentsdavid identificationandanalysisofgenesandpseudogeneswithinduplicatedregionsinthehumanandmousegenomes