Cargando…

Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example

BACKGROUND: Blocks of duplicated genomic DNA sequence longer than 1000 base pairs are known as low copy repeats (LCRs). Identified by their sequence similarity, LCRs are abundant in the human genome, and are interesting because they may represent recent adaptive events, or potential future adaptive...

Descripción completa

Detalles Bibliográficos
Autores principales: Bradley, Michael E, Benner, Steven A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555591/
https://www.ncbi.nlm.nih.gov/pubmed/15752422
http://dx.doi.org/10.1186/1471-2148-5-22
_version_ 1782122545286217728
author Bradley, Michael E
Benner, Steven A
author_facet Bradley, Michael E
Benner, Steven A
author_sort Bradley, Michael E
collection PubMed
description BACKGROUND: Blocks of duplicated genomic DNA sequence longer than 1000 base pairs are known as low copy repeats (LCRs). Identified by their sequence similarity, LCRs are abundant in the human genome, and are interesting because they may represent recent adaptive events, or potential future adaptive opportunities within the human lineage. Sequence analysis tools are needed, however, to decide whether these interpretations are likely, whether a particular set of LCRs represents nearly neutral drift creating junk DNA, or whether the appearance of LCRs reflects assembly error. Here we investigate an LCR family containing the sulfotransferase (SULT) 1A genes involved in drug metabolism, cancer, hormone regulation, and neurotransmitter biology as a first step for defining the problems that those tools must manage. RESULTS: Sequence analysis here identified a fourth sulfotransferase gene, which may be transcriptionally active, located on human chromosome 16. Four regions of genomic sequence containing the four human SULT1A paralogs defined a new LCR family. The stem hominoid SULT1A progenitor locus was identified by comparative genomics involving complete human and rodent genomes, and a draft chimpanzee genome. SULT1A expansion in hominoid genomes was followed by positive selection acting on specific protein sites. This episode of adaptive evolution appears to be responsible for the dopamine sulfonation function of some SULT enzymes. Each of the conclusions that this bioinformatic analysis generated using data that has uncertain reliability (such as that from the chimpanzee genome sequencing project) has been confirmed experimentally or by a "finished" chromosome 16 assembly, both of which were published after the submission of this manuscript. CONCLUSION: SULT1A genes expanded from one to four copies in hominoids during intra-chromosomal LCR duplications, including (apparently) one after the divergence of chimpanzees and humans. Thus, LCRs may provide a means for amplifying genes (and other genetic elements) that are adaptively useful. Being located on and among LCRs, however, could make the human SULT1A genes susceptible to further duplications or deletions resulting in 'genomic diseases' for some individuals. Pharmacogenomic studies of SULT1Asingle nucleotide polymorphisms, therefore, should also consider examining SULT1A copy number variability when searching for genotype-phenotype associations. The latest duplication is, however, only a substantiated hypothesis; an alternative explanation, disfavored by the majority of evidence, is that the duplication is an artifact of incorrect genome assembly.
format Text
id pubmed-555591
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5555912005-03-28 Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example Bradley, Michael E Benner, Steven A BMC Evol Biol Research Article BACKGROUND: Blocks of duplicated genomic DNA sequence longer than 1000 base pairs are known as low copy repeats (LCRs). Identified by their sequence similarity, LCRs are abundant in the human genome, and are interesting because they may represent recent adaptive events, or potential future adaptive opportunities within the human lineage. Sequence analysis tools are needed, however, to decide whether these interpretations are likely, whether a particular set of LCRs represents nearly neutral drift creating junk DNA, or whether the appearance of LCRs reflects assembly error. Here we investigate an LCR family containing the sulfotransferase (SULT) 1A genes involved in drug metabolism, cancer, hormone regulation, and neurotransmitter biology as a first step for defining the problems that those tools must manage. RESULTS: Sequence analysis here identified a fourth sulfotransferase gene, which may be transcriptionally active, located on human chromosome 16. Four regions of genomic sequence containing the four human SULT1A paralogs defined a new LCR family. The stem hominoid SULT1A progenitor locus was identified by comparative genomics involving complete human and rodent genomes, and a draft chimpanzee genome. SULT1A expansion in hominoid genomes was followed by positive selection acting on specific protein sites. This episode of adaptive evolution appears to be responsible for the dopamine sulfonation function of some SULT enzymes. Each of the conclusions that this bioinformatic analysis generated using data that has uncertain reliability (such as that from the chimpanzee genome sequencing project) has been confirmed experimentally or by a "finished" chromosome 16 assembly, both of which were published after the submission of this manuscript. CONCLUSION: SULT1A genes expanded from one to four copies in hominoids during intra-chromosomal LCR duplications, including (apparently) one after the divergence of chimpanzees and humans. Thus, LCRs may provide a means for amplifying genes (and other genetic elements) that are adaptively useful. Being located on and among LCRs, however, could make the human SULT1A genes susceptible to further duplications or deletions resulting in 'genomic diseases' for some individuals. Pharmacogenomic studies of SULT1Asingle nucleotide polymorphisms, therefore, should also consider examining SULT1A copy number variability when searching for genotype-phenotype associations. The latest duplication is, however, only a substantiated hypothesis; an alternative explanation, disfavored by the majority of evidence, is that the duplication is an artifact of incorrect genome assembly. BioMed Central 2005-03-07 /pmc/articles/PMC555591/ /pubmed/15752422 http://dx.doi.org/10.1186/1471-2148-5-22 Text en Copyright © 2005 Bradley and Benner; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bradley, Michael E
Benner, Steven A
Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title_full Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title_fullStr Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title_full_unstemmed Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title_short Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase 1A gene family example
title_sort phylogenomic approaches to common problems encountered in the analysis of low copy repeats: the sulfotransferase 1a gene family example
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555591/
https://www.ncbi.nlm.nih.gov/pubmed/15752422
http://dx.doi.org/10.1186/1471-2148-5-22
work_keys_str_mv AT bradleymichaele phylogenomicapproachestocommonproblemsencounteredintheanalysisoflowcopyrepeatsthesulfotransferase1agenefamilyexample
AT bennerstevena phylogenomicapproachestocommonproblemsencounteredintheanalysisoflowcopyrepeatsthesulfotransferase1agenefamilyexample