Cargando…

Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data

The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1–6 bp. They are found in...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, San-Xu, Hou, Wei, Zhang, Xue-Yan, Peng, Chang-Jun, Yue, Bi-Song, Fan, Zhen-Xin, Li, Jing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Science Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5968858/
https://www.ncbi.nlm.nih.gov/pubmed/29643326
http://dx.doi.org/10.24272/j.issn.2095-8137.2018.047
_version_ 1783325856392806400
author Liu, San-Xu
Hou, Wei
Zhang, Xue-Yan
Peng, Chang-Jun
Yue, Bi-Song
Fan, Zhen-Xin
Li, Jing
author_facet Liu, San-Xu
Hou, Wei
Zhang, Xue-Yan
Peng, Chang-Jun
Yue, Bi-Song
Fan, Zhen-Xin
Li, Jing
author_sort Liu, San-Xu
collection PubMed
description The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1–6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (lobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P<0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P<0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations. The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques.
format Online
Article
Text
id pubmed-5968858
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Science Press
record_format MEDLINE/PubMed
spelling pubmed-59688582018-07-18 Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data Liu, San-Xu Hou, Wei Zhang, Xue-Yan Peng, Chang-Jun Yue, Bi-Song Fan, Zhen-Xin Li, Jing Zool Res Report The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN)(2017). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1–6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (lobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P<0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P<0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations. The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for screening polymorphic STRs. Our results also lay a foundation for future genetic variation studies of macaques. Science Press 2018-05-12 2018-07-18 /pmc/articles/PMC5968858/ /pubmed/29643326 http://dx.doi.org/10.24272/j.issn.2095-8137.2018.047 Text en © 2018. Editorial Office of Zoological Research, Kunming Institute of Zoology, Chinese Academy of Sciences http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Report
Liu, San-Xu
Hou, Wei
Zhang, Xue-Yan
Peng, Chang-Jun
Yue, Bi-Song
Fan, Zhen-Xin
Li, Jing
Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title_full Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title_fullStr Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title_full_unstemmed Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title_short Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data
title_sort identification and characterization of short tandem repeats in the tibetan macaque genome based on resequencing data
topic Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5968858/
https://www.ncbi.nlm.nih.gov/pubmed/29643326
http://dx.doi.org/10.24272/j.issn.2095-8137.2018.047
work_keys_str_mv AT liusanxu identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT houwei identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT zhangxueyan identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT pengchangjun identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT yuebisong identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT fanzhenxin identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata
AT lijing identificationandcharacterizationofshorttandemrepeatsinthetibetanmacaquegenomebasedonresequencingdata