Cargando…
Recovery of non-reference sequences missing from the human reference genome
BACKGROUND: The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist. RESULTS: Here, we compared 31 human de novo...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6796347/ https://www.ncbi.nlm.nih.gov/pubmed/31619167 http://dx.doi.org/10.1186/s12864-019-6107-1 |
_version_ | 1783459565222756352 |
---|---|
author | Li, Ran Tian, Xiaomeng Yang, Peng Fan, Yingzhi Li, Ming Zheng, Hongxiang Wang, Xihong Jiang, Yu |
author_facet | Li, Ran Tian, Xiaomeng Yang, Peng Fan, Yingzhi Li, Ming Zheng, Hongxiang Wang, Xihong Jiang, Yu |
author_sort | Li, Ran |
collection | PubMed |
description | BACKGROUND: The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist. RESULTS: Here, we compared 31 human de novo assemblies with the current reference genome to identify the NRS and their location. We resolved the precise location of 6113 NRS adding up to 12.8 Mb. Besides 1571 insertions, we detected 3041 alternate alleles, which were defined as having less than 90% (or none) identity with the reference alleles. These alternate alleles overlapped with 1143 protein-coding genes including a putative novel MHC haplotype. Further, we demonstrated that the alternate alleles and their flanking regions had high content of tandem repeats, indicating that their origin was associated with tandem repeats. CONCLUSIONS: Our study detected a large number of NRS including many alternate alleles which are previously uncharacterized. We suggested that the origin of alternate alleles was associated with tandem repeats. Our results enriched the spectrum of genetic variations in human genome. |
format | Online Article Text |
id | pubmed-6796347 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-67963472019-10-21 Recovery of non-reference sequences missing from the human reference genome Li, Ran Tian, Xiaomeng Yang, Peng Fan, Yingzhi Li, Ming Zheng, Hongxiang Wang, Xihong Jiang, Yu BMC Genomics Research Article BACKGROUND: The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist. RESULTS: Here, we compared 31 human de novo assemblies with the current reference genome to identify the NRS and their location. We resolved the precise location of 6113 NRS adding up to 12.8 Mb. Besides 1571 insertions, we detected 3041 alternate alleles, which were defined as having less than 90% (or none) identity with the reference alleles. These alternate alleles overlapped with 1143 protein-coding genes including a putative novel MHC haplotype. Further, we demonstrated that the alternate alleles and their flanking regions had high content of tandem repeats, indicating that their origin was associated with tandem repeats. CONCLUSIONS: Our study detected a large number of NRS including many alternate alleles which are previously uncharacterized. We suggested that the origin of alternate alleles was associated with tandem repeats. Our results enriched the spectrum of genetic variations in human genome. BioMed Central 2019-10-16 /pmc/articles/PMC6796347/ /pubmed/31619167 http://dx.doi.org/10.1186/s12864-019-6107-1 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Li, Ran Tian, Xiaomeng Yang, Peng Fan, Yingzhi Li, Ming Zheng, Hongxiang Wang, Xihong Jiang, Yu Recovery of non-reference sequences missing from the human reference genome |
title | Recovery of non-reference sequences missing from the human reference genome |
title_full | Recovery of non-reference sequences missing from the human reference genome |
title_fullStr | Recovery of non-reference sequences missing from the human reference genome |
title_full_unstemmed | Recovery of non-reference sequences missing from the human reference genome |
title_short | Recovery of non-reference sequences missing from the human reference genome |
title_sort | recovery of non-reference sequences missing from the human reference genome |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6796347/ https://www.ncbi.nlm.nih.gov/pubmed/31619167 http://dx.doi.org/10.1186/s12864-019-6107-1 |
work_keys_str_mv | AT liran recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT tianxiaomeng recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT yangpeng recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT fanyingzhi recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT liming recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT zhenghongxiang recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT wangxihong recoveryofnonreferencesequencesmissingfromthehumanreferencegenome AT jiangyu recoveryofnonreferencesequencesmissingfromthehumanreferencegenome |