Cargando…
Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy
With the broad application of high-throughput sequencing, more whole-genome resequencing data and de novo assemblies of natural populations are becoming available. For a particular species, in general, only the reference genome is well established and annotated. Computational tools based on sequence...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902276/ https://www.ncbi.nlm.nih.gov/pubmed/31850053 http://dx.doi.org/10.3389/fgene.2019.01046 |
_version_ | 1783477632531169280 |
---|---|
author | Song, Baoxing Sang, Qing Wang, Hai Pei, Huimin Gan, XiangChao Wang, Fen |
author_facet | Song, Baoxing Sang, Qing Wang, Hai Pei, Huimin Gan, XiangChao Wang, Fen |
author_sort | Song, Baoxing |
collection | PubMed |
description | With the broad application of high-throughput sequencing, more whole-genome resequencing data and de novo assemblies of natural populations are becoming available. For a particular species, in general, only the reference genome is well established and annotated. Computational tools based on sequence alignment have been developed to investigate the gene models of individuals belonging to the same or closely related species. During this process, inconsistent alignment often obscures genome annotation lift over and leads to improper functional impact prediction for a genomic variant, especially in plant species. Here, we proposed the zebraic striped dynamic programming algorithm, which provides different weights to genetic features to refine genome annotation lift over. Testing of our zebraic striped dynamic programming algorithm on both plant and animal genomic data showed complementation to standard sequence approach for highly diverse individuals. Using the lift over genome annotation as anchors, a base-pair resolution genome-wide sequence alignment and variant calling pipeline for de novo assembly has been implemented in the GEAN software. GEAN could be used to compare haplotype diversity, refine the genetic variant functional annotation, annotate de novo assembly genome sequence, detect homologous syntenic blocks, improve the quantification of gene expression levels using RNA-seq data, and unify genomic variants for population genetic analysis. We expect that GEAN will be a standard tool for the coming of age of de novo assembly population genetics. |
format | Online Article Text |
id | pubmed-6902276 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69022762019-12-17 Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy Song, Baoxing Sang, Qing Wang, Hai Pei, Huimin Gan, XiangChao Wang, Fen Front Genet Genetics With the broad application of high-throughput sequencing, more whole-genome resequencing data and de novo assemblies of natural populations are becoming available. For a particular species, in general, only the reference genome is well established and annotated. Computational tools based on sequence alignment have been developed to investigate the gene models of individuals belonging to the same or closely related species. During this process, inconsistent alignment often obscures genome annotation lift over and leads to improper functional impact prediction for a genomic variant, especially in plant species. Here, we proposed the zebraic striped dynamic programming algorithm, which provides different weights to genetic features to refine genome annotation lift over. Testing of our zebraic striped dynamic programming algorithm on both plant and animal genomic data showed complementation to standard sequence approach for highly diverse individuals. Using the lift over genome annotation as anchors, a base-pair resolution genome-wide sequence alignment and variant calling pipeline for de novo assembly has been implemented in the GEAN software. GEAN could be used to compare haplotype diversity, refine the genetic variant functional annotation, annotate de novo assembly genome sequence, detect homologous syntenic blocks, improve the quantification of gene expression levels using RNA-seq data, and unify genomic variants for population genetic analysis. We expect that GEAN will be a standard tool for the coming of age of de novo assembly population genetics. Frontiers Media S.A. 2019-11-13 /pmc/articles/PMC6902276/ /pubmed/31850053 http://dx.doi.org/10.3389/fgene.2019.01046 Text en Copyright © 2019 Song, Sang, Wang, Pei, Gan and Wang http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Song, Baoxing Sang, Qing Wang, Hai Pei, Huimin Gan, XiangChao Wang, Fen Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title | Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title_full | Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title_fullStr | Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title_full_unstemmed | Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title_short | Complement Genome Annotation Lift Over Using a Weighted Sequence Alignment Strategy |
title_sort | complement genome annotation lift over using a weighted sequence alignment strategy |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6902276/ https://www.ncbi.nlm.nih.gov/pubmed/31850053 http://dx.doi.org/10.3389/fgene.2019.01046 |
work_keys_str_mv | AT songbaoxing complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy AT sangqing complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy AT wanghai complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy AT peihuimin complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy AT ganxiangchao complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy AT wangfen complementgenomeannotationliftoverusingaweightedsequencealignmentstrategy |