Cargando…

GAAP: A Genome Assembly + Annotation Pipeline

Genomic analysis begins with de novo assembly of short-read fragments in order to reconstruct full-length base sequences without exploiting a reference genome sequence. Then, in the annotation step, gene locations are identified within the base sequences, and the structures and functions of these ge...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kong, Jinhwa, Huh, Sun, Won, Jung-Im, Yoon, Jeehee, Kim, Baeksop, Kim, Kiyong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6617929/ https://www.ncbi.nlm.nih.gov/pubmed/31346518 http://dx.doi.org/10.1155/2019/4767354

_version_	1783433804545785856
author	Kong, Jinhwa Huh, Sun Won, Jung-Im Yoon, Jeehee Kim, Baeksop Kim, Kiyong
author_facet	Kong, Jinhwa Huh, Sun Won, Jung-Im Yoon, Jeehee Kim, Baeksop Kim, Kiyong
author_sort	Kong, Jinhwa
collection	PubMed
description	Genomic analysis begins with de novo assembly of short-read fragments in order to reconstruct full-length base sequences without exploiting a reference genome sequence. Then, in the annotation step, gene locations are identified within the base sequences, and the structures and functions of these genes are determined. Recently, a wide range of powerful tools have been developed and published for whole-genome analysis, enabling even individual researchers in small laboratories to perform whole-genome analyses on their objects of interest. However, these analytical tools are generally complex and use diverse algorithms, parameter setting methods, and input formats; thus, it remains difficult for individual researchers to select, utilize, and combine these tools to obtain their final results. To resolve these issues, we have developed a genome analysis pipeline (GAAP) for semiautomated, iterative, and high-throughput analysis of whole-genome data. This pipeline is designed to perform read correction, de novo genome (transcriptome) assembly, gene prediction, and functional annotation using a range of proven tools and databases. We aim to assist non-IT researchers by describing each stage of analysis in detail and discussing current approaches. We also provide practical advice on how to access and use the bioinformatics tools and databases and how to implement the provided suggestions. Whole-genome analysis of Toxocara canis is used as case study to show intermediate results at each stage, demonstrating the practicality of the proposed method.
format	Online Article Text
id	pubmed-6617929
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-66179292019-07-25 GAAP: A Genome Assembly + Annotation Pipeline Kong, Jinhwa Huh, Sun Won, Jung-Im Yoon, Jeehee Kim, Baeksop Kim, Kiyong Biomed Res Int Research Article Genomic analysis begins with de novo assembly of short-read fragments in order to reconstruct full-length base sequences without exploiting a reference genome sequence. Then, in the annotation step, gene locations are identified within the base sequences, and the structures and functions of these genes are determined. Recently, a wide range of powerful tools have been developed and published for whole-genome analysis, enabling even individual researchers in small laboratories to perform whole-genome analyses on their objects of interest. However, these analytical tools are generally complex and use diverse algorithms, parameter setting methods, and input formats; thus, it remains difficult for individual researchers to select, utilize, and combine these tools to obtain their final results. To resolve these issues, we have developed a genome analysis pipeline (GAAP) for semiautomated, iterative, and high-throughput analysis of whole-genome data. This pipeline is designed to perform read correction, de novo genome (transcriptome) assembly, gene prediction, and functional annotation using a range of proven tools and databases. We aim to assist non-IT researchers by describing each stage of analysis in detail and discussing current approaches. We also provide practical advice on how to access and use the bioinformatics tools and databases and how to implement the provided suggestions. Whole-genome analysis of Toxocara canis is used as case study to show intermediate results at each stage, demonstrating the practicality of the proposed method. Hindawi 2019-06-26 /pmc/articles/PMC6617929/ /pubmed/31346518 http://dx.doi.org/10.1155/2019/4767354 Text en Copyright © 2019 Jinhwa Kong et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Kong, Jinhwa Huh, Sun Won, Jung-Im Yoon, Jeehee Kim, Baeksop Kim, Kiyong GAAP: A Genome Assembly + Annotation Pipeline
title	GAAP: A Genome Assembly + Annotation Pipeline
title_full	GAAP: A Genome Assembly + Annotation Pipeline
title_fullStr	GAAP: A Genome Assembly + Annotation Pipeline
title_full_unstemmed	GAAP: A Genome Assembly + Annotation Pipeline
title_short	GAAP: A Genome Assembly + Annotation Pipeline
title_sort	gaap: a genome assembly + annotation pipeline
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6617929/ https://www.ncbi.nlm.nih.gov/pubmed/31346518 http://dx.doi.org/10.1155/2019/4767354
work_keys_str_mv	AT kongjinhwa gaapagenomeassemblyannotationpipeline AT huhsun gaapagenomeassemblyannotationpipeline AT wonjungim gaapagenomeassemblyannotationpipeline AT yoonjeehee gaapagenomeassemblyannotationpipeline AT kimbaeksop gaapagenomeassemblyannotationpipeline AT kimkiyong gaapagenomeassemblyannotationpipeline

GAAP: A Genome Assembly + Annotation Pipeline

Ejemplares similares