Cargando…

misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny

As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particul...

Descripción completa

Detalles Bibliográficos
Autores principales: Ko, Young-Joon, Kim, Jung Sun, Kim, Sangsoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Society of Gastrointestinal Intervention 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5769862/
https://www.ncbi.nlm.nih.gov/pubmed/29307138
http://dx.doi.org/10.5808/GI.2017.15.4.128
_version_ 1783292978261917696
author Ko, Young-Joon
Kim, Jung Sun
Kim, Sangsoo
author_facet Ko, Young-Joon
Kim, Jung Sun
Kim, Sangsoo
author_sort Ko, Young-Joon
collection PubMed
description As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particular, the complexities of plants are incomparable to those of microorganisms or animals because of whole-genome duplications, repeat insertions, and Numt insertions, etc. In this study, we describe a new method for detecting misassembly sequence regions of Brassica rapa with genotyping-by-sequencing, followed by MadMapper clustering. The misassembly candidate regions were cross-checked with BAC clone paired-ends library sequences that have been mapped to the reference genome. The results were further verified with gene synteny relations between Brassica rapa and Arabidopsis thaliana. We conclude that this method will help detect misassembly regions and be applicable to incompletely assembled reference genomes from a variety of species.
format Online
Article
Text
id pubmed-5769862
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Society of Gastrointestinal Intervention
record_format MEDLINE/PubMed
spelling pubmed-57698622018-01-19 misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny Ko, Young-Joon Kim, Jung Sun Kim, Sangsoo Genomics Inform Original Article As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particular, the complexities of plants are incomparable to those of microorganisms or animals because of whole-genome duplications, repeat insertions, and Numt insertions, etc. In this study, we describe a new method for detecting misassembly sequence regions of Brassica rapa with genotyping-by-sequencing, followed by MadMapper clustering. The misassembly candidate regions were cross-checked with BAC clone paired-ends library sequences that have been mapped to the reference genome. The results were further verified with gene synteny relations between Brassica rapa and Arabidopsis thaliana. We conclude that this method will help detect misassembly regions and be applicable to incompletely assembled reference genomes from a variety of species. Society of Gastrointestinal Intervention 2017-12 2017-12-29 /pmc/articles/PMC5769862/ /pubmed/29307138 http://dx.doi.org/10.5808/GI.2017.15.4.128 Text en Copyright © 2017 by the Korea Genome Organization It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/).
spellingShingle Original Article
Ko, Young-Joon
Kim, Jung Sun
Kim, Sangsoo
misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title_full misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title_fullStr misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title_full_unstemmed misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title_short misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny
title_sort mismm: an integrated pipeline for misassembly detection using genotyping-by-sequencing and its validation with bac end library sequences and gene synteny
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5769862/
https://www.ncbi.nlm.nih.gov/pubmed/29307138
http://dx.doi.org/10.5808/GI.2017.15.4.128
work_keys_str_mv AT koyoungjoon mismmanintegratedpipelineformisassemblydetectionusinggenotypingbysequencinganditsvalidationwithbacendlibrarysequencesandgenesynteny
AT kimjungsun mismmanintegratedpipelineformisassemblydetectionusinggenotypingbysequencinganditsvalidationwithbacendlibrarysequencesandgenesynteny
AT kimsangsoo mismmanintegratedpipelineformisassemblydetectionusinggenotypingbysequencinganditsvalidationwithbacendlibrarysequencesandgenesynteny