Cargando…

Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species

INTRODUCTION: For reference genomes and gene annotations are key materials that can determine the limits of the molecular biology research of a species; however, systematic research on their quality assessment remains insufficient. METHODS: We collected reference assemblies, gene annotations, and 3,...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Sinwoo, Lee, Jinbaek, Kim, Jaeryeong, Kim, Dohyeon, Lee, Jin Hyup, Pack, Seung Pil, Seo, Minseok
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9988948/
https://www.ncbi.nlm.nih.gov/pubmed/36896291
http://dx.doi.org/10.3389/fvets.2023.1128570
_version_ 1784901678797946880
author Park, Sinwoo
Lee, Jinbaek
Kim, Jaeryeong
Kim, Dohyeon
Lee, Jin Hyup
Pack, Seung Pil
Seo, Minseok
author_facet Park, Sinwoo
Lee, Jinbaek
Kim, Jaeryeong
Kim, Dohyeon
Lee, Jin Hyup
Pack, Seung Pil
Seo, Minseok
author_sort Park, Sinwoo
collection PubMed
description INTRODUCTION: For reference genomes and gene annotations are key materials that can determine the limits of the molecular biology research of a species; however, systematic research on their quality assessment remains insufficient. METHODS: We collected reference assemblies, gene annotations, and 3,420 RNA-sequencing (RNA-seq) data from 114 species and selected effective indicators to simultaneously evaluate the reference genome quality of various species, including statistics that can be obtained empirically during the mapping process of short reads. Furthermore, we newly presented and applied transcript diversity and quantification success rates that can relatively evaluate the quality of gene annotations of various species. Finally, we proposed a next-generation sequencing (NGS) applicability index by integrating a total of 10 effective indicators that can evaluate the genome and gene annotation of a specific species. RESULTS AND DISCUSSION: Based on these effective evaluation indicators, we successfully evaluated and demonstrated the relative accessibility of NGS applications in all species, which will directly contribute to determining the technological boundaries in each species. Simultaneously, we expect that it will be a key indicator to examine the direction of future development through relative quality evaluation of genomes and gene annotations in each species, including countless organisms whose genomes and gene annotations will be constructed in the future.
format Online
Article
Text
id pubmed-9988948
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99889482023-03-08 Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species Park, Sinwoo Lee, Jinbaek Kim, Jaeryeong Kim, Dohyeon Lee, Jin Hyup Pack, Seung Pil Seo, Minseok Front Vet Sci Veterinary Science INTRODUCTION: For reference genomes and gene annotations are key materials that can determine the limits of the molecular biology research of a species; however, systematic research on their quality assessment remains insufficient. METHODS: We collected reference assemblies, gene annotations, and 3,420 RNA-sequencing (RNA-seq) data from 114 species and selected effective indicators to simultaneously evaluate the reference genome quality of various species, including statistics that can be obtained empirically during the mapping process of short reads. Furthermore, we newly presented and applied transcript diversity and quantification success rates that can relatively evaluate the quality of gene annotations of various species. Finally, we proposed a next-generation sequencing (NGS) applicability index by integrating a total of 10 effective indicators that can evaluate the genome and gene annotation of a specific species. RESULTS AND DISCUSSION: Based on these effective evaluation indicators, we successfully evaluated and demonstrated the relative accessibility of NGS applications in all species, which will directly contribute to determining the technological boundaries in each species. Simultaneously, we expect that it will be a key indicator to examine the direction of future development through relative quality evaluation of genomes and gene annotations in each species, including countless organisms whose genomes and gene annotations will be constructed in the future. Frontiers Media S.A. 2023-02-21 /pmc/articles/PMC9988948/ /pubmed/36896291 http://dx.doi.org/10.3389/fvets.2023.1128570 Text en Copyright © 2023 Park, Lee, Kim, Kim, Lee, Pack and Seo. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Veterinary Science
Park, Sinwoo
Lee, Jinbaek
Kim, Jaeryeong
Kim, Dohyeon
Lee, Jin Hyup
Pack, Seung Pil
Seo, Minseok
Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title_full Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title_fullStr Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title_full_unstemmed Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title_short Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
title_sort benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species
topic Veterinary Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9988948/
https://www.ncbi.nlm.nih.gov/pubmed/36896291
http://dx.doi.org/10.3389/fvets.2023.1128570
work_keys_str_mv AT parksinwoo benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT leejinbaek benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT kimjaeryeong benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT kimdohyeon benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT leejinhyup benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT packseungpil benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species
AT seominseok benchmarkstudyforevaluatingthequalityofreferencegenomesandgeneannotationsin114species