Cargando…

Manual annotation of Drosophila genes: a Genomics Education Partnership protocol

Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partne...

Descripción completa

Detalles Bibliográficos
Autores principales: Rele, Chinmay P., Sandlin, Katie M., Leung, Wilson, Reed, Laura K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10579860/
https://www.ncbi.nlm.nih.gov/pubmed/37854289
http://dx.doi.org/10.12688/f1000research.126839.3
_version_ 1785121819972337664
author Rele, Chinmay P.
Sandlin, Katie M.
Leung, Wilson
Reed, Laura K.
author_facet Rele, Chinmay P.
Sandlin, Katie M.
Leung, Wilson
Reed, Laura K.
author_sort Rele, Chinmay P.
collection PubMed
description Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partnership (GEP; https://thegep.org/) developed a structural annotation protocol for protein-coding genes that enables undergraduate student and faculty researchers to create high-quality gene annotations that can be utilized in subsequent scientific investigations. For example, this protocol has been utilized by the GEP faculty to engage undergraduate students in the comparative annotation of genes involved in the insulin signaling pathway in 27 Drosophila species, using D. melanogaster as the reference genome. Students construct gene models using multiple lines of computational and empirical evidence including expression data (e.g., RNA-Seq), sequence similarity (e.g., BLAST and multiple sequence alignment), and computational gene predictions. Quality control measures require each gene be annotated by at least two students working independently, followed by reconciliation of the submitted gene models by a more experienced student. This article provides an overview of the annotation protocol and describes how discrepancies in student submitted gene models are resolved to produce a final, high-quality gene set suitable for subsequent analyses. The protocol can be adapted to other scientific questions (e.g., expansion of the Drosophila Muller F element) and species (e.g., parasitoid wasps) to provide additional opportunities for undergraduate students to participate in genomics research. These student annotation efforts can substantially improve the quality of gene annotations in publicly available genomic databases.
format Online
Article
Text
id pubmed-10579860
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-105798602023-10-18 Manual annotation of Drosophila genes: a Genomics Education Partnership protocol Rele, Chinmay P. Sandlin, Katie M. Leung, Wilson Reed, Laura K. F1000Res Method Article Annotating the genomes of multiple species allows us to analyze the evolution of their genes. While many eukaryotic genome assemblies already include computational gene predictions, these predictions can benefit from review and refinement through manual gene annotation. The Genomics Education Partnership (GEP; https://thegep.org/) developed a structural annotation protocol for protein-coding genes that enables undergraduate student and faculty researchers to create high-quality gene annotations that can be utilized in subsequent scientific investigations. For example, this protocol has been utilized by the GEP faculty to engage undergraduate students in the comparative annotation of genes involved in the insulin signaling pathway in 27 Drosophila species, using D. melanogaster as the reference genome. Students construct gene models using multiple lines of computational and empirical evidence including expression data (e.g., RNA-Seq), sequence similarity (e.g., BLAST and multiple sequence alignment), and computational gene predictions. Quality control measures require each gene be annotated by at least two students working independently, followed by reconciliation of the submitted gene models by a more experienced student. This article provides an overview of the annotation protocol and describes how discrepancies in student submitted gene models are resolved to produce a final, high-quality gene set suitable for subsequent analyses. The protocol can be adapted to other scientific questions (e.g., expansion of the Drosophila Muller F element) and species (e.g., parasitoid wasps) to provide additional opportunities for undergraduate students to participate in genomics research. These student annotation efforts can substantially improve the quality of gene annotations in publicly available genomic databases. F1000 Research Limited 2023-10-13 /pmc/articles/PMC10579860/ /pubmed/37854289 http://dx.doi.org/10.12688/f1000research.126839.3 Text en Copyright: © 2023 Rele CP et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method Article
Rele, Chinmay P.
Sandlin, Katie M.
Leung, Wilson
Reed, Laura K.
Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title_full Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title_fullStr Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title_full_unstemmed Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title_short Manual annotation of Drosophila genes: a Genomics Education Partnership protocol
title_sort manual annotation of drosophila genes: a genomics education partnership protocol
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10579860/
https://www.ncbi.nlm.nih.gov/pubmed/37854289
http://dx.doi.org/10.12688/f1000research.126839.3
work_keys_str_mv AT relechinmayp manualannotationofdrosophilagenesagenomicseducationpartnershipprotocol
AT sandlinkatiem manualannotationofdrosophilagenesagenomicseducationpartnershipprotocol
AT leungwilson manualannotationofdrosophilagenesagenomicseducationpartnershipprotocol
AT reedlaurak manualannotationofdrosophilagenesagenomicseducationpartnershipprotocol