Cargando…

Automatic identification and annotation of MYB gene family members in plants

BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the fi...

Descripción completa

Detalles Bibliográficos
Autor principal: Pucker, Boas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8933966/
https://www.ncbi.nlm.nih.gov/pubmed/35305581
http://dx.doi.org/10.1186/s12864-022-08452-5
_version_ 1784671771362852864
author Pucker, Boas
author_facet Pucker, Boas
author_sort Pucker, Boas
collection PubMed
description BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the first investigation looked at Arabidopsis thaliana. This body of knowledge and characterized sequences provide the basis for the identification, classification, and functional annotation of candidate sequences in new genome and transcriptome assemblies. RESULTS: A pipeline for the automatic identification and functional annotation of MYBs in a given sequence data set was implemented in Python. MYB candidates are identified, screened for the presence of a MYB domain and other motifs, and finally placed in a phylogenetic context with well characterized sequences. In addition to technical benchmarking based on existing annotation, the transcriptome assembly of Croton tiglium and the annotated genome sequence of Castanea crenata were screened for MYBs. Results of both analyses are presented in this study to illustrate the potential of this application. The analysis of one species takes only a few minutes depending on the number of predicted sequences and the size of the MYB gene family. This pipeline, the required bait sequences, and reference sequences for a classification are freely available on github: https://github.com/bpucker/MYB_annotator. CONCLUSIONS: This automatic annotation of the MYB gene family in novel assemblies makes genome-wide investigations consistent and paves the way for comparative studies in the future. Candidate genes for in-depth analyses are presented based on their orthology to previously characterized sequences which allows the functional annotation of the newly identified MYBs with high confidence. The identification of orthologs can also be harnessed to detect duplication and deletion events. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08452-5.
format Online
Article
Text
id pubmed-8933966
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-89339662022-03-23 Automatic identification and annotation of MYB gene family members in plants Pucker, Boas BMC Genomics Software BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the first investigation looked at Arabidopsis thaliana. This body of knowledge and characterized sequences provide the basis for the identification, classification, and functional annotation of candidate sequences in new genome and transcriptome assemblies. RESULTS: A pipeline for the automatic identification and functional annotation of MYBs in a given sequence data set was implemented in Python. MYB candidates are identified, screened for the presence of a MYB domain and other motifs, and finally placed in a phylogenetic context with well characterized sequences. In addition to technical benchmarking based on existing annotation, the transcriptome assembly of Croton tiglium and the annotated genome sequence of Castanea crenata were screened for MYBs. Results of both analyses are presented in this study to illustrate the potential of this application. The analysis of one species takes only a few minutes depending on the number of predicted sequences and the size of the MYB gene family. This pipeline, the required bait sequences, and reference sequences for a classification are freely available on github: https://github.com/bpucker/MYB_annotator. CONCLUSIONS: This automatic annotation of the MYB gene family in novel assemblies makes genome-wide investigations consistent and paves the way for comparative studies in the future. Candidate genes for in-depth analyses are presented based on their orthology to previously characterized sequences which allows the functional annotation of the newly identified MYBs with high confidence. The identification of orthologs can also be harnessed to detect duplication and deletion events. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08452-5. BioMed Central 2022-03-19 /pmc/articles/PMC8933966/ /pubmed/35305581 http://dx.doi.org/10.1186/s12864-022-08452-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Pucker, Boas
Automatic identification and annotation of MYB gene family members in plants
title Automatic identification and annotation of MYB gene family members in plants
title_full Automatic identification and annotation of MYB gene family members in plants
title_fullStr Automatic identification and annotation of MYB gene family members in plants
title_full_unstemmed Automatic identification and annotation of MYB gene family members in plants
title_short Automatic identification and annotation of MYB gene family members in plants
title_sort automatic identification and annotation of myb gene family members in plants
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8933966/
https://www.ncbi.nlm.nih.gov/pubmed/35305581
http://dx.doi.org/10.1186/s12864-022-08452-5
work_keys_str_mv AT puckerboas automaticidentificationandannotationofmybgenefamilymembersinplants