Cargando…
Automatic identification and annotation of MYB gene family members in plants
BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the fi...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8933966/ https://www.ncbi.nlm.nih.gov/pubmed/35305581 http://dx.doi.org/10.1186/s12864-022-08452-5 |
_version_ | 1784671771362852864 |
---|---|
author | Pucker, Boas |
author_facet | Pucker, Boas |
author_sort | Pucker, Boas |
collection | PubMed |
description | BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the first investigation looked at Arabidopsis thaliana. This body of knowledge and characterized sequences provide the basis for the identification, classification, and functional annotation of candidate sequences in new genome and transcriptome assemblies. RESULTS: A pipeline for the automatic identification and functional annotation of MYBs in a given sequence data set was implemented in Python. MYB candidates are identified, screened for the presence of a MYB domain and other motifs, and finally placed in a phylogenetic context with well characterized sequences. In addition to technical benchmarking based on existing annotation, the transcriptome assembly of Croton tiglium and the annotated genome sequence of Castanea crenata were screened for MYBs. Results of both analyses are presented in this study to illustrate the potential of this application. The analysis of one species takes only a few minutes depending on the number of predicted sequences and the size of the MYB gene family. This pipeline, the required bait sequences, and reference sequences for a classification are freely available on github: https://github.com/bpucker/MYB_annotator. CONCLUSIONS: This automatic annotation of the MYB gene family in novel assemblies makes genome-wide investigations consistent and paves the way for comparative studies in the future. Candidate genes for in-depth analyses are presented based on their orthology to previously characterized sequences which allows the functional annotation of the newly identified MYBs with high confidence. The identification of orthologs can also be harnessed to detect duplication and deletion events. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08452-5. |
format | Online Article Text |
id | pubmed-8933966 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-89339662022-03-23 Automatic identification and annotation of MYB gene family members in plants Pucker, Boas BMC Genomics Software BACKGROUND: MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the first investigation looked at Arabidopsis thaliana. This body of knowledge and characterized sequences provide the basis for the identification, classification, and functional annotation of candidate sequences in new genome and transcriptome assemblies. RESULTS: A pipeline for the automatic identification and functional annotation of MYBs in a given sequence data set was implemented in Python. MYB candidates are identified, screened for the presence of a MYB domain and other motifs, and finally placed in a phylogenetic context with well characterized sequences. In addition to technical benchmarking based on existing annotation, the transcriptome assembly of Croton tiglium and the annotated genome sequence of Castanea crenata were screened for MYBs. Results of both analyses are presented in this study to illustrate the potential of this application. The analysis of one species takes only a few minutes depending on the number of predicted sequences and the size of the MYB gene family. This pipeline, the required bait sequences, and reference sequences for a classification are freely available on github: https://github.com/bpucker/MYB_annotator. CONCLUSIONS: This automatic annotation of the MYB gene family in novel assemblies makes genome-wide investigations consistent and paves the way for comparative studies in the future. Candidate genes for in-depth analyses are presented based on their orthology to previously characterized sequences which allows the functional annotation of the newly identified MYBs with high confidence. The identification of orthologs can also be harnessed to detect duplication and deletion events. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-022-08452-5. BioMed Central 2022-03-19 /pmc/articles/PMC8933966/ /pubmed/35305581 http://dx.doi.org/10.1186/s12864-022-08452-5 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Pucker, Boas Automatic identification and annotation of MYB gene family members in plants |
title | Automatic identification and annotation of MYB gene family members in plants |
title_full | Automatic identification and annotation of MYB gene family members in plants |
title_fullStr | Automatic identification and annotation of MYB gene family members in plants |
title_full_unstemmed | Automatic identification and annotation of MYB gene family members in plants |
title_short | Automatic identification and annotation of MYB gene family members in plants |
title_sort | automatic identification and annotation of myb gene family members in plants |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8933966/ https://www.ncbi.nlm.nih.gov/pubmed/35305581 http://dx.doi.org/10.1186/s12864-022-08452-5 |
work_keys_str_mv | AT puckerboas automaticidentificationandannotationofmybgenefamilymembersinplants |