Cargando…
Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes
Phylogenomic approaches, employing next-generation sequencing (NGS) techniques, have revolutionized systematic and evolutionary biology. Target enrichment is an efficient and cost-effective method in phylogenomics and is becoming increasingly popular. Depending on availability and quality of referen...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5704539/ https://www.ncbi.nlm.nih.gov/pubmed/29218053 http://dx.doi.org/10.3389/fpls.2017.01973 |
_version_ | 1783281918101422080 |
---|---|
author | Li, Xi Hao, Baohai Pan, Da Schneeweiss, Gerald M. |
author_facet | Li, Xi Hao, Baohai Pan, Da Schneeweiss, Gerald M. |
author_sort | Li, Xi |
collection | PubMed |
description | Phylogenomic approaches, employing next-generation sequencing (NGS) techniques, have revolutionized systematic and evolutionary biology. Target enrichment is an efficient and cost-effective method in phylogenomics and is becoming increasingly popular. Depending on availability and quality of reference data as well as on biological features of the study system, (semi-)automated identification of suitable markers will require specific bioinformatic pipelines. Here, we established a highly flexible bioinformatic pipeline, BaitsFinder, to identify putative orthologous single copy genes (SCGs) and to construct bait sequences in a single workflow. Additionally, this pipeline has been constructed to be able to cope with challenging data sets, such as the nutritionally heterogeneous plant family Orobanchaceae. To this end, we used transcriptome data of differing quality available for four Orobanchaceae species and, as reference, SCG data from monkeyflower (Erythranthe guttata, syn. Mimulus g.; 1,915 genes) and tomato (Solanum lycopersicum; 391 genes). Depending on whether gaps were permitted in initial blast searches of the four Orobanchaceae species against the reference, our pipeline identified 1,307 and 981 SCGs with average length of 994 bp and 775 bp, respectively. Automated bait sequence construction (using 2× tiling) resulted in 38,170 and 21,856 bait sequences, respectively. In comparison to the recently published MarkerMiner 1.0 pipeline BaitsFinder identified about 1.6 times as many SCGs (of at least 900 bp length). Skipping steps specific to analyses of Orobanchaceae, BaitsFinder was successfully used in a group of non-parasitic plants (three Asteraceae species and, as reference, SCG data from Arabidopsis thaliana based on previously compiled SCGs). Thus, BaitsFinder is expected to be broadly applicable in groups, where only transcriptomes or partial genome data of differing quality are available. |
format | Online Article Text |
id | pubmed-5704539 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-57045392017-12-07 Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes Li, Xi Hao, Baohai Pan, Da Schneeweiss, Gerald M. Front Plant Sci Plant Science Phylogenomic approaches, employing next-generation sequencing (NGS) techniques, have revolutionized systematic and evolutionary biology. Target enrichment is an efficient and cost-effective method in phylogenomics and is becoming increasingly popular. Depending on availability and quality of reference data as well as on biological features of the study system, (semi-)automated identification of suitable markers will require specific bioinformatic pipelines. Here, we established a highly flexible bioinformatic pipeline, BaitsFinder, to identify putative orthologous single copy genes (SCGs) and to construct bait sequences in a single workflow. Additionally, this pipeline has been constructed to be able to cope with challenging data sets, such as the nutritionally heterogeneous plant family Orobanchaceae. To this end, we used transcriptome data of differing quality available for four Orobanchaceae species and, as reference, SCG data from monkeyflower (Erythranthe guttata, syn. Mimulus g.; 1,915 genes) and tomato (Solanum lycopersicum; 391 genes). Depending on whether gaps were permitted in initial blast searches of the four Orobanchaceae species against the reference, our pipeline identified 1,307 and 981 SCGs with average length of 994 bp and 775 bp, respectively. Automated bait sequence construction (using 2× tiling) resulted in 38,170 and 21,856 bait sequences, respectively. In comparison to the recently published MarkerMiner 1.0 pipeline BaitsFinder identified about 1.6 times as many SCGs (of at least 900 bp length). Skipping steps specific to analyses of Orobanchaceae, BaitsFinder was successfully used in a group of non-parasitic plants (three Asteraceae species and, as reference, SCG data from Arabidopsis thaliana based on previously compiled SCGs). Thus, BaitsFinder is expected to be broadly applicable in groups, where only transcriptomes or partial genome data of differing quality are available. Frontiers Media S.A. 2017-11-21 /pmc/articles/PMC5704539/ /pubmed/29218053 http://dx.doi.org/10.3389/fpls.2017.01973 Text en Copyright © 2017 Li, Hao, Pan and Schneeweiss. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Plant Science Li, Xi Hao, Baohai Pan, Da Schneeweiss, Gerald M. Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title | Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title_full | Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title_fullStr | Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title_full_unstemmed | Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title_short | Marker Development for Phylogenomics: The Case of Orobanchaceae, a Plant Family with Contrasting Nutritional Modes |
title_sort | marker development for phylogenomics: the case of orobanchaceae, a plant family with contrasting nutritional modes |
topic | Plant Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5704539/ https://www.ncbi.nlm.nih.gov/pubmed/29218053 http://dx.doi.org/10.3389/fpls.2017.01973 |
work_keys_str_mv | AT lixi markerdevelopmentforphylogenomicsthecaseoforobanchaceaeaplantfamilywithcontrastingnutritionalmodes AT haobaohai markerdevelopmentforphylogenomicsthecaseoforobanchaceaeaplantfamilywithcontrastingnutritionalmodes AT panda markerdevelopmentforphylogenomicsthecaseoforobanchaceaeaplantfamilywithcontrastingnutritionalmodes AT schneeweissgeraldm markerdevelopmentforphylogenomicsthecaseoforobanchaceaeaplantfamilywithcontrastingnutritionalmodes |