Cargando…
PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data
PREMISE: The application of high‐throughput sequencing, especially to herbarium specimens, is rapidly accelerating biodiversity research. Low‐coverage sequencing of total genomic DNA (genome skimming) is particularly promising and can simultaneously recover the plastid, mitochondrial, and nuclear ri...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215275/ https://www.ncbi.nlm.nih.gov/pubmed/35774988 http://dx.doi.org/10.1002/aps3.11475 |
_version_ | 1784731174651822080 |
---|---|
author | Cai, Liming Zhang, Hongrui Davis, Charles C. |
author_facet | Cai, Liming Zhang, Hongrui Davis, Charles C. |
author_sort | Cai, Liming |
collection | PubMed |
description | PREMISE: The application of high‐throughput sequencing, especially to herbarium specimens, is rapidly accelerating biodiversity research. Low‐coverage sequencing of total genomic DNA (genome skimming) is particularly promising and can simultaneously recover the plastid, mitochondrial, and nuclear ribosomal regions across hundreds of species. Here, we introduce PhyloHerb, a bioinformatic pipeline to efficiently assemble phylogenomic data sets derived from genome skimming. METHODS AND RESULTS: PhyloHerb uses either a built‐in database or user‐specified references to extract orthologous sequences from all three genomes using a BLAST search. It outputs FASTA files and offers a suite of utility functions to assist with alignment, partitioning, concatenation, and phylogeny inference. The program is freely available at https://github.com/lmcai/PhyloHerb/. CONCLUSIONS: We demonstrate that PhyloHerb can accurately identify genes using a published data set from Clusiaceae. We also show via simulations that our approach is effective for highly fragmented assemblies from herbarium specimens and is scalable to thousands of species. |
format | Online Article Text |
id | pubmed-9215275 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92152752022-06-29 PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data Cai, Liming Zhang, Hongrui Davis, Charles C. Appl Plant Sci Software Note PREMISE: The application of high‐throughput sequencing, especially to herbarium specimens, is rapidly accelerating biodiversity research. Low‐coverage sequencing of total genomic DNA (genome skimming) is particularly promising and can simultaneously recover the plastid, mitochondrial, and nuclear ribosomal regions across hundreds of species. Here, we introduce PhyloHerb, a bioinformatic pipeline to efficiently assemble phylogenomic data sets derived from genome skimming. METHODS AND RESULTS: PhyloHerb uses either a built‐in database or user‐specified references to extract orthologous sequences from all three genomes using a BLAST search. It outputs FASTA files and offers a suite of utility functions to assist with alignment, partitioning, concatenation, and phylogeny inference. The program is freely available at https://github.com/lmcai/PhyloHerb/. CONCLUSIONS: We demonstrate that PhyloHerb can accurately identify genes using a published data set from Clusiaceae. We also show via simulations that our approach is effective for highly fragmented assemblies from herbarium specimens and is scalable to thousands of species. John Wiley and Sons Inc. 2022-06-02 /pmc/articles/PMC9215275/ /pubmed/35774988 http://dx.doi.org/10.1002/aps3.11475 Text en © 2022 The Authors. Applications in Plant Sciences published by Wiley Periodicals LLC on behalf of Botanical Society of America. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. |
spellingShingle | Software Note Cai, Liming Zhang, Hongrui Davis, Charles C. PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title | PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title_full | PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title_fullStr | PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title_full_unstemmed | PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title_short | PhyloHerb: A high‐throughput phylogenomic pipeline for processing genome skimming data |
title_sort | phyloherb: a high‐throughput phylogenomic pipeline for processing genome skimming data |
topic | Software Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215275/ https://www.ncbi.nlm.nih.gov/pubmed/35774988 http://dx.doi.org/10.1002/aps3.11475 |
work_keys_str_mv | AT cailiming phyloherbahighthroughputphylogenomicpipelineforprocessinggenomeskimmingdata AT zhanghongrui phyloherbahighthroughputphylogenomicpipelineforprocessinggenomeskimmingdata AT davischarlesc phyloherbahighthroughputphylogenomicpipelineforprocessinggenomeskimmingdata |