Cargando…

THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline

THAPBI PICT is an open source software pipeline for metabarcoding analysis of Illumina paired-end reads, including cases of multiplexing where more than one amplicon is amplified per DNA sample. Initially a Phytophthora ITS1 Classification Tool (PICT), we demonstrate using worked examples with our o...

Descripción completa

Detalles Bibliográficos
Autores principales: Cock, Peter J. A., Cooke, David E. L., Thorpe, Peter, Pritchard, Leighton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441533/
https://www.ncbi.nlm.nih.gov/pubmed/37609440
http://dx.doi.org/10.7717/peerj.15648
_version_ 1785093393575051264
author Cock, Peter J. A.
Cooke, David E. L.
Thorpe, Peter
Pritchard, Leighton
author_facet Cock, Peter J. A.
Cooke, David E. L.
Thorpe, Peter
Pritchard, Leighton
author_sort Cock, Peter J. A.
collection PubMed
description THAPBI PICT is an open source software pipeline for metabarcoding analysis of Illumina paired-end reads, including cases of multiplexing where more than one amplicon is amplified per DNA sample. Initially a Phytophthora ITS1 Classification Tool (PICT), we demonstrate using worked examples with our own and public data sets how, with appropriate primer settings and a custom database, it can be applied to other amplicons and organisms, and used for reanalysis of existing datasets. The core dataflow of the implementation is (i) data reduction to unique marker sequences, often called amplicon sequence variants (ASVs), (ii) dynamic thresholds for discarding low abundance sequences to remove noise and artifacts (rather than error correction by default), before (iii) classification using a curated reference database. The default classifier assigns a label to each query sequence based on a database match that is either perfect, or a single base pair edit away (substitution, deletion or insertion). Abundance thresholds for inclusion can be set by the user or automatically using per-batch negative or synthetic control samples. Output is designed for practical interpretation by non-specialists and includes a read report (ASVs with classification and counts per sample), sample report (samples with counts per species classification), and a topological graph of ASVs as nodes with short edit distances as edges. Source code available from https://github.com/peterjc/thapbi-pict/ with documentation including installation instructions.
format Online
Article
Text
id pubmed-10441533
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-104415332023-08-22 THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline Cock, Peter J. A. Cooke, David E. L. Thorpe, Peter Pritchard, Leighton PeerJ Biodiversity THAPBI PICT is an open source software pipeline for metabarcoding analysis of Illumina paired-end reads, including cases of multiplexing where more than one amplicon is amplified per DNA sample. Initially a Phytophthora ITS1 Classification Tool (PICT), we demonstrate using worked examples with our own and public data sets how, with appropriate primer settings and a custom database, it can be applied to other amplicons and organisms, and used for reanalysis of existing datasets. The core dataflow of the implementation is (i) data reduction to unique marker sequences, often called amplicon sequence variants (ASVs), (ii) dynamic thresholds for discarding low abundance sequences to remove noise and artifacts (rather than error correction by default), before (iii) classification using a curated reference database. The default classifier assigns a label to each query sequence based on a database match that is either perfect, or a single base pair edit away (substitution, deletion or insertion). Abundance thresholds for inclusion can be set by the user or automatically using per-batch negative or synthetic control samples. Output is designed for practical interpretation by non-specialists and includes a read report (ASVs with classification and counts per sample), sample report (samples with counts per species classification), and a topological graph of ASVs as nodes with short edit distances as edges. Source code available from https://github.com/peterjc/thapbi-pict/ with documentation including installation instructions. PeerJ Inc. 2023-08-18 /pmc/articles/PMC10441533/ /pubmed/37609440 http://dx.doi.org/10.7717/peerj.15648 Text en © 2023 Cock et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Biodiversity
Cock, Peter J. A.
Cooke, David E. L.
Thorpe, Peter
Pritchard, Leighton
THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title_full THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title_fullStr THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title_full_unstemmed THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title_short THAPBI PICT—a fast, cautious, and accurate metabarcoding analysis pipeline
title_sort thapbi pict—a fast, cautious, and accurate metabarcoding analysis pipeline
topic Biodiversity
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10441533/
https://www.ncbi.nlm.nih.gov/pubmed/37609440
http://dx.doi.org/10.7717/peerj.15648
work_keys_str_mv AT cockpeterja thapbipictafastcautiousandaccuratemetabarcodinganalysispipeline
AT cookedavidel thapbipictafastcautiousandaccuratemetabarcodinganalysispipeline
AT thorpepeter thapbipictafastcautiousandaccuratemetabarcodinganalysispipeline
AT pritchardleighton thapbipictafastcautiousandaccuratemetabarcodinganalysispipeline