Cargando…

pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data

BACKGROUND: With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accu...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xiaolong, Shao, Yanyan, Tian, Jichao, Liao, Yuwei, Li, Peiying, Zhang, Yu, Chen, Jun, Li, Zhiguang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6511130/
https://www.ncbi.nlm.nih.gov/pubmed/31077131
http://dx.doi.org/10.1186/s12859-019-2854-x
_version_ 1783417523837861888
author Zhang, Xiaolong
Shao, Yanyan
Tian, Jichao
Liao, Yuwei
Li, Peiying
Zhang, Yu
Chen, Jun
Li, Zhiguang
author_facet Zhang, Xiaolong
Shao, Yanyan
Tian, Jichao
Liao, Yuwei
Li, Peiying
Zhang, Yu
Chen, Jun
Li, Zhiguang
author_sort Zhang, Xiaolong
collection PubMed
description BACKGROUND: With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accuracy require improvement in trimming large scale of primers in high throughput target genome sequencing. This issue is becoming more urgent considering the potential clinical implementation of MAS for processing patient samples. We here developed pTrimmer that could handle thousands of primers simultaneously with greatly improved accuracy and performance. RESULT: pTrimmer combines the two algorithms of k-mers and Needleman-Wunsch algorithm, which ensures its accuracy even with the presence of sequencing errors. pTrimmer has an improvement of 28.59% sensitivity and 11.87% accuracy compared to the similar tools. The simulation showed pTrimmer has an ultra-high sensitivity rate of 99.96% and accuracy of 97.38% compared to cutPrimers (70.85% sensitivity rate and 58.73% accuracy). And the performance of pTrimmer is notably higher. It is about 370 times faster than cutPrimers and even 17,000 times faster than cutadapt per threads. Trimming 2158 pairs of primers from 11 million reads (Illumina PE 150 bp) takes only 37 s and no more than 100 MB of memory consumption. CONCLUSIONS: pTrimmer is designed to trim primer sequence from multiplex amplicon sequencing and target sequencing. It is highly sensitive and specific compared to other three similar tools, which could help users to get more reliable mutational information for downstream analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2854-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6511130
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65111302019-05-20 pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data Zhang, Xiaolong Shao, Yanyan Tian, Jichao Liao, Yuwei Li, Peiying Zhang, Yu Chen, Jun Li, Zhiguang BMC Bioinformatics Software BACKGROUND: With the widespread use of multiple amplicon-sequencing (MAS) in genetic variation detection, an efficient tool is required to remove primer sequences from short reads to ensure the reliability of downstream analysis. Although some tools are currently available, their efficiency and accuracy require improvement in trimming large scale of primers in high throughput target genome sequencing. This issue is becoming more urgent considering the potential clinical implementation of MAS for processing patient samples. We here developed pTrimmer that could handle thousands of primers simultaneously with greatly improved accuracy and performance. RESULT: pTrimmer combines the two algorithms of k-mers and Needleman-Wunsch algorithm, which ensures its accuracy even with the presence of sequencing errors. pTrimmer has an improvement of 28.59% sensitivity and 11.87% accuracy compared to the similar tools. The simulation showed pTrimmer has an ultra-high sensitivity rate of 99.96% and accuracy of 97.38% compared to cutPrimers (70.85% sensitivity rate and 58.73% accuracy). And the performance of pTrimmer is notably higher. It is about 370 times faster than cutPrimers and even 17,000 times faster than cutadapt per threads. Trimming 2158 pairs of primers from 11 million reads (Illumina PE 150 bp) takes only 37 s and no more than 100 MB of memory consumption. CONCLUSIONS: pTrimmer is designed to trim primer sequence from multiplex amplicon sequencing and target sequencing. It is highly sensitive and specific compared to other three similar tools, which could help users to get more reliable mutational information for downstream analysis. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2854-x) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-10 /pmc/articles/PMC6511130/ /pubmed/31077131 http://dx.doi.org/10.1186/s12859-019-2854-x Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Zhang, Xiaolong
Shao, Yanyan
Tian, Jichao
Liao, Yuwei
Li, Peiying
Zhang, Yu
Chen, Jun
Li, Zhiguang
pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_full pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_fullStr pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_full_unstemmed pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_short pTrimmer: An efficient tool to trim primers of multiplex deep sequencing data
title_sort ptrimmer: an efficient tool to trim primers of multiplex deep sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6511130/
https://www.ncbi.nlm.nih.gov/pubmed/31077131
http://dx.doi.org/10.1186/s12859-019-2854-x
work_keys_str_mv AT zhangxiaolong ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT shaoyanyan ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT tianjichao ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT liaoyuwei ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT lipeiying ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT zhangyu ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT chenjun ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata
AT lizhiguang ptrimmeranefficienttooltotrimprimersofmultiplexdeepsequencingdata