Cargando…

PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data

BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from...

Descripción completa

Detalles Bibliográficos
Autores principales: Chiu, Kuo Ping, Wong, Chee-Hong, Chen, Qiongyu, Ariyaratne, Pramila, Ooi, Hong Sain, Wei, Chia-Lin, Sung, Wing-Kin Ken, Ruan, Yijun
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1564156/
https://www.ncbi.nlm.nih.gov/pubmed/16934139
http://dx.doi.org/10.1186/1471-2105-7-390
_version_ 1782129551163260928
author Chiu, Kuo Ping
Wong, Chee-Hong
Chen, Qiongyu
Ariyaratne, Pramila
Ooi, Hong Sain
Wei, Chia-Lin
Sung, Wing-Kin Ken
Ruan, Yijun
author_facet Chiu, Kuo Ping
Wong, Chee-Hong
Chen, Qiongyu
Ariyaratne, Pramila
Ooi, Hong Sain
Wei, Chia-Lin
Sung, Wing-Kin Ken
Ruan, Yijun
author_sort Chiu, Kuo Ping
collection PubMed
description BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. RESULTS: We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the ProjectManager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. CONCLUSION: The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management.
format Text
id pubmed-1564156
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15641562006-09-13 PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data Chiu, Kuo Ping Wong, Chee-Hong Chen, Qiongyu Ariyaratne, Pramila Ooi, Hong Sain Wei, Chia-Lin Sung, Wing-Kin Ken Ruan, Yijun BMC Bioinformatics Software BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. RESULTS: We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the ProjectManager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. CONCLUSION: The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management. BioMed Central 2006-08-25 /pmc/articles/PMC1564156/ /pubmed/16934139 http://dx.doi.org/10.1186/1471-2105-7-390 Text en Copyright © 2006 Chiu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Chiu, Kuo Ping
Wong, Chee-Hong
Chen, Qiongyu
Ariyaratne, Pramila
Ooi, Hong Sain
Wei, Chia-Lin
Sung, Wing-Kin Ken
Ruan, Yijun
PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title_full PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title_fullStr PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title_full_unstemmed PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title_short PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
title_sort pet-tool: a software suite for comprehensive processing and managing of paired-end ditag (pet) sequence data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1564156/
https://www.ncbi.nlm.nih.gov/pubmed/16934139
http://dx.doi.org/10.1186/1471-2105-7-390
work_keys_str_mv AT chiukuoping pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT wongcheehong pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT chenqiongyu pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT ariyaratnepramila pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT ooihongsain pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT weichialin pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT sungwingkinken pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata
AT ruanyijun pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata