Cargando…
PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data
BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1564156/ https://www.ncbi.nlm.nih.gov/pubmed/16934139 http://dx.doi.org/10.1186/1471-2105-7-390 |
_version_ | 1782129551163260928 |
---|---|
author | Chiu, Kuo Ping Wong, Chee-Hong Chen, Qiongyu Ariyaratne, Pramila Ooi, Hong Sain Wei, Chia-Lin Sung, Wing-Kin Ken Ruan, Yijun |
author_facet | Chiu, Kuo Ping Wong, Chee-Hong Chen, Qiongyu Ariyaratne, Pramila Ooi, Hong Sain Wei, Chia-Lin Sung, Wing-Kin Ken Ruan, Yijun |
author_sort | Chiu, Kuo Ping |
collection | PubMed |
description | BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. RESULTS: We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the ProjectManager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. CONCLUSION: The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management. |
format | Text |
id | pubmed-1564156 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-15641562006-09-13 PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data Chiu, Kuo Ping Wong, Chee-Hong Chen, Qiongyu Ariyaratne, Pramila Ooi, Hong Sain Wei, Chia-Lin Sung, Wing-Kin Ken Ruan, Yijun BMC Bioinformatics Software BACKGROUND: We recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable. RESULTS: We designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the ProjectManager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping. CONCLUSION: The speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management. BioMed Central 2006-08-25 /pmc/articles/PMC1564156/ /pubmed/16934139 http://dx.doi.org/10.1186/1471-2105-7-390 Text en Copyright © 2006 Chiu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Chiu, Kuo Ping Wong, Chee-Hong Chen, Qiongyu Ariyaratne, Pramila Ooi, Hong Sain Wei, Chia-Lin Sung, Wing-Kin Ken Ruan, Yijun PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title | PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title_full | PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title_fullStr | PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title_full_unstemmed | PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title_short | PET-Tool: a software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data |
title_sort | pet-tool: a software suite for comprehensive processing and managing of paired-end ditag (pet) sequence data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1564156/ https://www.ncbi.nlm.nih.gov/pubmed/16934139 http://dx.doi.org/10.1186/1471-2105-7-390 |
work_keys_str_mv | AT chiukuoping pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT wongcheehong pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT chenqiongyu pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT ariyaratnepramila pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT ooihongsain pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT weichialin pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT sungwingkinken pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata AT ruanyijun pettoolasoftwaresuiteforcomprehensiveprocessingandmanagingofpairedendditagpetsequencedata |