Cargando…

UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences

With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Pu-Feng, Zhao, Wei, Miao, Yang-Yang, Wei, Le-Yi, Wang, Likun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5713368/
https://www.ncbi.nlm.nih.gov/pubmed/29135934
http://dx.doi.org/10.3390/ijms18112400
_version_ 1783283408830464000
author Du, Pu-Feng
Zhao, Wei
Miao, Yang-Yang
Wei, Le-Yi
Wang, Likun
author_facet Du, Pu-Feng
Zhao, Wei
Miao, Yang-Yang
Wei, Le-Yi
Wang, Likun
author_sort Du, Pu-Feng
collection PubMed
description With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.
format Online
Article
Text
id pubmed-5713368
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-57133682017-12-07 UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences Du, Pu-Feng Zhao, Wei Miao, Yang-Yang Wei, Le-Yi Wang, Likun Int J Mol Sci Article With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository. MDPI 2017-11-14 /pmc/articles/PMC5713368/ /pubmed/29135934 http://dx.doi.org/10.3390/ijms18112400 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Du, Pu-Feng
Zhao, Wei
Miao, Yang-Yang
Wei, Le-Yi
Wang, Likun
UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_full UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_fullStr UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_full_unstemmed UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_short UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
title_sort ultrapse: a universal and extensible software platform for representing biological sequences
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5713368/
https://www.ncbi.nlm.nih.gov/pubmed/29135934
http://dx.doi.org/10.3390/ijms18112400
work_keys_str_mv AT dupufeng ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT zhaowei ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT miaoyangyang ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT weileyi ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences
AT wanglikun ultrapseauniversalandextensiblesoftwareplatformforrepresentingbiologicalsequences