Cargando…

ProDaMa: an open source Python library to generate protein structure datasets

BACKGROUND: The huge difference between the number of known sequences and known tertiary structures has justified the use of automated methods for protein analysis. Although a general methodology to solve these problems has not been yet devised, researchers are engaged in developing more accurate te...

Descripción completa

Detalles Bibliográficos
Autores principales: Armano, Giuliano, Manconi, Andrea
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761932/
https://www.ncbi.nlm.nih.gov/pubmed/19799773
http://dx.doi.org/10.1186/1756-0500-2-202
_version_ 1782172876513738752
author Armano, Giuliano
Manconi, Andrea
author_facet Armano, Giuliano
Manconi, Andrea
author_sort Armano, Giuliano
collection PubMed
description BACKGROUND: The huge difference between the number of known sequences and known tertiary structures has justified the use of automated methods for protein analysis. Although a general methodology to solve these problems has not been yet devised, researchers are engaged in developing more accurate techniques and algorithms whose training plays a relevant role in determining their performance. From this perspective, particular importance is given to the training data used in experiments, and researchers are often engaged in the generation of specialized datasets that meet their requirements. FINDINGS: To facilitate the task of generating specialized datasets we devised and implemented ProDaMa, an open source Python library than provides classes for retrieving, organizing, updating, analyzing, and filtering protein data. CONCLUSION: ProDaMa has been used to generate specialized datasets useful for secondary structure prediction and to develop a collaborative web application aimed at generating and sharing protein structure datasets. The library, the related database, and the documentation are freely available at the URL .
format Text
id pubmed-2761932
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27619322009-10-15 ProDaMa: an open source Python library to generate protein structure datasets Armano, Giuliano Manconi, Andrea BMC Res Notes Technical Note BACKGROUND: The huge difference between the number of known sequences and known tertiary structures has justified the use of automated methods for protein analysis. Although a general methodology to solve these problems has not been yet devised, researchers are engaged in developing more accurate techniques and algorithms whose training plays a relevant role in determining their performance. From this perspective, particular importance is given to the training data used in experiments, and researchers are often engaged in the generation of specialized datasets that meet their requirements. FINDINGS: To facilitate the task of generating specialized datasets we devised and implemented ProDaMa, an open source Python library than provides classes for retrieving, organizing, updating, analyzing, and filtering protein data. CONCLUSION: ProDaMa has been used to generate specialized datasets useful for secondary structure prediction and to develop a collaborative web application aimed at generating and sharing protein structure datasets. The library, the related database, and the documentation are freely available at the URL . BioMed Central 2009-10-02 /pmc/articles/PMC2761932/ /pubmed/19799773 http://dx.doi.org/10.1186/1756-0500-2-202 Text en Copyright © 2009 Manconi et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Armano, Giuliano
Manconi, Andrea
ProDaMa: an open source Python library to generate protein structure datasets
title ProDaMa: an open source Python library to generate protein structure datasets
title_full ProDaMa: an open source Python library to generate protein structure datasets
title_fullStr ProDaMa: an open source Python library to generate protein structure datasets
title_full_unstemmed ProDaMa: an open source Python library to generate protein structure datasets
title_short ProDaMa: an open source Python library to generate protein structure datasets
title_sort prodama: an open source python library to generate protein structure datasets
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761932/
https://www.ncbi.nlm.nih.gov/pubmed/19799773
http://dx.doi.org/10.1186/1756-0500-2-202
work_keys_str_mv AT armanogiuliano prodamaanopensourcepythonlibrarytogenerateproteinstructuredatasets
AT manconiandrea prodamaanopensourcepythonlibrarytogenerateproteinstructuredatasets