Cargando…
CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering
Similarity-based clustering and classification of compounds enable the search of drug leads and the structural and chemogenomic studies for facilitating chemical, biomedical, agricultural, material and other industrial applications. A database that organizes compounds into similarity-based as well a...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383987/ https://www.ncbi.nlm.nih.gov/pubmed/25414339 http://dx.doi.org/10.1093/nar/gku1212 |
_version_ | 1782364828135849984 |
---|---|
author | Zhang, Cheng Tao, Lin Qin, Chu Zhang, Peng Chen, Shangying Zeng, Xian Xu, Feng Chen, Zhe Yang, Sheng Yong Chen, Yu Zong |
author_facet | Zhang, Cheng Tao, Lin Qin, Chu Zhang, Peng Chen, Shangying Zeng, Xian Xu, Feng Chen, Zhe Yang, Sheng Yong Chen, Yu Zong |
author_sort | Zhang, Cheng |
collection | PubMed |
description | Similarity-based clustering and classification of compounds enable the search of drug leads and the structural and chemogenomic studies for facilitating chemical, biomedical, agricultural, material and other industrial applications. A database that organizes compounds into similarity-based as well as scaffold-based and property-based families is useful for facilitating these tasks. CFam Chemical Family database http://bidd2.cse.nus.edu.sg/cfam was developed to hierarchically cluster drugs, bioactive molecules, human metabolites, natural products, patented agents and other molecules into functional families, superfamilies and classes of structurally similar compounds based on the literature-reported high, intermediate and remote similarity measures. The compounds were represented by molecular fingerprint and molecular similarity was measured by Tanimoto coefficient. The functional seeds of CFam families were from hierarchically clustered drugs, bioactive molecules, human metabolites, natural products, patented agents, respectively, which were used to characterize families and cluster compounds into families, superfamilies and classes. CFam currently contains 11 643 classes, 34 880 superfamilies and 87 136 families of 490 279 compounds (1691 approved drugs, 1228 clinical trial drugs, 12 386 investigative drugs, 262 881 highly active molecules, 15 055 human metabolites, 80 255 ZINC-processed natural products and 116 783 patented agents). Efforts will be made to further expand CFam database and add more functional categories and families based on other types of molecular representations. |
format | Online Article Text |
id | pubmed-4383987 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-43839872015-04-08 CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering Zhang, Cheng Tao, Lin Qin, Chu Zhang, Peng Chen, Shangying Zeng, Xian Xu, Feng Chen, Zhe Yang, Sheng Yong Chen, Yu Zong Nucleic Acids Res Database Issue Similarity-based clustering and classification of compounds enable the search of drug leads and the structural and chemogenomic studies for facilitating chemical, biomedical, agricultural, material and other industrial applications. A database that organizes compounds into similarity-based as well as scaffold-based and property-based families is useful for facilitating these tasks. CFam Chemical Family database http://bidd2.cse.nus.edu.sg/cfam was developed to hierarchically cluster drugs, bioactive molecules, human metabolites, natural products, patented agents and other molecules into functional families, superfamilies and classes of structurally similar compounds based on the literature-reported high, intermediate and remote similarity measures. The compounds were represented by molecular fingerprint and molecular similarity was measured by Tanimoto coefficient. The functional seeds of CFam families were from hierarchically clustered drugs, bioactive molecules, human metabolites, natural products, patented agents, respectively, which were used to characterize families and cluster compounds into families, superfamilies and classes. CFam currently contains 11 643 classes, 34 880 superfamilies and 87 136 families of 490 279 compounds (1691 approved drugs, 1228 clinical trial drugs, 12 386 investigative drugs, 262 881 highly active molecules, 15 055 human metabolites, 80 255 ZINC-processed natural products and 116 783 patented agents). Efforts will be made to further expand CFam database and add more functional categories and families based on other types of molecular representations. Oxford University Press 2014-11-20 2015-01-28 /pmc/articles/PMC4383987/ /pubmed/25414339 http://dx.doi.org/10.1093/nar/gku1212 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Database Issue Zhang, Cheng Tao, Lin Qin, Chu Zhang, Peng Chen, Shangying Zeng, Xian Xu, Feng Chen, Zhe Yang, Sheng Yong Chen, Yu Zong CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title | CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title_full | CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title_fullStr | CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title_full_unstemmed | CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title_short | CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
title_sort | cfam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering |
topic | Database Issue |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383987/ https://www.ncbi.nlm.nih.gov/pubmed/25414339 http://dx.doi.org/10.1093/nar/gku1212 |
work_keys_str_mv | AT zhangcheng cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT taolin cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT qinchu cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT zhangpeng cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT chenshangying cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT zengxian cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT xufeng cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT chenzhe cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT yangshengyong cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering AT chenyuzong cfamachemicalfamiliesdatabasebasedoniterativeselectionoffunctionalseedsandseeddirectedcompoundclustering |