Cargando…

Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization

BACKGROUND: Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jiaoyun, Wang, Haipeng, Ding, Huitong, An, Ning, Alterovitz, Gil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248484/
https://www.ncbi.nlm.nih.gov/pubmed/28103789
http://dx.doi.org/10.1186/s12859-017-1484-4
_version_ 1782497277164650496
author Yang, Jiaoyun
Wang, Haipeng
Ding, Huitong
An, Ning
Alterovitz, Gil
author_facet Yang, Jiaoyun
Wang, Haipeng
Ding, Huitong
An, Ning
Alterovitz, Gil
author_sort Yang, Jiaoyun
collection PubMed
description BACKGROUND: Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar data quality issues, we propose to visualize biobricks to tackle them. However, existing dimensionality reduction methods could not be directly applied on biobricks datasets. Hereby, we use normalized edit distance to enhance dimensionality reduction methods, including Isomap and Laplacian Eigenmaps. RESULTS: By extracting biobricks from synthetic biology database Registry of Standard Biological Parts, six combinations of various types of biobricks are tested. The visualization graphs illustrate discriminated biobricks and inappropriately labeled biobricks. Clustering algorithm K-means is adopted to quantify the reduction results. The average clustering accuracy for Isomap and Laplacian Eigenmaps are 0.857 and 0.844, respectively. Besides, Laplacian Eigenmaps is 5 times faster than Isomap, and its visualization graph is more concentrated to discriminate biobricks. CONCLUSIONS: By combining normalized edit distance with Isomap and Laplacian Eigenmaps, synthetic biology biobircks are successfully visualized in two dimensional space. Various types of biobricks could be discriminated and inappropriately labeled biobricks could be determined, which could help to assess crowdsourcing-based synthetic biology databases’ quality, and make biobricks selection. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1484-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5248484
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52484842017-01-25 Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization Yang, Jiaoyun Wang, Haipeng Ding, Huitong An, Ning Alterovitz, Gil BMC Bioinformatics Research Article BACKGROUND: Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar data quality issues, we propose to visualize biobricks to tackle them. However, existing dimensionality reduction methods could not be directly applied on biobricks datasets. Hereby, we use normalized edit distance to enhance dimensionality reduction methods, including Isomap and Laplacian Eigenmaps. RESULTS: By extracting biobricks from synthetic biology database Registry of Standard Biological Parts, six combinations of various types of biobricks are tested. The visualization graphs illustrate discriminated biobricks and inappropriately labeled biobricks. Clustering algorithm K-means is adopted to quantify the reduction results. The average clustering accuracy for Isomap and Laplacian Eigenmaps are 0.857 and 0.844, respectively. Besides, Laplacian Eigenmaps is 5 times faster than Isomap, and its visualization graph is more concentrated to discriminate biobricks. CONCLUSIONS: By combining normalized edit distance with Isomap and Laplacian Eigenmaps, synthetic biology biobircks are successfully visualized in two dimensional space. Various types of biobricks could be discriminated and inappropriately labeled biobricks could be determined, which could help to assess crowdsourcing-based synthetic biology databases’ quality, and make biobricks selection. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1484-4) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-19 /pmc/articles/PMC5248484/ /pubmed/28103789 http://dx.doi.org/10.1186/s12859-017-1484-4 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Yang, Jiaoyun
Wang, Haipeng
Ding, Huitong
An, Ning
Alterovitz, Gil
Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title_full Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title_fullStr Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title_full_unstemmed Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title_short Nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
title_sort nonlinear dimensionality reduction methods for synthetic biology biobricks’ visualization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5248484/
https://www.ncbi.nlm.nih.gov/pubmed/28103789
http://dx.doi.org/10.1186/s12859-017-1484-4
work_keys_str_mv AT yangjiaoyun nonlineardimensionalityreductionmethodsforsyntheticbiologybiobricksvisualization
AT wanghaipeng nonlineardimensionalityreductionmethodsforsyntheticbiologybiobricksvisualization
AT dinghuitong nonlineardimensionalityreductionmethodsforsyntheticbiologybiobricksvisualization
AT anning nonlineardimensionalityreductionmethodsforsyntheticbiologybiobricksvisualization
AT alterovitzgil nonlineardimensionalityreductionmethodsforsyntheticbiologybiobricksvisualization