Cargando…

Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets

Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), bas...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Wanxin, Mirone, Jules, Prasad, Ashok, Miolane, Nina, Legrand, Carine, Dao Duc, Khanh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448701/
https://www.ncbi.nlm.nih.gov/pubmed/37637212
http://dx.doi.org/10.3389/fbinf.2023.1211819
_version_ 1785094793384165376
author Li, Wanxin
Mirone, Jules
Prasad, Ashok
Miolane, Nina
Legrand, Carine
Dao Duc, Khanh
author_facet Li, Wanxin
Mirone, Jules
Prasad, Ashok
Miolane, Nina
Legrand, Carine
Dao Duc, Khanh
author_sort Li, Wanxin
collection PubMed
description Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization.
format Online
Article
Text
id pubmed-10448701
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-104487012023-08-25 Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets Li, Wanxin Mirone, Jules Prasad, Ashok Miolane, Nina Legrand, Carine Dao Duc, Khanh Front Bioinform Bioinformatics Conventional dimensionality reduction methods like Multidimensional Scaling (MDS) are sensitive to the presence of orthogonal outliers, leading to significant defects in the embedding. We introduce a robust MDS method, called DeCOr-MDS (Detection and Correction of Orthogonal outliers using MDS), based on the geometry and statistics of simplices formed by data points, that allows to detect orthogonal outliers and subsequently reduce dimensionality. We validate our methods using synthetic datasets, and further show how it can be applied to a variety of large real biological datasets, including cancer image cell data, human microbiome project data and single cell RNA sequencing data, to address the task of data cleaning and visualization. Frontiers Media S.A. 2023-08-10 /pmc/articles/PMC10448701/ /pubmed/37637212 http://dx.doi.org/10.3389/fbinf.2023.1211819 Text en Copyright © 2023 Li, Mirone, Prasad, Miolane, Legrand and Dao Duc. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioinformatics
Li, Wanxin
Mirone, Jules
Prasad, Ashok
Miolane, Nina
Legrand, Carine
Dao Duc, Khanh
Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title_full Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title_fullStr Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title_full_unstemmed Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title_short Orthogonal outlier detection and dimension estimation for improved MDS embedding of biological datasets
title_sort orthogonal outlier detection and dimension estimation for improved mds embedding of biological datasets
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10448701/
https://www.ncbi.nlm.nih.gov/pubmed/37637212
http://dx.doi.org/10.3389/fbinf.2023.1211819
work_keys_str_mv AT liwanxin orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets
AT mironejules orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets
AT prasadashok orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets
AT miolanenina orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets
AT legrandcarine orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets
AT daoduckhanh orthogonaloutlierdetectionanddimensionestimationforimprovedmdsembeddingofbiologicaldatasets