Cargando…

pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction

MOTIVATION: The emergence of single-cell RNA-sequencing has enabled analyses that leverage transitioning cell states to reconstruct pseudotemporal trajectories. Multidimensional data sparsity, zero inflation and technical variation necessitate the selection of high-quality features that feed downstr...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Bob, Herring, Charles A, Lau, Ken S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6596893/
https://www.ncbi.nlm.nih.gov/pubmed/30445607
http://dx.doi.org/10.1093/bioinformatics/bty950
_version_ 1783430514342887424
author Chen, Bob
Herring, Charles A
Lau, Ken S
author_facet Chen, Bob
Herring, Charles A
Lau, Ken S
author_sort Chen, Bob
collection PubMed
description MOTIVATION: The emergence of single-cell RNA-sequencing has enabled analyses that leverage transitioning cell states to reconstruct pseudotemporal trajectories. Multidimensional data sparsity, zero inflation and technical variation necessitate the selection of high-quality features that feed downstream analyses. Despite the development of numerous algorithms for the unsupervised selection of biologically relevant features, their differential performance remains largely unaddressed. RESULTS: We implemented the neighborhood variance ratio (NVR) feature selection approach as a Python package with substantial improvements in performance. In comparing NVR with multiple unsupervised algorithms such as dpFeature, we observed striking differences in features selected. We present evidence that quantifiable dataset properties have observable and predictable effects on the performance of these algorithms. AVAILABILITY AND IMPLEMENTATION: pyNVR is freely available at https://github.com/KenLauLab/NVR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6596893
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-65968932019-10-21 pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction Chen, Bob Herring, Charles A Lau, Ken S Bioinformatics Applications Notes MOTIVATION: The emergence of single-cell RNA-sequencing has enabled analyses that leverage transitioning cell states to reconstruct pseudotemporal trajectories. Multidimensional data sparsity, zero inflation and technical variation necessitate the selection of high-quality features that feed downstream analyses. Despite the development of numerous algorithms for the unsupervised selection of biologically relevant features, their differential performance remains largely unaddressed. RESULTS: We implemented the neighborhood variance ratio (NVR) feature selection approach as a Python package with substantial improvements in performance. In comparing NVR with multiple unsupervised algorithms such as dpFeature, we observed striking differences in features selected. We present evidence that quantifiable dataset properties have observable and predictable effects on the performance of these algorithms. AVAILABILITY AND IMPLEMENTATION: pyNVR is freely available at https://github.com/KenLauLab/NVR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-07-01 2018-11-16 /pmc/articles/PMC6596893/ /pubmed/30445607 http://dx.doi.org/10.1093/bioinformatics/bty950 Text en © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Chen, Bob
Herring, Charles A
Lau, Ken S
pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title_full pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title_fullStr pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title_full_unstemmed pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title_short pyNVR: investigating factors affecting feature selection from scRNA-seq data for lineage reconstruction
title_sort pynvr: investigating factors affecting feature selection from scrna-seq data for lineage reconstruction
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6596893/
https://www.ncbi.nlm.nih.gov/pubmed/30445607
http://dx.doi.org/10.1093/bioinformatics/bty950
work_keys_str_mv AT chenbob pynvrinvestigatingfactorsaffectingfeatureselectionfromscrnaseqdataforlineagereconstruction
AT herringcharlesa pynvrinvestigatingfactorsaffectingfeatureselectionfromscrnaseqdataforlineagereconstruction
AT laukens pynvrinvestigatingfactorsaffectingfeatureselectionfromscrnaseqdataforlineagereconstruction