Cargando…

A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples

Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval f...

Descripción completa

Detalles Bibliográficos
Autores principales: Şener, Duygu Dede, Santoni, Daniele, Felici, Giovanni, Oğul, Hasan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: De Gruyter 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348744/
https://www.ncbi.nlm.nih.gov/pubmed/30367805
http://dx.doi.org/10.1515/jib-2017-0067
_version_ 1783390156397477888
author Şener, Duygu Dede
Santoni, Daniele
Felici, Giovanni
Oğul, Hasan
author_facet Şener, Duygu Dede
Santoni, Daniele
Felici, Giovanni
Oğul, Hasan
author_sort Şener, Duygu Dede
collection PubMed
description Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval framework for identifying relevant metagenomic samples is developed. The framework consists of feature extraction, selection methods and similarity measures for whole metagenome sequencing samples. Performance of the developed framework was evaluated on given samples. A ground truth was used to evaluate the system performance such that if the system retrieves patients with the same disease, -called positive samples-, they are labeled as relevant samples otherwise irrelevant. The experimental results show that relevant experiments can be detected by using different fingerprinting approaches. We observed that Latent Semantic Analysis (LSA) Method is a promising fingerprinting approach for representing metagenomic samples and finding relevance among them. Source codes and executable files are available at www.baskent.edu.tr/∼hogul/WMS_retrieval.rar.
format Online
Article
Text
id pubmed-6348744
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher De Gruyter
record_format MEDLINE/PubMed
spelling pubmed-63487442019-01-28 A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples Şener, Duygu Dede Santoni, Daniele Felici, Giovanni Oğul, Hasan J Integr Bioinform Research Article Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval framework for identifying relevant metagenomic samples is developed. The framework consists of feature extraction, selection methods and similarity measures for whole metagenome sequencing samples. Performance of the developed framework was evaluated on given samples. A ground truth was used to evaluate the system performance such that if the system retrieves patients with the same disease, -called positive samples-, they are labeled as relevant samples otherwise irrelevant. The experimental results show that relevant experiments can be detected by using different fingerprinting approaches. We observed that Latent Semantic Analysis (LSA) Method is a promising fingerprinting approach for representing metagenomic samples and finding relevance among them. Source codes and executable files are available at www.baskent.edu.tr/∼hogul/WMS_retrieval.rar. De Gruyter 2018-10-26 /pmc/articles/PMC6348744/ /pubmed/30367805 http://dx.doi.org/10.1515/jib-2017-0067 Text en ©2018, Duygu Dede Şener et al., published by De Gruyter, Berlin/Boston http://creativecommons.org/licenses/by-nc-nd/4.0 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
spellingShingle Research Article
Şener, Duygu Dede
Santoni, Daniele
Felici, Giovanni
Oğul, Hasan
A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title_full A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title_fullStr A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title_full_unstemmed A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title_short A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
title_sort content-based retrieval framework for whole metagenome sequencing samples
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348744/
https://www.ncbi.nlm.nih.gov/pubmed/30367805
http://dx.doi.org/10.1515/jib-2017-0067
work_keys_str_mv AT senerduygudede acontentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT santonidaniele acontentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT felicigiovanni acontentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT ogulhasan acontentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT senerduygudede contentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT santonidaniele contentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT felicigiovanni contentbasedretrievalframeworkforwholemetagenomesequencingsamples
AT ogulhasan contentbasedretrievalframeworkforwholemetagenomesequencingsamples