Cargando…
A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples
Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval f...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
De Gruyter
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348744/ https://www.ncbi.nlm.nih.gov/pubmed/30367805 http://dx.doi.org/10.1515/jib-2017-0067 |
_version_ | 1783390156397477888 |
---|---|
author | Şener, Duygu Dede Santoni, Daniele Felici, Giovanni Oğul, Hasan |
author_facet | Şener, Duygu Dede Santoni, Daniele Felici, Giovanni Oğul, Hasan |
author_sort | Şener, Duygu Dede |
collection | PubMed |
description | Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval framework for identifying relevant metagenomic samples is developed. The framework consists of feature extraction, selection methods and similarity measures for whole metagenome sequencing samples. Performance of the developed framework was evaluated on given samples. A ground truth was used to evaluate the system performance such that if the system retrieves patients with the same disease, -called positive samples-, they are labeled as relevant samples otherwise irrelevant. The experimental results show that relevant experiments can be detected by using different fingerprinting approaches. We observed that Latent Semantic Analysis (LSA) Method is a promising fingerprinting approach for representing metagenomic samples and finding relevance among them. Source codes and executable files are available at www.baskent.edu.tr/∼hogul/WMS_retrieval.rar. |
format | Online Article Text |
id | pubmed-6348744 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | De Gruyter |
record_format | MEDLINE/PubMed |
spelling | pubmed-63487442019-01-28 A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples Şener, Duygu Dede Santoni, Daniele Felici, Giovanni Oğul, Hasan J Integr Bioinform Research Article Finding similarities and differences between metagenomic samples within large repositories has been rather a significant issue for researchers. Over the recent years, content-based retrieval has been suggested by various studies from different perspectives. In this study, a content-based retrieval framework for identifying relevant metagenomic samples is developed. The framework consists of feature extraction, selection methods and similarity measures for whole metagenome sequencing samples. Performance of the developed framework was evaluated on given samples. A ground truth was used to evaluate the system performance such that if the system retrieves patients with the same disease, -called positive samples-, they are labeled as relevant samples otherwise irrelevant. The experimental results show that relevant experiments can be detected by using different fingerprinting approaches. We observed that Latent Semantic Analysis (LSA) Method is a promising fingerprinting approach for representing metagenomic samples and finding relevance among them. Source codes and executable files are available at www.baskent.edu.tr/∼hogul/WMS_retrieval.rar. De Gruyter 2018-10-26 /pmc/articles/PMC6348744/ /pubmed/30367805 http://dx.doi.org/10.1515/jib-2017-0067 Text en ©2018, Duygu Dede Şener et al., published by De Gruyter, Berlin/Boston http://creativecommons.org/licenses/by-nc-nd/4.0 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. |
spellingShingle | Research Article Şener, Duygu Dede Santoni, Daniele Felici, Giovanni Oğul, Hasan A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title | A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title_full | A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title_fullStr | A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title_full_unstemmed | A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title_short | A Content-Based Retrieval Framework for Whole Metagenome Sequencing Samples |
title_sort | content-based retrieval framework for whole metagenome sequencing samples |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348744/ https://www.ncbi.nlm.nih.gov/pubmed/30367805 http://dx.doi.org/10.1515/jib-2017-0067 |
work_keys_str_mv | AT senerduygudede acontentbasedretrievalframeworkforwholemetagenomesequencingsamples AT santonidaniele acontentbasedretrievalframeworkforwholemetagenomesequencingsamples AT felicigiovanni acontentbasedretrievalframeworkforwholemetagenomesequencingsamples AT ogulhasan acontentbasedretrievalframeworkforwholemetagenomesequencingsamples AT senerduygudede contentbasedretrievalframeworkforwholemetagenomesequencingsamples AT santonidaniele contentbasedretrievalframeworkforwholemetagenomesequencingsamples AT felicigiovanni contentbasedretrievalframeworkforwholemetagenomesequencingsamples AT ogulhasan contentbasedretrievalframeworkforwholemetagenomesequencingsamples |