Cargando…
What we can see from very small size sample of metagenomic sequences
BACKGROUND: Since the analysis of a large number of metagenomic sequences costs heavy computing resources and takes long time, we examined a selected small part of metagenomic sequences as “sample”s of the entire full sequences, both for a mock community and for 10 different existing metagenomics ca...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6215618/ https://www.ncbi.nlm.nih.gov/pubmed/30390617 http://dx.doi.org/10.1186/s12859-018-2431-8 |
_version_ | 1783368177012441088 |
---|---|
author | Kwak, Jaesik Park, Joonhong |
author_facet | Kwak, Jaesik Park, Joonhong |
author_sort | Kwak, Jaesik |
collection | PubMed |
description | BACKGROUND: Since the analysis of a large number of metagenomic sequences costs heavy computing resources and takes long time, we examined a selected small part of metagenomic sequences as “sample”s of the entire full sequences, both for a mock community and for 10 different existing metagenomics case studies. A mock community with 10 bacterial strains was prepared, and their mixed genome were sequenced by Hiseq. The hits of BLAST search for reference genome of each strain were counted. Each of 176 different small parts selected from these sequences were also searched by BLAST and their hits were also counted, in order to compare them to the original search results from the full sequences. We also prepared small parts of sequences which were selected from 10 publicly downloadable research data of MG-RAST service, and analyzed these samples with MG-RAST. RESULTS: Both the BLAST search tests of the mock community and the results from the publicly downloadable researches of MG-RAST show that sampling an extremely small part from sequence data is useful to estimate brief taxonomic information of the original metagenomic sequences. For 9 cases out of 10, the most annotated classes from the MG-RAST analyses of the selected partial sample sequences are the same as the ones from the originals. CONCLUSIONS: When a researcher wants to estimate brief information of a metagenome’s taxonomic distribution with less computing resources and within shorter time, the researcher can analyze a selected small part of metagenomic sequences. With this approach, we can also build a strategy to monitor metagenome samples of wider geographic area, more frequently. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2431-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6215618 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62156182018-11-08 What we can see from very small size sample of metagenomic sequences Kwak, Jaesik Park, Joonhong BMC Bioinformatics Research Article BACKGROUND: Since the analysis of a large number of metagenomic sequences costs heavy computing resources and takes long time, we examined a selected small part of metagenomic sequences as “sample”s of the entire full sequences, both for a mock community and for 10 different existing metagenomics case studies. A mock community with 10 bacterial strains was prepared, and their mixed genome were sequenced by Hiseq. The hits of BLAST search for reference genome of each strain were counted. Each of 176 different small parts selected from these sequences were also searched by BLAST and their hits were also counted, in order to compare them to the original search results from the full sequences. We also prepared small parts of sequences which were selected from 10 publicly downloadable research data of MG-RAST service, and analyzed these samples with MG-RAST. RESULTS: Both the BLAST search tests of the mock community and the results from the publicly downloadable researches of MG-RAST show that sampling an extremely small part from sequence data is useful to estimate brief taxonomic information of the original metagenomic sequences. For 9 cases out of 10, the most annotated classes from the MG-RAST analyses of the selected partial sample sequences are the same as the ones from the originals. CONCLUSIONS: When a researcher wants to estimate brief information of a metagenome’s taxonomic distribution with less computing resources and within shorter time, the researcher can analyze a selected small part of metagenomic sequences. With this approach, we can also build a strategy to monitor metagenome samples of wider geographic area, more frequently. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2431-8) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-03 /pmc/articles/PMC6215618/ /pubmed/30390617 http://dx.doi.org/10.1186/s12859-018-2431-8 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Kwak, Jaesik Park, Joonhong What we can see from very small size sample of metagenomic sequences |
title | What we can see from very small size sample of metagenomic sequences |
title_full | What we can see from very small size sample of metagenomic sequences |
title_fullStr | What we can see from very small size sample of metagenomic sequences |
title_full_unstemmed | What we can see from very small size sample of metagenomic sequences |
title_short | What we can see from very small size sample of metagenomic sequences |
title_sort | what we can see from very small size sample of metagenomic sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6215618/ https://www.ncbi.nlm.nih.gov/pubmed/30390617 http://dx.doi.org/10.1186/s12859-018-2431-8 |
work_keys_str_mv | AT kwakjaesik whatwecanseefromverysmallsizesampleofmetagenomicsequences AT parkjoonhong whatwecanseefromverysmallsizesampleofmetagenomicsequences |