Cargando…

Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae

With the rapid development of high-speed sequencing technologies and the implementation of many whole genome sequencing project, research in the genomics is advancing from genome sequencing to genome synthesis. Synthetic biology technologies such as DNA-based molecular assemblies, genome editing tec...

Descripción completa

Detalles Bibliográficos
Autores principales:	He, Wenying, Ju, Ying, Zeng, Xiangxiang, Liu, Xiangrong, Zou, Quan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2018
Materias:	Microbiology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6144933/ https://www.ncbi.nlm.nih.gov/pubmed/30258427 http://dx.doi.org/10.3389/fmicb.2018.02174

_version_	1783356172163612672
author	He, Wenying Ju, Ying Zeng, Xiangxiang Liu, Xiangrong Zou, Quan
author_facet	He, Wenying Ju, Ying Zeng, Xiangxiang Liu, Xiangrong Zou, Quan
author_sort	He, Wenying
collection	PubMed
description	With the rapid development of high-speed sequencing technologies and the implementation of many whole genome sequencing project, research in the genomics is advancing from genome sequencing to genome synthesis. Synthetic biology technologies such as DNA-based molecular assemblies, genome editing technology, directional evolution technology and DNA storage technology, and other cutting-edge technologies emerge in succession. Especially the rapid growth and development of DNA assembly technology may greatly push forward the success of artificial life. Meanwhile, DNA assembly technology needs a large number of target sequences of known information as data support. Non-coding DNA (ncDNA) sequences occupy most of the organism genomes, thus accurate recognizing of them is necessary. Although experimental methods have been proposed to detect ncDNA sequences, they are expensive for performing genome wide detections. Thus, it is necessary to develop machine-learning methods for predicting non-coding DNA sequences. In this study, we collected the ncDNA benchmark dataset of Saccharomyces cerevisiae and reported a support vector machine-based predictor, called Sc-ncDNAPred, for predicting ncDNA sequences. The optimal feature extraction strategy was selected from a group included mononucleotide, dimer, trimer, tetramer, pentamer, and hexamer, using support vector machine learning method. Sc-ncDNAPred achieved an overall accuracy of 0.98. For the convenience of users, an online web-server has been built at: http://server.malab.cn/Sc_ncDNAPred/index.jsp.
format	Online Article Text
id	pubmed-6144933
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-61449332018-09-26 Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae He, Wenying Ju, Ying Zeng, Xiangxiang Liu, Xiangrong Zou, Quan Front Microbiol Microbiology With the rapid development of high-speed sequencing technologies and the implementation of many whole genome sequencing project, research in the genomics is advancing from genome sequencing to genome synthesis. Synthetic biology technologies such as DNA-based molecular assemblies, genome editing technology, directional evolution technology and DNA storage technology, and other cutting-edge technologies emerge in succession. Especially the rapid growth and development of DNA assembly technology may greatly push forward the success of artificial life. Meanwhile, DNA assembly technology needs a large number of target sequences of known information as data support. Non-coding DNA (ncDNA) sequences occupy most of the organism genomes, thus accurate recognizing of them is necessary. Although experimental methods have been proposed to detect ncDNA sequences, they are expensive for performing genome wide detections. Thus, it is necessary to develop machine-learning methods for predicting non-coding DNA sequences. In this study, we collected the ncDNA benchmark dataset of Saccharomyces cerevisiae and reported a support vector machine-based predictor, called Sc-ncDNAPred, for predicting ncDNA sequences. The optimal feature extraction strategy was selected from a group included mononucleotide, dimer, trimer, tetramer, pentamer, and hexamer, using support vector machine learning method. Sc-ncDNAPred achieved an overall accuracy of 0.98. For the convenience of users, an online web-server has been built at: http://server.malab.cn/Sc_ncDNAPred/index.jsp. Frontiers Media S.A. 2018-09-12 /pmc/articles/PMC6144933/ /pubmed/30258427 http://dx.doi.org/10.3389/fmicb.2018.02174 Text en Copyright © 2018 He, Ju, Zeng, Liu and Zou. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Microbiology He, Wenying Ju, Ying Zeng, Xiangxiang Liu, Xiangrong Zou, Quan Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title	Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title_full	Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title_fullStr	Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title_full_unstemmed	Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title_short	Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae
title_sort	sc-ncdnapred: a sequence-based predictor for identifying non-coding dna in saccharomyces cerevisiae
topic	Microbiology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6144933/ https://www.ncbi.nlm.nih.gov/pubmed/30258427 http://dx.doi.org/10.3389/fmicb.2018.02174
work_keys_str_mv	AT hewenying scncdnapredasequencebasedpredictorforidentifyingnoncodingdnainsaccharomycescerevisiae AT juying scncdnapredasequencebasedpredictorforidentifyingnoncodingdnainsaccharomycescerevisiae AT zengxiangxiang scncdnapredasequencebasedpredictorforidentifyingnoncodingdnainsaccharomycescerevisiae AT liuxiangrong scncdnapredasequencebasedpredictorforidentifyingnoncodingdnainsaccharomycescerevisiae AT zouquan scncdnapredasequencebasedpredictorforidentifyingnoncodingdnainsaccharomycescerevisiae

Sc-ncDNAPred: A Sequence-Based Predictor for Identifying Non-coding DNA in Saccharomyces cerevisiae

Ejemplares similares