Cargando…

SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets

MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RES...

Descripción completa

Detalles Bibliográficos
Autores principales: Mao, Hongliang, Wang, Hao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408816/
https://www.ncbi.nlm.nih.gov/pubmed/28062442
http://dx.doi.org/10.1093/bioinformatics/btw718
_version_ 1783232371183583232
author Mao, Hongliang
Wang, Hao
author_facet Mao, Hongliang
Wang, Hao
author_sort Mao, Hongliang
collection PubMed
description MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RESULTS: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. AVAILABILITY AND IMPLEMENTATION: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5408816
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-54088162017-05-03 SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets Mao, Hongliang Wang, Hao Bioinformatics Applications Notes MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RESULTS: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. AVAILABILITY AND IMPLEMENTATION: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-03-01 2016-12-13 /pmc/articles/PMC5408816/ /pubmed/28062442 http://dx.doi.org/10.1093/bioinformatics/btw718 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Applications Notes
Mao, Hongliang
Wang, Hao
SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title_full SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title_fullStr SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title_full_unstemmed SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title_short SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
title_sort sine_scan: an efficient tool to discover short interspersed nuclear elements (sines) in large-scale genomic datasets
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408816/
https://www.ncbi.nlm.nih.gov/pubmed/28062442
http://dx.doi.org/10.1093/bioinformatics/btw718
work_keys_str_mv AT maohongliang sinescananefficienttooltodiscovershortinterspersednuclearelementssinesinlargescalegenomicdatasets
AT wanghao sinescananefficienttooltodiscovershortinterspersednuclearelementssinesinlargescalegenomicdatasets