Cargando…
SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets
MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RES...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408816/ https://www.ncbi.nlm.nih.gov/pubmed/28062442 http://dx.doi.org/10.1093/bioinformatics/btw718 |
_version_ | 1783232371183583232 |
---|---|
author | Mao, Hongliang Wang, Hao |
author_facet | Mao, Hongliang Wang, Hao |
author_sort | Mao, Hongliang |
collection | PubMed |
description | MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RESULTS: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. AVAILABILITY AND IMPLEMENTATION: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5408816 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54088162017-05-03 SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets Mao, Hongliang Wang, Hao Bioinformatics Applications Notes MOTIVATION: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. RESULTS: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. AVAILABILITY AND IMPLEMENTATION: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-03-01 2016-12-13 /pmc/articles/PMC5408816/ /pubmed/28062442 http://dx.doi.org/10.1093/bioinformatics/btw718 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Applications Notes Mao, Hongliang Wang, Hao SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title | SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title_full | SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title_fullStr | SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title_full_unstemmed | SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title_short | SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets |
title_sort | sine_scan: an efficient tool to discover short interspersed nuclear elements (sines) in large-scale genomic datasets |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408816/ https://www.ncbi.nlm.nih.gov/pubmed/28062442 http://dx.doi.org/10.1093/bioinformatics/btw718 |
work_keys_str_mv | AT maohongliang sinescananefficienttooltodiscovershortinterspersednuclearelementssinesinlargescalegenomicdatasets AT wanghao sinescananefficienttooltodiscovershortinterspersednuclearelementssinesinlargescalegenomicdatasets |