Cargando…

A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data

We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statist...

Descripción completa

Detalles Bibliográficos
Autores principales: Gog, Julia R., Lever, Andrew M. L., Skittrall, Jordan P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5898753/
https://www.ncbi.nlm.nih.gov/pubmed/29652903
http://dx.doi.org/10.1371/journal.pone.0195763
_version_ 1783314185221832704
author Gog, Julia R.
Lever, Andrew M. L.
Skittrall, Jordan P.
author_facet Gog, Julia R.
Lever, Andrew M. L.
Skittrall, Jordan P.
author_sort Gog, Julia R.
collection PubMed
description We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.
format Online
Article
Text
id pubmed-5898753
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58987532018-04-27 A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data Gog, Julia R. Lever, Andrew M. L. Skittrall, Jordan P. PLoS One Research Article We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus. Public Library of Science 2018-04-13 /pmc/articles/PMC5898753/ /pubmed/29652903 http://dx.doi.org/10.1371/journal.pone.0195763 Text en © 2018 Gog et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gog, Julia R.
Lever, Andrew M. L.
Skittrall, Jordan P.
A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title_full A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title_fullStr A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title_full_unstemmed A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title_short A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
title_sort new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5898753/
https://www.ncbi.nlm.nih.gov/pubmed/29652903
http://dx.doi.org/10.1371/journal.pone.0195763
work_keys_str_mv AT gogjuliar anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT leverandrewml anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT skittralljordanp anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT gogjuliar newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT leverandrewml newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT skittralljordanp newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata