Cargando…

SAVI: a statistical algorithm for variant frequency identification

BACKGROUND: Many problems in biomedical research can be posed as a comparison between related samples (healthy vs. disease, subtypes of the same disease, longitudinal data representing the progression of a disease, etc). In the cases in which the distinction has a genetic or epigenetic basis, next-g...

Descripción completa

Detalles Bibliográficos
Autores principales: Trifonov, Vladimir, Pasqualucci, Laura, Tiacci, Enrico, Falini, Brunangelo, Rabadan, Raul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851977/
https://www.ncbi.nlm.nih.gov/pubmed/24564980
http://dx.doi.org/10.1186/1752-0509-7-S2-S2
_version_ 1782294388681998336
author Trifonov, Vladimir
Pasqualucci, Laura
Tiacci, Enrico
Falini, Brunangelo
Rabadan, Raul
author_facet Trifonov, Vladimir
Pasqualucci, Laura
Tiacci, Enrico
Falini, Brunangelo
Rabadan, Raul
author_sort Trifonov, Vladimir
collection PubMed
description BACKGROUND: Many problems in biomedical research can be posed as a comparison between related samples (healthy vs. disease, subtypes of the same disease, longitudinal data representing the progression of a disease, etc). In the cases in which the distinction has a genetic or epigenetic basis, next-generation sequencing technologies have become a major tool for obtaining the difference between the samples. A commonly occurring application is the identification of somatic mutations occurring in tumor tissue samples driving a single cell to expand clonally. In this case, the progression of the disease can be traced through the trajectory of the frequency of the oncogenic alleles. Thus obtaining precise estimates of the frequency of abnormal alleles at various stages of the disease is paramount to understanding the processes driving it. Although the procedure is conceptually simple, technical difficulties arise due to inhomogeneous samples, existence of competing subclonal populations, and systematic and non-systematic errors introduced by the sequencing technologies. RESULTS: We present a method, Statistical Algorithm for Variant Frequency Identification (SAVI), to estimate the frequency of alleles in a set of samples. The method employs Bayesian analysis and uses an iterative procedure to derive empirical priors. The approach allows for the comparison of allele frequencies across several samples, e.g. normal/tumor pairs and more complex experimental designs comparing multiple samples in tumor progression, as well as analyzing sequencing data from RNA sequencing experiments. CONCLUSIONS: Analyzing sequencing data through estimating allele frequencies using empirical Bayes methods is a powerful complement to the ever-increasing throughput of the sequencing technologies.
format Online
Article
Text
id pubmed-3851977
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38519772013-12-20 SAVI: a statistical algorithm for variant frequency identification Trifonov, Vladimir Pasqualucci, Laura Tiacci, Enrico Falini, Brunangelo Rabadan, Raul BMC Syst Biol Research BACKGROUND: Many problems in biomedical research can be posed as a comparison between related samples (healthy vs. disease, subtypes of the same disease, longitudinal data representing the progression of a disease, etc). In the cases in which the distinction has a genetic or epigenetic basis, next-generation sequencing technologies have become a major tool for obtaining the difference between the samples. A commonly occurring application is the identification of somatic mutations occurring in tumor tissue samples driving a single cell to expand clonally. In this case, the progression of the disease can be traced through the trajectory of the frequency of the oncogenic alleles. Thus obtaining precise estimates of the frequency of abnormal alleles at various stages of the disease is paramount to understanding the processes driving it. Although the procedure is conceptually simple, technical difficulties arise due to inhomogeneous samples, existence of competing subclonal populations, and systematic and non-systematic errors introduced by the sequencing technologies. RESULTS: We present a method, Statistical Algorithm for Variant Frequency Identification (SAVI), to estimate the frequency of alleles in a set of samples. The method employs Bayesian analysis and uses an iterative procedure to derive empirical priors. The approach allows for the comparison of allele frequencies across several samples, e.g. normal/tumor pairs and more complex experimental designs comparing multiple samples in tumor progression, as well as analyzing sequencing data from RNA sequencing experiments. CONCLUSIONS: Analyzing sequencing data through estimating allele frequencies using empirical Bayes methods is a powerful complement to the ever-increasing throughput of the sequencing technologies. BioMed Central 2013-10-14 /pmc/articles/PMC3851977/ /pubmed/24564980 http://dx.doi.org/10.1186/1752-0509-7-S2-S2 Text en Copyright © 2013 Trifonov et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Trifonov, Vladimir
Pasqualucci, Laura
Tiacci, Enrico
Falini, Brunangelo
Rabadan, Raul
SAVI: a statistical algorithm for variant frequency identification
title SAVI: a statistical algorithm for variant frequency identification
title_full SAVI: a statistical algorithm for variant frequency identification
title_fullStr SAVI: a statistical algorithm for variant frequency identification
title_full_unstemmed SAVI: a statistical algorithm for variant frequency identification
title_short SAVI: a statistical algorithm for variant frequency identification
title_sort savi: a statistical algorithm for variant frequency identification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3851977/
https://www.ncbi.nlm.nih.gov/pubmed/24564980
http://dx.doi.org/10.1186/1752-0509-7-S2-S2
work_keys_str_mv AT trifonovvladimir saviastatisticalalgorithmforvariantfrequencyidentification
AT pasqualuccilaura saviastatisticalalgorithmforvariantfrequencyidentification
AT tiaccienrico saviastatisticalalgorithmforvariantfrequencyidentification
AT falinibrunangelo saviastatisticalalgorithmforvariantfrequencyidentification
AT rabadanraul saviastatisticalalgorithmforvariantfrequencyidentification