Cargando…

Real-Time Definition of Non-Randomness in the Distribution of Genomic Events

Features such as mutations or structural characteristics can be non-randomly or non-uniformly distributed within a genome. So far, computer simulations were required for statistical inferences on the distribution of sequence motifs. Here, we show that these analyses are possible using an analytical,...

Descripción completa

Detalles Bibliográficos
Autores principales: Abel, Ulrich, Deichmann, Annette, Bartholomae, Cynthia, Schwarzwaelder, Kerstin, Glimm, Hanno, Howe, Steven, Thrasher, Adrian, Garrigue, Alexandrine, Hacein-Bey-Abina, Salima, Cavazzana-Calvo, Marina, Fischer, Alain, Jaeger, Dirk, von Kalle, Christof, Schmidt, Manfred
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1892803/
https://www.ncbi.nlm.nih.gov/pubmed/17593969
http://dx.doi.org/10.1371/journal.pone.0000570
Descripción
Sumario:Features such as mutations or structural characteristics can be non-randomly or non-uniformly distributed within a genome. So far, computer simulations were required for statistical inferences on the distribution of sequence motifs. Here, we show that these analyses are possible using an analytical, mathematical approach. For the assessment of non-randomness, our calculations only require information including genome size, number of (sampled) sequence motifs and distance parameters. We have developed computer programs evaluating our analytical formulas for the real-time determination of expected values and p-values. This approach permits a flexible cluster definition that can be applied to most effectively identify non-random or non-uniform sequence motif distribution. As an example, we show the effectivity and reliability of our mathematical approach in clinical retroviral vector integration site distribution.