Cargando…
Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling
Understanding CRISPR-Cas systems—the adaptive defence mechanism that about half of bacterial species and most of archaea use to neutralise viral attacks—is important for explaining the biodiversity observed in the microbial world as well as for editing animal and plant genomes effectively. The CRISP...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026048/ https://www.ncbi.nlm.nih.gov/pubmed/33770071 http://dx.doi.org/10.1371/journal.pcbi.1008841 |
_version_ | 1783675601342693376 |
---|---|
author | Pavlova, Yekaterina S. Paez-Espino, David Morozov, Andrew Yu. Belalov, Ilya S. |
author_facet | Pavlova, Yekaterina S. Paez-Espino, David Morozov, Andrew Yu. Belalov, Ilya S. |
author_sort | Pavlova, Yekaterina S. |
collection | PubMed |
description | Understanding CRISPR-Cas systems—the adaptive defence mechanism that about half of bacterial species and most of archaea use to neutralise viral attacks—is important for explaining the biodiversity observed in the microbial world as well as for editing animal and plant genomes effectively. The CRISPR-Cas system learns from previous viral infections and integrates small pieces from phage genomes called spacers into the microbial genome. The resulting library of spacers collected in CRISPR arrays is then compared with the DNA of potential invaders. One of the most intriguing and least well understood questions about CRISPR-Cas systems is the distribution of spacers across the microbial population. Here, using empirical data, we show that the global distribution of spacer numbers in CRISPR arrays across multiple biomes worldwide typically exhibits scale-invariant power law behaviour, and the standard deviation is greater than the sample mean. We develop a mathematical model of spacer loss and acquisition dynamics which fits observed data from almost four thousand metagenomes well. In analogy to the classical ‘rich-get-richer’ mechanism of power law emergence, the rate of spacer acquisition is proportional to the CRISPR array size, which allows a small proportion of CRISPRs within the population to possess a significant number of spacers. Our study provides an alternative explanation for the rarity of all-resistant super microbes in nature and why proliferation of phages can be highly successful despite the effectiveness of CRISPR-Cas systems. |
format | Online Article Text |
id | pubmed-8026048 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-80260482021-04-15 Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling Pavlova, Yekaterina S. Paez-Espino, David Morozov, Andrew Yu. Belalov, Ilya S. PLoS Comput Biol Research Article Understanding CRISPR-Cas systems—the adaptive defence mechanism that about half of bacterial species and most of archaea use to neutralise viral attacks—is important for explaining the biodiversity observed in the microbial world as well as for editing animal and plant genomes effectively. The CRISPR-Cas system learns from previous viral infections and integrates small pieces from phage genomes called spacers into the microbial genome. The resulting library of spacers collected in CRISPR arrays is then compared with the DNA of potential invaders. One of the most intriguing and least well understood questions about CRISPR-Cas systems is the distribution of spacers across the microbial population. Here, using empirical data, we show that the global distribution of spacer numbers in CRISPR arrays across multiple biomes worldwide typically exhibits scale-invariant power law behaviour, and the standard deviation is greater than the sample mean. We develop a mathematical model of spacer loss and acquisition dynamics which fits observed data from almost four thousand metagenomes well. In analogy to the classical ‘rich-get-richer’ mechanism of power law emergence, the rate of spacer acquisition is proportional to the CRISPR array size, which allows a small proportion of CRISPRs within the population to possess a significant number of spacers. Our study provides an alternative explanation for the rarity of all-resistant super microbes in nature and why proliferation of phages can be highly successful despite the effectiveness of CRISPR-Cas systems. Public Library of Science 2021-03-26 /pmc/articles/PMC8026048/ /pubmed/33770071 http://dx.doi.org/10.1371/journal.pcbi.1008841 Text en https://creativecommons.org/publicdomain/zero/1.0/This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. |
spellingShingle | Research Article Pavlova, Yekaterina S. Paez-Espino, David Morozov, Andrew Yu. Belalov, Ilya S. Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title | Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title_full | Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title_fullStr | Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title_full_unstemmed | Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title_short | Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling |
title_sort | searching for fat tails in crispr-cas systems: data analysis and mathematical modeling |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026048/ https://www.ncbi.nlm.nih.gov/pubmed/33770071 http://dx.doi.org/10.1371/journal.pcbi.1008841 |
work_keys_str_mv | AT pavlovayekaterinas searchingforfattailsincrisprcassystemsdataanalysisandmathematicalmodeling AT paezespinodavid searchingforfattailsincrisprcassystemsdataanalysisandmathematicalmodeling AT morozovandrewyu searchingforfattailsincrisprcassystemsdataanalysisandmathematicalmodeling AT belalovilyas searchingforfattailsincrisprcassystemsdataanalysisandmathematicalmodeling |