Cargando…

Comprehensive detection of structural variation and transposable element differences between wild type laboratory lineages of C. elegans

Genomic structural variations (SVs) and transposable elements (TEs) can be significant contributors to genome evolution, altered gene expression, and risk of genetic diseases. Recent advancements in long-read sequencing have greatly improved the quality of de novo genome assemblies and enhanced the...

Descripción completa

Detalles Bibliográficos
Autores principales: Bush, Zachary D., Naftaly, Alice F. S., Dinwiddie, Devin, Albers, Cora, Hillers, Kenneth J., Libuda, Diana E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10634987/
https://www.ncbi.nlm.nih.gov/pubmed/37961628
http://dx.doi.org/10.1101/2023.01.13.523974
Descripción
Sumario:Genomic structural variations (SVs) and transposable elements (TEs) can be significant contributors to genome evolution, altered gene expression, and risk of genetic diseases. Recent advancements in long-read sequencing have greatly improved the quality of de novo genome assemblies and enhanced the detection of sequence variants at the scale of hundreds or thousands of bases. Comparisons between two diverged wild isolates of Caenorhabditis elegans, the Bristol and Hawaiian strains, have been widely utilized in the analysis of small genetic variations. Genetic drift, including SVs and rearrangements of repeated sequences such as TEs, can occur over time from long-term maintenance of wild type isolates within the laboratory. To comprehensively detect both large and small structural variations as well as TEs due to genetic drift, we generated de novo genome assemblies and annotations for each strain from our lab collection using both long- and short-read sequencing and compared our assemblies and annotations with that of other lab wild type strains. Within our lab assemblies, we annotate over 3.1Mb of sequence divergence between the Bristol and Hawaiian isolates: 337,584 SNPs, 94,503 small insertion-deletions (<50bp), and 4,334 structural variations (>50bp). Further, we define the location and movement of specific DNA TEs between N2 Bristol and CB4856 Hawaiian wild type isolates. Specifically, we find the N2 Bristol genome has 20.6% more TEs from the Tc1/mariner family than the CB4856 Hawaiian genome. Moreover, we identified Zator elements as the most abundant and mobile TE family in the genome. Using specific TE sequences with unique SNPs, we also identify 38 TEs that moved intrachromosomally and 9 TEs that moved interchromosomally between the N2 Bristol and CB4856 Hawaiian genomes. By comparing the de novo genome assembly of our lab collection Bristol isolate to the VC2010 Bristol assembly, we also reveal that lab lineages display over 2 Mb of total variation: 1,162 SNPs, 1,528 indels, and 897 SVs with 95% of the variation due to SVs. Overall, our work demonstrates the unique contribution of SVs and TEs to variation and genetic drift between wild type laboratory strains assumed to be isogenic despite growing evidence of genetic drift and phenotypic variation.