Cargando…

Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains

BACKGROUND: Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These l...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, William, Luo, Meizhong, Ma, Jianxin, Estep, Matt, Estill, James, He, Ruifeng, Talag, Jayson, Sisneros, Nicholas, Kudrna, David, Kim, HyeRan, Ammiraju, Jetty SS, Collura, Kristi, Bharti, Arvind K, Messing, Joachim, Wing, Rod A, SanMiguel, Phillip, Bennetzen, Jeffrey L, Soderlund, Carol
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628917/
https://www.ncbi.nlm.nih.gov/pubmed/19099592
http://dx.doi.org/10.1186/1471-2164-9-621
_version_ 1782163751294730240
author Nelson, William
Luo, Meizhong
Ma, Jianxin
Estep, Matt
Estill, James
He, Ruifeng
Talag, Jayson
Sisneros, Nicholas
Kudrna, David
Kim, HyeRan
Ammiraju, Jetty SS
Collura, Kristi
Bharti, Arvind K
Messing, Joachim
Wing, Rod A
SanMiguel, Phillip
Bennetzen, Jeffrey L
Soderlund, Carol
author_facet Nelson, William
Luo, Meizhong
Ma, Jianxin
Estep, Matt
Estill, James
He, Ruifeng
Talag, Jayson
Sisneros, Nicholas
Kudrna, David
Kim, HyeRan
Ammiraju, Jetty SS
Collura, Kristi
Bharti, Arvind K
Messing, Joachim
Wing, Rod A
SanMiguel, Phillip
Bennetzen, Jeffrey L
Soderlund, Carol
author_sort Nelson, William
collection PubMed
description BACKGROUND: Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. RESULTS: A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary. CONCLUSION: MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.
format Text
id pubmed-2628917
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26289172009-01-21 Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains Nelson, William Luo, Meizhong Ma, Jianxin Estep, Matt Estill, James He, Ruifeng Talag, Jayson Sisneros, Nicholas Kudrna, David Kim, HyeRan Ammiraju, Jetty SS Collura, Kristi Bharti, Arvind K Messing, Joachim Wing, Rod A SanMiguel, Phillip Bennetzen, Jeffrey L Soderlund, Carol BMC Genomics Research Article BACKGROUND: Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. RESULTS: A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary. CONCLUSION: MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences. BioMed Central 2008-12-19 /pmc/articles/PMC2628917/ /pubmed/19099592 http://dx.doi.org/10.1186/1471-2164-9-621 Text en Copyright © 2008 Nelson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Nelson, William
Luo, Meizhong
Ma, Jianxin
Estep, Matt
Estill, James
He, Ruifeng
Talag, Jayson
Sisneros, Nicholas
Kudrna, David
Kim, HyeRan
Ammiraju, Jetty SS
Collura, Kristi
Bharti, Arvind K
Messing, Joachim
Wing, Rod A
SanMiguel, Phillip
Bennetzen, Jeffrey L
Soderlund, Carol
Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title_full Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title_fullStr Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title_full_unstemmed Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title_short Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains
title_sort methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map dna methylation domains
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628917/
https://www.ncbi.nlm.nih.gov/pubmed/19099592
http://dx.doi.org/10.1186/1471-2164-9-621
work_keys_str_mv AT nelsonwilliam methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT luomeizhong methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT majianxin methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT estepmatt methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT estilljames methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT heruifeng methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT talagjayson methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT sisnerosnicholas methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT kudrnadavid methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT kimhyeran methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT ammirajujettyss methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT collurakristi methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT bhartiarvindk methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT messingjoachim methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT wingroda methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT sanmiguelphillip methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT bennetzenjeffreyl methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains
AT soderlundcarol methylationsensitivelinkinglibrariesenhancegeneenrichedsequencingofcomplexgenomesandmapdnamethylationdomains