Cargando…

High-throughput whole-genome sequencing of E14 mouse embryonic stem cells

Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1], [2], [3], [4], [5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our...

Descripción completa

Detalles Bibliográficos
Autores principales: Incarnato, Danny, Neri, Francesco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4535964/
https://www.ncbi.nlm.nih.gov/pubmed/26484140
http://dx.doi.org/10.1016/j.gdata.2014.10.023
_version_ 1782385679090581504
author Incarnato, Danny
Neri, Francesco
author_facet Incarnato, Danny
Neri, Francesco
author_sort Incarnato, Danny
collection PubMed
description Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1], [2], [3], [4], [5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7 × 10 E6 single nucleotide variant [6]. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of these cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php.
format Online
Article
Text
id pubmed-4535964
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-45359642015-10-19 High-throughput whole-genome sequencing of E14 mouse embryonic stem cells Incarnato, Danny Neri, Francesco Genom Data Data in Brief Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1], [2], [3], [4], [5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7 × 10 E6 single nucleotide variant [6]. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of these cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php. Elsevier 2014-11-07 /pmc/articles/PMC4535964/ /pubmed/26484140 http://dx.doi.org/10.1016/j.gdata.2014.10.023 Text en © 2014 The Authors http://creativecommons.org/licenses/by-nc-nd/3.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
spellingShingle Data in Brief
Incarnato, Danny
Neri, Francesco
High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_full High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_fullStr High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_full_unstemmed High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_short High-throughput whole-genome sequencing of E14 mouse embryonic stem cells
title_sort high-throughput whole-genome sequencing of e14 mouse embryonic stem cells
topic Data in Brief
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4535964/
https://www.ncbi.nlm.nih.gov/pubmed/26484140
http://dx.doi.org/10.1016/j.gdata.2014.10.023
work_keys_str_mv AT incarnatodanny highthroughputwholegenomesequencingofe14mouseembryonicstemcells
AT nerifrancesco highthroughputwholegenomesequencingofe14mouseembryonicstemcells