Cargando…
Tools and best practices for retrotransposon analysis using high-throughput sequencing data
BACKGROUND: Sequencing technologies give access to a precise picture of the molecular mechanisms acting upon genome regulation. One of the biggest technical challenges with sequencing data is to map millions of reads to a reference genome. This problem is exacerbated when dealing with repetitive seq...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6935493/ https://www.ncbi.nlm.nih.gov/pubmed/31890048 http://dx.doi.org/10.1186/s13100-019-0192-1 |
_version_ | 1783483586712698880 |
---|---|
author | Teissandier, Aurélie Servant, Nicolas Barillot, Emmanuel Bourc’his, Deborah |
author_facet | Teissandier, Aurélie Servant, Nicolas Barillot, Emmanuel Bourc’his, Deborah |
author_sort | Teissandier, Aurélie |
collection | PubMed |
description | BACKGROUND: Sequencing technologies give access to a precise picture of the molecular mechanisms acting upon genome regulation. One of the biggest technical challenges with sequencing data is to map millions of reads to a reference genome. This problem is exacerbated when dealing with repetitive sequences such as transposable elements that occupy half of the mammalian genome mass. Sequenced reads coming from these regions introduce ambiguities in the mapping step. Therefore, applying dedicated parameters and algorithms has to be taken into consideration when transposable elements regulation is investigated with sequencing datasets. RESULTS: Here, we used simulated reads on the mouse and human genomes to define the best parameters for aligning transposable element-derived reads on a reference genome. The efficiency of the most commonly used aligners was compared and we further evaluated how transposable element representation should be estimated using available methods. The mappability of the different transposon families in the mouse and the human genomes was calculated giving an overview into their evolution. CONCLUSIONS: Based on simulated data, we provided recommendations on the alignment and the quantification steps to be performed when transposon expression or regulation is studied, and identified the limits in detecting specific young transposon families of the mouse and human genomes. These principles may help the community to adopt standard procedures and raise awareness of the difficulties encountered in the study of transposable elements. |
format | Online Article Text |
id | pubmed-6935493 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69354932019-12-30 Tools and best practices for retrotransposon analysis using high-throughput sequencing data Teissandier, Aurélie Servant, Nicolas Barillot, Emmanuel Bourc’his, Deborah Mob DNA Methodology BACKGROUND: Sequencing technologies give access to a precise picture of the molecular mechanisms acting upon genome regulation. One of the biggest technical challenges with sequencing data is to map millions of reads to a reference genome. This problem is exacerbated when dealing with repetitive sequences such as transposable elements that occupy half of the mammalian genome mass. Sequenced reads coming from these regions introduce ambiguities in the mapping step. Therefore, applying dedicated parameters and algorithms has to be taken into consideration when transposable elements regulation is investigated with sequencing datasets. RESULTS: Here, we used simulated reads on the mouse and human genomes to define the best parameters for aligning transposable element-derived reads on a reference genome. The efficiency of the most commonly used aligners was compared and we further evaluated how transposable element representation should be estimated using available methods. The mappability of the different transposon families in the mouse and the human genomes was calculated giving an overview into their evolution. CONCLUSIONS: Based on simulated data, we provided recommendations on the alignment and the quantification steps to be performed when transposon expression or regulation is studied, and identified the limits in detecting specific young transposon families of the mouse and human genomes. These principles may help the community to adopt standard procedures and raise awareness of the difficulties encountered in the study of transposable elements. BioMed Central 2019-12-29 /pmc/articles/PMC6935493/ /pubmed/31890048 http://dx.doi.org/10.1186/s13100-019-0192-1 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Teissandier, Aurélie Servant, Nicolas Barillot, Emmanuel Bourc’his, Deborah Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title | Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title_full | Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title_fullStr | Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title_full_unstemmed | Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title_short | Tools and best practices for retrotransposon analysis using high-throughput sequencing data |
title_sort | tools and best practices for retrotransposon analysis using high-throughput sequencing data |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6935493/ https://www.ncbi.nlm.nih.gov/pubmed/31890048 http://dx.doi.org/10.1186/s13100-019-0192-1 |
work_keys_str_mv | AT teissandieraurelie toolsandbestpracticesforretrotransposonanalysisusinghighthroughputsequencingdata AT servantnicolas toolsandbestpracticesforretrotransposonanalysisusinghighthroughputsequencingdata AT barillotemmanuel toolsandbestpracticesforretrotransposonanalysisusinghighthroughputsequencingdata AT bourchisdeborah toolsandbestpracticesforretrotransposonanalysisusinghighthroughputsequencingdata |