Cargando…

The Challenge of Small-Scale Repeats for Indel Discovery

Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genom...

Descripción completa

Detalles Bibliográficos
Autores principales: Narzisi, Giuseppe, Schatz, Michael C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4306302/
https://www.ncbi.nlm.nih.gov/pubmed/25674564
http://dx.doi.org/10.3389/fbioe.2015.00008
_version_ 1782354308001431552
author Narzisi, Giuseppe
Schatz, Michael C.
author_facet Narzisi, Giuseppe
Schatz, Michael C.
author_sort Narzisi, Giuseppe
collection PubMed
description Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genome at different scales they can cause various types of sequence analysis errors, including in alignment, de novo assembly, and annotation, among others. This mini-review highlights the challenges introduced by small-scale repeat sequences, especially near-identical tandem or closely located repeats and short tandem repeats, for discovering DNA insertion and deletion (indel) mutations from next-generation sequencing data. We also discuss the de Bruijn graph sequence assembly paradigm that is emerging as the most popular and promising approach for detecting indels. The human exome is taken as an example and highlights how these repetitive elements can obscure or introduce errors while detecting these types of mutations.
format Online
Article
Text
id pubmed-4306302
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-43063022015-02-11 The Challenge of Small-Scale Repeats for Indel Discovery Narzisi, Giuseppe Schatz, Michael C. Front Bioeng Biotechnol Bioengineering and Biotechnology Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genome at different scales they can cause various types of sequence analysis errors, including in alignment, de novo assembly, and annotation, among others. This mini-review highlights the challenges introduced by small-scale repeat sequences, especially near-identical tandem or closely located repeats and short tandem repeats, for discovering DNA insertion and deletion (indel) mutations from next-generation sequencing data. We also discuss the de Bruijn graph sequence assembly paradigm that is emerging as the most popular and promising approach for detecting indels. The human exome is taken as an example and highlights how these repetitive elements can obscure or introduce errors while detecting these types of mutations. Frontiers Media S.A. 2015-01-26 /pmc/articles/PMC4306302/ /pubmed/25674564 http://dx.doi.org/10.3389/fbioe.2015.00008 Text en Copyright © 2015 Narzisi and Schatz. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Narzisi, Giuseppe
Schatz, Michael C.
The Challenge of Small-Scale Repeats for Indel Discovery
title The Challenge of Small-Scale Repeats for Indel Discovery
title_full The Challenge of Small-Scale Repeats for Indel Discovery
title_fullStr The Challenge of Small-Scale Repeats for Indel Discovery
title_full_unstemmed The Challenge of Small-Scale Repeats for Indel Discovery
title_short The Challenge of Small-Scale Repeats for Indel Discovery
title_sort challenge of small-scale repeats for indel discovery
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4306302/
https://www.ncbi.nlm.nih.gov/pubmed/25674564
http://dx.doi.org/10.3389/fbioe.2015.00008
work_keys_str_mv AT narzisigiuseppe thechallengeofsmallscalerepeatsforindeldiscovery
AT schatzmichaelc thechallengeofsmallscalerepeatsforindeldiscovery
AT narzisigiuseppe challengeofsmallscalerepeatsforindeldiscovery
AT schatzmichaelc challengeofsmallscalerepeatsforindeldiscovery