Cargando…
The Challenge of Small-Scale Repeats for Indel Discovery
Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genom...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4306302/ https://www.ncbi.nlm.nih.gov/pubmed/25674564 http://dx.doi.org/10.3389/fbioe.2015.00008 |
_version_ | 1782354308001431552 |
---|---|
author | Narzisi, Giuseppe Schatz, Michael C. |
author_facet | Narzisi, Giuseppe Schatz, Michael C. |
author_sort | Narzisi, Giuseppe |
collection | PubMed |
description | Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genome at different scales they can cause various types of sequence analysis errors, including in alignment, de novo assembly, and annotation, among others. This mini-review highlights the challenges introduced by small-scale repeat sequences, especially near-identical tandem or closely located repeats and short tandem repeats, for discovering DNA insertion and deletion (indel) mutations from next-generation sequencing data. We also discuss the de Bruijn graph sequence assembly paradigm that is emerging as the most popular and promising approach for detecting indels. The human exome is taken as an example and highlights how these repetitive elements can obscure or introduce errors while detecting these types of mutations. |
format | Online Article Text |
id | pubmed-4306302 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-43063022015-02-11 The Challenge of Small-Scale Repeats for Indel Discovery Narzisi, Giuseppe Schatz, Michael C. Front Bioeng Biotechnol Bioengineering and Biotechnology Repetitive sequences are abundant in the human genome. Different classes of repetitive DNA sequences, including simple repeats, tandem repeats, segmental duplications, interspersed repeats, and other elements, collectively span more than 50% of the genome. Because repeat sequences occur in the genome at different scales they can cause various types of sequence analysis errors, including in alignment, de novo assembly, and annotation, among others. This mini-review highlights the challenges introduced by small-scale repeat sequences, especially near-identical tandem or closely located repeats and short tandem repeats, for discovering DNA insertion and deletion (indel) mutations from next-generation sequencing data. We also discuss the de Bruijn graph sequence assembly paradigm that is emerging as the most popular and promising approach for detecting indels. The human exome is taken as an example and highlights how these repetitive elements can obscure or introduce errors while detecting these types of mutations. Frontiers Media S.A. 2015-01-26 /pmc/articles/PMC4306302/ /pubmed/25674564 http://dx.doi.org/10.3389/fbioe.2015.00008 Text en Copyright © 2015 Narzisi and Schatz. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Bioengineering and Biotechnology Narzisi, Giuseppe Schatz, Michael C. The Challenge of Small-Scale Repeats for Indel Discovery |
title | The Challenge of Small-Scale Repeats for Indel Discovery |
title_full | The Challenge of Small-Scale Repeats for Indel Discovery |
title_fullStr | The Challenge of Small-Scale Repeats for Indel Discovery |
title_full_unstemmed | The Challenge of Small-Scale Repeats for Indel Discovery |
title_short | The Challenge of Small-Scale Repeats for Indel Discovery |
title_sort | challenge of small-scale repeats for indel discovery |
topic | Bioengineering and Biotechnology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4306302/ https://www.ncbi.nlm.nih.gov/pubmed/25674564 http://dx.doi.org/10.3389/fbioe.2015.00008 |
work_keys_str_mv | AT narzisigiuseppe thechallengeofsmallscalerepeatsforindeldiscovery AT schatzmichaelc thechallengeofsmallscalerepeatsforindeldiscovery AT narzisigiuseppe challengeofsmallscalerepeatsforindeldiscovery AT schatzmichaelc challengeofsmallscalerepeatsforindeldiscovery |