Cargando…

Identifying micro-inversions using high-throughput sequencing reads

BACKGROUND: The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease...

Descripción completa

Detalles Bibliográficos
Autores principales:	He, Feifei, Li, Yang, Tang, Yu-Hang, Ma, Jian, Zhu, Huaiqiu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895285/ https://www.ncbi.nlm.nih.gov/pubmed/26818118 http://dx.doi.org/10.1186/s12864-015-2305-7

_version_	1782435818745364480
author	He, Feifei Li, Yang Tang, Yu-Hang Ma, Jian Zhu, Huaiqiu
author_facet	He, Feifei Li, Yang Tang, Yu-Hang Ma, Jian Zhu, Huaiqiu
author_sort	He, Feifei
collection	PubMed
description	BACKGROUND: The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. RESULTS: The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. CONCLUSIONS: To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2305-7) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4895285
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-48952852016-06-10 Identifying micro-inversions using high-throughput sequencing reads He, Feifei Li, Yang Tang, Yu-Hang Ma, Jian Zhu, Huaiqiu BMC Genomics Proceedings BACKGROUND: The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. RESULTS: The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. CONCLUSIONS: To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2305-7) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-11 /pmc/articles/PMC4895285/ /pubmed/26818118 http://dx.doi.org/10.1186/s12864-015-2305-7 Text en © He et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Proceedings He, Feifei Li, Yang Tang, Yu-Hang Ma, Jian Zhu, Huaiqiu Identifying micro-inversions using high-throughput sequencing reads
title	Identifying micro-inversions using high-throughput sequencing reads
title_full	Identifying micro-inversions using high-throughput sequencing reads
title_fullStr	Identifying micro-inversions using high-throughput sequencing reads
title_full_unstemmed	Identifying micro-inversions using high-throughput sequencing reads
title_short	Identifying micro-inversions using high-throughput sequencing reads
title_sort	identifying micro-inversions using high-throughput sequencing reads
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4895285/ https://www.ncbi.nlm.nih.gov/pubmed/26818118 http://dx.doi.org/10.1186/s12864-015-2305-7
work_keys_str_mv	AT hefeifei identifyingmicroinversionsusinghighthroughputsequencingreads AT liyang identifyingmicroinversionsusinghighthroughputsequencingreads AT tangyuhang identifyingmicroinversionsusinghighthroughputsequencingreads AT majian identifyingmicroinversionsusinghighthroughputsequencingreads AT zhuhuaiqiu identifyingmicroinversionsusinghighthroughputsequencingreads

Identifying micro-inversions using high-throughput sequencing reads

Ejemplares similares