Cargando…

Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes

Given a set of nucleotide sequences we consider the problem of identifying conserved substrings occurring in homologous genes in a large number of sequences. The problem is solved by identifying certain nodes in a suffix tree containing all substrings occurring in the given nucleotide sequences. Due...

Descripción completa

Detalles Bibliográficos
Autores principales: Moritz, Ruby L. V., Bernt, Matthias, Middendorf, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009762/
https://www.ncbi.nlm.nih.gov/pubmed/24833343
http://dx.doi.org/10.3390/biology3010220
_version_ 1782479799952867328
author Moritz, Ruby L. V.
Bernt, Matthias
Middendorf, Martin
author_facet Moritz, Ruby L. V.
Bernt, Matthias
Middendorf, Martin
author_sort Moritz, Ruby L. V.
collection PubMed
description Given a set of nucleotide sequences we consider the problem of identifying conserved substrings occurring in homologous genes in a large number of sequences. The problem is solved by identifying certain nodes in a suffix tree containing all substrings occurring in the given nucleotide sequences. Due to the large size of the targeted data set, our approach employs a truncated version of suffix trees. Two methods for this task are introduced: (1) The annotation guided marker detection method uses gene annotations which might contain a moderate number of errors; (2) The probability based marker detection method determines sequences that appear significantly more often than expected. The approach is successfully applied to the mitochondrial nucleotide sequences, and the corresponding annotations that are available in RefSeq for 2989 metazoan species. We demonstrate that the approach finds appropriate substrings.
format Online
Article
Text
id pubmed-4009762
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-40097622014-05-07 Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes Moritz, Ruby L. V. Bernt, Matthias Middendorf, Martin Biology (Basel) Article Given a set of nucleotide sequences we consider the problem of identifying conserved substrings occurring in homologous genes in a large number of sequences. The problem is solved by identifying certain nodes in a suffix tree containing all substrings occurring in the given nucleotide sequences. Due to the large size of the targeted data set, our approach employs a truncated version of suffix trees. Two methods for this task are introduced: (1) The annotation guided marker detection method uses gene annotations which might contain a moderate number of errors; (2) The probability based marker detection method determines sequences that appear significantly more often than expected. The approach is successfully applied to the mitochondrial nucleotide sequences, and the corresponding annotations that are available in RefSeq for 2989 metazoan species. We demonstrate that the approach finds appropriate substrings. MDPI 2014-03-11 /pmc/articles/PMC4009762/ /pubmed/24833343 http://dx.doi.org/10.3390/biology3010220 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Moritz, Ruby L. V.
Bernt, Matthias
Middendorf, Martin
Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title_full Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title_fullStr Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title_full_unstemmed Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title_short Local Similarity Search to Find Gene Indicators in Mitochondrial Genomes
title_sort local similarity search to find gene indicators in mitochondrial genomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4009762/
https://www.ncbi.nlm.nih.gov/pubmed/24833343
http://dx.doi.org/10.3390/biology3010220
work_keys_str_mv AT moritzrubylv localsimilaritysearchtofindgeneindicatorsinmitochondrialgenomes
AT berntmatthias localsimilaritysearchtofindgeneindicatorsinmitochondrialgenomes
AT middendorfmartin localsimilaritysearchtofindgeneindicatorsinmitochondrialgenomes