Cargando…

An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data

With recent advances in sequencing technologies, the scientific community has begun to probe the potential genetic bases behind complex phenotypes in humans and model organisms. In many cases, the genomes of genetically distinct strains of model organisms, such as the mouse (Mus musculus), have not...

Descripción completa

Detalles Bibliográficos
Autores principales: Dornbos, Peter, Arkatkar, Anooj A., LaPres, John J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7056971/
https://www.ncbi.nlm.nih.gov/pubmed/31911484
http://dx.doi.org/10.1534/g3.119.400983
_version_ 1783503570533875712
author Dornbos, Peter
Arkatkar, Anooj A.
LaPres, John J.
author_facet Dornbos, Peter
Arkatkar, Anooj A.
LaPres, John J.
author_sort Dornbos, Peter
collection PubMed
description With recent advances in sequencing technologies, the scientific community has begun to probe the potential genetic bases behind complex phenotypes in humans and model organisms. In many cases, the genomes of genetically distinct strains of model organisms, such as the mouse (Mus musculus), have not been fully sequenced. Here, we report on a tool designed to use single-nucleotide polymorphism (SNP) and insertion-deletion (indel) data to predict gene, mRNA, and protein sequences for up to 36 genetically distinct mouse strains. By automated querying of freely accessible databases through a graphical interface, the software requires no data and little computational experience. As a proof of concept, we predicted the gene and amino acid sequence of the aryl hydrocarbon receptor (Ahr) for all inbred mouse strains of which variant data were currently available through Mouse Genome Project. Predicted sequences were compared with fully sequenced genomes to show that the tool is effective in predicting gene and protein sequences.
format Online
Article
Text
id pubmed-7056971
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-70569712020-03-12 An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data Dornbos, Peter Arkatkar, Anooj A. LaPres, John J. G3 (Bethesda) Software and Data Resources With recent advances in sequencing technologies, the scientific community has begun to probe the potential genetic bases behind complex phenotypes in humans and model organisms. In many cases, the genomes of genetically distinct strains of model organisms, such as the mouse (Mus musculus), have not been fully sequenced. Here, we report on a tool designed to use single-nucleotide polymorphism (SNP) and insertion-deletion (indel) data to predict gene, mRNA, and protein sequences for up to 36 genetically distinct mouse strains. By automated querying of freely accessible databases through a graphical interface, the software requires no data and little computational experience. As a proof of concept, we predicted the gene and amino acid sequence of the aryl hydrocarbon receptor (Ahr) for all inbred mouse strains of which variant data were currently available through Mouse Genome Project. Predicted sequences were compared with fully sequenced genomes to show that the tool is effective in predicting gene and protein sequences. Genetics Society of America 2020-01-07 /pmc/articles/PMC7056971/ /pubmed/31911484 http://dx.doi.org/10.1534/g3.119.400983 Text en Copyright © 2020 Dornbos et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software and Data Resources
Dornbos, Peter
Arkatkar, Anooj A.
LaPres, John J.
An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title_full An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title_fullStr An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title_full_unstemmed An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title_short An Automated Method To Predict Mouse Gene and Protein Sequences Using Variant Data
title_sort automated method to predict mouse gene and protein sequences using variant data
topic Software and Data Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7056971/
https://www.ncbi.nlm.nih.gov/pubmed/31911484
http://dx.doi.org/10.1534/g3.119.400983
work_keys_str_mv AT dornbospeter anautomatedmethodtopredictmousegeneandproteinsequencesusingvariantdata
AT arkatkaranooja anautomatedmethodtopredictmousegeneandproteinsequencesusingvariantdata
AT lapresjohnj anautomatedmethodtopredictmousegeneandproteinsequencesusingvariantdata
AT dornbospeter automatedmethodtopredictmousegeneandproteinsequencesusingvariantdata
AT arkatkaranooja automatedmethodtopredictmousegeneandproteinsequencesusingvariantdata
AT lapresjohnj automatedmethodtopredictmousegeneandproteinsequencesusingvariantdata