Cargando…

SNP-PHAGE – High throughput SNP discovery pipeline

BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to re...

Descripción completa

Detalles Bibliográficos
Autores principales:	Matukumalli, Lakshmi K, Grefenstette, John J, Hyten, David L, Choi, Ik-Young, Cregan, Perry B, Van Tassell, Curtis P
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1626092/ https://www.ncbi.nlm.nih.gov/pubmed/17059604 http://dx.doi.org/10.1186/1471-2105-7-468

_version_	1782130581637693440
author	Matukumalli, Lakshmi K Grefenstette, John J Hyten, David L Choi, Ik-Young Cregan, Perry B Van Tassell, Curtis P
author_facet	Matukumalli, Lakshmi K Grefenstette, John J Hyten, David L Choi, Ik-Young Cregan, Perry B Van Tassell, Curtis P
author_sort	Matukumalli, Lakshmi K
collection	PubMed
description	BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers.
format	Text
id	pubmed-1626092
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-16260922006-10-27 SNP-PHAGE – High throughput SNP discovery pipeline Matukumalli, Lakshmi K Grefenstette, John J Hyten, David L Choi, Ik-Young Cregan, Perry B Van Tassell, Curtis P BMC Bioinformatics Software BACKGROUND: Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable. RESULTS: We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at . CONCLUSION: SNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers. BioMed Central 2006-10-23 /pmc/articles/PMC1626092/ /pubmed/17059604 http://dx.doi.org/10.1186/1471-2105-7-468 Text en Copyright © 2006 Matukumalli et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Matukumalli, Lakshmi K Grefenstette, John J Hyten, David L Choi, Ik-Young Cregan, Perry B Van Tassell, Curtis P SNP-PHAGE – High throughput SNP discovery pipeline
title	SNP-PHAGE – High throughput SNP discovery pipeline
title_full	SNP-PHAGE – High throughput SNP discovery pipeline
title_fullStr	SNP-PHAGE – High throughput SNP discovery pipeline
title_full_unstemmed	SNP-PHAGE – High throughput SNP discovery pipeline
title_short	SNP-PHAGE – High throughput SNP discovery pipeline
title_sort	snp-phage – high throughput snp discovery pipeline
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1626092/ https://www.ncbi.nlm.nih.gov/pubmed/17059604 http://dx.doi.org/10.1186/1471-2105-7-468
work_keys_str_mv	AT matukumallilakshmik snpphagehighthroughputsnpdiscoverypipeline AT grefenstettejohnj snpphagehighthroughputsnpdiscoverypipeline AT hytendavidl snpphagehighthroughputsnpdiscoverypipeline AT choiikyoung snpphagehighthroughputsnpdiscoverypipeline AT creganperryb snpphagehighthroughputsnpdiscoverypipeline AT vantassellcurtisp snpphagehighthroughputsnpdiscoverypipeline

SNP-PHAGE – High throughput SNP discovery pipeline

Ejemplares similares