Cargando…

A Strategy for Genome-Wide Identification of Gene Based Polymorphisms in Rice Reveals Non-Synonymous Variation and Functional Genotypic Markers

The genetic diversity of plants has traditionally been employed to improve crop plants to suit human needs, and in the future feed the increasing population and protect crops from environmental stresses and climate change. Genome-wide sequencing is a reality and can be used to make association to cr...

Descripción completa

Detalles Bibliográficos
Autores principales: Srivastava, Subodh K., Wolinski, Pawel, Pereira, Andy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4169549/
https://www.ncbi.nlm.nih.gov/pubmed/25237817
http://dx.doi.org/10.1371/journal.pone.0105335
Descripción
Sumario:The genetic diversity of plants has traditionally been employed to improve crop plants to suit human needs, and in the future feed the increasing population and protect crops from environmental stresses and climate change. Genome-wide sequencing is a reality and can be used to make association to crop traits to be utilized by high-throughput marker based selection methods. This study describes a strategy of using next generation sequencing (NGS) data from the rice genome to make comparisons to the high-quality reference genome, identify functional polymorphisms within genes that might result in function changes and be used to study correlations to traits and employed in genetic mapping. We analyzed the NGS data of Oryza sativa ssp indica cv. G4 covering 241 Mb with ∼20X coverage and compared to the reference genome of Oryza sativa ssp. japonica to describe the genome-wide distribution of gene-based single nucleotide polymorphisms (SNPs). The analysis shows that the 63% covered genome consists of 1.6 million SNPs with 6.9 SNPs/Kb, and including 80,146 insertions and 92,655 deletions (INDELs) genome-wide. There are a total of 1,139,801 intergenic SNPs, 295,136 SNPs in intronic/non-coding regions, 195,098 in coding regions, 23,242 SNPs at the five-prime (5′) UTR regions and 22,686 SNPs at the three-prime (3′) UTR region. SNP variation was found in 40,761 gene loci, which include 75,262 synonymous and 119,836 non-synonymous changes, and functional reading frame changes through 3,886 inducing STOP-codon (isSNP) and 729 preventing STOP-codon (psSNP) variation. There are quickly evolving 194 high SNP hotspot genes (>100 SNPs/gene), and 1,513 out of 2,458 transcription factors displaying 2,294 non-synonymous SNPs that can be a major source of phenotypic diversity within the species. All data is searchable at https://plantstress-pereira.uark.edu/oryza2. We envision that this strategy will be useful for the identification of genes for crop traits and molecular breeding of rice cultivars.