Cargando…

Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA

In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or a large public datase...

Descripción completa

Detalles Bibliográficos
Autores principales: Miller, Marina D., Devor, Eric J., Salinas, Erin A., Newtson, Andreea M., Goodheart, Michael J., Leslie, Kimberly K., Gonzalez-Bosquet, Jesus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6429328/
https://www.ncbi.nlm.nih.gov/pubmed/30857229
http://dx.doi.org/10.3390/ijms20051192
_version_ 1783405570077753344
author Miller, Marina D.
Devor, Eric J.
Salinas, Erin A.
Newtson, Andreea M.
Goodheart, Michael J.
Leslie, Kimberly K.
Gonzalez-Bosquet, Jesus
author_facet Miller, Marina D.
Devor, Eric J.
Salinas, Erin A.
Newtson, Andreea M.
Goodheart, Michael J.
Leslie, Kimberly K.
Gonzalez-Bosquet, Jesus
author_sort Miller, Marina D.
collection PubMed
description In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or a large public dataset to the remainder of the population? The study presented here aims to answer that question. Utilizing next generation sequencing data from endometrial and ovarian cancer patients from both the University of Iowa and TCGA, genomic admixture of each population was analyzed using STRUCTURE and ADMIXTURE software. In our independent data set, one subpopulation was identified, whereas in TCGA 4–6 subpopulations were identified. Data presented here demonstrate how different the genetic substructures of the TCGA and University of Iowa populations are. Validation of genomic studies between two different population samples must be aware of, account for and be corrected for background genetic substructure.
format Online
Article
Text
id pubmed-6429328
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64293282019-04-10 Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA Miller, Marina D. Devor, Eric J. Salinas, Erin A. Newtson, Andreea M. Goodheart, Michael J. Leslie, Kimberly K. Gonzalez-Bosquet, Jesus Int J Mol Sci Communication In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or a large public dataset to the remainder of the population? The study presented here aims to answer that question. Utilizing next generation sequencing data from endometrial and ovarian cancer patients from both the University of Iowa and TCGA, genomic admixture of each population was analyzed using STRUCTURE and ADMIXTURE software. In our independent data set, one subpopulation was identified, whereas in TCGA 4–6 subpopulations were identified. Data presented here demonstrate how different the genetic substructures of the TCGA and University of Iowa populations are. Validation of genomic studies between two different population samples must be aware of, account for and be corrected for background genetic substructure. MDPI 2019-03-08 /pmc/articles/PMC6429328/ /pubmed/30857229 http://dx.doi.org/10.3390/ijms20051192 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Communication
Miller, Marina D.
Devor, Eric J.
Salinas, Erin A.
Newtson, Andreea M.
Goodheart, Michael J.
Leslie, Kimberly K.
Gonzalez-Bosquet, Jesus
Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title_full Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title_fullStr Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title_full_unstemmed Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title_short Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
title_sort population substructure has implications in validating next-generation cancer genomics studies with tcga
topic Communication
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6429328/
https://www.ncbi.nlm.nih.gov/pubmed/30857229
http://dx.doi.org/10.3390/ijms20051192
work_keys_str_mv AT millermarinad populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT devorericj populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT salinaserina populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT newtsonandreeam populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT goodheartmichaelj populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT lesliekimberlyk populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga
AT gonzalezbosquetjesus populationsubstructurehasimplicationsinvalidatingnextgenerationcancergenomicsstudieswithtcga