Cargando…

Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project

The advent of next-generation sequencing allows simultaneous processing of several genomic regions/individuals, increasing the availability and accuracy of whole-genome data. However, these new approaches may present some errors and bias due to alignment, genotype calling, and imputation methods. De...

Descripción completa

Detalles Bibliográficos
Autores principales: Marano, Leonardo Arduino, Marcorin, Letícia, Castelli, Erick da Cruz, Mendes-Junior, Celso Teixeira
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Sociedade Brasileira de Genética 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5488459/
https://www.ncbi.nlm.nih.gov/pubmed/28486572
http://dx.doi.org/10.1590/1678-4685-GMB-2016-0180
_version_ 1783246658799140864
author Marano, Leonardo Arduino
Marcorin, Letícia
Castelli, Erick da Cruz
Mendes-Junior, Celso Teixeira
author_facet Marano, Leonardo Arduino
Marcorin, Letícia
Castelli, Erick da Cruz
Mendes-Junior, Celso Teixeira
author_sort Marano, Leonardo Arduino
collection PubMed
description The advent of next-generation sequencing allows simultaneous processing of several genomic regions/individuals, increasing the availability and accuracy of whole-genome data. However, these new approaches may present some errors and bias due to alignment, genotype calling, and imputation methods. Despite these flaws, data obtained by next-generation sequencing can be valuable for population and evolutionary studies of specific genes, such as genes related to how pigmentation evolved among populations, one of the main topics in human evolutionary biology. Melanocortin-1 receptor (MC1R) is one of the most studied genes involved in pigmentation variation. As MC1R has already been suggested to affect melanogenesis and increase risk of developing melanoma, it constitutes one of the best models to understand how natural selection acts on pigmentation. Here we employed a locally developed pipeline to obtain genotype and haplotype data for MC1R from the raw sequencing data provided by the 1000 Genomes FTP site. We also compared such genotype data to Phase 3 VCF to evaluate its quality and discover any polymorphic sites that may have been overlooked. In conclusion, either the VCF file or one of the presently described pipelines could be used to obtain reliable and accurate genotype calling from the 1000 Genomes Phase 3 data.
format Online
Article
Text
id pubmed-5488459
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Sociedade Brasileira de Genética
record_format MEDLINE/PubMed
spelling pubmed-54884592017-07-11 Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project Marano, Leonardo Arduino Marcorin, Letícia Castelli, Erick da Cruz Mendes-Junior, Celso Teixeira Genet Mol Biol Genomics and Bioinformatics The advent of next-generation sequencing allows simultaneous processing of several genomic regions/individuals, increasing the availability and accuracy of whole-genome data. However, these new approaches may present some errors and bias due to alignment, genotype calling, and imputation methods. Despite these flaws, data obtained by next-generation sequencing can be valuable for population and evolutionary studies of specific genes, such as genes related to how pigmentation evolved among populations, one of the main topics in human evolutionary biology. Melanocortin-1 receptor (MC1R) is one of the most studied genes involved in pigmentation variation. As MC1R has already been suggested to affect melanogenesis and increase risk of developing melanoma, it constitutes one of the best models to understand how natural selection acts on pigmentation. Here we employed a locally developed pipeline to obtain genotype and haplotype data for MC1R from the raw sequencing data provided by the 1000 Genomes FTP site. We also compared such genotype data to Phase 3 VCF to evaluate its quality and discover any polymorphic sites that may have been overlooked. In conclusion, either the VCF file or one of the presently described pipelines could be used to obtain reliable and accurate genotype calling from the 1000 Genomes Phase 3 data. Sociedade Brasileira de Genética 2017-05-08 2017 /pmc/articles/PMC5488459/ /pubmed/28486572 http://dx.doi.org/10.1590/1678-4685-GMB-2016-0180 Text en Copyright © 2017, Sociedade Brasileira de Genética. http://creativecommons.org/licenses/by/4.0/ License information: This is an open-access article distributed under the terms of the Creative Commons Attribution License (type CC-BY), which permits unrestricted use, distribution and reproduction in any medium, provided the original article is properly cited.
spellingShingle Genomics and Bioinformatics
Marano, Leonardo Arduino
Marcorin, Letícia
Castelli, Erick da Cruz
Mendes-Junior, Celso Teixeira
Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title_full Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title_fullStr Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title_full_unstemmed Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title_short Evaluation of MC1R high-throughput nucleotide sequencing data generated by the 1000 Genomes Project
title_sort evaluation of mc1r high-throughput nucleotide sequencing data generated by the 1000 genomes project
topic Genomics and Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5488459/
https://www.ncbi.nlm.nih.gov/pubmed/28486572
http://dx.doi.org/10.1590/1678-4685-GMB-2016-0180
work_keys_str_mv AT maranoleonardoarduino evaluationofmc1rhighthroughputnucleotidesequencingdatageneratedbythe1000genomesproject
AT marcorinleticia evaluationofmc1rhighthroughputnucleotidesequencingdatageneratedbythe1000genomesproject
AT castellierickdacruz evaluationofmc1rhighthroughputnucleotidesequencingdatageneratedbythe1000genomesproject
AT mendesjuniorcelsoteixeira evaluationofmc1rhighthroughputnucleotidesequencingdatageneratedbythe1000genomesproject