Cargando…

Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atheros...

Descripción completa

Detalles Bibliográficos
Autores principales: Robiou-du-Pont, Sébastien, Li, Aihua, Christie, Shanice, Sohani, Zahra N., Meyre, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351168/
https://www.ncbi.nlm.nih.gov/pubmed/25742008
http://dx.doi.org/10.1371/journal.pone.0118925
_version_ 1782360296471396352
author Robiou-du-Pont, Sébastien
Li, Aihua
Christie, Shanice
Sohani, Zahra N.
Meyre, David
author_facet Robiou-du-Pont, Sébastien
Li, Aihua
Christie, Shanice
Sohani, Zahra N.
Meyre, David
author_sort Robiou-du-Pont, Sébastien
collection PubMed
description Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation.
format Online
Article
Text
id pubmed-4351168
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43511682015-03-17 Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool Robiou-du-Pont, Sébastien Li, Aihua Christie, Shanice Sohani, Zahra N. Meyre, David PLoS One Research Article Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. Public Library of Science 2015-03-05 /pmc/articles/PMC4351168/ /pubmed/25742008 http://dx.doi.org/10.1371/journal.pone.0118925 Text en © 2015 Robiou-du-Pont et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Robiou-du-Pont, Sébastien
Li, Aihua
Christie, Shanice
Sohani, Zahra N.
Meyre, David
Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title_full Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title_fullStr Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title_full_unstemmed Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title_short Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
title_sort should we have blind faith in bioinformatics software? illustrations from the snap web-based tool
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351168/
https://www.ncbi.nlm.nih.gov/pubmed/25742008
http://dx.doi.org/10.1371/journal.pone.0118925
work_keys_str_mv AT robioudupontsebastien shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool
AT liaihua shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool
AT christieshanice shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool
AT sohanizahran shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool
AT meyredavid shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool