Cargando…
Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool
Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atheros...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351168/ https://www.ncbi.nlm.nih.gov/pubmed/25742008 http://dx.doi.org/10.1371/journal.pone.0118925 |
_version_ | 1782360296471396352 |
---|---|
author | Robiou-du-Pont, Sébastien Li, Aihua Christie, Shanice Sohani, Zahra N. Meyre, David |
author_facet | Robiou-du-Pont, Sébastien Li, Aihua Christie, Shanice Sohani, Zahra N. Meyre, David |
author_sort | Robiou-du-Pont, Sébastien |
collection | PubMed |
description | Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. |
format | Online Article Text |
id | pubmed-4351168 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-43511682015-03-17 Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool Robiou-du-Pont, Sébastien Li, Aihua Christie, Shanice Sohani, Zahra N. Meyre, David PLoS One Research Article Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. Public Library of Science 2015-03-05 /pmc/articles/PMC4351168/ /pubmed/25742008 http://dx.doi.org/10.1371/journal.pone.0118925 Text en © 2015 Robiou-du-Pont et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Robiou-du-Pont, Sébastien Li, Aihua Christie, Shanice Sohani, Zahra N. Meyre, David Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title | Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title_full | Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title_fullStr | Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title_full_unstemmed | Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title_short | Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool |
title_sort | should we have blind faith in bioinformatics software? illustrations from the snap web-based tool |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351168/ https://www.ncbi.nlm.nih.gov/pubmed/25742008 http://dx.doi.org/10.1371/journal.pone.0118925 |
work_keys_str_mv | AT robioudupontsebastien shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool AT liaihua shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool AT christieshanice shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool AT sohanizahran shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool AT meyredavid shouldwehaveblindfaithinbioinformaticssoftwareillustrationsfromthesnapwebbasedtool |