Cargando…

Can Anonymous Posters on Medical Forums be Reidentified?

BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identit...

Descripción completa

Detalles Bibliográficos
Autores principales: Bobicev, Victoria, Sokolova, Marina, El Emam, Khaled, Jafer, Yasser, Dewar, Brian, Jonker, Elizabeth, Matwin, Stan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications Inc. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3806358/
https://www.ncbi.nlm.nih.gov/pubmed/24091380
http://dx.doi.org/10.2196/jmir.2514
_version_ 1782288367350710272
author Bobicev, Victoria
Sokolova, Marina
El Emam, Khaled
Jafer, Yasser
Dewar, Brian
Jonker, Elizabeth
Matwin, Stan
author_facet Bobicev, Victoria
Sokolova, Marina
El Emam, Khaled
Jafer, Yasser
Dewar, Brian
Jonker, Elizabeth
Matwin, Stan
author_sort Bobicev, Victoria
collection PubMed
description BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identities, nicknames, or pseudonyms that cannot readily be traced back to them. However, individual writing styles have unique features and it may be possible to determine the true identity of an anonymous user through author attribution analysis. Although there has been previous work on the authorship attribution problem, there has been a dearth of research on automated authorship attribution on medical forums. The focus of the paper is to demonstrate that character-based author attribution works better than word-based methods in medical forums. OBJECTIVE: The goal was to build a system that accurately attributes authorship of messages posted on medical forums. The Authorship Attributor system uses text analysis techniques to crawl medical forums and automatically correlate messages written by the same authors. Authorship Attributor processes unstructured texts regardless of the document type, context, and content. METHODS: The messages were labeled by nicknames of the forum participants. We evaluated the system’s performance through its accuracy on 6000 messages gathered from 2 medical forums on an in vitro fertilization (IVF) support website. RESULTS: Given 2 lists of candidate authors (30 and 50 candidates, respectively), we obtained an F score accuracy in detecting authors of 75% to 80% on messages containing 100 to 150 words on average, and 97.9% on longer messages containing at least 300 words. CONCLUSIONS: Authorship can be successfully detected in short free-form messages posted on medical forums. This raises a concern about the meaningfulness of anonymous posting on such medical forums. Authorship attribution tools can be used to warn consumers wishing to post anonymously about the likelihood of their identity being determined.
format Online
Article
Text
id pubmed-3806358
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher JMIR Publications Inc.
record_format MEDLINE/PubMed
spelling pubmed-38063582013-10-24 Can Anonymous Posters on Medical Forums be Reidentified? Bobicev, Victoria Sokolova, Marina El Emam, Khaled Jafer, Yasser Dewar, Brian Jonker, Elizabeth Matwin, Stan J Med Internet Res Original Paper BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identities, nicknames, or pseudonyms that cannot readily be traced back to them. However, individual writing styles have unique features and it may be possible to determine the true identity of an anonymous user through author attribution analysis. Although there has been previous work on the authorship attribution problem, there has been a dearth of research on automated authorship attribution on medical forums. The focus of the paper is to demonstrate that character-based author attribution works better than word-based methods in medical forums. OBJECTIVE: The goal was to build a system that accurately attributes authorship of messages posted on medical forums. The Authorship Attributor system uses text analysis techniques to crawl medical forums and automatically correlate messages written by the same authors. Authorship Attributor processes unstructured texts regardless of the document type, context, and content. METHODS: The messages were labeled by nicknames of the forum participants. We evaluated the system’s performance through its accuracy on 6000 messages gathered from 2 medical forums on an in vitro fertilization (IVF) support website. RESULTS: Given 2 lists of candidate authors (30 and 50 candidates, respectively), we obtained an F score accuracy in detecting authors of 75% to 80% on messages containing 100 to 150 words on average, and 97.9% on longer messages containing at least 300 words. CONCLUSIONS: Authorship can be successfully detected in short free-form messages posted on medical forums. This raises a concern about the meaningfulness of anonymous posting on such medical forums. Authorship attribution tools can be used to warn consumers wishing to post anonymously about the likelihood of their identity being determined. JMIR Publications Inc. 2013-10-03 /pmc/articles/PMC3806358/ /pubmed/24091380 http://dx.doi.org/10.2196/jmir.2514 Text en ©Victoria Bobicev, Marina Sokolova, Khaled El Emam, Yasser Jafer, Brian Dewar, Elizabeth Jonker, Stan Matwin. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 03.10.2013. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Bobicev, Victoria
Sokolova, Marina
El Emam, Khaled
Jafer, Yasser
Dewar, Brian
Jonker, Elizabeth
Matwin, Stan
Can Anonymous Posters on Medical Forums be Reidentified?
title Can Anonymous Posters on Medical Forums be Reidentified?
title_full Can Anonymous Posters on Medical Forums be Reidentified?
title_fullStr Can Anonymous Posters on Medical Forums be Reidentified?
title_full_unstemmed Can Anonymous Posters on Medical Forums be Reidentified?
title_short Can Anonymous Posters on Medical Forums be Reidentified?
title_sort can anonymous posters on medical forums be reidentified?
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3806358/
https://www.ncbi.nlm.nih.gov/pubmed/24091380
http://dx.doi.org/10.2196/jmir.2514
work_keys_str_mv AT bobicevvictoria cananonymouspostersonmedicalforumsbereidentified
AT sokolovamarina cananonymouspostersonmedicalforumsbereidentified
AT elemamkhaled cananonymouspostersonmedicalforumsbereidentified
AT jaferyasser cananonymouspostersonmedicalforumsbereidentified
AT dewarbrian cananonymouspostersonmedicalforumsbereidentified
AT jonkerelizabeth cananonymouspostersonmedicalforumsbereidentified
AT matwinstan cananonymouspostersonmedicalforumsbereidentified