Cargando…
Can Anonymous Posters on Medical Forums be Reidentified?
BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identit...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications Inc.
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3806358/ https://www.ncbi.nlm.nih.gov/pubmed/24091380 http://dx.doi.org/10.2196/jmir.2514 |
_version_ | 1782288367350710272 |
---|---|
author | Bobicev, Victoria Sokolova, Marina El Emam, Khaled Jafer, Yasser Dewar, Brian Jonker, Elizabeth Matwin, Stan |
author_facet | Bobicev, Victoria Sokolova, Marina El Emam, Khaled Jafer, Yasser Dewar, Brian Jonker, Elizabeth Matwin, Stan |
author_sort | Bobicev, Victoria |
collection | PubMed |
description | BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identities, nicknames, or pseudonyms that cannot readily be traced back to them. However, individual writing styles have unique features and it may be possible to determine the true identity of an anonymous user through author attribution analysis. Although there has been previous work on the authorship attribution problem, there has been a dearth of research on automated authorship attribution on medical forums. The focus of the paper is to demonstrate that character-based author attribution works better than word-based methods in medical forums. OBJECTIVE: The goal was to build a system that accurately attributes authorship of messages posted on medical forums. The Authorship Attributor system uses text analysis techniques to crawl medical forums and automatically correlate messages written by the same authors. Authorship Attributor processes unstructured texts regardless of the document type, context, and content. METHODS: The messages were labeled by nicknames of the forum participants. We evaluated the system’s performance through its accuracy on 6000 messages gathered from 2 medical forums on an in vitro fertilization (IVF) support website. RESULTS: Given 2 lists of candidate authors (30 and 50 candidates, respectively), we obtained an F score accuracy in detecting authors of 75% to 80% on messages containing 100 to 150 words on average, and 97.9% on longer messages containing at least 300 words. CONCLUSIONS: Authorship can be successfully detected in short free-form messages posted on medical forums. This raises a concern about the meaningfulness of anonymous posting on such medical forums. Authorship attribution tools can be used to warn consumers wishing to post anonymously about the likelihood of their identity being determined. |
format | Online Article Text |
id | pubmed-3806358 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | JMIR Publications Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-38063582013-10-24 Can Anonymous Posters on Medical Forums be Reidentified? Bobicev, Victoria Sokolova, Marina El Emam, Khaled Jafer, Yasser Dewar, Brian Jonker, Elizabeth Matwin, Stan J Med Internet Res Original Paper BACKGROUND: Participants in medical forums often reveal personal health information about themselves in their online postings. To feel comfortable revealing sensitive personal health information, some participants may hide their identity by posting anonymously. They can do this by using fake identities, nicknames, or pseudonyms that cannot readily be traced back to them. However, individual writing styles have unique features and it may be possible to determine the true identity of an anonymous user through author attribution analysis. Although there has been previous work on the authorship attribution problem, there has been a dearth of research on automated authorship attribution on medical forums. The focus of the paper is to demonstrate that character-based author attribution works better than word-based methods in medical forums. OBJECTIVE: The goal was to build a system that accurately attributes authorship of messages posted on medical forums. The Authorship Attributor system uses text analysis techniques to crawl medical forums and automatically correlate messages written by the same authors. Authorship Attributor processes unstructured texts regardless of the document type, context, and content. METHODS: The messages were labeled by nicknames of the forum participants. We evaluated the system’s performance through its accuracy on 6000 messages gathered from 2 medical forums on an in vitro fertilization (IVF) support website. RESULTS: Given 2 lists of candidate authors (30 and 50 candidates, respectively), we obtained an F score accuracy in detecting authors of 75% to 80% on messages containing 100 to 150 words on average, and 97.9% on longer messages containing at least 300 words. CONCLUSIONS: Authorship can be successfully detected in short free-form messages posted on medical forums. This raises a concern about the meaningfulness of anonymous posting on such medical forums. Authorship attribution tools can be used to warn consumers wishing to post anonymously about the likelihood of their identity being determined. JMIR Publications Inc. 2013-10-03 /pmc/articles/PMC3806358/ /pubmed/24091380 http://dx.doi.org/10.2196/jmir.2514 Text en ©Victoria Bobicev, Marina Sokolova, Khaled El Emam, Yasser Jafer, Brian Dewar, Elizabeth Jonker, Stan Matwin. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 03.10.2013. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Bobicev, Victoria Sokolova, Marina El Emam, Khaled Jafer, Yasser Dewar, Brian Jonker, Elizabeth Matwin, Stan Can Anonymous Posters on Medical Forums be Reidentified? |
title | Can Anonymous Posters on Medical Forums be Reidentified? |
title_full | Can Anonymous Posters on Medical Forums be Reidentified? |
title_fullStr | Can Anonymous Posters on Medical Forums be Reidentified? |
title_full_unstemmed | Can Anonymous Posters on Medical Forums be Reidentified? |
title_short | Can Anonymous Posters on Medical Forums be Reidentified? |
title_sort | can anonymous posters on medical forums be reidentified? |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3806358/ https://www.ncbi.nlm.nih.gov/pubmed/24091380 http://dx.doi.org/10.2196/jmir.2514 |
work_keys_str_mv | AT bobicevvictoria cananonymouspostersonmedicalforumsbereidentified AT sokolovamarina cananonymouspostersonmedicalforumsbereidentified AT elemamkhaled cananonymouspostersonmedicalforumsbereidentified AT jaferyasser cananonymouspostersonmedicalforumsbereidentified AT dewarbrian cananonymouspostersonmedicalforumsbereidentified AT jonkerelizabeth cananonymouspostersonmedicalforumsbereidentified AT matwinstan cananonymouspostersonmedicalforumsbereidentified |