Cargando…

A two level learning model for authorship authentication

Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic kno...

Descripción completa

Detalles Bibliográficos
Autores principales: Taha, Ahmed, Khalil, Heba M., El-shishtawy, Tarek
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341647/
https://www.ncbi.nlm.nih.gov/pubmed/34352003
http://dx.doi.org/10.1371/journal.pone.0255661
_version_ 1783733956312563712
author Taha, Ahmed
Khalil, Heba M.
El-shishtawy, Tarek
author_facet Taha, Ahmed
Khalil, Heba M.
El-shishtawy, Tarek
author_sort Taha, Ahmed
collection PubMed
description Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier’s results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author’s writing styles in numerical forms. Through this work, many new features are proposed for identifying the author’s writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%.
format Online
Article
Text
id pubmed-8341647
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83416472021-08-06 A two level learning model for authorship authentication Taha, Ahmed Khalil, Heba M. El-shishtawy, Tarek PLoS One Research Article Nowadays, forensic authorship authentication plays a vital role in identifying the number of unknown authors as a result of the world’s rapidly rising internet use. This paper presents two-level learning techniques for authorship authentication. The learning technique is supplied with linguistic knowledge, statistical features, and vocabulary features to enhance its efficiency instead of learning only. The linguistic knowledge is represented through lexical analysis features such as part of speech. In this study, a two-level classifier has been presented to capture the best predictive performance for identifying authorship. The first classifier is based on vocabulary features that detect the frequency with which each author uses certain words. This classifier’s results are fed to the second one which is based on a learning technique. It depends on lexical, statistical and linguistic features. All of the three sets of features describe the author’s writing styles in numerical forms. Through this work, many new features are proposed for identifying the author’s writing style. Although, the proposed new methodology is tested for Arabic writings, it is general and can be applied to any language. According to the used machine learning models, the experiment carried out shows that the trained two-level classifier achieves an accuracy ranging from 94% to 96.16%. Public Library of Science 2021-08-05 /pmc/articles/PMC8341647/ /pubmed/34352003 http://dx.doi.org/10.1371/journal.pone.0255661 Text en © 2021 Taha et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Taha, Ahmed
Khalil, Heba M.
El-shishtawy, Tarek
A two level learning model for authorship authentication
title A two level learning model for authorship authentication
title_full A two level learning model for authorship authentication
title_fullStr A two level learning model for authorship authentication
title_full_unstemmed A two level learning model for authorship authentication
title_short A two level learning model for authorship authentication
title_sort two level learning model for authorship authentication
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341647/
https://www.ncbi.nlm.nih.gov/pubmed/34352003
http://dx.doi.org/10.1371/journal.pone.0255661
work_keys_str_mv AT tahaahmed atwolevellearningmodelforauthorshipauthentication
AT khalilhebam atwolevellearningmodelforauthorshipauthentication
AT elshishtawytarek atwolevellearningmodelforauthorshipauthentication
AT tahaahmed twolevellearningmodelforauthorshipauthentication
AT khalilhebam twolevellearningmodelforauthorshipauthentication
AT elshishtawytarek twolevellearningmodelforauthorshipauthentication