Cargando…

Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study

Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT...

Descripción completa

Detalles Bibliográficos
Autores principales: Graham, Shawn, Yates, Donna, El-Roby, Ahmed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445804/
https://www.ncbi.nlm.nih.gov/pubmed/37645491
http://dx.doi.org/10.12688/openreseurope.16003.1
_version_ 1785094258392301568
author Graham, Shawn
Yates, Donna
El-Roby, Ahmed
author_facet Graham, Shawn
Yates, Donna
El-Roby, Ahmed
author_sort Graham, Shawn
collection PubMed
description Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means.
format Online
Article
Text
id pubmed-10445804
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-104458042023-08-29 Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study Graham, Shawn Yates, Donna El-Roby, Ahmed Open Res Eur Research Article Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means. F1000 Research Limited 2023-06-20 /pmc/articles/PMC10445804/ /pubmed/37645491 http://dx.doi.org/10.12688/openreseurope.16003.1 Text en Copyright: © 2023 Graham S et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Graham, Shawn
Yates, Donna
El-Roby, Ahmed
Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title_full Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title_fullStr Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title_full_unstemmed Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title_short Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
title_sort investigating antiquities trafficking with generative pre-trained transformer (gpt)-3 enabled knowledge graphs: a case study
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445804/
https://www.ncbi.nlm.nih.gov/pubmed/37645491
http://dx.doi.org/10.12688/openreseurope.16003.1
work_keys_str_mv AT grahamshawn investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy
AT yatesdonna investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy
AT elrobyahmed investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy