Cargando…
Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445804/ https://www.ncbi.nlm.nih.gov/pubmed/37645491 http://dx.doi.org/10.12688/openreseurope.16003.1 |
_version_ | 1785094258392301568 |
---|---|
author | Graham, Shawn Yates, Donna El-Roby, Ahmed |
author_facet | Graham, Shawn Yates, Donna El-Roby, Ahmed |
author_sort | Graham, Shawn |
collection | PubMed |
description | Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means. |
format | Online Article Text |
id | pubmed-10445804 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-104458042023-08-29 Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study Graham, Shawn Yates, Donna El-Roby, Ahmed Open Res Eur Research Article Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means. F1000 Research Limited 2023-06-20 /pmc/articles/PMC10445804/ /pubmed/37645491 http://dx.doi.org/10.12688/openreseurope.16003.1 Text en Copyright: © 2023 Graham S et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Graham, Shawn Yates, Donna El-Roby, Ahmed Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title | Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title_full | Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title_fullStr | Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title_full_unstemmed | Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title_short | Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study |
title_sort | investigating antiquities trafficking with generative pre-trained transformer (gpt)-3 enabled knowledge graphs: a case study |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10445804/ https://www.ncbi.nlm.nih.gov/pubmed/37645491 http://dx.doi.org/10.12688/openreseurope.16003.1 |
work_keys_str_mv | AT grahamshawn investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy AT yatesdonna investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy AT elrobyahmed investigatingantiquitiestraffickingwithgenerativepretrainedtransformergpt3enabledknowledgegraphsacasestudy |