Cargando…

Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies

The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity...

Descripción completa

Detalles Bibliográficos
Autores principales: Jagannathan, Murali, Roy, Debopam, Delhi, Venkata Santosh Kumar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer India 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103605/
http://dx.doi.org/10.1007/s40012-022-00355-w
_version_ 1784707595557142528
author Jagannathan, Murali
Roy, Debopam
Delhi, Venkata Santosh Kumar
author_facet Jagannathan, Murali
Roy, Debopam
Delhi, Venkata Santosh Kumar
author_sort Jagannathan, Murali
collection PubMed
description The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity of structured secondary data. The emerging technologies in data science and machine intelligence present a unique opportunity to understand the sector better and aid in effective decision-making. To better understand the utility of such technologies, the Management Discussion and Analysis ssections of the annual reports of publicly listed top Indian construction contracting firms are analyzed to identify the presence of ‘strategy themes’ and further map them to the organizations considered. Natural Language Processing (NLP)-based topic modeling algorithms, namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), are used in this study to perform a qualitative content analysis to identify the latent themes. From a methodological standpoint, considering the context of this study, the NMF results are better in accuracy, precision, and recall compared with the LDA. The results show that while most construction contracting firms prioritized a ‘revenue-focused’ strategy to expand their order books, a smaller set of large-sized firms seem to prioritize process improvement to improve their execution productivity and therefore are ‘profit margin improvement focused’ or ‘lean-focussed’ in their approach. Although a proof-of-concept, this study unlocks the immense potential of unsupervised NLP-based topic-modeling tools to understand and infer from unstructured and freely available text data in the public domain to aid sectoral analysis and policymaking.
format Online
Article
Text
id pubmed-9103605
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer India
record_format MEDLINE/PubMed
spelling pubmed-91036052022-05-16 Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies Jagannathan, Murali Roy, Debopam Delhi, Venkata Santosh Kumar CSIT Original Research The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity of structured secondary data. The emerging technologies in data science and machine intelligence present a unique opportunity to understand the sector better and aid in effective decision-making. To better understand the utility of such technologies, the Management Discussion and Analysis ssections of the annual reports of publicly listed top Indian construction contracting firms are analyzed to identify the presence of ‘strategy themes’ and further map them to the organizations considered. Natural Language Processing (NLP)-based topic modeling algorithms, namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), are used in this study to perform a qualitative content analysis to identify the latent themes. From a methodological standpoint, considering the context of this study, the NMF results are better in accuracy, precision, and recall compared with the LDA. The results show that while most construction contracting firms prioritized a ‘revenue-focused’ strategy to expand their order books, a smaller set of large-sized firms seem to prioritize process improvement to improve their execution productivity and therefore are ‘profit margin improvement focused’ or ‘lean-focussed’ in their approach. Although a proof-of-concept, this study unlocks the immense potential of unsupervised NLP-based topic-modeling tools to understand and infer from unstructured and freely available text data in the public domain to aid sectoral analysis and policymaking. Springer India 2022-05-13 2022 /pmc/articles/PMC9103605/ http://dx.doi.org/10.1007/s40012-022-00355-w Text en © CSI Publications 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Jagannathan, Murali
Roy, Debopam
Delhi, Venkata Santosh Kumar
Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title_full Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title_fullStr Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title_full_unstemmed Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title_short Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
title_sort application of nlp-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103605/
http://dx.doi.org/10.1007/s40012-022-00355-w
work_keys_str_mv AT jagannathanmurali applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies
AT roydebopam applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies
AT delhivenkatasantoshkumar applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies