Cargando…
Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer India
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103605/ http://dx.doi.org/10.1007/s40012-022-00355-w |
_version_ | 1784707595557142528 |
---|---|
author | Jagannathan, Murali Roy, Debopam Delhi, Venkata Santosh Kumar |
author_facet | Jagannathan, Murali Roy, Debopam Delhi, Venkata Santosh Kumar |
author_sort | Jagannathan, Murali |
collection | PubMed |
description | The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity of structured secondary data. The emerging technologies in data science and machine intelligence present a unique opportunity to understand the sector better and aid in effective decision-making. To better understand the utility of such technologies, the Management Discussion and Analysis ssections of the annual reports of publicly listed top Indian construction contracting firms are analyzed to identify the presence of ‘strategy themes’ and further map them to the organizations considered. Natural Language Processing (NLP)-based topic modeling algorithms, namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), are used in this study to perform a qualitative content analysis to identify the latent themes. From a methodological standpoint, considering the context of this study, the NMF results are better in accuracy, precision, and recall compared with the LDA. The results show that while most construction contracting firms prioritized a ‘revenue-focused’ strategy to expand their order books, a smaller set of large-sized firms seem to prioritize process improvement to improve their execution productivity and therefore are ‘profit margin improvement focused’ or ‘lean-focussed’ in their approach. Although a proof-of-concept, this study unlocks the immense potential of unsupervised NLP-based topic-modeling tools to understand and infer from unstructured and freely available text data in the public domain to aid sectoral analysis and policymaking. |
format | Online Article Text |
id | pubmed-9103605 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer India |
record_format | MEDLINE/PubMed |
spelling | pubmed-91036052022-05-16 Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies Jagannathan, Murali Roy, Debopam Delhi, Venkata Santosh Kumar CSIT Original Research The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity of structured secondary data. The emerging technologies in data science and machine intelligence present a unique opportunity to understand the sector better and aid in effective decision-making. To better understand the utility of such technologies, the Management Discussion and Analysis ssections of the annual reports of publicly listed top Indian construction contracting firms are analyzed to identify the presence of ‘strategy themes’ and further map them to the organizations considered. Natural Language Processing (NLP)-based topic modeling algorithms, namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), are used in this study to perform a qualitative content analysis to identify the latent themes. From a methodological standpoint, considering the context of this study, the NMF results are better in accuracy, precision, and recall compared with the LDA. The results show that while most construction contracting firms prioritized a ‘revenue-focused’ strategy to expand their order books, a smaller set of large-sized firms seem to prioritize process improvement to improve their execution productivity and therefore are ‘profit margin improvement focused’ or ‘lean-focussed’ in their approach. Although a proof-of-concept, this study unlocks the immense potential of unsupervised NLP-based topic-modeling tools to understand and infer from unstructured and freely available text data in the public domain to aid sectoral analysis and policymaking. Springer India 2022-05-13 2022 /pmc/articles/PMC9103605/ http://dx.doi.org/10.1007/s40012-022-00355-w Text en © CSI Publications 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Research Jagannathan, Murali Roy, Debopam Delhi, Venkata Santosh Kumar Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title | Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title_full | Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title_fullStr | Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title_full_unstemmed | Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title_short | Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
title_sort | application of nlp-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103605/ http://dx.doi.org/10.1007/s40012-022-00355-w |
work_keys_str_mv | AT jagannathanmurali applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies AT roydebopam applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies AT delhivenkatasantoshkumar applicationofnlpbasedtopicmodelingtoanalyseunstructuredtextdatainannualreportsofconstructioncontractingcompanies |