Cargando…
Application of NLP-based topic modeling to analyse unstructured text data in annual reports of construction contracting companies
The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer India
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9103605/ http://dx.doi.org/10.1007/s40012-022-00355-w |
Sumario: | The construction industry is the backbone of a nation’s economy. It is a matter of great concern that such an industry suffers from time and cost overruns, especially in these challenging times. Coupled with the overrun issues, the sector is often criticized for lacking adequate quality and quantity of structured secondary data. The emerging technologies in data science and machine intelligence present a unique opportunity to understand the sector better and aid in effective decision-making. To better understand the utility of such technologies, the Management Discussion and Analysis ssections of the annual reports of publicly listed top Indian construction contracting firms are analyzed to identify the presence of ‘strategy themes’ and further map them to the organizations considered. Natural Language Processing (NLP)-based topic modeling algorithms, namely Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), are used in this study to perform a qualitative content analysis to identify the latent themes. From a methodological standpoint, considering the context of this study, the NMF results are better in accuracy, precision, and recall compared with the LDA. The results show that while most construction contracting firms prioritized a ‘revenue-focused’ strategy to expand their order books, a smaller set of large-sized firms seem to prioritize process improvement to improve their execution productivity and therefore are ‘profit margin improvement focused’ or ‘lean-focussed’ in their approach. Although a proof-of-concept, this study unlocks the immense potential of unsupervised NLP-based topic-modeling tools to understand and infer from unstructured and freely available text data in the public domain to aid sectoral analysis and policymaking. |
---|