Статья опубликована в рамках: CCXXXVII Международной научно-практической конференции «Научное сообщество студентов: МЕЖДИСЦИПЛИНАРНЫЕ ИССЛЕДОВАНИЯ» (Россия, г. Новосибирск, 28 мая 2026 г.)
Наука: Информационные технологии
Скачать книгу(-и): Сборник статей конференции
дипломов
MACHINE LEARNING APPROACHES FOR CONSTRUCTION DELAY PREDICTION: A COMPARATIVE REVIEW
ABSTRACT
Construction delays remain a major problem in the construction industry, causing cost overruns and reduced project efficiency. Traditional forecasting methods such as CPM/PERT often fail to capture nonlinear relationships among delay factors. Recent advances in machine learning (ML) provide new opportunities for improving delay prediction accuracy. This paper reviews modern ML approaches for construction delay prediction, including Random Forest (RF), LightGBM, Long Short-Term Memory (LSTM), Graph Neural Networks (GNN), and regression-based models. The analysis focuses on their strengths, limitations, and applicability in different construction contexts. The review shows that ensemble and deep learning models improve prediction accuracy when sufficient data are available, while Random Forest remains one of the most practical and interpretable solutions for industry implementation.
Keywords: construction delay prediction, machine learning, Random Forest, LightGBM, LSTM, construction scheduling, project risk management.
1. Introduction
Construction delays often cause cost overruns, productivity losses, and contractual disputes. According to the McKinsey Global Institute, many large construction projects experience serious schedule overruns because of fragmented workflows and limited digitalization [1]. Delay factors include design changes, labor shortages, financial issues, supply-chain disruptions, equipment failures, and ineffective project management. These interactions reduce the effectiveness of traditional forecasting methods. Recent ML developments enable prediction of delays using historical project data. This paper reviews recent ML applications for construction delay prediction and compares the strengths and limitations of different algorithms.
2. Literature Review and Methodology
Machine learning methods are increasingly used in construction scheduling and delay management. Existing studies focus on classification and regression tasks. Deep learning approaches have shown strong potential for modeling sequential project data. Farzad et al. demonstrated that bidirectional LSTM networks effectively capture temporal dependencies in construction scheduling data [2]. Demiss and Elsaigh also reported that hybrid CNN–RNN architectures improve forecasting performance under uncertainty conditions [6]. Ensemble tree-based algorithms remain popular because they balance accuracy and interpretability. Liben et al. found that Random Forest achieved an R² value of 0.97 with an RMSE of approximately 74 days in Addis Ababa public-sector projects [3]. Jiang et al. emphasized that regression-based forecasting models strongly depend on dataset quality and feature engineering [4]. Gradient boosting methods such as XGBoost and LightGBM provide efficient and accurate forecasting [5]. Graph-based approaches are also gaining attention; Jia et al. highlighted the ability of graph neural networks to model task interdependencies and project relationships [7]. Russian researchers have also contributed to this field. Konkov et al. proposed a machine learning system based on historical project data for predicting construction delays and improving planning processes [8]. This study reviews peer-reviewed publications indexed in ScienceDirect, Springer, ASCE Library, MDPI, and Engineering Research Express. The methodology included selecting studies published between 2020 and 2025, extracting reported performance metrics, and comparing the strengths and limitations of different ML approaches.
3. Results
Table 1.
Summarizes the key findings of the reviewed studies.
|
Model |
Application context |
Key Findings |
Source |
|
Bidirectional LSTM |
Construction project scheduling |
Effective modeling of temporal dependencies |
[2] |
|
CNN–RNN hybrid model |
Construction duration prediction under uncertainty |
Improved forecasting for nonlinear project variables |
[6] |
|
Random Forest |
Ethiopian public building projects |
R² = 0.97; RMSE ≈ 74 days |
[3] |
|
Regression-based ML models |
Construction duration forecasting |
Strong dependence on feature engineering and dataset quality |
[4] |
|
LightGBM / XGBoost |
Construction schedule delay prediction |
Accurate and computationally efficient forecasting |
[5] |
|
Graph Neural Networks |
Construction engineering and management |
Improved modeling of task dependencies and project relationships |
[7] |
No ML model consistently outperforms others across all construction contexts. Model effectiveness depends on data quality and project characteristics. Deep learning models such as LSTM demonstrate strong capabilities for capturing temporal patterns, while ensemble methods such as Random Forest and LightGBM provide robust and interpretable forecasting results.
4. Discussion
The reviewed studies indicate that machine learning can substantially improve construction delay prediction compared with traditional forecasting approaches. Deep learning methods such as LSTM-based architectures are effective for projects with large historical datasets and dynamic scheduling data. Random Forest remains one of the most practical algorithms because it combines predictive capability with interpretability, allowing project managers to identify influential delay factors. LightGBM is effective for large and imbalanced datasets. A key future direction is ML integration with BIM and digital twins. Combining scheduling data with real-time project information may enable predictive scheduling and automated risk analysis. However, practical implementation remains limited because of inconsistent data collection practices, limited standardized datasets, and difficulties in explaining complex model predictions.
5. Limitations
Several limitations should be acknowledged. First, many reviewed studies rely on region-specific datasets, limiting the generalizability of findings. Second, many ML models are trained using relatively small or imbalanced datasets, increasing the risk of overfitting. Third, the reviewed studies use different prediction objectives and evaluation metrics, making direct comparison difficult. Future research should focus on larger datasets and real-time BIM/IoT integration.
6. Conclusion
Machine learning offers significant opportunities for improving construction delay prediction and project risk management. “Different ML models provide advantages depending on project characteristics and data availability. Among the reviewed approaches, Random Forest and LightGBM appear especially promising for practical implementation because they combine predictive performance, robustness, and interpretability. LSTM and CNN–RNN architectures also demonstrate strong potential for complex forecasting tasks but generally require larger datasets. Construction delay prediction is closely linked to industry digitalization and the integration of ML with BIM, IoT systems, and digital twins.
References:
- McKinsey Global Institute. Reinventing Construction: A Route to Higher Productivity [Электронный ресурс]. – М. : McKinsey & Company, 2017. – URL: https://www.mckinsey.com/capabilities/operations/our-insights/reinventing-construction-through-a-productivity-revolution (date of request: 18.05.2026).
- Farzad E., Dehghan Manshadi H., Dashti Rahmatabadi M. A. Predicting construction project scheduling issues using LSTM neural network // Amirkabir Journal of Civil Engineering. – 2023. – Vol. 55, № 9. – P. 1753–1764. – DOI: 10.22060/ceej.2023.21383.7701.
- Liben S. M., Belachew D. A., Elsaigh W. A. Comparing advanced and traditional machine learning algorithms for construction duration prediction: a case study of Addis Ababa's public sector // Engineering Research Express. – 2024. – Vol. 6, № 4. – Art. 045119. – DOI: 10.1088/2631-8695/ad979f.
- Jiang X., Li B., Zhao G., Zhang L. A comparative study of machine learning regression models for predicting construction duration // Journal of Asian Architecture and Building Engineering. – 2024. – Vol. 23, № 2. – P. 1245–1258. – DOI: 10.1080/13467581.2023.2278887.
- Jia S., Luo C., Wang R. [et al.] Data-driven multi-mode time–cost trade-off optimization for construction project scheduling using LightGBM // Processes. – 2026. – Vol. 14, № 8. – Art. 1311. – DOI: 10.3390/pr14081311.
- Demiss B. A., Elsaigh W. A. Application of novel hybrid deep learning architectures combining CNN and RNN for construction duration prediction under uncertainty // Engineering Research Express. – 2024. – Vol. 6, № 3. – Art. 032102. – DOI: 10.1088/2631-8695/ad6ca7.
- Jia Y., Wang J., Shou W., Hosseini M. R., Bai Y. Graph neural networks for construction applications // Automation in Construction. – 2023. – Vol. 154. – Art. 104984. – DOI: 10.1016/j.autcon.2023.104984.
- Konkov V. V., Shirokov V. I., Zhabitsky M. G. Predicting construction delays using machine learning based on historical data on the actual duration of completed projects // International Journal of Open Information Technologies. – 2024. – Vol. 12, № 8. – P. 35–47.
дипломов

