Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models
An open-source, reproducible framework demonstrating how Large Language Models (LLMs) can be used for complex semantic analysis of unstructured wind turbine maintenance logs. This project moves beyond simple classification to generate deeper reliability insights, such as identifying failure modes, inferring causal chains, and uncovering site-specific operational patterns to enhance data-driven O&M in the wind energy sector.
This repository contains the full workflow as supplementary materials for the paper:
Malyi, M., Shek, J., Biscaya, A., (2025). Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models. (To be published).
A wealth of operational intelligence is locked within the unstructured free-text of wind turbine maintenance logs, a resource largely inaccessible to traditional quantitative analysis. This project introduces an exploratory framework that uses state-of-the-art LLMs as an analytical tool, or "reliability co-pilot", to synthesise this textual data and generate actionable, expert-level hypotheses.
The framework is designed to be:
- Reproducible: All analytical workflows are contained within Jupyter notebooks with clear instructions and structured prompts.
- Flexible: The prompt engineering methodology can be easily adapted for different analytical tasks and datasets.
- Insight-Oriented: The analysis moves beyond data structuring to focus on hypothesis generation, root cause analysis, and uncovering nuanced operational patterns.
The entire workflow is documented within the two primary Jupyter notebooks. The analysis is designed to be run sequentially.
-
Data Preparation: Overview the cells in
1_data_cleaning.ipynbto understand the pre-processing steps and how the analytical cohorts were created. Note that the original raw data is not provided due to commercial sensitivity; this notebook is for methodological transparency. -
Semantic Analysis: Open and run the cells in
2_processing_log.ipynb. This notebook contains the prompts, LLM outputs, and Python code for all four analytical workflows:- Failure Mode Identification
- Causal Chain Inference
- Comparative Analysis
- Data Quality Audit
Distributed under the MIT License. See LICENSE for more information.
Max Malyi - [email protected]
Project Link: Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models