Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models

An open-source, reproducible framework demonstrating how Large Language Models (LLMs) can be used for complex semantic analysis of unstructured wind turbine maintenance logs. This project moves beyond simple classification to generate deeper reliability insights, such as identifying failure modes, inferring causal chains, and uncovering site-specific operational patterns to enhance data-driven O&M in the wind energy sector.

This repository contains the full workflow as supplementary materials for the paper:

Malyi, M., Shek, J., Biscaya, A., (2025). Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models. (To be published).

About The Project

A wealth of operational intelligence is locked within the unstructured free-text of wind turbine maintenance logs, a resource largely inaccessible to traditional quantitative analysis. This project introduces an exploratory framework that uses state-of-the-art LLMs as an analytical tool, or "reliability co-pilot", to synthesise this textual data and generate actionable, expert-level hypotheses.

The framework is designed to be:

Reproducible: All analytical workflows are contained within Jupyter notebooks with clear instructions and structured prompts.
Flexible: The prompt engineering methodology can be easily adapted for different analytical tasks and datasets.
Insight-Oriented: The analysis moves beyond data structuring to focus on hypothesis generation, root cause analysis, and uncovering nuanced operational patterns.

Structure

The entire workflow is documented within the two primary Jupyter notebooks. The analysis is designed to be run sequentially.

Data Preparation: Overview the cells in 1_data_cleaning.ipynb to understand the pre-processing steps and how the analytical cohorts were created. Note that the original raw data is not provided due to commercial sensitivity; this notebook is for methodological transparency.
Semantic Analysis: Open and run the cells in 2_processing_log.ipynb. This notebook contains the prompts, LLM outputs, and Python code for all four analytical workflows:
- Failure Mode Identification
- Causal Chain Inference
- Comparative Analysis
- Data Quality Audit

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Max Malyi - [email protected]

Project Link: Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
1_data_cleaning.ipynb		1_data_cleaning.ipynb
2_processing_log.ipynb		2_processing_log.ipynb
LICENSE		LICENSE
README.md		README.md
converter.json		converter.json
data_quality_report_gemini.md		data_quality_report_gemini.md
data_quality_report_gpt.md		data_quality_report_gpt.md
farms.json		farms.json
interactive_timeline.html		interactive_timeline.html
turbine.json		turbine.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models

About The Project

Structure

License

Contact

About

Uh oh!

Uh oh!

Languages

License

mvmalyi/llm-semantic-maintenance-logs-analysis

Folders and files

Latest commit

History

Repository files navigation

Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models

About The Project

Structure

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages