This project implements a log classification system, combining three complementary approaches to handle varying levels of complexity in log patterns.
-
Regular Expression (Regex):
- Handles the most simplified and predictable patterns
- Useful for patterns that are easily captured using predefined rules
-
Sentence Transformer + Logistic Regression:
- Manages complex patterns when there is sufficient training data
- Utilizes embeddings generated by Sentence Transformers and applies Logistic Regression as the classification layer
-
LLM (Large Language Models):
- Used for handling complex patterns when sufficient labeled training data is not available
- Provides a fallback or complementary approach to the other methods