CELESTA is a hybrid Entity Disambiguation (ED) framework designed for low-resource languages. In a case study on Indonesian, CELESTA performs parallel mention expansion using both multilingual and monolingual Large Language Models (LLMs). It then applies a similarity-based selection mechanism to choose the expansion that is most semantically aligned with the original context. Finally, the selected expansion is linked to a knowledge base entity using an off-the-shelf ED model—without requiring any fine-tuning. The following is the architecture of CELESTA:
│
├── datasets/ # Input datasets (IndGEL, IndQEL, IndEL-WIKI)
├── images/ # Architecture visualizations
│ └── celesta_architecture.jpg
├── src/ # Source code for CELESTA modules
│ └── mention_expansion/ # Mention expansion scripts
├── requirements.txt # Python dependencies
├── README.md # Project overview
└── LICENSE # License file
- Clone the repository
git clone https://github.com/dice-group/CELESTA.git
cd CELESTA
- Create the environment
conda create -n celesta python=3.10
conda activate celesta
pip install -r requirements.txt
- Install CELESTA-mGENRE
# change folder to entity_disambiguation directory
cd entity_disambiguation
# run script to install CELESTA-mGENRE
bash INSTALL-CELESTA-mGENRE.sh
CELESTA is evaluated on three Indonesian Entity Disambiguation (ED) datasets: IndGEL, IndQEL, and IndEL-WIKI.
- IndGEL (general domain) and IndQEL (specific domain) are from the IndEL dataset.
- IndEL-WIKI is a new dataset we created to provide additional evaluation data for CELESTA.
Dataset Property | IndGEL | IndQEL | IndEL-WIKI |
---|---|---|---|
Sentences | 2,114 | 2,621 | 24,678 |
Total entities | 4,765 | 2,453 | 24,678 |
Unique entities | 55 | 16 | 24,678 |
Entities / sentence | 2.4 | 1.6 | 1.0 |
Train set sentences | 1,674 | 2,076 | 17,172 |
Validation set sentences | 230 | 284 | 4,958 |
Test set sentences | 230 | 284 | 4,958 |
CELESTA uses two hybrid LLMs:
- Run Mention Expansion
# Change directory to the src folder
cd src
# To run the mention expansion script
# usage: mention_expansion.py [-h] [--model_name MODEL_NAME] [--prompt_type PROMPT_TYPE] [--dataset DATASET] [--split SPLIT] [--llm_name LLM_NAME] [--input_dir INPUT_DIR]
# [--output_dir OUTPUT_DIR] [--batch_size BATCH_SIZE] [--save_every SAVE_EVERY] [--save_interval SAVE_INTERVAL]
python mention_expansion.py --model_name meta-llama/Meta-Llama-3-70B-Instruct --prompt_type few-shot --dataset IndGEL --llm_name llama-3
- Entity Disambiguation
# Change to mGENRE directory
cd entity_disambiguation/GENRE/CELESTA-mGENRE
# Run script to CELESTA-mGENRE
bash run-CELESTA-mGENRE.sh ../../../../results/mension_expansion/celesta/IndGEL/few-shot_llama-3_komodo/test_set.json
The table below compares CELESTA with two baseline ED models (ReFinED and mGENRE) across the three evaluation datasets. Bold values indicate the highest score for each metric within a dataset.
Dataset | Model | Precision | Recall | F1 |
---|---|---|---|---|
IndGEL | ReFinED | 0.749 | 0.547 | 0.633 |
mGENRE | 0.742 | 0.718 | 0.730 | |
CELESTA (ours) | 0.748 | 0.722 | 0.735 | |
IndQEL | ReFinED | 0.208 | 0.160 | 0.181 |
mGENRE | 0.298 | 0.298 | 0.298 | |
CELESTA (ours) | 0.298 | 0.298 | 0.298 | |
IndEL-WIKI | ReFinED | 0.627 | 0.327 | 0.430 |
mGENRE | 0.601 | 0.489 | 0.539 | |
CELESTA (ours) | 0.595 | 0.495 | 0.540 |
The table below reports Precision (P), Recall (R), and F1 for CELESTA and individual LLM configurations across the three datasets, under zero-shot and few-shot prompting. Bold values indicate the highest F1 score within each dataset and prompting setting. The following results are obtained when CELESTA uses ReFinED to generate candidate entities and retrieve the corresponding Wikidata URIs.
Dataset | Model | Zero-shot | Few-shot | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
IndGEL | LLaMA-3 | 0.727 | 0.499 | 0.592 | 0.777 | 0.531 | 0.631 |
Mistral | 0.699 | 0.411 | 0.517 | 0.806 | 0.310 | 0.448 | |
Komodo | 0.709 | 0.447 | 0.548 | 0.704 | 0.527 | 0.603 | |
Merak | 0.654 | 0.441 | 0.526 | 0.749 | 0.547 | 0.633 | |
CELESTA with ReFinED | |||||||
LLaMA-3 & Komodo | 0.731 | 0.437 | 0.547 | 0.757 | 0.513 | 0.612 | |
LLaMA-3 & Merak | 0.688 | 0.431 | 0.530 | 0.802 | 0.586 | 0.677 | |
Mistral & Komodo | 0.719 | 0.390 | 0.506 | 0.781 | 0.344 | 0.478 | |
Mistral & Merak | 0.678 | 0.402 | 0.505 | 0.779 | 0.503 | 0.611 | |
IndQEL | LLaMA-3 | 0.154 | 0.051 | 0.077 | 0.327 | 0.058 | 0.099 |
Mistral | 0.179 | 0.131 | 0.151 | 0.072 | 0.029 | 0.042 | |
Komodo | 0.158 | 0.116 | 0.134 | 0.208 | 0.160 | 0.181 | |
Merak | 0.203 | 0.149 | 0.172 | 0.142 | 0.106 | 0.121 | |
CELESTA with ReFinED | |||||||
LLaMA-3 & Komodo | 0.138 | 0.047 | 0.071 | 0.282 | 0.073 | 0.116 | |
LLaMA-3 & Merak | 0.160 | 0.113 | 0.132 | 0.130 | 0.098 | 0.112 | |
Mistral & Komodo | 0.138 | 0.095 | 0.112 | 0.107 | 0.047 | 0.066 | |
Mistral & Merak | 0.196 | 0.146 | 0.167 | 0.128 | 0.095 | 0.109 | |
IndEL-WIKI | LLaMA-3 | 0.581 | 0.234 | 0.332 | 0.639 | 0.322 | 0.428 |
Mistral | 0.565 | 0.232 | 0.329 | 0.552 | 0.201 | 0.294 | |
Komodo | 0.592 | 0.256 | 0.357 | 0.591 | 0.270 | 0.370 | |
Merak | 0.591 | 0.285 | 0.385 | 0.548 | 0.293 | 0.382 | |
CELESTA with ReFinED | |||||||
LLaMA-3 & Komodo | 0.577 | 0.234 | 0.332 | 0.639 | 0.322 | 0.428 | |
LLaMA-3 & Merak | 0.596 | 0.273 | 0.374 | 0.641 | 0.355 | 0.457 | |
Mistral & Komodo | 0.576 | 0.231 | 0.330 | 0.575 | 0.219 | 0.317 | |
Mistral & Merak | 0.564 | 0.248 | 0.345 | 0.581 | 0.270 | 0.369 |
The following results are obtained when CELESTA uses mGENRE to generate candidate entities and retrieve the corresponding Wikidata URIs.
Dataset | Model | Zero-shot | Few-shot | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
IndGEL | LLaMA-3 | 0.720 | 0.694 | 0.707 | 0.742 | 0.718 | 0.730 |
Mistral | 0.667 | 0.640 | 0.653 | 0.607 | 0.584 | 0.595 | |
Komodo | 0.702 | 0.668 | 0.685 | 0.740 | 0.698 | 0.718 | |
Merak | 0.611 | 0.576 | 0.594 | 0.696 | 0.672 | 0.684 | |
CELESTA with mGENRE | |||||||
LLaMA-3 & Komodo | 0.695 | 0.660 | 0.677 | 0.741 | 0.708 | 0.724 | |
LLaMA-3 & Merak | 0.631 | 0.596 | 0.613 | 0.748 | 0.722 | 0.735 | |
Mistral & Komodo | 0.657 | 0.632 | 0.644 | 0.623 | 0.602 | 0.612 | |
Mistral & Merak | 0.620 | 0.588 | 0.603 | 0.702 | 0.676 | 0.686 | |
IndQEL | LLaMA-3 | 0.298 | 0.298 | 0.298 | 0.274 | 0.273 | 0.273 |
Mistral | 0.258 | 0.258 | 0.258 | 0.185 | 0.182 | 0.183 | |
Komodo | 0.252 | 0.251 | 0.251 | 0.269 | 0.269 | 0.269 | |
Merak | 0.233 | 0.233 | 0.233 | 0.255 | 0.255 | 0.255 | |
CELESTA with mGENRE | |||||||
LLaMA-3 & Komodo | 0.298 | 0.298 | 0.298 | 0.266 | 0.266 | 0.266 | |
LLaMA-3 & Merak | 0.276 | 0.276 | 0.276 | 0.0.256 | 0.255 | 0.255 | |
Mistral & Komodo | 0.262 | 0.262 | 0.262 | 0.185 | 0.182 | 0.183 | |
Mistral & Merak | 0.236 | 0.236 | 0.236 | 0.202 | 0.200 | 0.201 | |
IndEL-WIKI | LLaMA-3 | 0.516 | 0.415 | 0.460 | 0.601 | 0.489 | 0.539 |
Mistral | 0.457 | 0.360 | 0.403 | 0.447 | 0.363 | 0.401 | |
Komodo | 0.542 | 0.401 | 0.461 | 0.547 | 0.422 | 0.476 | |
Merak | 0.474 | 0.371 | 0.417 | 0.428 | 0.353 | 0.387 | |
CELESTA with mGENRE | |||||||
LLaMA-3 & Komodo | 0.548 | 0.411 | 0.470 | 0.618 | 0.481 | 0.537 | |
LLaMA-3 & Merak | 0.521 | 0.412 | 0.460 | 0.595 | 0.495 | 0.540 | |
Mistral & Komodo | 0.500 | 0.368 | 0.424 | 0.484 | 0.382 | 0.427 | |
Mistral & Merak | 0.447 | 0.349 | 0.392 | 0.507 | 0.413 | 0.455 |