2024-09-20: 🍻 Accepted to EMNLP 2024 Findings!
🗒️ arXiv link: https://arxiv.org/abs/2406.08101
Conversational Explanation Query Language (CoXQL): A text-to-SQL-like benchmark.
We recommand to use Python 3.8+
python -m pip install --upgrade pip
pip install -r requirements.txtdataset folder's structure is listed below:
dataset
|- data
|- filters
|- global_prediction
|- includes
|- local_explanation
|- local_predicrion
|- meta
|- modification
|- perturbation
In dataset folder, you can find the dataset in json format: coxql_train.json and coxql_test.json. More details about the number of pairs in each operation category can be found in dataset/README.md.
Both json files have the same structure as follows:
{
"idx": ...,
"text": ...,
"sql": ...
}Parsing accuracy results can be found in parsing/guided_decoding/results, parsing/multi_prompt/results and parsing/multi_prompt_plus/results
You can run calculate_parsing_accuracy.py to get an overview of parsing accuracy.
python calcualte_parsing_accuracy.py {guided_decoding, multi_prompt, multi_prompt_plus}🤗We evaluate in total seven state-of-the-art LMs:
| Model | Size | Huggingface Link |
|---|---|---|
| Falcon | 1B | https://huggingface.co/tiiuae/falcon-rw-1b |
| Pythia | 2.8B | https://huggingface.co/EleutherAI/pythia-2.8b-v0 |
| Mistral | 7B | https://huggingface.co/mistralai/Mistral-7B-v0.1 |
| CodeQWen1.5 | 7B | https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat |
| sqlcoder | 7B | https://huggingface.co/defog/sqlcoder-7b-2 |
| Llama3 | 8B | https://huggingface.co/meta-llama/Meta-Llama-3-8B |
| Llama3 | 70B | https://huggingface.co/meta-llama/Meta-Llama-3-70B |
@misc{wang2024coxqldatasetparsingexplanation,
title={CoXQL: A Dataset for Parsing Explanation Requests in Conversational XAI Systems},
author={Qianli Wang and Tatiana Anikina and Nils Feldhus and Simon Ostermann and Sebastian Möller},
year={2024},
eprint={2406.08101},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2406.08101},
}