McGill NLP

All

50 repositories

mcgill-nlp.github.io
Public
Python
•23•0•15•2•Updated Sep 5, 2025Sep 5, 2025
mSTEB
Public
Jupyter Notebook
•0•0•0•0•Updated Aug 25, 2025Aug 25, 2025
bias-bench
Public
ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.
Python
•41•145•0•0•Updated Aug 18, 2025Aug 18, 2025
AdversarialTriggers
Public
TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models
Python
•
MIT License
•2•18•0•0•Updated Aug 17, 2025Aug 17, 2025
agent-reward-bench
Public
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Python
•1•37•0•0•Updated Aug 7, 2025Aug 7, 2025
meaning-change
Public
R
•
MIT License
•0•1•0•0•Updated Jul 30, 2025Jul 30, 2025
AfroBench
Public
Large Scale Benchmark of Large Language Models on African Languages
Python
•2•9•0•0•Updated Jul 28, 2025Jul 28, 2025
unequal-unlearning
Public
Python
•0•5•2•0•Updated Jul 15, 2025Jul 15, 2025
nano-aha-moment
Public
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
Jupyter Notebook
•
MIT License
•47•523•4•1•Updated Jul 7, 2025Jul 7, 2025
AURORA
Public
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
Python
•
MIT License
•2•30•0•0•Updated Jun 30, 2025Jun 30, 2025
ast-lrl-speech
Public
Python
•0•0•0•0•Updated Jun 22, 2025Jun 22, 2025
VinePPO
Public
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
Python
•
MIT License
•20•169•4•0•Updated May 25, 2025May 25, 2025
Naija-representation-in-LLMs
Public
Evaluation dataset for our NAACL 2025 paper on "Does Generative AI speak Nigerian-Pidgin?: Issues about Representativeness and Bias for Multilingualism in LLMs"
Apache License 2.0
•0•0•0•0•Updated May 14, 2025May 14, 2025
thoughtology
Public
MIT License
•2•10•0•0•Updated May 12, 2025May 12, 2025
constituent-movement
Public
Repo for "Language Models Largely Exhibit Human-like Constituent Ordering Preferences"
Python
•1•2•0•0•Updated Apr 26, 2025Apr 26, 2025
weblinx-browsergym
Public
Python
•3•1•0•1•Updated Apr 25, 2025Apr 25, 2025
safearena
Public
SafeArena is a benchmark for assessing the harmful capabilities of web agents
Python
•3•17•2•0•Updated Apr 23, 2025Apr 23, 2025
malicious-ir
Public
Code for `Exploiting Instruction-Following Retrievers for Malicious Information Retrieval`
retrieval safety instruction-following-retrieval malicious-retrieval
Python
•
MIT License
•1•6•0•0•Updated Apr 1, 2025Apr 1, 2025
project-page-template
Public template
Template for creating project webpages based on jekyll/minimal-mistakes
1•1•0•0•Updated Mar 13, 2025Mar 13, 2025
tiny-aha-moment-length-budget
Public
Python
•47•0•0•0•Updated Mar 11, 2025Mar 11, 2025
CHASE
Public
Synthetic Data Generation for Evaluation
Python
•
MIT License
•4•14•0•0•Updated Feb 21, 2025Feb 21, 2025
Injongo
Public
A multicultural, open-source benchmark dataset for 16 African languages with utterances generated by native speakers across diverse domains.
Jupyter Notebook
•
GNU General Public License v3.0
•0•0•0•0•Updated Feb 12, 2025Feb 12, 2025
weblinx
Public
WebLINX is a benchmark for building web navigation agents with conversational capabilities
nlp agent web computer-vision navigation agents multimodal llm
Python
•
Apache License 2.0
•17•157•0•0•Updated Feb 11, 2025Feb 11, 2025
llm2vec
Public
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
Python
•
MIT License
•128•1.6k•36•4•Updated Jan 24, 2025Jan 24, 2025
webllama
Public
Llama-3 agents that can browse the web by following instructions and talking to you
Python
•
MIT License
•108•1.4k•2•0•Updated Dec 10, 2024Dec 10, 2024
statcan-dialogue-dataset
Public
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Python
•2•9•0•0•Updated Nov 6, 2024Nov 6, 2024
incontext-code-generation
Public
NAACL 2024: Evaluating In-Context Learning of Libraries for Code Generation
Python
•2•9•0•0•Updated Oct 23, 2024Oct 23, 2024
instruct-qa
Public
Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"
Python
•
Apache License 2.0
•5•86•5•1•Updated Aug 12, 2024Aug 12, 2024
scope-ambiguity
Public
Code and data for the paper 'Scope Ambiguities in Large Language Models'.
Python
•
MIT License
•2•5•0•0•Updated Jun 25, 2024Jun 25, 2024
length-generalization
Public
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
Python
•
MIT License
•7•137•3•0•Updated Apr 30, 2024Apr 30, 2024