This repository collects materials related to my Master's thesis for the MSc in Computer Science (Artificial Intelligence Curriculum) at the University of Pisa.
🚀 Final Grade: 110/110 cum laude
This thesis explores behavior-aware transfer learning in reinforcement learning, focusing on a novel framework called π-PACT — π-vectors-based Policy Adaptation by Characterization and Transfer. π-PACT leverages policy supervectors (π-vectors) as compact representations of policy behavior to characterize and compare skills across tasks.
Key components:
- Adaptation of a Universal Background Model (UBM) to observed policy state features.
- Dynamic monitoring of learning progress via π-vector distances.
- Selective transfer of knowledge from source policies to a target agent when relevant.
The approach is evaluated on the highway-env RL suite, with experiments analyzing:
- Representation quality of π-vectors,
- Transfer learning performance,
- Challenges like negative transfer and matching thresholds.
Below is a high-level overview of the π-PACT framework:
✅ Highlights:
- Policy behavior is periodically summarized as π-vectors.
- KL-based distance matching identifies similar source policies.
- A gatekeeper threshold controls when and how to transfer useful knowledge.
- The Universal Background Model (UBM) enables efficient feature adaptation.
Main components include:
- Feature extraction tools (lane, road, temporal buffers).
- Modified SAC agent with π-vector integration.
- Utilities for GMM training, MAP adaptation, and π-vector computation.
- 📄 Thesis manuscript:
thesis_petix.pdf - 📊 Presentation slides:
presentazione_tesi.pdf
- Update codebase to align with final thesis version.
- Package π-PACT as a modular framework.
- Provide reproducible experiments and pretrained agents.
- Explore broader benchmarks beyond
highway-env.
- Kanervisto, A., Wiltschko, A., & Ha, D. (2020). General Characterization of Agents by the States They Visit.
- Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker Verification Using Adapted Gaussian Mixture Models.

