π-PACT: Behavior Characterization via π-vectors in Multi-Task Reinforcement Learning

This repository collects materials related to my Master's thesis for the MSc in Computer Science (Artificial Intelligence Curriculum) at the University of Pisa.

🚀 Final Grade: 110/110 cum laude

📜 Thesis Abstract

This thesis explores behavior-aware transfer learning in reinforcement learning, focusing on a novel framework called π-PACT — π-vectors-based Policy Adaptation by Characterization and Transfer. π-PACT leverages policy supervectors (π-vectors) as compact representations of policy behavior to characterize and compare skills across tasks.

Key components:

Adaptation of a Universal Background Model (UBM) to observed policy state features.
Dynamic monitoring of learning progress via π-vector distances.
Selective transfer of knowledge from source policies to a target agent when relevant.

The approach is evaluated on the highway-env RL suite, with experiments analyzing:

Representation quality of π-vectors,
Transfer learning performance,
Challenges like negative transfer and matching thresholds.

📊 System Flowchart

Below is a high-level overview of the π-PACT framework:

✅ Highlights:

Policy behavior is periodically summarized as π-vectors.
KL-based distance matching identifies similar source policies.
A gatekeeper threshold controls when and how to transfer useful knowledge.
The Universal Background Model (UBM) enables efficient feature adaptation.

⚙️ About the Code

⚠️ Note: The code in this repository represents an earlier prototype version developed during the research phase. The final experimental version used for thesis evaluation may include refinements not reflected here.

Main components include:

Feature extraction tools (lane, road, temporal buffers).
Modified SAC agent with π-vector integration.
Utilities for GMM training, MAP adaptation, and π-vector computation.

📖 Resources

📄 Thesis manuscript: thesis_petix.pdf
📊 Presentation slides: presentazione_tesi.pdf

⚡ Potential Future Work

Update codebase to align with final thesis version.
Package π-PACT as a modular framework.
Provide reproducible experiments and pretrained agents.
Explore broader benchmarks beyond highway-env.

🧠 References

Kanervisto, A., Wiltschko, A., & Ha, D. (2020). General Characterization of Agents by the States They Visit.
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker Verification Using Adapted Gaussian Mixture Models.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
debug		debug
libraries		libraries
policies		policies
resources		resources
scripts		scripts
README.md		README.md
presentazione_tesi.pdf		presentazione_tesi.pdf
thesis_petix.pdf		thesis_petix.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

π-PACT: Behavior Characterization via π-vectors in Multi-Task Reinforcement Learning

📜 Thesis Abstract

📊 System Flowchart

⚙️ About the Code

📖 Resources

⚡ Potential Future Work

🧠 References

About

Uh oh!

Releases

Packages

Languages

marcopetix/msc-thesis-pi-pact

Folders and files

Latest commit

History

Repository files navigation

π-PACT: Behavior Characterization via π-vectors in Multi-Task Reinforcement Learning

📜 Thesis Abstract

📊 System Flowchart

⚙️ About the Code

📖 Resources

⚡ Potential Future Work

🧠 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages