Skip to content

Commit bf47044

Browse files
Add OSS NA '25 demo to examples (#44)
* Add OSS NA '25 demo Signed-off-by: Marcela Melara <[email protected]> * Add support for MockAttestations Signed-off-by: Marcela Melara <[email protected]> * Disable trivy scan Signed-off-by: Marcela Melara <[email protected]> * Debug mock attestation tests Signed-off-by: Marcela Melara <[email protected]> * Add demo README Signed-off-by: Marcela Melara <[email protected]> * Remove duplicate example files Signed-off-by: Marcela Melara <[email protected]> * Apply suggestions from code review Co-authored-by: Marcin Spoczynski <[email protected]> --------- Signed-off-by: Marcela Melara <[email protected]> Co-authored-by: Marcin Spoczynski <[email protected]>
1 parent fd26f3e commit bf47044

File tree

8 files changed

+351
-15
lines changed

8 files changed

+351
-15
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
11
/target
2-
*.*~
2+
*.*~
3+
*.parquet
4+
*.pem

docs/EXAMPLES.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,11 @@
22

33
This document provides examples and workflow patterns for using the Atlas CLI tool.
44

5+
## End-to-End Examples
6+
7+
We provide a number of end-to-end Atlas workflow examples in the
8+
[/examples](../examples) directory,
9+
510
## Basic Usage Examples
611

712
### Creating a Single Model Manifest
@@ -336,4 +341,4 @@ make example-full-workflow
336341

337342
# Run example for filesystem storage
338343
make example-filesystem-storage
339-
```
344+
```

examples/oss-na-25-demo/README.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Open Source Summit North America (OSS NA) '25 Provenance Demo
2+
3+
## Introduction
4+
5+
This example demonstrates how to generate and verify provenance data for a
6+
two-stage machine learning lifecycle using the Atlas CLI tool. Although the demo
7+
contains scripts for training and evaluation pipelines, the demo does not run
8+
them but still tracks these software components as part of the lifecycle from
9+
dataset download to evaluation, showing how to establish an end-to-end audit
10+
trail of all artifacts and their relationships.
11+
12+
To show the added integrity properties gained from running the Atlas CLI inside
13+
of a confidential computing environment such as Intel TDX, the demo will
14+
collect a hardware-based platform attestation and include it in each artifact
15+
manifest generated by the CLI. When run on platforms that do not support Intel
16+
TDX, the demo will still generate a mock hardware attestation.
17+
18+
This example creates C2PA-compliant manifests for:
19+
- Datasets (raw MNIST training and test datasets downloaded from HuggingFace)
20+
- Software components (training and evaluation scripts)
21+
- Models (dummy trained classifier model)
22+
- Evaluation results (dummy results)
23+
24+
All components are linked to their direct parents during creation to form a
25+
complete provenance graph that can be exported and audited.
26+
27+
For a more comprehensive example that does run the training and evaluation
28+
pipelines, see the
29+
[MNIST training provenance collection example](../mnist/README.md).
30+
31+
## Prerequisites
32+
33+
### System Requirements
34+
- Rust toolchain (1.85 or above)
35+
- Docker and Docker-compose
36+
37+
### Setting up Atlas CLI
38+
39+
Ensure Atlas CLI is built and available in your PATH:
40+
41+
```bash
42+
# Build Atlas CLI (from the root directory)
43+
cargo build --release
44+
# Add to PATH or use full path
45+
export PATH=$PATH:./target/release
46+
```
47+
48+
### Setting up the Database Backend
49+
50+
Start the database backend (if not already running):
51+
52+
```bash
53+
# Start the database service
54+
cd storage_service && docker-compose build && docker-compose up -d && cd ..
55+
```
56+
57+
## Running the Example
58+
59+
You can launch the demo using the provided bash script, and progress through the
60+
demo by pressing any key to proceed at each step:
61+
62+
``` bash
63+
./collect_mnist_provenance.sh
64+
```
65+
66+
The demo script also displays the generated and linked manifests after select
67+
operations.
68+
69+
### Demo Steps
70+
71+
1. Generate provenance signing key pair (deleted at the end of the demo).
72+
2. Download the MNIST Dataset (training and test data) from HuggingFace.
73+
3. Generate and link the C2PA manifests for the training script, training data,
74+
and dummy model. Each manifest includes a (mock) hardware-based attestation.
75+
4. Generate and link the C2PA manifests for the evaluation script, dummy model,
76+
test data, and evaluation results. Each manifest includes a (mock)
77+
hardware-based attestation.
78+
5. Export the collected provenance graph.
79+
6. Verify the provenance graph: For each manifest, the manifest format, hashes
80+
for every tracked artifact, and expected manifest links are validated.
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
#!/bin/bash
2+
# MNIST Demo Provenance Collection Script
3+
# This script runs a demo MNIST workflow (no data prep or training) and collects
4+
# provenance data
5+
6+
# Configuration
7+
STORAGE_URL="http://localhost:8080"
8+
9+
# Helper function to extract ID from output
10+
extract_id() {
11+
grep -o "ID: [^ ]*" "$1" | cut -d' ' -f2
12+
}
13+
14+
TRAIN_DATASET="train-00000-of-00001.parquet"
15+
TEST_DATASET="test-00000-of-00001.parquet"
16+
17+
if [ ! -e "$TRAIN_DATASET" ]; then
18+
echo "Warning: Training datset not found. Downloading..."
19+
wget -q https://huggingface.co/datasets/ylecun/mnist/resolve/main/mnist/$TRAIN_DATASET
20+
fi
21+
22+
if [ ! -e "$TEST_DATASET" ]; then
23+
echo "Warning: Test datset not found. Downloading..."
24+
wget -q https://huggingface.co/datasets/ylecun/mnist/resolve/main/mnist/$TEST_DATASET
25+
fi
26+
27+
echo -e "=== STEP 0: Setup Provenance Signing/Verification Key Pair ==="
28+
openssl genpkey -algorithm RSA -out private.pem -pkeyopt rsa_keygen_bits:4096 2>/dev/null
29+
openssl rsa -pubout -in private.pem -out public.pem 2>/dev/null
30+
31+
read -s -r -p "Press any key to continue"
32+
33+
echo -e "\n=== STEP 1: Generate Provenance for MNIST Training Data ==="
34+
35+
read -s -r -p "Create training dataset manifest..."
36+
atlas-cli dataset create \
37+
--paths="$TRAIN_DATASET" \
38+
--ingredient-names="MNIST Training Dataset" \
39+
--name="MNIST Training Data" \
40+
--author-org="https://huggingface.co/datasets/ylecun/mnist/tree/main/mnist/blob/main/mnist/$TRAIN_DATASET" \
41+
--author-name="ylecun" \
42+
--storage-type=database \
43+
--storage-url=$STORAGE_URL \
44+
--key=private.pem \
45+
> train_dataset_output.txt
46+
TRAIN_DATASET_ID=$(extract_id train_dataset_output.txt)
47+
echo "Dataset ID: $TRAIN_DATASET_ID"
48+
49+
read -s -r -p "Display training data manifest"
50+
atlas-cli manifest export \
51+
--id=$TRAIN_DATASET_ID \
52+
--format=json \
53+
| jq '.'
54+
55+
read -s -r -p "Press any key to continue"
56+
57+
echo -e "\n=== STEP 2: Generate Provenance for Model Training Artifacts ==="
58+
59+
read -s -r -p "Create training script manifest..."
60+
atlas-cli software create \
61+
--paths=../mnist/train.py \
62+
--ingredient-names="MNIST Training Script" \
63+
--name="MNIST CNN Training Implementation" \
64+
--software-type="script" \
65+
--version="1.0.0" \
66+
--author-org="Your Organization" \
67+
--author-name="Your Name" \
68+
--description="PyTorch training script for MNIST CNN model" \
69+
--with-tdx \
70+
--key=private.pem \
71+
--storage-type=database \
72+
--storage-url=$STORAGE_URL \
73+
> training_script_output.txt
74+
TRAINING_SCRIPT_ID=$(extract_id training_script_output.txt)
75+
echo "Training Script ID: $TRAINING_SCRIPT_ID"
76+
77+
touch classifier.onnx
78+
read -s -r -p "Create model manifest..."
79+
atlas-cli model create \
80+
--paths=classifier.onnx \
81+
--ingredient-names="MNIST CNN Model" \
82+
--name="Trained MNIST Classifier" \
83+
--author-org="Your Organization" \
84+
--author-name="Your Name" \
85+
--key=private.pem \
86+
--storage-type=database \
87+
--storage-url=$STORAGE_URL \
88+
> model_output.txt
89+
MODEL_ID=$(extract_id model_output.txt)
90+
echo "Model ID: $MODEL_ID"
91+
92+
read -s -r -p "Display model's manifest"
93+
atlas-cli manifest export \
94+
--id=$MODEL_ID \
95+
--format=json \
96+
| jq '.'
97+
98+
read -s -r -p "Press any key to continue"
99+
100+
echo -e "\n=== STEP 3: Link Model Training Manifests ==="
101+
102+
read -s -r -p "Link MNIST training dataset to model..."
103+
atlas-cli manifest link \
104+
--source=$MODEL_ID \
105+
--target=$TRAIN_DATASET_ID \
106+
--storage-type=database \
107+
--storage-url=$STORAGE_URL \
108+
> model_train_dataset_link_output.txt
109+
MODEL_ID=$(extract_id model_train_dataset_link_output.txt)
110+
echo "Updated Model ID: $MODEL_ID"
111+
112+
read -s -r -p "Link training script to model..."
113+
atlas-cli manifest link \
114+
--source=$MODEL_ID \
115+
--target=$TRAINING_SCRIPT_ID \
116+
--storage-type=database \
117+
--storage-url=$STORAGE_URL \
118+
> model_train_script_link_output.txt
119+
MODEL_ID=$(extract_id model_train_script_link_output.txt)
120+
echo "Updated Model ID: $MODEL_ID"
121+
122+
read -s -r -p "Display model's manifest"
123+
atlas-cli manifest export \
124+
--id=$MODEL_ID \
125+
--format=json \
126+
| jq '.'
127+
128+
read -s -r -p "Press any key to continue"
129+
130+
echo -e "\n=== STEP 4: Generate & Link Provenance for Model Evaluation Artifacts ==="
131+
132+
atlas-cli dataset create \
133+
--paths="$TEST_DATASET" \
134+
--ingredient-names="MNIST Training Dataset" \
135+
--name="MNIST Training Data" \
136+
--author-org="https://huggingface.co/datasets/ylecun/mnist/tree/main/mnist/blob/main/mnist/$TEST_DATASET" \
137+
--author-name="ylecun" \
138+
--storage-type=database \
139+
--storage-url=$STORAGE_URL \
140+
--key=private.pem \
141+
> test_dataset_output.txt
142+
TEST_DATASET_ID=$(extract_id test_dataset_output.txt)
143+
echo "Test Dataset ID: $TEST_DATASET_ID"
144+
145+
atlas-cli software create \
146+
--paths=../mnist/eval.py \
147+
--ingredient-names="MNIST Evaluation Script" \
148+
--name="MNIST Model Evaluation Implementation" \
149+
--software-type="script" \
150+
--version="1.0.0" \
151+
--author-org="Your Organization" \
152+
--author-name="Your Name" \
153+
--description="PyTorch evaluation script for MNIST CNN model" \
154+
--with-tdx \
155+
--key=private.pem \
156+
--storage-type=database \
157+
--storage-url=$STORAGE_URL \
158+
> eval_script_output.txt
159+
EVAL_SCRIPT_ID=$(extract_id eval_script_output.txt)
160+
echo "Evaluation Script ID: $EVAL_SCRIPT_ID"
161+
162+
touch eval_results.json
163+
echo "Creating evaluation results manifest linked to model..."
164+
atlas-cli evaluation create \
165+
--path=eval_results.json \
166+
--name="MNIST Model Evaluation Results" \
167+
--author-org="Your Organization" \
168+
--author-name="Your Name" \
169+
--model-id=$MODEL_ID \
170+
--dataset-id=$TEST_DATASET_ID \
171+
--hash-alg=sha384 \
172+
--key=private.pem \
173+
--storage-type=database \
174+
--storage-url=$STORAGE_URL \
175+
> eval_results_output.txt
176+
EVAL_RESULTS_ID=$(extract_id eval_results_output.txt)
177+
echo "Evaluation Results ID: $EVAL_RESULTS_ID"
178+
179+
atlas-cli manifest link \
180+
--source=$EVAL_RESULTS_ID \
181+
--target=$EVAL_SCRIPT_ID \
182+
--storage-type=database \
183+
--storage-url=$STORAGE_URL \
184+
> eval_script_link_output.txt
185+
EVAL_RESULTS_ID=$(extract_id eval_script_link_output.txt)
186+
echo "Updated Eval Results ID: $EVAL_RESULTS_ID"
187+
188+
read -s -r -p "Press any key to continue"
189+
190+
echo -e "\n=== STEP 4: Export Provenance Graph ==="
191+
atlas-cli manifest export \
192+
--id=$EVAL_RESULTS_ID \
193+
--storage-type=database \
194+
--storage-url=$STORAGE_URL \
195+
--format=json \
196+
--max-depth=10 \
197+
--output=mnist_provenance.json
198+
199+
read -s -r -p "Press any key to continue"
200+
201+
echo -e "\n=== STEP 5: Validate Provenance ==="
202+
203+
read -s -r -p "Validate model manifest..."
204+
atlas-cli manifest validate \
205+
--id=$MODEL_ID \
206+
--storage-type=database \
207+
--storage-url=$STORAGE_URL
208+
209+
read -s -r -p "Validate evaluation results manifest..."
210+
atlas-cli manifest validate \
211+
--id=$EVAL_RESULTS_ID \
212+
--storage-type=database \
213+
--storage-url=$STORAGE_URL
214+
215+
INVALID_LINKED_MANIFEST_ID="urn:c2pa:123e4567-e89b-12d3-a456-426614174000"
216+
217+
read -s -r -p "Validate bad manifest link (should fail)..."
218+
atlas-cli manifest verify-link \
219+
--source=$MODEL_ID \
220+
--target=$INVALID_LINKED_MANIFEST_ID \
221+
--storage-type=database \
222+
--storage-url=$STORAGE_URL
223+
224+
read -s -r -p "Display exported evaluation results provenance"
225+
echo -e "\n"
226+
cat mnist_provenance.json | jq '.'
227+
228+
read -s -r -p "Finish demo"
229+
echo -e "\n"
230+
rm -f *_output.txt *.pem classifier.onnx eval_results.json mnist_provenance.json

src/cc_attestation/mock.rs

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
use serde::{Deserialize, Serialize};
12
use serde_json::json;
23

34
use tdx_workload_attestation::error::Result;
@@ -7,6 +8,16 @@ pub struct MockAttestationProvider {
78
platform: String,
89
}
910

11+
#[derive(Serialize, Deserialize, Debug)]
12+
pub struct MockReport {
13+
report_type: String,
14+
platform: String,
15+
timestamp: String,
16+
status: String,
17+
version: String,
18+
message: String,
19+
}
20+
1021
impl MockAttestationProvider {
1122
pub fn new(platform: &str) -> Self {
1223
Self {
@@ -19,14 +30,12 @@ impl AttestationProvider for MockAttestationProvider {
1930
fn get_attestation_report(&self) -> Result<String> {
2031
// Create a mock attestation report with platform info
2132
let mock_report = json!({
22-
"type": "mock_attestation",
33+
"report_type": "mock_attestation",
2334
"platform": self.platform,
2435
"timestamp": chrono::Utc::now().to_rfc3339(),
25-
"mock_data": {
26-
"version": "1.0",
27-
"status": "simulated",
28-
"message": "This is a mock attestation report for non-Linux or unsupported platforms"
29-
}
36+
"version": "1.0",
37+
"status": "simulated",
38+
"message": "This is a mock attestation report for non-Linux or unsupported platforms"
3039
});
3140

3241
// Serialize to JSON string

src/manifest/mod.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
use crate::cc_attestation::mock::MockReport;
12
use crate::error::{Error, Result};
23
use crate::hash;
34
use crate::storage::traits::StorageBackend;
@@ -991,6 +992,14 @@ fn extract_assertion_details(
991992
"enforced": do_not_train.enforced,
992993
})
993994
}
995+
atlas_c2pa_lib::assertion::Assertion::CustomAssertion(custom) => {
996+
let r_str = custom.data.as_str().unwrap();
997+
let r: MockReport = serde_json::from_str(r_str).unwrap();
998+
serde_json::json!({
999+
"label": custom.label,
1000+
"data": r,
1001+
})
1002+
}
9941003
_ => serde_json::json!({"type": "Unknown"}),
9951004
}
9961005
}

0 commit comments

Comments
 (0)