-
Notifications
You must be signed in to change notification settings - Fork 1
Add MNIST training provenance collection example #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
marcelamelara
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sandlbn ! LGTM, I'll approve once I run the example
| --storage-url=http://localhost:8080 | ||
| ``` | ||
|
|
||
| ## Troubleshooting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome, I wish more docs had such a section :)
| 1. Extend the Pipeline: Add data preprocessing steps, hyperparameter tuning, or model optimization | ||
| 2. Track Experiments: Create manifests for different training runs with varying parameters | ||
| 3. Build CI/CD Integration: Automatically collect provenance in your ML pipeline | ||
| 4. Create Visualizations: Use the provenance graph to create visual representations of your ML workflow | ||
| 5. Implement Governance: Use provenance data for model approval and deployment decisions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this example in TDX? If so, we should add a line about running inside of TDX either down here, or higher up in the doc to explain the use of the with-tdx feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cool to add it in the second iteration. No, I don’t test it with TDX. I think the best way to enable TDX is to add additional parameters to the script. But I think we can add this as an issue at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracking this in #7
Co-authored-by: Marcela Melara <[email protected]>
Co-authored-by: Marcela Melara <[email protected]>
marcelamelara
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes, LGTM
This PR adds example demonstrating how to collect provenance data from an ML workflow using Atlas CLI. The example tracks a full MNIST training pipeline including dataset download, model training, and evaluation.