-
Notifications
You must be signed in to change notification settings - Fork 486
Closed
Labels
Description
Hello everyone, thank you for the great work on Kubeflow.
My name is Dejan Golubovic, I am a software engineer at CERN, where we run Kubeflow in production.
My experience with Kubeflow includes:
- Deploying a cluster on private cloud, using custom ArgoCD repository
- Customizing Kubeflow components to fit our use cases
- Maintaining a fork of Kubeflow repo
- Custom centraldashboard, jupyter-web-app, kale and kfp setup
- Adding envoyfilters, roles and rolebings when necessary
- Integrating CERN network file system EOS and Kerberos authentication
- Maintaining the private cloud instance, fixing ongoing issues
- Providing examples and documentation for CERN users
- On-boarding users and helping run workloads (pipelines, Katib, TFJobs, PytorchJobs...)
- Running scalable model training on a GCP and Azure instances (up to 128 GPUs)
Our previous Kubecon talks can be seen here:
- Kubecon 2021 NA - https://www.youtube.com/watch?v=sBbfUPaTCVA&ab_channel=CNCF%5BCloudNativeComputingFoundation%5D
- Kubecon 2021 Europe - https://www.youtube.com/watch?v=HuWt1N8NFzU&ab_channel=CNCF%5BCloudNativeComputingFoundation%5D
While running the production instance, we are gathering feedback from CERN users.
I would like to help implement some of the features for which we see a need.
One such feature is showing Katib logs on the UI: #971
I could help with this in the following weeks.
Informing WG leaders - @kubeflow/wg-automl-leads
Please let me know if you need more information.
kimwnasptd