Skip to content

feat: add rancher integration pipeline #1033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

FrankYang0529
Copy link
Member

Problem:

Solution:

Related Issue(s):

harvester/harvester#7889

Test plan:

Additional documentation or context

@Copilot Copilot AI review requested due to automatic review settings April 17, 2025 06:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new GitHub Actions workflow for Rancher integration that automates version patching, building Harvester artifacts, and running integration tests.

  • Adds a workflow to patch RKE2 and Rancher versions, build artifacts, and deploy/test clusters.
  • Includes steps for cloning ipxe-examples, performing cluster setup, running tests, and collecting logs.

@FrankYang0529 FrankYang0529 force-pushed the HARV-7889 branch 5 times, most recently from 0835b17 to 072422c Compare April 23, 2025 08:48
@innobead innobead requested a review from Vicente-Cheng April 23, 2025 09:39
working-directory: ${{ steps.ipxe.outputs.VAGRANT_HOME }}
run: |
vagrant destroy -f
- name: Remove OVMF.fd line if needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copy this from

- name: Remove OVMF.fd line if needed
working-directory: ${{ steps.ipxe.outputs.VAGRANT_HOME }}
run: |
if [ ! -f /usr/share/qemu/OVMF.fd ]; then
echo "Remove libvirt loader: can't find UEFI firmware"
sed 's/libvirt.loader.*/#libvirt.loader = /' Vagrantfile
fi
. @bk201 do you know why we need this in CI? Does it avoid UEFI firmware not found error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the OVMF file is present in the system, the VM will run in UEFI mode, otherwise, it will run in legacy BIOS mode. Not every runner has the file presented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some snippets are resuable, suggest creating util scripts or workflows, then share them with each other. In addition, please also add comments to clarify the purpose, or prefer making the step name self-explantory. It will be good for understanding.

Copy link
Contributor

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a reusable workflow which can be used in both workflow to reduce duplicated implementation here? Basically, have a job running the reusbale workflow, then start the different steps in another job.

working-directory: ./ci/terraform
run: |
./cleanup_test_files.sh
- name: Testing existing files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check files exist on provisioned hosts

nit: The name is quite general, can we make them specific? What files should exist. check_files.sh is also too common.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since check_files.sh is used in other workflow, I will update the naming in the next PR.

./test_terraform_vm.sh ${{ steps.ipxe.outputs.VAGRANT_HOME }}/settings.yml

- name: Check Rancher System Agent
run: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this just after Setup cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it after terraform check. If it checks rancher-system-agent just after setup cluster, the cluster may not be ready.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then it can be part of tests.

@innobead
Copy link
Contributor

innobead commented Apr 25, 2025

The ultimate goal is to automatically detect the versions of Rancher and RKE2, and then run these workflows.

There are a few things we need to improve later.

  • Create a matrix strategy outlining which versions to run in the workflow. For instance: Harvester 1.5/Rancher 2.11/RKE2 1.32, we only need to execute pipelines for supported branches and the head. Currently, these are head/1.6, 1.5, 1.4, and 1.3. To conserve resources, we will only run head, n, and n-1, which means head, 1.5, and 1.4.
  • Schedule a daily cron job to check for any new releases (just released on the day) of Rancher or RKE2, and run the testing pipeline as needed. You may require an additional step/job initially to determine them. If no new releases are available, there's no need to execute anything.
  • W/ above improvements, still ensure the capability to run manually by entering upstream versions

@FrankYang0529 FrankYang0529 force-pushed the HARV-7889 branch 2 times, most recently from 12f7b4a to 851bacf Compare April 25, 2025 05:32
Copy link
Contributor

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor feedback. After resolving them, we can merge this first, so you can verify and fine-tune this later.

- name: Create new-version.yaml
working-directory: ./ci/upgrade
run: |
./create_upgrade_file.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be create_version_file.sh instead?

BTW, can we simplify the upgrade operation as follows? When creating a new version or upgrade, apply it directly.

- name: Start file server
  run: |
    python3 -m http.server 8000 --bind 0.0.0.0 &

- name: Create the upgrade version
  working-directory: ./ci/upgrade
  run: |
    ./create_version.sh

- name: Create the upgrade
  working-directory: ./ci/upgrade
  run: |
    ./create_upgrade.sh


- name: Check Rancher System Agent
run: |
for i in 0 1 2; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to check_rancher_system_agent.sh? I would like to move all checks and tests under the terraform folder to make them consistent as the existing test scripts like test_terraform_vm.sh.


- name: Test Login
run: |
node0_ip=$(yq e ".harvester_network_config.cluster[0].ip" ${{ steps.ipxe.outputs.VAGRANT_HOME }}/settings.yml)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, move this to check_login.sh

exit 1
fi

- name: Test RKE2 cert rotation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_rke2_cert_rotation.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants