-
-
Notifications
You must be signed in to change notification settings - Fork 491
feat: replace microOS with leapMicro #1936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: leapmicro-init
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @pat-s, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant changes by making openSUSE LeapMicro the default operating system for Kube-Hetzner clusters, moving away from openSUSE MicroOS. The primary motivation is to leverage LeapMicro's stability and more controlled update mechanisms. The update includes new Packer templates for LeapMicro, a robust SELinux policy specifically designed for Kubernetes components on LeapMicro, and flexible OS selection options for users. The changes also involve refactoring Terraform configurations to support multi-OS environments and improving cloud-init scripts for better system hardening and management. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed effort to add support for LeapMicro as the default OS. The changes are comprehensive, covering Terraform configurations, Packer templates, and scripts, making the module more robust and flexible with multi-OS support. I've identified a few issues, including a critical one in locals.tf that would cause Terraform to fail, and a couple of high-severity bugs related to hardcoded values that would prevent certain features from working as intended. There are also some minor issues in the README that should be addressed to improve clarity for users. Overall, this is excellent work on a complex feature.
locals.tf
Outdated
| placement_group : nodepool_obj.placement_group | ||
| os : nodepool_obj.os | ||
| placement_group : nodepool_obj.placement_group, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autoscaler-agents.tf
Outdated
| cloudinit_write_files_common = local.cloudinit_write_files_common_by_os["leapmicro"] | ||
| cloudinit_runcmd_common = local.cloudinit_runcmd_common_by_os["leapmicro"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cloud-init configuration for autoscaler nodepools is hardcoding leapmicro as the OS for OS-specific scripts. This will cause issues if a user defines an autoscaler nodepool with os = "microos". You should use the os value from the nodepool definition to select the correct scripts.
cloudinit_write_files_common = local.cloudinit_write_files_common_by_os[var.autoscaler_nodepools[count.index].os]
cloudinit_runcmd_common = local.cloudinit_runcmd_common_by_os[var.autoscaler_nodepools[count.index].os]
autoscaler-agents.tf
Outdated
| cloudinit_write_files_common = local.cloudinit_write_files_common_by_os["leapmicro"] | ||
| cloudinit_runcmd_common = local.cloudinit_runcmd_common_by_os["leapmicro"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legacy cloud-init configuration also hardcodes leapmicro. This configuration applies to all autoscaler nodepools when legacy options are used. To support different OS choices, it should use the OS from the first defined nodepool, similar to how first_nodepool_snapshot_id is determined.
cloudinit_write_files_common = local.cloudinit_write_files_common_by_os[var.autoscaler_nodepools[0].os]
cloudinit_runcmd_common = local.cloudinit_runcmd_common_by_os[var.autoscaler_nodepools[0].os],
| zypper --non-interactive --no-gpg-checks install -y k3s-selinux || { | ||
| echo "Failed to install k3s-selinux, trying alternative method..." | ||
| # Alternative: Download and install the RPM directly | ||
| curl -fsSL -o /tmp/k3s-selinux.rpm https://github.com/k3s-io/k3s-selinux/releases/download/v1.6.stable.1/k3s-selinux-1.6-1.sle.noarch.rpm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback URL for downloading k3s-selinux has a hardcoded version (v1.6.stable.1). This should use the k3s_selinux_version variable to ensure the correct version is downloaded, similar to how it's done in the hcloud-microos-snapshots.pkr.hcl file. This prevents installing an outdated version if the variable is updated.
curl -fsSL -o /tmp/k3s-selinux.rpm https://github.com/k3s-io/k3s-selinux/releases/download/${var.k3s_selinux_version}/k3s-selinux-${replace(var.k3s_selinux_version, "v", "")}-1.sle.noarch.rpm
| **Why OpenSUSE MicroOS (and not Ubuntu)?** | ||
| **Migration Notes (MicroOS → LeapMicro)** | ||
|
|
||
| As of version **X.Y.Z**, **openSUSE LeapMicro** replaced **MicroOS** as the default operating system for Kube-Hetzner installations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Stability & Reliability:** LeapMicro provides a stable, fixed-point release with fewer updates, reducing the risk of disruptions that MicroOS's rolling-release model occasionally introduced. | ||
| - **Controlled Updates:** LeapMicro’s snapshot-based updates allow pinning specific kernel and package versions, enabling more predictable upgrades compared to MicroOS. | ||
|
|
||
| 🚨 If you already have an existing deployment, please ensure you read the [Upgrade & Migration Guide](UPGRADE_AND_MIGRATION_GUIDE.md) before proceeding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-microos-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_microos_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-microos-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-microos-snapshots.pkr.hcl). | ||
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-leapmicro-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the file extension. It should be .hcl instead of .hlc.
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). | |
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hcl](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-microos-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_microos_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-microos-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-microos-snapshots.pkr.hcl). | ||
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-leapmicro-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This instruction for upgrading from v1.x to v2.x is a bit confusing. It advises removing opensuse_leapmicro_mirror_link, but users upgrading from v1.x would have opensuse_microos_mirror_link in their configuration. This seems to be a find-and-replace error and should be corrected to avoid confusion.
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). | |
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_microos_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
|
@pat-s Appreciate it. But please let's remove microos support, otherwise let me know why you think it's needed? My understanding is that new nodes will just require leapmicro, which is good enough. Also please make sure that gemini is happy. You can use "/gemini review" to re-review. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a significant and well-executed pull request that adds support for LeapMicro as the new default OS. The changes are extensive and robust, covering documentation, Terraform configurations, Packer templates, and shell scripts. The implementation includes thoughtful features like conditional resource creation, more resilient cloud-init commands, and a detailed Packer template for the new LeapMicro images. My review focuses on enhancing consistency, robustness, and maintainability. I've identified a few areas for improvement, including minor documentation updates, refinements in the new Packer template to improve package management and error handling, and a suggestion to refactor the SELinux policy application for better consistency between operating systems. Overall, this is a high-quality contribution that greatly enhances the project.
| zypper --non-interactive --no-gpg-checks install -y k3s-selinux || { | ||
| echo "Failed to install k3s-selinux via zypper, trying direct RPM download..." | ||
| # Alternative: Download and install the RPM directly using the version variable | ||
| # Strip the 'v' prefix from version for the RPM filename (e.g., v1.6.stable.1 -> 1.6.stable.1) | ||
| K3S_SELINUX_RPM_VERSION=$(echo "${var.k3s_selinux_version}" | sed 's/^v//') | ||
| curl -fsSL -o /tmp/k3s-selinux.rpm "https://github.com/k3s-io/k3s-selinux/releases/download/${var.k3s_selinux_version}/k3s-selinux-$${K3S_SELINUX_RPM_VERSION}-1.sle.noarch.rpm" | ||
| rpm -i --nosignature /tmp/k3s-selinux.rpm || true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using || true after the rpm -i command will suppress any installation errors, potentially hiding a failed installation of k3s-selinux. This could lead to an image being built without the necessary SELinux policies, causing issues later. It's better to let the build fail if the RPM installation fails, so the issue can be addressed immediately.
zypper --non-interactive --no-gpg-checks install -y k3s-selinux || {
echo "Failed to install k3s-selinux via zypper, trying direct RPM download..."
# Alternative: Download and install the RPM directly using the version variable
# Strip the 'v' prefix from version for the RPM filename (e.g., v1.6.stable.1 -> 1.6.stable.1)
K3S_SELINUX_RPM_VERSION=$(echo "${var.k3s_selinux_version}" | sed 's/^v//')
curl -fsSL -o /tmp/k3s-selinux.rpm "https://github.com/k3s-io/k3s-selinux/releases/download/${var.k3s_selinux_version}/k3s-selinux-$${K3S_SELINUX_RPM_VERSION}-1.sle.noarch.rpm"
rpm -i --nosignature /tmp/k3s-selinux.rpm
}
| **Why OpenSUSE MicroOS (and not Ubuntu)?** | ||
| **Migration Notes (MicroOS → LeapMicro)** | ||
|
|
||
| As of version **X.Y.Z**, **openSUSE LeapMicro** replaced **MicroOS** as the default operating system for Kube-Hetzner installations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-microos-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_microos_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-microos-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-microos-snapshots.pkr.hcl). | ||
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-leapmicro-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a small typo in the file extension for the packer template file in the link text. It should be .hcl instead of .hlc.
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). | |
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hcl](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
| %{ if os == "microos" ~} | ||
| # Apply SELinux policies for k8s on MicroOS (keeping old approach for now) | ||
| - | | ||
| if command -v checkmodule >/dev/null 2>&1 && command -v semodule_package >/dev/null 2>&1 && command -v semodule >/dev/null 2>&1; then | ||
| echo "Setting up SELinux policies for Kubernetes..." | ||
|
|
||
| # Create the policy file | ||
| cat > /tmp/k8s_custom_policies.te <<'SELINUX_POLICY' | ||
| module k8s_custom_policies 1.0; | ||
|
|
||
| require { | ||
| type container_t; | ||
| type cert_t; | ||
| type proc_t; | ||
| type sysfs_t; | ||
| type unreserved_port_t; | ||
| type kubernetes_port_t; | ||
| type http_port_t; | ||
| type hplip_port_t; | ||
| type node_t; | ||
| class dir { read search open getattr }; | ||
| class file { read open getattr }; | ||
| class lnk_file { read getattr }; | ||
| class tcp_socket { name_bind name_connect accept listen }; | ||
| class node { tcp_recv tcp_send }; | ||
| class peer recv; | ||
| class filesystem getattr; | ||
| } | ||
|
|
||
| # Allow containers to read certificate directories and files | ||
| allow container_t cert_t:dir { read search open getattr }; | ||
| allow container_t cert_t:file { read open getattr }; | ||
|
|
||
| # Allow containers to read proc filesystem (needed for metrics-server filesystem collector) | ||
| allow container_t proc_t:file { read open getattr }; | ||
| allow container_t proc_t:dir { read search open getattr }; | ||
| allow container_t proc_t:lnk_file { read getattr }; | ||
| allow container_t proc_t:filesystem getattr; | ||
|
|
||
| # Also allow sysfs access which is often needed alongside proc | ||
| allow container_t sysfs_t:file { read open getattr }; | ||
| allow container_t sysfs_t:dir { read search open getattr }; | ||
| allow container_t sysfs_t:lnk_file { read getattr }; | ||
| allow container_t sysfs_t:filesystem getattr; | ||
|
|
||
| # Allow containers to bind to kubernetes ports (including 10250 for metrics-server) | ||
| allow container_t kubernetes_port_t:tcp_socket { name_bind name_connect accept listen }; | ||
|
|
||
| # Allow containers to bind to hplip ports (including 9100 for node-exporter) | ||
| allow container_t hplip_port_t:tcp_socket { name_bind name_connect accept listen }; | ||
|
|
||
| # Allow containers to bind to unreserved high ports | ||
| allow container_t unreserved_port_t:tcp_socket { name_bind name_connect accept listen }; | ||
|
|
||
| # Allow containers to bind to http ports (some exporters may use these) | ||
| allow container_t http_port_t:tcp_socket { name_bind name_connect accept listen }; | ||
|
|
||
| # Allow container-to-container communication (needed for readiness probes) | ||
| allow container_t container_t:tcp_socket { name_connect accept }; | ||
| allow container_t container_t:peer recv; | ||
|
|
||
| # Allow containers to use network nodes | ||
| allow container_t node_t:node { tcp_recv tcp_send }; | ||
| SELINUX_POLICY | ||
|
|
||
| echo "Compiling and applying SELinux policies..." | ||
| checkmodule -M -m -o /tmp/k8s_custom_policies.mod /tmp/k8s_custom_policies.te && \ | ||
| semodule_package -o /tmp/k8s_custom_policies.pp -m /tmp/k8s_custom_policies.mod && \ | ||
| semodule -i /tmp/k8s_custom_policies.pp && \ | ||
| rm -f /tmp/k8s_custom_policies.{te,mod,pp} && \ | ||
| echo "Custom SELinux policies applied successfully" || \ | ||
| echo "Warning: Could not apply custom SELinux policies" | ||
| else | ||
| echo "SELinux policy tools not available, skipping custom policies" | ||
| fi | ||
| %{ endif ~} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SELinux policy application for microos is implemented as a large inline script within runcmd. This is inconsistent with the more robust approach used for leapmicro, which uses a separate script file and a systemd service. For better readability, maintainability, and consistency, consider refactoring the microos SELinux logic to follow the same pattern as leapmicro (using write_files for a script and service, then enabling it in runcmd).
|
|
||
| # Keep the package list minimal and working | ||
| # SELinux tools will be installed via transactional-update | ||
| needed_packages = join(" ", concat(["restorecond policycoreutils policycoreutils-python-utils audit open-iscsi nfs-client git selinux-policy xfsprogs cryptsetup lvm2 git bash-completion udica qemu-guest-agent bash-completion"], var.packages_to_install)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The git and bash-completion packages are listed twice in the needed_packages list. Removing the duplicates will make the list cleaner and slightly more efficient.
needed_packages = join(" ", concat(["restorecond policycoreutils policycoreutils-python-utils audit open-iscsi nfs-client selinux-policy xfsprogs cryptsetup lvm2 git bash-completion udica qemu-guest-agent"], var.packages_to_install))
| } | ||
| # Disable dontaudit rules to make SELinux less restrictive and show all denials | ||
| echo "Disabling SELinux dontaudit rules for better visibility..." | ||
| semodule -DB || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using || true after semodule -DB suppresses potential errors. While less critical than an installation failure, it's generally better to let the build fail on unexpected errors from system commands to ensure a fully valid build. Please consider removing || true to make the build process more strict.
semodule -DB
| provisioner "shell" { | ||
| pause_before = "5s" | ||
| inline = [local.clean_up] | ||
| expect_disconnect = true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The final provisioner for the x86 build has expect_disconnect = true, but the clean_up script it runs does not perform a reboot. This is inconsistent with the ARM build's final provisioner, which correctly omits this setting. Please remove expect_disconnect = true for consistency and to avoid potential issues with Packer waiting for a disconnect that never happens.
provisioner "shell" {
pause_before = "5s"
inline = [local.clean_up]
}
|
@pat-s I merged Stani's branch into a local branch, the reason is that I want him in the commit history, as he started all this and spent a lot of time. If you can work with this it would be amazing please. For SELinux, we support Udica, don't forget. Please address the merge conflicts, and all the reviews and then I will merge into this local branch and proceed to merging into master myself after testing it all. Thanks |
|
/gemini review |
|
@codex review please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a significant and well-executed pull request that adds support for LeapMicro as the default OS, enhancing stability. The changes are comprehensive, touching everything from the README and packer templates to the core Terraform logic. The introduction of OS selection via the os variable is a great feature for flexibility. The robust SELinux policy handling for LeapMicro, using a dedicated systemd service, is a solid approach to address the stricter security environment. I also appreciate the improvements for idempotency and robustness in the cloud-init scripts, like the more resilient semanage and ip route commands, and the conditional logic for configuration files. The updated create.sh script also improves the user experience for setting up new clusters. I have a few minor suggestions for documentation typos and code cleanup, and one more important point about error handling in the new packer template. Overall, this is excellent work.
| # Strip the 'v' prefix from version for the RPM filename (e.g., v1.6.stable.1 -> 1.6.stable.1) | ||
| K3S_SELINUX_RPM_VERSION=$(echo "${var.k3s_selinux_version}" | sed 's/^v//') | ||
| curl -fsSL -o /tmp/k3s-selinux.rpm "https://github.com/k3s-io/k3s-selinux/releases/download/${var.k3s_selinux_version}/k3s-selinux-$${K3S_SELINUX_RPM_VERSION}-1.sle.noarch.rpm" | ||
| rpm -i --nosignature /tmp/k3s-selinux.rpm || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using || true after the rpm -i command will suppress any errors if the fallback installation of k3s-selinux fails. This could lead to a broken image without any indication of failure during the packer build. If this fallback mechanism is triggered, it should probably fail the build if it doesn't succeed.
rpm -i --nosignature /tmp/k3s-selinux.rpm
| **Why OpenSUSE MicroOS (and not Ubuntu)?** | ||
| **Migration Notes (MicroOS → LeapMicro)** | ||
|
|
||
| As of version **X.Y.Z**, **openSUSE LeapMicro** replaced **MicroOS** as the default operating system for Kube-Hetzner installations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ### Turning Off Automatic Upgrades | ||
|
|
||
| _If you wish to turn off automatic MicroOS upgrades (Important if you are not launching an HA setup that requires at least 3 control-plane nodes), you need to set:_ | ||
| _If you wish to turn off automatic LeapMico upgrades (Important if you are not launching an HA setup that requires at least 3 control-plane nodes), you need to set:_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo here as well. It should be LeapMicro.
| _If you wish to turn off automatic LeapMico upgrades (Important if you are not launching an HA setup that requires at least 3 control-plane nodes), you need to set:_ | |
| _If you wish to turn off automatic LeapMicro upgrades (Important if you are not launching an HA setup that requires at least 3 control-plane nodes), you need to set:_ |
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-microos-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_microos_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-microos-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-microos-snapshots.pkr.hcl). | ||
| - Within your project folder, run the `createkh` installation command, see [Do Not Skip](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner#-do-not-skip-creating-your-kubetf-file-and-the-opensuse-leapmicro-snapshot) section above. This will create the snapshot for you. Don't worry, it's non-destructive and will leave your kube.tf and terraform state alone, but will download the required other packer file. | ||
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in the file extension. It should be .pkr.hcl instead of .pkr.hlc.
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hlc](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). | |
| - Then modify your kube.tf to use version >= 2.0, and remove `extra_packages_to_install` and `opensuse_leapmicro_mirror_link` variables if used. This functionality has been moved to the packer snapshot definition, see [packer-template/hcloud-leapmicro-snapshots.pkr.hcl](https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner/blob/master/packer-template/hcloud-leapmicro-snapshots.pkr.hcl). |
| [Unit] | ||
| Description=Apply K8s SELinux Policy for Leap Micro | ||
| DefaultDependencies=no | ||
| After=local-fs.target | ||
| Before=k3s.service network-pre.target | ||
|
|
||
| [Service] | ||
| Type=oneshot | ||
| RemainAfterExit=yes | ||
| ExecStart=/usr/local/bin/apply-k8s-selinux-policy.sh | ||
|
|
||
| [Install] | ||
| WantedBy=sysinit.target No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| # Keep the package list minimal and working | ||
| # SELinux tools will be installed via transactional-update | ||
| needed_packages = join(" ", concat(["restorecond policycoreutils policycoreutils-python-utils audit open-iscsi nfs-client git selinux-policy xfsprogs cryptsetup lvm2 git bash-completion udica qemu-guest-agent bash-completion"], var.packages_to_install)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The packages git and bash-completion are listed twice in the needed_packages local variable. While this is harmless, removing the duplicates would make the list cleaner.
needed_packages = join(" ", concat(["restorecond policycoreutils policycoreutils-python-utils audit open-iscsi nfs-client selinux-policy xfsprogs cryptsetup lvm2 git bash-completion udica qemu-guest-agent"], var.packages_to_install))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| dynamic "network" { | ||
| for_each = [] | ||
| content { | ||
| network_id = var.network_id | ||
| ip = var.private_ipv4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attach private network during server creation
The inline network block on the server resource is now generated with for_each = [], so the server is created without any private network attached. When a nodepool disables both public IPv4 and IPv6 (the standard setup for private-only clusters), Hetzner requires a network interface at create time and the API rejects the server. Even if creation succeeded, subsequent provisioners that connect via self.network would run before hcloud_server_network attaches the subnet. The previous version attached the network inline, so private-only nodes worked. Please attach the private network during server creation (or gate the dynamic block on var.network_id when public IPs are disabled) to avoid breaking private clusters.
Useful? React with 👍 / 👎.
|
Most importantly @pat-s , we move all of it to leapmicro, but microos nodes should keep on working, there's no reason they shouldn't work. Selinux, again we support Udica, please explore that. |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Hmmm. Not a fan of this change right now (and in general). Why not let the user decide which OS they want to use? Both are totally fine and can be changed through a simple variable which in the end just uses a different packer template. Besides I am still skeptical that a one-off migration will work: as said, nodes must be replaced one-by-one. Somewhat unrelated side-note: I am just partly happy having to deal with two AIs now in the review. While I am not against AI and use it myself for coding assistance, I think there's a lot of value for human review > AI review. Even if there's a mix, it would be great to not have a 95%/5% in favor of AI.
This branch was branched off from Stanis initial branch. I think rebasing will again will just create a huge mess.
Hm, this is new to me (did I overlook something from previous discussions?). I gotta say I don't have resources left for another refactoring which is not straight forward and guaranteed to work. Thinking about this reply for some minutes, I have to conclude that I was hoping to not get into a "large rebase" state again but rather having found a good spot when placing this PR without any conflicts. Coming back after creation, I am somewhat overwhelmed with 20+ AI comments + an (unexpected) large refactor request. I think at this point I have to draw a line and not continue here. Please feel free to continue and get this merged in the way you envision it. I don't care much about contribution history of mine. At this point right now I think that's the most effective (and maybe only) way to get this merged. |
Preface
This effort was initially started in #1628 by @stani-o and for quite some time continued in a hybrid way by me.
After intense local testing, it became clear that #1628 needs a lot more changes to work realiably, especially in relation to SELinux.
This PR addresses most of these and for the sake of autonomy in regards to merging the default branch and branching, I now decided to create a new PR which effectively supersedes #1628.
To get an idea of the "robustness" of this branch: I am between 20-30 packer rebuilds meanwhile and around ~ 50h of work for this implementation.
I didn't yet focus on the documentation and README, but maybe this can also be done by somebody else once the technical points are agreed on.
I'll try to keep this branch in sync with the default branch, so testing is easier.
SELinux
LeapMicro is much more strict about SELinux in general and pretty much denies anything by default.
To make essential cluster components work (CSI pods, metrics server, node exporter, etc.), several allowances are required. These must be applied by a tailored SELinux policy which is applied during the cloud-init phase.
The policy itself is implemented as a systemd service. This way,
cloud-initcan implement an order to first apply the policy and then startk3son the node.General
Support for
microosis preserved and users can now select the OS with theosvariable within the node blocks:{ name = "control-plane-fsn1", server_type = "cax11", location = "fsn1", labels = [], taints = [], count = 1 keep_disk_size = true os = "leapmicro" },The var will default to
leapmicrobut can also be set tomicroos.Note
It is highly recommended to do so and migrate nodes one at a time - otherwise your whole cluster will be recreated at the same time, which will very likely result in data corruption if you're using longhorn.
Packer templates
New packer templates were added which can be built in the same way as the previous MicroOS ones.
Testing
Even though I have a mixed-arch cluster with 5 nodes and 3 control planes running now with this branch, it would be great to get some more testing in.
Note that this should only be conducted if you're confident to understand this module and you're able to conduct ad-hoc fixes on nodes and other backup/restore related task. I can't guarantee that it's not breaking your existing cluster!
Testing on a new cluster is surely easiest, but that should also be done. If this works, one can try to migrate this new test cluster to this branch.
While I haven't tried it yet, kubernetes autoscaler for Hetzner support for multiple OS is supported since some time now. This means one can use different image IDs for the nodepools, which is needed to allow multiple OS choices in the first place.