-
Notifications
You must be signed in to change notification settings - Fork 48
Migration guide from k8s-workload-registrar #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
migration/config/spire.spiffe.io_clusterfederatedtrustdomains.yaml
Outdated
Show resolved
Hide resolved
Do you think it's confusing to have 2 sets of configs in the project? One in the migration directory and one in demo? Should migration reference demo configs instead? |
migration/config/spire.spiffe.io_clusterfederatedtrustdomains.yaml
Outdated
Show resolved
Hide resolved
kind: CustomResourceDefinition | ||
metadata: | ||
annotations: | ||
controller-gen.kubebuilder.io/version: v0.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this annotation be present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its auto generated so id rather not remove it.
kind: CustomResourceDefinition | ||
metadata: | ||
annotations: | ||
controller-gen.kubebuilder.io/version: v0.8.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this annotation be present?
|
@keeganwitt Thanks for catching this. Ill update the FAQ to reflect that part is missing. |
@keeganwitt A lot of the config in here is copied from the demo/ and docs/ directories. The demo directory is a more complex set up with 2 clusters and I wanted this to be a more simple. Im thinking to push this forward as is so we can deprecate k8s-workload-registrar and a open a follow up PR to reduce duplication and clean up the docs. What do you think? |
I'm in favor of merging it soon. Incomplete/imperfect documentation is better than none at all. We won't move off Registrar until it supports downstream/parentId in the CRD. Otherwise, we have to manually set up downstream agents rather than leveraging our Helm deployment. Do you disagree that that level of feature parity should be present before official deprecation? |
I think the downstream PR is a good one to merge prior to deprecating. In general I don't think we need full feature parity as long as we have alternatives/workarounds to at least move things forward. I try to cover this in the FAQ section of this PR. For example I don't think we need to support annotation based registration going forward. Is there a specific feature aside from downstream you think we need to support prior to deprecating k8s-workload-registrar? |
We don't need parity for features that are replaced with other concepts. We weren't using the annotation based registration before, but the guide should probably address that also (and any others you're aware of). The two features we'd like are the ability to specify downstream flag and the ability to specify parent ID. |
bindAddress: 127.0.0.1:8083 | ||
leaderElection: | ||
leaderElect: true | ||
resourceName: 98c9c988.spiffe.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this name referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, its the name of the ConfigMap
thats created to do leader election. Its defined here: https://pkg.go.dev/k8s.io/component-base/config/v1alpha1#LeaderElectionConfiguration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it always have this name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use a different name if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, wasn't actually sure who generated that ConfigMap. Good to know that we control the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did following in my Helm chart.
resourceName: {{ .Release.Name | sha256sum | trunc 8 }}.spiffe.io
2022-12-13T00:41:21.844Z INFO webhook-manager Webhook configuration patched with CABundle | ||
``` | ||
|
||
### I'm using CRD mode Kubernetes Workload Registrar and it gets stuck deleting the SpiffeId CRD. What do I do? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the new CRDs also appear to have finalizers, could we run into the same issue with those as we did with the SpiffeID CRDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see finalizer mentioned in the repo but I dont see it being attached to any resources as it was the k8s-workload-registrar. So I dont think you'll run into that issue. @azdagron can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure, because I saw it in the RBAC permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those were added by default with kubebuilder. We don't currently make use of finalizers.
migration/README.md
Outdated
|
||
``` | ||
|
||
If you require these DNS Names to be automatically populated, please open a [new issue](https://github.com/spiffe/spire-controller-manager/issues/new) with your use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue created for this: #48
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the readme with this ticket. Is this a showstopper for migrating nginx service mesh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, due to the unpredictability of how a user may choose to label their Pods. I also created a second blocking issue that was discovered while testing, #49, though is unrelated to this guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump into the same and commented on the issue to give my 50 cents. #48 (comment)
FYI: Using this Guide I was able to build the spire-controller-manager into this Helm chart |
migration/README.md
Outdated
|
||
## Introduction | ||
|
||
This guide will walk you through how to migrate an existing Kubernetes Workload Registrar deployment to SPIRE Controller Manager. Existing entries created by the Kubernetes Workload Registrar aren't compatible with SPIRE Controller Manager so they'll be deleted and replaced with new entries. Workloads will continue to function with the old entries until their certificates expire, after which they'll get new certificates based on the new entries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workloads will continue to function with the old entries until their certificates expire
This isn't quite accurate. Workloads should continue to function, but the SVIDs will be actively rotated and pushed down to workloads as soon as the Agent receives the new entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
> **Note** | ||
> As we'll be deleting and creating entries, it's important to do this migration during a downtime window. | ||
|
||
## Clean up Kubernetes Workload Registrar Resources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the CRDs are deleted, and the accompanying entries, workloads will see PermissionDenied returned from the Workload API. Spec conformant workloads will interpret this as "I don't have any SVIDs" and SHOULD proactively unload/stop using previously received materials. Doing this step before the controller manager has created new entries will cause a service interruption for these workloads.
See the last paragraph under https://github.com/spiffe/spiffe/blob/main/standards/SPIFFE_Workload_API.md#521-fetchx509svid.
We might need a way to stop the registrar, deploy the controller manager, and then somehow remove the CRDs. The controller manager will remove the entries.
If we need to add something to the registrar to help facilitate this (like a "dont create new entries" mode) that still allows finalizer and cleanup to run, we can add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats going to be tricky to have them running side by side. Perhaps we can just strip the finalizers by patching the CRDs after the registrar is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work too. If there is a straightforward way to do that that seems easier than adding a flag to the k8s-workload-registrar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script does the trick:
for ns in $(kubectl get ns | awk '{print $1}' | tail -n +2)
do
if [ $(kubectl get spiffeids -n $ns 2>/dev/null | wc -l) -ne 0 ]
then
kubectl patch spiffeid $(kubectl get spiffeids -n $ns | awk '{print $1}' | tail -n +2) --type='merge' -p '{"metadata":{"finalizers":null}}' -n $ns
fi
migration/README.md
Outdated
``` | ||
|
||
> **Note** | ||
> This will create entries for every Pod in the system. Its better to restrict it with a label like in the main example in `config/clusterspiffeid.yaml`. Also see [ClusterSPIFFEID defintion](https://github.com/spiffe/spire-controller-manager/blob/main/docs/clusterspiffeid-crd.md) for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its better to restrict it
I think possibly a better approach would be to just explain the pros and cons and let folks decide for themselves which is better according to their circumstances. There are valid scenarios where you don't need to restrict by label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated note.
ignoreNamespaces: | ||
- kube-system | ||
- kube-public | ||
- istio-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially remove Istio-specific references to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its ok to have in this case as we don't want to generate certs for core Istio services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was since Istio isn't mentioned anywhere else in the migration guide and I don't think it's a requirement for the migration guide it might be confusing to include. I think it is totally fine to leave in, just a suggestion :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks @faisal-memon !
Just need DCO fixed up and we can merge this. |
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Co-authored-by: Marco Franssen <[email protected]> Signed-off-by: Faisal Memon <[email protected]>
Co-authored-by: Marco Franssen <[email protected]> Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]> Co-authored-by: Marco Franssen <[email protected]> Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
Signed-off-by: Faisal Memon <[email protected]>
DCO fixed. |
Provides steps for users of Kubernetes Workload Registrar to transition to SPIRE Controller Manager.
spire
namespace because thats what Kubernetes Workload Registrar users are likely using. Newer examples usespire-system
. Should we steer users there?Addresses: spiffe/spire#3614