fix: Clear related caches on clusters update #2085

Callisto13 · 2022-05-04T09:47:35Z

This fixes an issue where the clustersNamespaces and the userNamespaces
caches were not updated when Clusters were deleted.

This was most noticeable for clustersNamespaces since the GetAll there
would return everything in the cache, dated or not.

For the userNamespaces cache this was hidden, since the GetAll in this
case would receive an up to date list of the cached clusters and then
use that to look up individual keys, returning them in a collated list.
This meant it looked like the cache was updated, but it was only
showing us what we wanted to see; the cache was still quietly building
up.

The solution here is to generate a Hash based on an ordered list of the
names of current clusters. When updating the clustersNamespaces cache, we
compare the saved with the current hash, and clear the clustersNamespaces if
they differ.
We also clear the userNamespaces at this point; since the userNamespaces
cache is dependent on both the clusters and the clustersNamespaces
cache, it is simpler to clear everything together. (I can change this if wanted
nbd.)

The userNsList fun has been pulled apart for testing purposes, but is
otherwise the same. The testing that the userNamespaces cache has been
cleared was harder, since, as I said above the GetAll does not really
get all, but "gets all based on this cluster list". I can expose new
methods and change this into a genuine List if that is preferred, but
have not done that for now.

Closes #2083

luizbafilho · 2022-05-04T12:57:05Z

This looks good, but it's not enough, since we need to clean up the cf.usersNamespaces too, but different from the clustersNamespaces we cannot do every 30 seconds, other wise will have to ask user's permissions to the cluster too frequently, slowing things down.

So I propose that we create a hash for the list of clusters, and only reset everything up if that hash changes, so basically you'd do:

List the cluster names ordered and save a concatenated string in the factory, we could add a GetHash to the Clusters cache for instance.
When updating the namespaces, check if current hash is different from the saved one, if so, clear clustersNamespace and the usersNamespaces, if not leave as is.

With that, we should expect things to be eventually consistent, after 30 seconds or so.

foot · 2022-05-04T13:05:49Z

I get some very strange behaviour..

https://localhost:8000/v1/helmreleases? returns a cannot find cluster= (empty) kind of thing (sorry I'll try and get the corrent error message again)
https://localhost:8000/v1/helmreleases?clusterName=Default complains it cannot find a secret that only exists in the leaf (ewq2)
https://localhost:8000/v1/helmreleases?clusterName=ewq2 complains it cannot find a secret in mgmt (Default)

Callisto13 · 2022-05-04T13:06:48Z

we need to clean up the cf.usersNamespaces too

I had deliberately left that to handle separately, but you are right that they have to be considered together 👍

luizbafilho · 2022-05-04T13:14:03Z

I get some very strange behaviour..

https://localhost:8000/v1/helmreleases? returns a cannot find cluster= (empty) kind of thing (sorry I'll try and get the corrent error message again)

https://localhost:8000/v1/helmreleases?clusterName=Default complains it cannot find a secret that only exists in the leaf (ewq2)

https://localhost:8000/v1/helmreleases?clusterName=ewq2 complains it cannot find a secret in mgmt (Default)

This is a separate issue, related to that handler specifically, looking at the code it tries to do some secret querying indeed.

I also noticed you are trying to query by ClusterName I don't think all list handlers support that, feel free to open another issue to add that too.

Callisto13 · 2022-05-05T13:31:06Z

Updated commit and description

core/clustersmngr/factory_caches.go

core/clustersmngr/factory.go

luizbafilho

this looks great beside couple small nits.

@foot would be possible for you to test it from this branch? or does it need to be merged?

foot · 2022-05-05T17:16:35Z

I can test from this branch 👌, will take it for a spin tomo thanks team!

…

On Thu 5. May 2022 at 18:47, Luiz Filho ***@***.***> wrote: ***@***.**** approved this pull request. this looks great beside couple small nits. @foot <https://github.com/foot> would be possible for you to test it from this branch? or does it need to be merged? — Reply to this email directly, view it on GitHub <#2085 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAFL6DSCTOGQWI6RSKKHFDVIP3SJANCNFSM5VBNNO5A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

foot · 2022-05-06T08:51:11Z

Still not quite working unfortunately.

/v1/flux_runtime_objects returns:

{
  "code": 2,
  "message": "could not list deployments in namespace flux-system: cluster=ewq2 not found",
  "details": []
}

We could add some debug logging? 🤔 , I can put together a cluster to play with if it helps too.

This fixes an issue where the clustersNamespaces and the userNamespaces caches were not updated when Clusters were deleted. This was most noticeable for clustersNamespaces since the GetAll there would return everything in the cache, dated or not. For the userNamespaces cache this was hidden, since the GetAll in this case would receive an up to date list of the cached clusters and then use that to look up individual keys, returning them in a collated list. This meant it _looked like_ the cache was updated, but it was only showing us what we wanted to see; the cache was still quietly building up. The solution here is to generate a Hash based on an ordered list of the names of current clusters. When updating the clustersNamespaces cache, we compare the saved with the current hash, and clear the clustersNamespaces if they differ. We also clear the userNamespaces at this point; since the userNamespaces cache is dependent on both the clusters and the clustersNamespaces cache, it is simpler to clear everything together. The `userNsList` fun has been pulled apart for testing purposes, but is otherwise the same. The testing that the userNamespaces cache has been cleared was harder, since, as I said above the GetAll does not _really_ get all, but "gets all based on this cluster list". I can expose new methods and change this into a genuine List if that is preferred, but have not done that for now.

Callisto13 · 2022-05-06T09:07:07Z

@foot we could jump on a call together, might be faster?

I only have a couple of hours today before I go on holiday, so someone may need to take this off me if it drags on.

foot · 2022-05-06T10:54:23Z

After some testing and being more careful about observing the 30s window etc it seems to be working great! 💯

LGTM ⭐

Includes important MC fixes: - weaveworks/weave-gitops#2137 - weaveworks/weave-gitops#2085 Fixes needed to address core changes: - `wego-admin` is the new user to login as (no longer `admin`) - Core can now be configured with a fake-client so we can remove some hacks - Adds a little bit of debugging around MC querying

Callisto13 added the bug Something isn't working label May 4, 2022

Callisto13 requested review from foot and luizbafilho May 4, 2022 09:47

Callisto13 added the team/mauvelous label May 4, 2022

Callisto13 force-pushed the clear-cluster-ns-cache branch from 01d7772 to 8e071b5 Compare May 5, 2022 13:30

Callisto13 changed the title ~~fix: Clear the clustersNamespaces cache on Update~~ fix: Clear related caches on clusters update May 5, 2022

Callisto13 force-pushed the clear-cluster-ns-cache branch from 8e071b5 to dafb4bf Compare May 5, 2022 14:11

luizbafilho reviewed May 5, 2022

View reviewed changes

core/clustersmngr/factory_caches.go Outdated Show resolved Hide resolved

luizbafilho reviewed May 5, 2022

View reviewed changes

core/clustersmngr/factory.go Outdated Show resolved Hide resolved

luizbafilho reviewed May 5, 2022

View reviewed changes

core/clustersmngr/factory.go Outdated Show resolved Hide resolved

luizbafilho reviewed May 5, 2022

View reviewed changes

core/clustersmngr/factory.go Outdated Show resolved Hide resolved

luizbafilho approved these changes May 5, 2022

View reviewed changes

Callisto13 force-pushed the clear-cluster-ns-cache branch from dafb4bf to 3b47eaa Compare May 6, 2022 09:05

Callisto13 merged commit f4e42a0 into main May 6, 2022

Callisto13 deleted the clear-cluster-ns-cache branch May 6, 2022 10:57

foot mentioned this pull request May 11, 2022

Upgrade BE to [email protected] weaveworks/weave-gitops-enterprise#726

Merged

This was referenced May 12, 2022

Updates for 0.0.0-this-is-a-test-please-ignore #2151

Closed

Updates for 0.8.1-rc.3 #2158

Merged

Updates for 0.8.1-rc.4 #2169

Merged

This was referenced May 26, 2022

Updates for 0.8.1-rc.5 #2213

Merged

Updates for 0.8.1-rc.6 #2248

Merged

Updates for 0.8.1-rc.7 #2258

Merged

Updates for 0.8.1 #2262

Merged

weave-gitops-bot mentioned this pull request Jun 8, 2022

Updates for 0.9.0-rc.1 #2291

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Clear related caches on clusters update #2085

fix: Clear related caches on clusters update #2085

Uh oh!

Callisto13 commented May 4, 2022 •

edited

Loading

Uh oh!

luizbafilho commented May 4, 2022

Uh oh!

foot commented May 4, 2022

Uh oh!

Callisto13 commented May 4, 2022

Uh oh!

luizbafilho commented May 4, 2022

Uh oh!

Callisto13 commented May 5, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luizbafilho left a comment

Uh oh!

foot commented May 5, 2022 via email

Uh oh!

foot commented May 6, 2022

Uh oh!

Callisto13 commented May 6, 2022

Uh oh!

foot commented May 6, 2022

Uh oh!

Uh oh!

fix: Clear related caches on clusters update #2085

fix: Clear related caches on clusters update #2085

Uh oh!

Conversation

Callisto13 commented May 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luizbafilho commented May 4, 2022

Uh oh!

foot commented May 4, 2022

Uh oh!

Callisto13 commented May 4, 2022

Uh oh!

luizbafilho commented May 4, 2022

Uh oh!

Callisto13 commented May 5, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luizbafilho left a comment

Choose a reason for hiding this comment

Uh oh!

foot commented May 5, 2022 via email

Uh oh!

foot commented May 6, 2022

Uh oh!

Callisto13 commented May 6, 2022

Uh oh!

foot commented May 6, 2022

Uh oh!

Uh oh!

Callisto13 commented May 4, 2022 •

edited

Loading