Skip to content

Conversation

@tehbooom
Copy link
Member

This PR allows for a cluster to have multiple StackConfigPolicies (SCPs). Avoids conflicts of the same resources based on the weights of the policy with the lowest weight having the highest priority. If two policies have the same weight and the same resources it will fail and the phase will be changed to CONFLICT.

Here is a simple table showing what should pass and what should fail

Weight Resource Result
Same Different
Same Same
Different Same
Different Different

The following was tested and passed interchangeably.

---
# different weights same policy should pass with policy-1 winning
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-1
spec:
  weight: -1
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-1-backups"
          region: "us-west-2"
---
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-2
spec:
  weight: 11
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-2-backups"
          region: "us-east-1"
---
## Different weights different policies should pass
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-1
spec:
  weight: 10
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-1-backups"
          region: "us-west-2"
---
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-2
spec:
  weight: 11
  elasticsearch:
    snapshotRepositories:
      policy-2-backups:
        type: s3
        settings:
          bucket: "policy-2-backups"
          region: "us-east-1"
---
## Same weight different repositories should pass
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-1
spec:
  weight: 10
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-1-backups"
          region: "us-west-2"
---
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-2
spec:
  weight: 10
  elasticsearch:
    snapshotRepositories:
      policy-2-backups:
        type: s3
        settings:
          bucket: "policy-3-backups"
          region: "us-east-1"
---
# Policy 2 wins with telemetry.optIn set to true
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: kibana-policy-1
spec:
  weight: 1
  kibana:
    config:
      telemetry.optIn: false
---
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: kibana-policy-2
spec:
  weight: 0
  kibana:
    config:
      telemetry.optIn: true

The following will fail

---
# Same weight same policy must fail
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-1
spec:
  weight: 10
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-1-backups"
          region: "us-west-2"
---
apiVersion: stackconfigpolicy.k8s.elastic.co/v1alpha1
kind: StackConfigPolicy
metadata:
  name: cluster-policy-2
spec:
  weight: 10
  elasticsearch:
    snapshotRepositories:
      policy-1-backups:
        type: s3
        settings:
          bucket: "policy-2-backups"
          region: "us-east-1"

This was also tested with a ECK 3.1.0 then upgraded. Previous SCPs will have a default value of 0.

@tehbooom tehbooom added the >enhancement Enhancement of existing functionality label Aug 18, 2025
@prodsecmachine
Copy link
Collaborator

prodsecmachine commented Aug 18, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Licenses 0 0 0 0 0 issues
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link

github-actions bot commented Aug 18, 2025

🔍 Preview links for changed docs

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for multiple StackConfigPolicies (SCPs) per cluster by introducing a weight-based priority system. Policies with lower weights take precedence over those with higher weights, and conflicts are detected when policies with the same weight attempt to configure the same resources.

  • Adds a weight field to StackConfigPolicy specs with default value 0
  • Implements conflict detection for policies with same weights targeting overlapping resources
  • Merges configurations from multiple policies based on weight priorities

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/apis/stackconfigpolicy/v1alpha1/stackconfigpolicy_types.go Adds weight field to StackConfigPolicySpec
pkg/controller/stackconfigpolicy/weight_conflict_test.go Comprehensive test suite for weight conflict detection logic
pkg/controller/stackconfigpolicy/controller.go Core multi-policy support with conflict detection and policy merging
pkg/controller/stackconfigpolicy/elasticsearch_config_settings.go Multi-policy versions of Elasticsearch configuration functions
pkg/controller/stackconfigpolicy/kibana_config_settings.go Multi-policy versions of Kibana configuration functions
pkg/controller/elasticsearch/filesettings/secret.go Multi-policy secret management with policy reference tracking
pkg/controller/elasticsearch/filesettings/file_settings.go Policy merging logic for Elasticsearch file settings
pkg/controller/stackconfigpolicy/controller_test.go Updates test expectations for multi-policy behavior
config/crds/v1/*.yaml CRD updates to include weight field
docs/reference/api-reference/main.md Documentation for new weight field

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

// For map-type values (like snapshot repositories), individual entries are merged rather than replaced
func (s *Settings) mergeConfig(target, source *commonv1.Config) {
if source == nil || source.Data == nil {
return
Copy link

Copilot AI Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mergeConfig function modifies the target parameter but then creates a new Config object that is not assigned back to the original parameter. The line target = &commonv1.Config{Data: make(map[string]interface{})} only modifies the local variable, not the original pointer. This will cause the merge to fail silently.

Suggested change
return
func (s *Settings) mergeConfig(target, source *commonv1.Config) *commonv1.Config {
if source == nil || source.Data == nil {
return target

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a valid copilot observation. Also I would expect unit tests for this kind of functionality. We do have more advanced merging through the CanonicalConfig abstraction in the code base. But there might be some gotchas with the more complex configuration objects that stack config policies support that might not be thinking of right now.

Copy link
Contributor

@thbkrkr thbkrkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you! I like how this extends the original implementation to multiple SCPs without adding complexity. I didn’t do any testing on my side, but code-wise this looks good. 👍

@pebrc
Copy link
Collaborator

pebrc commented Sep 19, 2025

buildkite test this -f p=gke,t=TestStackConfigPolicy*

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am generally positive about the approach but I feel like this needs more work, and test coverage. I did a quick smoke test and it does not work as advertised.

},
ClusterSettings: &commonv1.Config{
Data: map[string]interface{}{
"persistent": map[string]interface{}{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this leads to an illegal setting "persistent"/"transient" are not supported via stack config policies


// mergeConfig merges source config into target config, with source taking precedence
// For map-type values (like snapshot repositories), individual entries are merged rather than replaced
func (s *Settings) mergeConfig(target, source *commonv1.Config) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a receiver method of Settings?

Comment on lines 167 to 177
for key, value := range source.Data {
if targetValue, exists := target.Data[key]; exists {
if targetMap, targetIsMap := targetValue.(map[string]interface{}); targetIsMap {
if sourceMap, sourceIsMap := value.(map[string]interface{}); sourceIsMap {
for subKey, subValue := range sourceMap {
targetMap[subKey] = subValue
}
continue
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems too simplistic given that for example cluster settings can both be in nested and flat syntax. Also it only merges one level deep?

// For map-type values (like snapshot repositories), individual entries are merged rather than replaced
func (s *Settings) mergeConfig(target, source *commonv1.Config) {
if source == nil || source.Data == nil {
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a valid copilot observation. Also I would expect unit tests for this kind of functionality. We do have more advanced merging through the CanonicalConfig abstraction in the code base. But there might be some gotchas with the more complex configuration objects that stack config policies support that might not be thinking of right now.

@tehbooom
Copy link
Member Author

buildkite test this -f p=gke,t=TestStackConfigPolicy*

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement Enhancement of existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants