Skip to content

Conversation

john-h-kastner-aws
Copy link
Contributor

@john-h-kastner-aws john-h-kastner-aws commented Jun 16, 2023

Copy link
Contributor

@mwhicks1 mwhicks1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Got delayed, so pushing a couple of comments but not a complete review.)

@john-h-kastner-aws john-h-kastner-aws changed the title Partial schema validation RFC Partial schema validation Jun 26, 2023
@cdisselkoen cdisselkoen added the pending This RFC is pending; for definitions see the README label Jul 3, 2023
@max2me
Copy link

max2me commented Jul 7, 2023

I just want to point out that ability to declare additionalAttributes on any Record type (as this RFC proposed) provides a fine-grained control over validation and makes it possible to accomodate scenarios that AWS Verified Access has. There service knows all top level keys of the context but might not have information about all attributes of nested Records.


# Summary

This RFC proposes extending the Cedar validator to work with partial or even complete empty schemas,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complete -> completely

When a policy mentions an undefined entity type, or attempts to access an undefined attribute of some entity type,
the validator will not signal an error, unless it recognizes that there is an inevitable type error.
For example, even if the type of `access_level` is not declared, we know `principal.access_level > "5"` will inevitably error due to the `>` comparison against a string.
Partial schema validation does not attempt to detect all type errors, and is therefore unsound.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion, it would be interesting to emit warnings in cases where the current proposal silently accepts the policy (e.g., in cases where the policy mentions an undefined entity type, or attempts to access an undefined attribute).

I like this idea because it allows us to prove some kind of soundness property (namely that in the absence of validation warnings or errors, the policy will not produce an error at authorization time). The other benefit of this design (as pointed out in offline discussion) is that it allows for tooling to help users improve their schemas (e.g., offering to add a definition for a new entity type).

# Motivation

We want to provide some of the benefits of policy validation to Cedar users who have not written or may not be able to write a schema.
Detecting any amount potential errors during development is preferable to discovering the errors later,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amount -> amount of

Undeclared entity types and actions are treated as incomplete according to the second two kinds of incompleteness.
At the extreme, it is possible to have a schema that does not declare any entity types or actions.

This policy will validate with an empty schema that does not declares either of the entity types or the action.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either of the entity types or the action -> any entity types or actions

The bottom type functions like the [TypeScript `any` type](https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#any).
It is used to ensure that the type is allowed to appear in any expression without causing additional type errors.
The `>` operator expects a value of type `Long` as its operand.
It will in fact accept any subtype of `Long`, and the bottom type is a subtype of every type, so it accepted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accepted -> is accepted

so an expression `principal.role == "Admin"` would still be accepted by the validator.

Alternatively, we might have complete information about the attributes present for `User` entities.
When we know that `access_level` does not exist, the `shape` for the principal entity type can include `"additionalAttributes": false` to specify that there are no undeclared attributes for this entity type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that "additionalAttributes": false is the default, so it is unnecessary to say this explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now that it's noted in a later section, but still good to include here to head off confusion.

We extend the schema format to support the above example schema.
Specifically, we add the following properties to the JSON:

* Any record type, including the record types for entity attributes and action context attributes, may include `"additionalAttributes": true` to specify that values having that record type may have attribute other than those declared.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may have attribute -> may have attributes

Comment on lines +184 to +185
This change does not necessarily require any change to the behavior of the standard validator,
but the additions to the schema format could be soundly supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the implementation make these changes, or is this noting that the changes are possible?

Comment on lines +198 to +199
A sound, type inference based, approach to partial schema validation would report _more_ type error than this proposal,
so it could not be used as a drop-in replacement.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type error -> type errors

Also: why would an inference-based approach result in more errors?


* Policy validation is not required to use Cedar,
so we could continue to support validation only for users who are able to provide a complete schema.
* We could design and implement a type inference algorithm for Cedar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to have more details about the schema inference option to determine whether it is a viable alternative.

john-h-kastner-aws added a commit to cedar-policy/cedar that referenced this pull request Nov 27, 2023
Experimental implementation of partial schema validation as described in cedar-policy/rfcs#8.
@john-h-kastner-aws john-h-kastner-aws added pre-acceptance-experimental This RFC is pre-acceptance experimental; for definitions see the README and removed pending This RFC is pending; for definitions see the README labels Nov 27, 2023
@luxas
Copy link

luxas commented May 13, 2025

Hi! 👋 What is the status on this RFC? I see that the partial-validate feature is at least partially implemented. Did the design change as part of implementation? I just noticed the additional_attributes bool on the RecordType, which led me here. One of the things I noticed (which is probably expected/desired today, is that only the JSON syntax has this field, additional_attributes does not round-trip through the Cedarschema encoding.

One use-case I have (which is related, but slightly different) is to be able to say for some struct type that "any property that is not known, should have this type"; e.g. string, just like OpenAPI v3's additionalProperties. This would allow serializing a string-string map with arbitrary values into Cedar as-is, with something like additionalProperties { ty: TypeVariant::String }. You can do this today with a has check for a type that it empty, e.g.

type LabelsMap = {}
type Metadata = {
   labels: LabelsMap,
}
entity Resource = {
  metadata: Metadata,
}

permit(principal, action, resource == Resource) when {
  resource.metadata.labels has foolabel && resource.metadata.labels.foolabel == "bla"
};

The VS Code schema validator (Cedar 4.4.0) allows this, even in the default validation mode, although the type of resource.metadata.labels.foolabel is not known (which could result in a schema typing error not being caught and thus a runtime error). However, cedar validate in strict mode correctly tells this is (as per the schema) an impossible policy. At runtime, using cedar authorize, this policy works well, but indeed requires the has guard.

If additionalProperties was part of the schema, the has guard maybe wouldn't be needed, and it would be possible to typecheck the value. However, I realize this would probably get dangerously close to maps, which are not part of Cedar today (except in tags), and thus might make this harder to analyze.

I'm happy to open a dedicated issue on this topic, as it's only tangientally related to additional_attributes as used in this RFC, but wanted to mention this here first.

@john-h-kastner-aws
Copy link
Contributor Author

Thanks for the interest @luxas

Hi! 👋 What is the status on this RFC? I see that the partial-validate feature is at least partially implemented. Did the design change as part of implementation?

This RFC is currently inactive with no plans to move forward with it. It is implemented as an experimental feature under the partial-validate feature which IIRC does implement the feature as described here. You can experiment with it, and if it solves an important use case we'd be really interested in hearing about it.

I just noticed the additional_attributes bool on the RecordType, which led me here. One of the things I noticed (which is probably expected/desired today, is that only the JSON syntax has this field, additional_attributes does not round-trip through the Cedarschema encoding.

Right, this is expected. When we added the Cedar schema syntax we did not incorporate the partial schema validation features.

If additionalProperties was part of the schema, the has guard maybe wouldn't be needed, and it would be possible to typecheck the value. However, I realize this would probably get dangerously close to maps, which are not part of Cedar today (except in tags), and thus might make this harder to analyze.

Based on my understanding of this use case, it sounds like you would want RFC #66 (open records for simple maps), but after extensive discussion we decided that we could not implement it while maintaining analyzability. Entity tags are the best we can offer here. I feel like tags should are a pretty natural solution for representing a string:string map, but let me know where it fails short.

@luxas
Copy link

luxas commented May 15, 2025

Thanks for the response!

I think that for my case, I'll work around it by either:

a) "bottom-up": dynamically updating the schema of the "open" Cedar record (which in JSON/OpenAPI is an unstructured string-string map) to include the record attributes that become well-known by actually referencing them in a policy (e.g. by using #74), or
b) "top-down": walking objects in the database to discover which labels (record attribute keys) "exist" in the system, and thus populating the schema, or
c) (what I have now) model the map as a Set<{key: String, value: String}> which works well, but is not as expressive and intuitive to use.

So I don't think I need this feature per se, if I do any of the above, as I anyways need the soundness guarantees that the normal schema validator provides.

@john-h-kastner-aws
Copy link
Contributor Author

Workarounds seem reasonable

a) Sounds tricky since you'll need to infer the type of the attributes based on how they're used. Doing it soundly in the general case will be difficult.

b) Probably a good approach if it's workable. Depends a lot on what the database looks like.

c) A good workaround if the limited expressiveness isn't an issue, but I'm curious why you have this instead of using entity tags. Entity tags were largely motivated by supporting this pattern while allowing direct access to the values by hasTag and getTag operations.

@luxas
Copy link

luxas commented May 16, 2025

a) Sounds tricky since you'll need to infer the type of the attributes based on how they're used. Doing it soundly in the general case will be difficult.

In this case, I know that key is always a string (as it's modelled from a JSON object), and I know the Cedar type of value too, thanks to the OpenAPI schema (and in 90%+ of cases just a string). So what I infer from policies is just the attribute names/keys, basically, which I think should work.

A good workaround if the limited expressiveness isn't an issue, but I'm curious why you have this instead of using entity tags. Entity tags were largely motivated by supporting this pattern while allowing direct access to the values by hasTag and getTag operations.

The reason was (until I read your message and realized how to do it, thanks!) that I didn't realize that I don't have to put tags on the "parent" object, but can declare an intermediate type. So the simplified practical example:

namespace before {
    entity Pod = {
        metadata: ObjectMeta,
    };
    type ObjectMeta = {
        labels: Set<StringKeyValue>,
        annotations: Set<StringKeyValue>,
    };
    type StringKeyValue = {
        key: String,
        value: String
    };
    action update appliesTo {
        principal: User,
        resource: Pod
    };
    entity User;
}

This is the existing case for https://github.com/awslabs/cedar-access-control-for-k8s/

When I was thinking about tags, I somehow thought about putting the tags on ObjectMeta (the "parent"), which didn't really fit nicely, especially as there were two of these open records, labels and annotations.

With this discussion, I realized that nothing stops me from declaring an intermediate entity, not sure why I didn't realize this before, when reading the tags RFC.

namespace after {
    entity Pod = {
        metadata: ObjectMeta,
    };
    type ObjectMeta = {
        labels: StringMap,
        annotations: StringMap
    };
    entity StringMap = {
        
    } tags String;
    action update appliesTo {
        principal: User,
        resource: Pod
    };
    entity User;
}

Now I can write:

permit(
    principal,
    action,
    resource is after::Pod
) when {
    resource.metadata.labels.hasTag("owner") && resource.metadata.labels.getTag("owner") == "lucas" 
};

And to support the case of figuring out all of the available labels (when not caring about the values), to make sure there aren't other tags/labels I don't want to be there, but also don't know the name of, I can add:

entity StringMap = {
    keys: Set<String>
} tags String;

such that I can make sure the object doesn't contain other, unauthorized labels, only the two allowlisted ones here (owner, other-ok-label):

permit(
    principal,
    action,
    resource is after::Pod
) when {
    resource.metadata.labels.hasTag("owner") &&
    resource.metadata.labels.getTag("owner") == "lucas" &&
    ["owner", "other-ok-label"].containsAll(resource.metadata.labels.keys) 
};

For posterity, this is how it would have looked like if we had open records, not very different:

permit(
    principal,
    action,
    resource is after::Pod
) when {
    resource.metadata.labels has "owner" &&
    resource.metadata.labels.owner == "lucas" &&
    ["owner", "other-ok-label"].containsAll(resource.metadata.labelKeys) 
};

Thanks for helping with the thinking!
It might be good to document this somewhere in the docs as "so, you think you need an open record" 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pre-acceptance-experimental This RFC is pre-acceptance experimental; for definitions see the README

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants