-
Notifications
You must be signed in to change notification settings - Fork 107
Speed Improvement: Make incoming entities Arc<Entities> like schema entities #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed Improvement: Make incoming entities Arc<Entities> like schema entities #1296
Conversation
cedar-policy-core/src/entities.rs
Outdated
fn create_entity_map(es: impl Iterator<Item = Entity>) -> Result<HashMap<EntityUID, Entity>> { | ||
fn create_entity_map(es: impl Iterator<Item = Entity>) -> Result<HashMap<EntityUID, Arc<Entity>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also have the input es
here be an iterator of Arc<Entity>
, and force callers to map Arc::new
if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same concern also applies to add_entities
, right? I'd be in favor of making our internal APIs more explicit by taking Arc
on these functions. Public APIs shouldn't change ofc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I wasn't sure how interested you'd be in propagating this Arc expectation (at least internally). I can make change that reflect this for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to this request in 1da976e, let me know what you think.
cedar-policy-core/src/entities.rs
Outdated
self.entities | ||
.into_values() | ||
.map(Arc::unwrap_or_clone) | ||
.collect::<Vec<Entity>>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to .collect()
here seems like a potential performance problem as well ... is it possible to avoid it with a different choice of type IntoIter
above, so that the implementation here can just be self.entities.into_values().map(Arc::unwrap_or_clone)
? or alternately, should we change the type Item
to be Arc<Entity>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, I'll check that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 11e70bd. I changed the IntoIter type to reflect the .map call as the return type. This gave another speed boost as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Craig's comments are probably worth addressing.
cedar-policy-core/src/entities.rs
Outdated
fn create_entity_map(es: impl Iterator<Item = Entity>) -> Result<HashMap<EntityUID, Entity>> { | ||
fn create_entity_map(es: impl Iterator<Item = Entity>) -> Result<HashMap<EntityUID, Arc<Entity>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same concern also applies to add_entities
, right? I'd be in favor of making our internal APIs more explicit by taking Arc
on these functions. Public APIs shouldn't change ofc.
schema | ||
.action_entities() | ||
.into_iter() | ||
.map(|e| (e.uid().clone(), Arc::unwrap_or_clone(e))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a second to puzzle out why this actually gives a speed up, so I'll leave my reasoning here. Maybe worth elaborating in a comment.
We remove this unwrap_or_clone
, but add a make_mut
(which I assume must clone if there's another reference to the data) call inside TC computation, so at first glance it feels like that should be a wash. But, the TC was already computed for the entity map (which we just constructed, so one ref to each Arc
), and we separately computed TC for the actions inside the schema constructor (while it had unique ownership). Put together these should mean we never trigger a clone on the make_mut
during TC computation.
12fbc39
to
2776826
Compare
…rc the incoming entities, update function parameters to receive Arc entities, implement TCNode for Arc<Entity> Signed-off-by: Thomas Hill <[email protected]>
…ty> and proto::Entity Signed-off-by: Thomas Hill <[email protected]>
2776826
to
98cc078
Compare
… receive Arc<Entities> instead, update all tests to take this into account, update call sites of these functions to map to Arc<Entity> as well Signed-off-by: Thomas Hill <[email protected]>
…llecting by updating the IntoIter type, and remove .collect(). Signed-off-by: Thomas Hill <[email protected]>
Thanks for this work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks the build error is coming from the protobuf feature
…ped entities into account. Signed-off-by: Thomas Hill <[email protected]>
Protobuf tests updated to use Arc wrapped entities in 857509b. |
Signed-off-by: Thomas Hill <[email protected]>
…ntities (#1296) Signed-off-by: Thomas Hill <[email protected]>
…ntities (#1296) Signed-off-by: Thomas Hill <[email protected]> Signed-off-by: Craig Disselkoen <[email protected]>
Description of changes
This provides a 3x speedup based on local benchmarking with a large schema that has big action entity hierarchy relationships. Today Entities.entities is a HashMap of
<EntityUID, Entity>
. When from_entities brings in the entities and actions from a given schema, it currently relies onArc::unwrap_or_clone
, which frequently ends up being clone. For a large set of very interrelated Actions, this repeated clone becomes expensive.Changing Entities.entities to be a HashMap of
<EntityUID, Arc<Entity>>
allows us to keep the clone cheap, and Arc'ing the incoming Entities becomes cheap as well.Changing the interface of the HashMap means altering what create_entities_map returns, adding an implementation of TCNode for
Arc<Entity>
, and udpating how IntoIterator works to return values from the Arc's.Issue #, if available
This contributes to 1285.
Checklist for requesting a review
The change in this PR is (choose one, and delete the other options):
cedar-policy-core
,cedar-validator
, etc.)I confirm that this PR (choose one, and delete the other options):
I confirm that
cedar-spec
(choose one, and delete the other options):I confirm that
docs.cedarpolicy.com
(choose one, and delete the other options):