-
Notifications
You must be signed in to change notification settings - Fork 359
Fix query performance by reordering property names #2588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@microsoft-github-policy-service agree company="Microsoft" |
|
@XiaoningLiu , @EmmaZhu @EmmaZhu @jrahman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Looks good to me.
|
@blueww I have updated the change log with a note about the performance improvements here. |
|
Thanks for the contribution! The PR is approved now. |
|
Hi @blueww any updates on merging this change? We have active issues with our testing infrastructure because Azurite performance with metadata operations is a bit slow. We're looking to get this change merged to improve our test stability. |
|
The PR merge is currently blocked by the pipeline issue. |
|
Hi @EmmaZhu any updates on the pipeline fixes? Looking at the repo, the last merged PR was several months ago. Also, this PR would greatly improve our local testing setup that uses Azurite. |
Significantly improve Azurite performance under heavy metadata lookup workloads. The issue here is that the LokiJS library only supported indexed access to data on the first predicate of a $and or $or clause. This is documented in the LokiJS code here.
The existing Azurite LokiMetadata store code used properties like accountName as the first property in the filter. The impact is that the find() and findOne() calls only performed an indexed lookup on the accountName. There is usually only a single account, so all blobs/queues/messages are returned, and the remaining filters/properties are applied by running a linear scan over all values in the result set. You can see the performance impact in the following CPU profile taken from a copy of Azurite running without this optimization:
The fix here moves the properties names that have higher selectivity to the first position in the filter object. By doing so, when LokiJS evaluates the filter object, the first (high selectivity) property is used for the initial indexed (fast) lookup which quickly prunes out nearly all of the entries in the collection. Only a few entries are left to be scanned over to compute the final result.
The impact here is substantial, in a test framework using Azurite, an initial load process took 12.5 seconds previously, and after this change only took about 9 seconds. This performance improvement only required re-ordering some fields here in the find()/findOne() API calls.