-
Notifications
You must be signed in to change notification settings - Fork 180
Support nested aggregation when calcite enabled #4979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support nested aggregation when calcite enabled #4979
Conversation
Signed-off-by: Lantao Jin <[email protected]>
📝 WalkthroughWalkthroughAdds nested-aggregation support and hint-based aggregation metadata across Calcite planning and OpenSearch pushdown layers, including nested-path resolution utilities, new hint utilities, a converter rule and physical nested aggregate, pushdown adaptations, response parsing updates, and integration tests/resources. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant CalciteRelNodeVisitor
participant PPLHintUtils
participant CalciteLogicalIndexScan
participant EnumerableNestedAggregateRule
participant AggregateAnalyzer
participant OpenSearch
Client->>CalciteRelNodeVisitor: visitAggregation(rel)
CalciteRelNodeVisitor->>CalciteRelNodeVisitor: collect aggCallRefs
CalciteRelNodeVisitor->>CalciteRelNodeVisitor: compute hintIgnoreNullBucket
alt nested aggregator detected
CalciteRelNodeVisitor->>PPLHintUtils: addNestedAggCallHintToAggregate(relBuilder)
end
alt hintIgnoreNullBucket true
CalciteRelNodeVisitor->>PPLHintUtils: addIgnoreNullBucketHintToAggregate(relBuilder)
end
CalciteRelNodeVisitor->>CalciteLogicalIndexScan: pushDownAggregate(aggregate, hints)
CalciteLogicalIndexScan->>CalciteLogicalIndexScan: buildSchema() / resolveNestedPath()
CalciteLogicalIndexScan->>EnumerableNestedAggregateRule: rule matching (hints)
EnumerableNestedAggregateRule->>EnumerableNestedAggregateRule: convert -> CalciteEnumerableNestedAggregate
CalciteLogicalIndexScan->>AggregateAnalyzer: analyze(agg, bucketNames, nested info)
AggregateAnalyzer->>AggregateAnalyzer: resolveNestedPath & build aggregations (may wrap NestedAggregationBuilder)
CalciteLogicalIndexScan->>OpenSearch: execute pushdown with nested aggregation
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
Comment |
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
| protected Function0<CalcitePrepare> createPrepareFactory() { | ||
| return OpenSearchPrepareImpl::new; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After upgraded to 1.41. this method was not called any more, change it to createPrepare()
|
|
||
| @Override | ||
| protected PreparedResult implement(RelRoot root) { | ||
| Hook.PLAN_BEFORE_IMPLEMENTATION.run(root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hook was called twice for non-full-scannable plan
core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
Outdated
Show resolved
Hide resolved
| curl -sS -H 'Content-Type: application/json' \ | ||
| -X POST localhost:9200/_plugins/_ppl/_explain?format=extended \ | ||
| -d '{"query" : "source=state_country | where age>30"}' | ||
| -d '{"query" : "source=state_country | head 10 | where age>30"}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid whole-plan-pushdown
| if (result.getCause() == null) { | ||
| if (result.getSuppressed().length > 0) { | ||
| return result.getSuppressed()[0]; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some exception without cause (root exception), the actual root cause may be attached in its suppressed cause.
| return new OpenSearchErrorMessage(exception, exception.status().getStatus()); | ||
| } | ||
| return new ErrorMessage(e, status); | ||
| return new ErrorMessage(cause, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to display the root cause instead of wrapping exception.
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Lantao Jin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (11)
docs/user/ppl/interfaces/endpoint.md (1)
114-127: Clarify the rationale for the example query change and execution strategy.Line 114 changes the example query from
where age>30tohead 10 | where age>30, which fundamentally alters the execution plan and demonstrates a different execution strategy (enumerable-based with runtime filtering instead of pure pushdown). While the previous review comment "avoid whole-plan-pushdown" suggests this is intentional, the documentation doesn't explain:
- Why this query pattern was chosen as the extended format example
- How the
head 10operation affects plan structure (introducingLogicalSort)- Why the physical plan shifts to enumerable execution instead of full pushdown
- What the trade-offs are
Additionally, the YAML format example (line 139) still uses the original query without
head 10, creating an inconsistency. Clarify whether both examples should use the same query or if they're intentionally demonstrating different scenarios.core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
1076-1086: JavaDoc example is slightly inconsistent with the implementation.The JavaDoc example shows
aggCalls: [count(), count(a.b), avg(a.c)]but the method takesaggCallRefs(which areRexInputRefindices), not the originalAggCallobjects. Consider clarifying that the input is the list ofRexInputRefextracted from aggCalls.Also, the example shows
aggCallRefs [1, 2]but doesn't explain why index 0 is excluded (likely becausecount()has no field reference). Adding this clarification would improve understanding.core/src/main/java/org/opensearch/sql/utils/Utils.java (1)
18-27: Consider usingArrayListinstead ofLinkedListfor better performance.
LinkedListhas poor cache locality and higher memory overhead compared toArrayList. Since you're only appending to the list and not inserting/removing from the middle,ArrayListwould be more efficient:🔎 Proposed refactor
static <I> List<Pair<I, Integer>> zipWithIndex(List<I> input) { - LinkedList<Pair<I, Integer>> result = new LinkedList<>(); - Iterator<I> iter = input.iterator(); - int index = 0; - while (iter.hasNext()) { - result.add(Pair.of(iter.next(), index++)); + List<Pair<I, Integer>> result = new ArrayList<>(input.size()); + for (int i = 0; i < input.size(); i++) { + result.add(Pair.of(input.get(i), i)); } return result; }docs/user/ppl/cmd/explain.md (1)
13-13: Minor grammar improvement suggested.The description could be clearer with improved grammar:
🔎 Proposed fix
- * extended: Display the standard information plus generated code, if the whole plan is able to pushdown, it equals to standard mode. + * extended: Display the standard information plus generated code. If the whole plan can be pushed down, it is equivalent to standard mode.opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)
108-115: Good defensive check for nested field paths.This additional guard using
Utils.resolveNestedPathensures that pushdown is also prevented when dedup columns reference nested sub-fields (e.g.,nested.subfield), even if the immediate column type isn't ARRAY. This complements the type-based check above.Consider adding a debug log message here similar to line 74 for consistency and debugging visibility:
🔎 Proposed enhancement
if (targetProjections.stream() .anyMatch( pair -> Utils.resolveNestedPath(pair.getValue(), scan.getOsIndex().getFieldTypes()) != null)) { + LOG.debug("Cannot pushdown the dedup since the dedup fields contain nested sub-fields"); // fallback to non-pushdown if the dedup columns contain nested fields. return; }opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java (1)
53-66: Add defensive check for empty inner aggregations.When the first aggregation is
InternalNested, the code assumes there's at least one inner aggregation. If the nested aggregation has no inner aggregations,iterator().next()will throwNoSuchElementException.🔎 Proposed defensive check
@Override public List<Map<String, Object>> parse(Aggregations aggregations) { Aggregation agg; if (aggregations.asList().getFirst() instanceof InternalNested) { - agg = ((InternalNested) aggregations.asList().getFirst()).getAggregations().iterator().next(); + Aggregations innerAggs = ((InternalNested) aggregations.asList().getFirst()).getAggregations(); + if (innerAggs.asList().isEmpty()) { + return List.of(); + } + agg = innerAggs.iterator().next(); } else { agg = aggregations.asList().getFirst(); }opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java (1)
54-56: Add defensive check for empty nested aggregations.Similar to
BucketAggregationParser, this code assumes the nested aggregation has at least one inner aggregation.getFirst()will throwNoSuchElementExceptionif the list is empty.🔎 Proposed defensive check
for (Aggregation aggregation : aggregations) { if (aggregation instanceof Nested) { - aggregation = ((Nested) aggregation).getAggregations().asList().getFirst(); + List<Aggregation> innerAggs = ((Nested) aggregation).getAggregations().asList(); + if (innerAggs.isEmpty()) { + continue; // Skip empty nested aggregations + } + aggregation = innerAggs.getFirst(); }core/src/test/java/org/opensearch/sql/utils/UtilsTest.java (1)
17-92: Good test coverage forUtils.resolveNestedPath.The tests comprehensively cover null/empty fieldTypes, ARRAY type at various nesting levels, STRUCT-only paths, and edge cases. Test naming follows the expected pattern.
Consider adding a test for null
pathparameter to verify the method handles it gracefully:Suggested additional test
@Test void testResolveNestedPathWithNullPath() { Map<String, ExprType> fieldTypes = new HashMap<>(); fieldTypes.put("a", ExprCoreType.ARRAY); assertNull(Utils.resolveNestedPath(null, fieldTypes)); }opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)
1204-1212: Correct nested query wrapping inSimpleQueryExpression.builder().The logic properly wraps the query in a
nestedQuerywhennestedPathis present. UsingScoreMode.Noneis appropriate since these are filter predicates that don't need scoring.Minor: Consider using
rel.getNestedPath()instead of direct field accessrel.nestedPathfor consistency with the accessor pattern:- if (rel != null && rel.nestedPath != null) { - return nestedQuery(rel.nestedPath, builder, ScoreMode.None); + if (rel != null && rel.getNestedPath() != null) { + return nestedQuery(rel.getNestedPath(), builder, ScoreMode.None); }core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)
623-640: Verify assertion condition inaddNestedHintToAggregate.The assertion at line 632 expects
indicesWithArgList.size() == nestedList.size(). This assumes callers always pass anestedListwith exactly as many elements as there are aggregation calls with non-empty arg lists.If this invariant is violated (e.g., by a bug in calling code), the assertion will fail in development but could cause silent issues in production if assertions are disabled.
Consider adding a defensive check or clearer exception:
if (indicesWithArgList.size() != nestedList.size()) { throw new IllegalArgumentException( String.format("Nested list size %d does not match aggregate calls with args %d", nestedList.size(), indicesWithArgList.size())); }opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java (1)
76-90: Improve error message clarity for users.The
implementmethod correctly extracts nested aggregation hints and throws an appropriate exception when pushdown cannot be applied. However, the error message could be more user-friendly.🔎 Consider enhancing the error message
The current error message includes a technical list of
AggregateCallobjects which may not be clear to users. Consider a more descriptive format:throw new UnsupportedOperationException( String.format( - "Nested aggregate is unsupported when pushdown cannot be applied for %s.", - nestedAggIndices.stream().map(aggCalls::get).toList())); + "Nested aggregate is unsupported when pushdown cannot be applied. " + + "Nested aggregations at indices %s require pushdown to be enabled and applicable.", + nestedAggIndices));This avoids printing the entire
AggregateCalltoString representation and provides clearer guidance.Note: This is a TODO placeholder implementation per line 78 comment.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (42)
core/src/main/java/org/opensearch/sql/ast/expression/Argument.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/utils/Utils.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javadocs/user/ppl/cmd/explain.mddocs/user/ppl/interfaces/endpoint.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.jsoninteg-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/indexDefinitions/nested_simple_index_mapping.jsoninteg-test/src/test/resources/nested_simple.jsoninteg-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4949.ymlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.javaopensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
🧰 Additional context used
📓 Path-based instructions (11)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javacore/src/main/java/org/opensearch/sql/utils/Utils.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.javacore/src/main/java/org/opensearch/sql/ast/expression/Argument.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaopensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javacore/src/main/java/org/opensearch/sql/utils/Utils.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.javaopensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.javacore/src/main/java/org/opensearch/sql/ast/expression/Argument.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaopensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javaopensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yamlinteg-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yamlinteg-test/src/test/resources/indexDefinitions/nested_simple_index_mapping.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yamlinteg-test/src/test/resources/nested_simple.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/utils/Utils.javacore/src/main/java/org/opensearch/sql/ast/expression/Argument.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
core/src/main/java/org/opensearch/sql/ast/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible
- Follow visitor pattern for AST traversal
- Ensure proper toString() implementation for debugging
Files:
core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4949.ymlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yamlinteg-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javainteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javainteg-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javainteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yamlcore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.javainteg-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javacore/src/test/java/org/opensearch/sql/utils/UtilsTest.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yamlinteg-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yamlinteg-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.jsoninteg-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yamlopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.javainteg-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.jsonopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.javainteg-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Prefer `Optional<T>` for nullable returns in Java
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧬 Code graph analysis (5)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java (5)
integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java (1)
TestsConstants(9-101)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteInformationSchemaCommandIT.java (1)
Ignore(11-18)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteJsonFunctionsIT.java (1)
Ignore(11-18)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePrometheusDataSourceCommandsIT.java (1)
Ignore(11-18)integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteShowDataSourcesCommandIT.java (1)
Ignore(11-18)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
MatcherUtils(43-502)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java (3)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableIndexScanRule.java (1)
EnumerableIndexScanRule(17-54)opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableSystemIndexScanRule.java (1)
EnumerableSystemIndexScanRule(20-50)opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
EnumerableNestedAggregateRule(22-67)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java (1)
OpenSearchIndexRules(12-81)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java (1)
EnumerableNestedAggregate(31-96)
🪛 Biome (2.1.2)
integ-test/src/test/resources/nested_simple.json
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 6-7: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 7-8: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 8-9: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 9-10: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
🪛 LanguageTool
docs/user/ppl/cmd/explain.md
[style] ~13-~13: As a shorter alternative for ‘able to’, consider using “can”.
Context: ... plus generated code, if the whole plan is able to pushdown, it equals to standard mode. *...
(BE_ABLE_TO)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Lantao Jin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
1076-1085: Add defensive field existence check beforerelBuilder.field(root).The
relBuilder.field(root)call on line 1084 throwsIllegalArgumentExceptionif the field doesn't exist. Extracting the root portion viasubstringBefore(".")may produce field names that don't exist as standalone fields in the schema—for example, extracting "a" from "a.b" when only the nested form "a.b" exists in the row type.🔎 Proposed fix with defensive check
private boolean containsNestedAggregator(RelBuilder relBuilder, List<RexInputRef> aggCallRefs) { + List<String> fieldNames = relBuilder.peek().getRowType().getFieldNames(); return aggCallRefs.stream() - .map(r -> relBuilder.peek().getRowType().getFieldNames().get(r.getIndex())) + .map(r -> fieldNames.get(r.getIndex())) .map(name -> org.apache.commons.lang3.StringUtils.substringBefore(name, ".")) - .anyMatch(root -> relBuilder.field(root).getType().getSqlTypeName() == SqlTypeName.ARRAY); + .filter(fieldNames::contains) + .anyMatch(root -> relBuilder.field(root).getType().getSqlTypeName() == SqlTypeName.ARRAY); }
🧹 Nitpick comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)
175-187: Consider more specific error message verification.The error message verification on line 186 checks for a generic phrase. If the actual exception message contains more specific details about why pushdown cannot be applied (e.g., specific field names or conditions), consider verifying those details to make the test more robust and catch potential regressions in error reporting.
core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java (2)
39-47: Consider replacingassertwith explicit validation.The
assertstatement (line 40-41) is disabled by default in production JVMs unless the-eaflag is set. If this precondition is critical, consider throwingIllegalArgumentExceptionorIllegalStateExceptioninstead for reliable runtime validation.🔎 Proposed fix
public static void addIgnoreNullBucketHintToAggregate(RelBuilder relBuilder) { - assert relBuilder.peek() instanceof LogicalAggregate - : "Hint HINT_AGG_ARGUMENTS can be added to LogicalAggregate only"; + if (!(relBuilder.peek() instanceof LogicalAggregate)) { + throw new IllegalStateException("Hint AGG_ARGS can be added to LogicalAggregate only"); + } final RelHint statHint =
45-47: Consider extracting common hint strategy table setup.Both
addIgnoreNullBucketHintToAggregateandaddNestedAggCallHintToAggregatehave identical logic for ensuring the hint strategy table is set. This could be extracted to a private helper method to reduce duplication.🔎 Proposed refactor
+ private static void ensureHintStrategyTableInstalled(RelBuilder relBuilder) { + if (relBuilder.getCluster().getHintStrategies() == HintStrategyTable.EMPTY) { + relBuilder.getCluster().setHintStrategies(HINT_STRATEGY_TABLE.get()); + } + } + public static void addIgnoreNullBucketHintToAggregate(RelBuilder relBuilder) { // ... validation and hint creation ... relBuilder.hints(statHint); - if (relBuilder.getCluster().getHintStrategies() == HintStrategyTable.EMPTY) { - relBuilder.getCluster().setHintStrategies(HINT_STRATEGY_TABLE.get()); - } + ensureHintStrategyTableInstalled(relBuilder); }Also applies to: 60-62
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javadocs/user/ppl/limitations/limitations.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
💤 Files with no reviewable changes (1)
- core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
🚧 Files skipped from review as they are similar to previous changes (2)
- integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
- opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (9)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javacore/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javacore/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
🧬 Code graph analysis (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
MatcherUtils(43-502)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java (1)
CalciteEnumerableNestedAggregate(31-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: WhiteSource Security Check
🔇 Additional comments (14)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)
22-22: LGTM!The import is correctly used in the error handling test at line 2308.
2220-2309: Nested aggregation pushdown tests are properly structured with all expected plan files in place.All 7 referenced YAML files exist in
integ-test/src/test/resources/expectedOutput/calcite/. The tests follow consistent patterns withenabledOnlyWhenPushdownIsEnabled()guards, proper YAML plan comparisons viaassertYamlEqualsIgnoreId(), and are backed by test data loaded in the classinit()method (Index.NESTED_SIMPLE).The test coverage appropriately exercises nested aggregation pushdown scenarios:
- Multiple metrics with grouping (stats count, min, max, avg)
- Single field aggregations (count on nested.area)
- Aggregation with sorting
- Grouping with both bucket_nullable states
- Top command on nested fields
- Correct handling of dedup (verified as NOT pushed)
- Error handling when aggregation cannot be applied
Consider expanding coverage to also test:
- Multiple nested paths in a single query (e.g., stats on both
address.areaand another nested path)- Nested aggregation with filter predicates (e.g., where conditions before stats)
- Nested fields combined with eval expressions
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)
646-646: LGTM!Clean refactor that delegates hint inspection to the new
PPLHintUtilsutility, consolidating hint-related logic in one place.core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java (1)
65-83: LGTM!The accessor methods follow Calcite's hint inspection patterns correctly. The string-based boolean handling ("true"/"false") is consistent with how Calcite's
RelHint.kvOptionsstores values.opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
22-65: LGTM!The converter rule follows Calcite's
ConverterRulepatterns correctly:
- Uses hint-based detection via
PPLHintUtils.hasNestedAggCallto identify nested aggregation scenarios.- Returns
nullwhen the rule shouldn't apply (no hint) or on validation failure, which is the expected Calcite rule behavior.- Debug logging for
InvalidRelExceptionis appropriate for troubleshooting without cluttering production logs.opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (3)
181-198: LGTM!The
buildSchema()method correctly combines base fields with resolved nested paths usingUtils.resolveNestedPath. The use ofdistinct()prevents duplicate nested paths, and the method is appropriately scoped as private.
379-388: LGTM - Core fix for nested aggregation handling.This guard correctly prevents aggregation pushdown when bucket fields are of ARRAY (nested) type, which addresses the root cause of bucket-dependent aggregations failing on nested fields. The debug log provides useful diagnostic information.
390-390: LGTM!The inversion logic is correct: when
ignoreNullBuckethint istrue, bucket values should exclude nulls, hencebucketNullable = false.core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (3)
1066-1067: LGTM!The hint application logic correctly applies
ignoreNullBucketandnestedAggCallhints after the aggregate is built, following the patterns established inPPLHintUtils.
1192-1222: LGTM!The refactored aggregation flow:
- Renamed variable to
hintIgnoreNullBucketimproves clarity by matching the hint semantics.- Conditional filter application (lines 1213-1219) is a good optimization that avoids building unnecessary filter nodes when
nonNullFieldsis empty.- The hint flag is correctly propagated to
aggregateWithTrimming.
2350-2362: LGTM!The
visitRareTopNchanges are consistent with the refactoring invisitAggregation, correctly using the renamedhintIgnoreNullBucketvariable.opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java (3)
60-75: LGTM -copymethod correctly overridden.Based on learnings, custom Calcite RelNode classes must override the
copymethod to prevent downgrading to the parent class type during copy operations. This implementation correctly returns a newCalciteEnumerableNestedAggregateinstance, preserving the custom behavior.
77-83: Intentional placeholder for pushdown-only execution.The
implementmethod throwingUnsupportedOperationExceptionis intentional per the PR design. This class exists to represent nested aggregations in the Calcite plan; if the pushdown rules don't convert it to an OpenSearch-native execution, this exception clearly communicates that nested aggregation requires pushdown support. The error message includesaggCallsfor debugging context.
85-88: LGTM!The 0.9 cost multiplier is a standard Calcite pattern to make this RelNode slightly preferred by the optimizer when nested aggregation is applicable, guiding plan selection toward the pushdown-capable implementation.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
Show resolved
Hide resolved
Signed-off-by: Lantao Jin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)
84-84: Typo in variable name:dedupNumer→dedupNumber.Minor naming inconsistency that could cause confusion during maintenance.
🔎 Proposed fix
- Integer dedupNumer = literal.getValueAs(Integer.class); + Integer dedupNumber = literal.getValueAs(Integer.class);Also update the usage at line 136:
- relBuilder.aggregate( - relBuilder.groupKey(relBuilder.fields(newGroupByList)), relBuilder.literalAgg(dedupNumer)); + relBuilder.aggregate( + relBuilder.groupKey(relBuilder.fields(newGroupByList)), relBuilder.literalAgg(dedupNumber));
♻️ Duplicate comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (3)
22-29: JavaDoc is missing for public class and methods.Past review comments already flag this issue. Please add JavaDoc for the public class and the
init()method as previously requested.
150-157: Remove debug print statement.Past review comments already flag the
System.out.printlnon line 155. Please remove it as previously requested.
159-171: Remove debug print statement.Past review comments already flag the
System.out.printlnon line 165. Please remove it as previously requested.
🧹 Nitpick comments (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)
110-117: Consider adding a debug log for consistency.The MAP/ARRAY fallback (lines 75-77) logs a debug message explaining why pushdown was skipped, but this nested-field fallback silently returns. For debugging/troubleshooting parity, consider adding a similar log statement.
🔎 Suggested improvement
if (targetProjections.stream() .anyMatch( pair -> Utils.resolveNestedPath(pair.getValue(), scan.getOsIndex().getFieldTypes()) != null)) { - // fallback to non-pushdown if the dedup columns contain nested fields. + LOG.debug("Cannot pushdown the dedup since the dedup fields contain nested types"); return; }integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)
78-93: Consider adding schema verification for consistency.While the test correctly verifies data rows, it lacks schema verification unlike other similar tests in this class. Adding schema verification would make the test more comprehensive and consistent.
🔎 Suggested enhancement
@Test public void testNestedAggregationBySpan() throws IOException { enabledOnlyWhenPushdownIsEnabled(); JSONObject actual = executeQuery( String.format( "source=%s | stats count(address.area) as count_area, min(address.area) as" + " min_area, max(address.area) as max_area, avg(address.area) as avg_area," + " avg(age) as avg_age by span(age, 10)", TEST_INDEX_NESTED_SIMPLE)); + verifySchemaInOrder( + actual, + isCalciteEnabled() ? schema("count_area", "bigint") : schema("count_area", "int"), + schema("min_area", "double"), + schema("max_area", "double"), + schema("avg_area", "double"), + schema("avg_age", "double"), + schema("span(age,10)", "int")); verifyDataRows( actual, rows(0, null, null, null, 19, 10), rows(7, 10.24, 429.79, 241.43714285714285, 25, 20), rows(2, 9.99, 1000.99, 505.49, 32, 30)); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/user/ppl/limitations/limitations.mdinteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.javaopensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/user/ppl/limitations/limitations.md
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.javaopensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-415)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (13)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (5)
9-9: LGTM!The new imports are appropriately added and all are actively used within the file.
Also applies to: 44-44, 66-66
128-138: LGTM!The refactoring to use centralized rule lists from
OpenSearchIndexRulesimproves maintainability. TheNESTED_AGGREGATE_RULEis now properly included in the non-pushdown rules, which aligns with the PR objective of supporting nested aggregations.
140-150: LGTM!Using
buildSchema()instead of directly accessing field names ensures that nested paths are properly included in the schema for filter analysis.
373-407: LGTM with observation.The changes correctly implement:
- Schema building with nested path resolution
- Guard against ARRAY (nested) types in bucket fields with appropriate TODO reference
- Cleaner hint-based null bucket handling via
PPLHintUtils.ignoreNullBucket()- Proper bucket names extraction for
AggPushDownActionThe nested type guard (lines 380-389) properly filters out null types before checking, which handles computed fields not in the schema. As per the coding guidelines, the Calcite-specific workaround is documented via the TODO comment.
181-198: No issues with this code. ThebuildSchema()method is well-documented and correctly implements nested path resolution. The project targets Java 21, which fully supports.toList()(introduced in Java 16), and this pattern is used throughout the codebase.Likely an incorrect or invalid review comment.
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (2)
69-79: LGTM!The extended type check correctly guards against both MAP and ARRAY types, with a clear TODO reference for future enhancement and an appropriately updated log message.
139-139: Method signature confirmed in PPLHintUtils. The migration fromPlanUtilstoPPLHintUtilsis correctly implemented. The method signatureaddIgnoreNullBucketHintToAggregate(RelBuilder relBuilder)matches the usage at line 139, and consistent usage is found across the codebase (e.g., CalciteRelNodeVisitor.java line 1066).integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (6)
31-49: LGTM! Comprehensive nested aggregation test.The test correctly guards with
enabledOnlyWhenPushdownIsEnabled()and uses conditional schema verification to account for Calcite type differences. The assertions cover both schema and data validation comprehensively.
51-76: LGTM! Test coverage for grouped aggregation is thorough.The test validates nested aggregation with grouping and verifies multiple expected result rows, ensuring the "by" clause works correctly with nested fields.
95-104: LGTM! Focused test for count aggregation.The test appropriately validates count operations on nested fields with a clear, focused assertion.
106-133: LGTM! Comprehensive test for nested path grouping.The test thoroughly validates aggregation grouped by a nested path with extensive data row verification, ensuring correct behavior across multiple result rows.
135-148: LGTM! Test validates nested root grouping.The test correctly verifies that aggregation grouped by the nested root object works, with appropriate schema and row count validation.
173-185: LGTM! Important test for error handling.The test correctly validates that an
UnsupportedOperationExceptionis thrown when nested aggregation pushdown cannot be applied, using the appropriateassertThrowsWithReplaceutility and verifying the error message content.
| // Used to track the current sub-builder as analysis progresses | ||
| Builder subBuilder = newMetricBuilder; | ||
| // Push auto date span & case in group-by list into nested aggregations | ||
| // Push auto date span & case in group-by list into structured aggregations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid ambiguous, we call the multiple sub-aggregations structured aggregations instead of nested
|
|
||
| private static Pair<Builder, List<MetricParser>> processAggregateCalls( | ||
| List<String> aggFieldNames, | ||
| List<String> aggNames, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the value could be min(a.b), count(c), rename aggFieldNames to aggNames.
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4979-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 77633ef589d839c57c4c00fdc8a86d70a02d74d8
# Push it to GitHub
git push --set-upstream origin backport/backport-4979-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
) * refactor: throw exception if pushdown cannot be applied Signed-off-by: Lantao Jin <[email protected]> * fix tests Signed-off-by: Lantao Jin <[email protected]> * fix IT Signed-off-by: Lantao Jin <[email protected]> * Support top/dedup/aggregate by nested sub-fields Signed-off-by: Lantao Jin <[email protected]> * fix typo Signed-off-by: Lantao Jin <[email protected]> * address comments Signed-off-by: Lantao Jin <[email protected]> * minor fixing Signed-off-by: Lantao Jin <[email protected]> --------- Signed-off-by: Lantao Jin <[email protected]> (cherry picked from commit 77633ef)
Description
Refactor implementation for PPL: #3696 (closed)
Deprecated implementation for SQL: #2814 (closed)
Support nested aggregation in PPL only when calcite enabled.
With this PR, follow PPL query is able to execute a nested aggregation query.
And it equals the DSL
Follow queries (group-by nested path) are supported with this PR as well:
Limitation:
Related Issues
Resolves #4949, #4564, #2813 and #2529, and #2739
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.