Skip to content

Conversation

@LantaoJin
Copy link
Member

@LantaoJin LantaoJin commented Dec 19, 2025

Description

Refactor implementation for PPL: #3696 (closed)
Deprecated implementation for SQL: #2814 (closed)

Support nested aggregation in PPL only when calcite enabled.

With this PR, follow PPL query is able to execute a nested aggregation query.

source=logs | head 10000 | stats min(pages.load_time)

And it equals the DSL

GET logs/_search
{
  "aggs": {
    "pages": {
      "nested": {
        "path": "pages"
      },
      "aggs": {
        "min_load_time": { "min": { "field": "pages.load_time" } }
      }
    }
  }
}

Follow queries (group-by nested path) are supported with this PR as well:

source=test | top pages.load_time
source=test | stats count() by pages.load_time
source=test | dedup pages.load_time

Limitation:

  • PPL only
  • Calcite should be enabled
  • Throw UnsupportedOperationException if pushdown cannot be applied.
  • Follow queries (group-by nested root path) are supported with this PR without pushdown enahncement:
source=test | top pages
source=test | stats count() by pages
source=test | dedup pages

Related Issues

Resolves #4949, #4564, #2813 and #2529, and #2739

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 19, 2025

📝 Walkthrough

Walkthrough

Adds nested-aggregation support and hint-based aggregation metadata across Calcite planning and OpenSearch pushdown layers, including nested-path resolution utilities, new hint utilities, a converter rule and physical nested aggregate, pushdown adaptations, response parsing updates, and integration tests/resources.

Changes

Cohort / File(s) Summary
Calcite hint utils & visitor
core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java, core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
New PPLHintUtils to attach/query ignoreNullBucket and hasNestedAggCall hints; CalciteRelNodeVisitor now computes and propagates hintIgnoreNullBucket, detects nested aggregators, and applies both hints during aggregation planning.
Calcite prepare & removed strategy
core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java, core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
Replaced protected factory API with public createPrepare(); removed standalone PPLHintStrategyTable (memoized table moved/handled in PPLHintUtils).
Plan utilities & schema helpers
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java, core/src/main/java/org/opensearch/sql/utils/Utils.java, core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
Removed legacy add-ignore-null hint helper; added Utils.resolveNestedPath and zipWithIndex plus unit tests for nested-path resolution.
Index scan & rule registry
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java, opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java
buildSchema() added to include nested-paths; pushdown rule registration updated to separate non-pushdown/pushdown rule sets; bucket nullability now derived via PPLHintUtils.
New nested-agg conversion & physical rel
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java, opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
Added ConverterRule to convert hinted LogicalAggregate to a new CalciteEnumerableNestedAggregate physical rel (validation, cost tweak, implement() unimplemented).
Aggregate analysis & pushdown actions
opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java, opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.java, opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
Renamed aggFieldName→aggName; integrated Utils.resolveNestedPath to detect nested paths; AggPushDownAction unwraps/wraps NestedAggregationBuilder and adjusts composite/nested handling across push/re-push logic.
Predicate & response parsing
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java, opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java, opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java
NamedFieldExpression gains nestedPath (via Utils.resolveNestedPath); SimpleQueryExpression emits nestedQuery when present; parsers unwrap Nested/InternalNested aggregations before parsing.
Dedup & other planner rule tweaks
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
Dedup eligibility broadened to ARRAY/MAP; prevents pushdown when dedup columns resolve to nested paths; hint handling switched to PPLHintUtils.
Response/error handling & test support
opensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.java, opensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
Improved cause unwrapping and wrapping for non-OpenSearch exceptions; AggregationResponseUtils registers Nested parsing support.
Docs, tests & resources
docs/...; integ-test/src/... (many explain YAML/JSON, nested_simple mapping & data), integ-test/src/test/java/.../CalcitePPLNestedAggregationIT.java, integ-test/src/test/java/.../CalciteExplainIT.java, others
Added/updated integration tests and expected explain outputs for nested aggregation scenarios; adjusted various integration tests and fixtures; documentation updates to explain examples and limitations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant CalciteRelNodeVisitor
    participant PPLHintUtils
    participant CalciteLogicalIndexScan
    participant EnumerableNestedAggregateRule
    participant AggregateAnalyzer
    participant OpenSearch

    Client->>CalciteRelNodeVisitor: visitAggregation(rel)
    CalciteRelNodeVisitor->>CalciteRelNodeVisitor: collect aggCallRefs
    CalciteRelNodeVisitor->>CalciteRelNodeVisitor: compute hintIgnoreNullBucket
    alt nested aggregator detected
        CalciteRelNodeVisitor->>PPLHintUtils: addNestedAggCallHintToAggregate(relBuilder)
    end
    alt hintIgnoreNullBucket true
        CalciteRelNodeVisitor->>PPLHintUtils: addIgnoreNullBucketHintToAggregate(relBuilder)
    end
    CalciteRelNodeVisitor->>CalciteLogicalIndexScan: pushDownAggregate(aggregate, hints)
    CalciteLogicalIndexScan->>CalciteLogicalIndexScan: buildSchema() / resolveNestedPath()
    CalciteLogicalIndexScan->>EnumerableNestedAggregateRule: rule matching (hints)
    EnumerableNestedAggregateRule->>EnumerableNestedAggregateRule: convert -> CalciteEnumerableNestedAggregate
    CalciteLogicalIndexScan->>AggregateAnalyzer: analyze(agg, bucketNames, nested info)
    AggregateAnalyzer->>AggregateAnalyzer: resolveNestedPath & build aggregations (may wrap NestedAggregationBuilder)
    CalciteLogicalIndexScan->>OpenSearch: execute pushdown with nested aggregation
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Possibly related PRs

Suggested labels

PPL, pushdown

Suggested reviewers

  • penghuo
  • anirudha
  • derek-ho
  • kavithacm
  • joshuali925

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.42% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Support nested aggregation when calcite enabled' accurately describes the main change: adding nested aggregation support for PPL when Calcite is enabled.
Linked Issues check ✅ Passed The PR code changes implement nested aggregation support for PPL when Calcite is enabled, directly addressing issue #4949 which requires bucket-dependent aggregations (top, stats by, dedup) to work correctly on nested fields.
Out of Scope Changes check ✅ Passed All changes are scoped to implementing nested aggregation support: hint utilities, rule additions, aggregation analyzers, schema handling, and comprehensive tests aligned with the PR objectives.
Description check ✅ Passed The pull request description clearly explains the feature (nested aggregation support in PPL), provides examples of supported queries, documents known limitations, and references related issues.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Lantao Jin <[email protected]>
@LantaoJin LantaoJin added the enhancement New feature or request label Dec 20, 2025
Signed-off-by: Lantao Jin <[email protected]>
Comment on lines -178 to -179
protected Function0<CalcitePrepare> createPrepareFactory() {
return OpenSearchPrepareImpl::new;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After upgraded to 1.41. this method was not called any more, change it to createPrepare()


@Override
protected PreparedResult implement(RelRoot root) {
Hook.PLAN_BEFORE_IMPLEMENTATION.run(root);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hook was called twice for non-full-scannable plan

curl -sS -H 'Content-Type: application/json' \
-X POST localhost:9200/_plugins/_ppl/_explain?format=extended \
-d '{"query" : "source=state_country | where age>30"}'
-d '{"query" : "source=state_country | head 10 | where age>30"}'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid whole-plan-pushdown

Comment on lines 36 to +39
if (result.getCause() == null) {
if (result.getSuppressed().length > 0) {
return result.getSuppressed()[0];
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some exception without cause (root exception), the actual root cause may be attached in its suppressed cause.

return new OpenSearchErrorMessage(exception, exception.status().getStatus());
}
return new ErrorMessage(e, status);
return new ErrorMessage(cause, status);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to display the root cause instead of wrapping exception.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (11)
docs/user/ppl/interfaces/endpoint.md (1)

114-127: Clarify the rationale for the example query change and execution strategy.

Line 114 changes the example query from where age>30 to head 10 | where age>30, which fundamentally alters the execution plan and demonstrates a different execution strategy (enumerable-based with runtime filtering instead of pure pushdown). While the previous review comment "avoid whole-plan-pushdown" suggests this is intentional, the documentation doesn't explain:

  1. Why this query pattern was chosen as the extended format example
  2. How the head 10 operation affects plan structure (introducing LogicalSort)
  3. Why the physical plan shifts to enumerable execution instead of full pushdown
  4. What the trade-offs are

Additionally, the YAML format example (line 139) still uses the original query without head 10, creating an inconsistency. Clarify whether both examples should use the same query or if they're intentionally demonstrating different scenarios.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

1076-1086: JavaDoc example is slightly inconsistent with the implementation.

The JavaDoc example shows aggCalls: [count(), count(a.b), avg(a.c)] but the method takes aggCallRefs (which are RexInputRef indices), not the original AggCall objects. Consider clarifying that the input is the list of RexInputRef extracted from aggCalls.

Also, the example shows aggCallRefs [1, 2] but doesn't explain why index 0 is excluded (likely because count() has no field reference). Adding this clarification would improve understanding.

core/src/main/java/org/opensearch/sql/utils/Utils.java (1)

18-27: Consider using ArrayList instead of LinkedList for better performance.

LinkedList has poor cache locality and higher memory overhead compared to ArrayList. Since you're only appending to the list and not inserting/removing from the middle, ArrayList would be more efficient:

🔎 Proposed refactor
 static <I> List<Pair<I, Integer>> zipWithIndex(List<I> input) {
-  LinkedList<Pair<I, Integer>> result = new LinkedList<>();
-  Iterator<I> iter = input.iterator();
-  int index = 0;
-  while (iter.hasNext()) {
-    result.add(Pair.of(iter.next(), index++));
+  List<Pair<I, Integer>> result = new ArrayList<>(input.size());
+  for (int i = 0; i < input.size(); i++) {
+    result.add(Pair.of(input.get(i), i));
   }
   return result;
 }
docs/user/ppl/cmd/explain.md (1)

13-13: Minor grammar improvement suggested.

The description could be clearer with improved grammar:

🔎 Proposed fix
-  * extended: Display the standard information plus generated code, if the whole plan is able to pushdown, it equals to standard mode.
+  * extended: Display the standard information plus generated code. If the whole plan can be pushed down, it is equivalent to standard mode.
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)

108-115: Good defensive check for nested field paths.

This additional guard using Utils.resolveNestedPath ensures that pushdown is also prevented when dedup columns reference nested sub-fields (e.g., nested.subfield), even if the immediate column type isn't ARRAY. This complements the type-based check above.

Consider adding a debug log message here similar to line 74 for consistency and debugging visibility:

🔎 Proposed enhancement
     if (targetProjections.stream()
         .anyMatch(
             pair ->
                 Utils.resolveNestedPath(pair.getValue(), scan.getOsIndex().getFieldTypes())
                     != null)) {
+      LOG.debug("Cannot pushdown the dedup since the dedup fields contain nested sub-fields");
       // fallback to non-pushdown if the dedup columns contain nested fields.
       return;
     }
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java (1)

53-66: Add defensive check for empty inner aggregations.

When the first aggregation is InternalNested, the code assumes there's at least one inner aggregation. If the nested aggregation has no inner aggregations, iterator().next() will throw NoSuchElementException.

🔎 Proposed defensive check
   @Override
   public List<Map<String, Object>> parse(Aggregations aggregations) {
     Aggregation agg;
     if (aggregations.asList().getFirst() instanceof InternalNested) {
-      agg = ((InternalNested) aggregations.asList().getFirst()).getAggregations().iterator().next();
+      Aggregations innerAggs = ((InternalNested) aggregations.asList().getFirst()).getAggregations();
+      if (innerAggs.asList().isEmpty()) {
+        return List.of();
+      }
+      agg = innerAggs.iterator().next();
     } else {
       agg = aggregations.asList().getFirst();
     }
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java (1)

54-56: Add defensive check for empty nested aggregations.

Similar to BucketAggregationParser, this code assumes the nested aggregation has at least one inner aggregation. getFirst() will throw NoSuchElementException if the list is empty.

🔎 Proposed defensive check
     for (Aggregation aggregation : aggregations) {
       if (aggregation instanceof Nested) {
-        aggregation = ((Nested) aggregation).getAggregations().asList().getFirst();
+        List<Aggregation> innerAggs = ((Nested) aggregation).getAggregations().asList();
+        if (innerAggs.isEmpty()) {
+          continue; // Skip empty nested aggregations
+        }
+        aggregation = innerAggs.getFirst();
       }
core/src/test/java/org/opensearch/sql/utils/UtilsTest.java (1)

17-92: Good test coverage for Utils.resolveNestedPath.

The tests comprehensively cover null/empty fieldTypes, ARRAY type at various nesting levels, STRUCT-only paths, and edge cases. Test naming follows the expected pattern.

Consider adding a test for null path parameter to verify the method handles it gracefully:

Suggested additional test
@Test
void testResolveNestedPathWithNullPath() {
    Map<String, ExprType> fieldTypes = new HashMap<>();
    fieldTypes.put("a", ExprCoreType.ARRAY);
    assertNull(Utils.resolveNestedPath(null, fieldTypes));
}
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (1)

1204-1212: Correct nested query wrapping in SimpleQueryExpression.builder().

The logic properly wraps the query in a nestedQuery when nestedPath is present. Using ScoreMode.None is appropriate since these are filter predicates that don't need scoring.

Minor: Consider using rel.getNestedPath() instead of direct field access rel.nestedPath for consistency with the accessor pattern:

-      if (rel != null && rel.nestedPath != null) {
-        return nestedQuery(rel.nestedPath, builder, ScoreMode.None);
+      if (rel != null && rel.getNestedPath() != null) {
+        return nestedQuery(rel.getNestedPath(), builder, ScoreMode.None);
       }
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)

623-640: Verify assertion condition in addNestedHintToAggregate.

The assertion at line 632 expects indicesWithArgList.size() == nestedList.size(). This assumes callers always pass a nestedList with exactly as many elements as there are aggregation calls with non-empty arg lists.

If this invariant is violated (e.g., by a bug in calling code), the assertion will fail in development but could cause silent issues in production if assertions are disabled.

Consider adding a defensive check or clearer exception:

if (indicesWithArgList.size() != nestedList.size()) {
    throw new IllegalArgumentException(
        String.format("Nested list size %d does not match aggregate calls with args %d",
            nestedList.size(), indicesWithArgList.size()));
}
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java (1)

76-90: Improve error message clarity for users.

The implement method correctly extracts nested aggregation hints and throws an appropriate exception when pushdown cannot be applied. However, the error message could be more user-friendly.

🔎 Consider enhancing the error message

The current error message includes a technical list of AggregateCall objects which may not be clear to users. Consider a more descriptive format:

 throw new UnsupportedOperationException(
     String.format(
-        "Nested aggregate is unsupported when pushdown cannot be applied for %s.",
-        nestedAggIndices.stream().map(aggCalls::get).toList()));
+        "Nested aggregate is unsupported when pushdown cannot be applied. "
+            + "Nested aggregations at indices %s require pushdown to be enabled and applicable.",
+        nestedAggIndices));

This avoids printing the entire AggregateCall toString representation and provides clearer guidance.

Note: This is a TODO placeholder implementation per line 78 comment.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08be6f9 and 95ddffb.

📒 Files selected for processing (42)
  • core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/utils/Utils.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • docs/user/ppl/cmd/explain.md
  • docs/user/ppl/interfaces/endpoint.md
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/indexDefinitions/nested_simple_index_mapping.json
  • integ-test/src/test/resources/nested_simple.json
  • integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4949.yml
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
🧰 Additional context used
📓 Path-based instructions (11)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • core/src/main/java/org/opensearch/sql/utils/Utils.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.java
  • core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/BucketAggregationParser.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/MetricParserHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/response/error/ErrorMessageFactory.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • core/src/main/java/org/opensearch/sql/utils/Utils.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateFilterAnalyzer.java
  • core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • opensearch/src/test/java/org/opensearch/sql/opensearch/response/AggregationResponseUtils.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
integ-test/src/test/resources/**/*

⚙️ CodeRabbit configuration file

integ-test/src/test/resources/**/*: - Verify test data is realistic and representative

  • Check data format matches expected schema
  • Ensure test data covers edge cases and boundary conditions

Files:

  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yaml
  • integ-test/src/test/resources/indexDefinitions/nested_simple_index_mapping.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yaml
  • integ-test/src/test/resources/nested_simple.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

  • Public methods MUST have JavaDoc with @param, @return, and @throws
  • Follow existing function implementation patterns in the same package
  • New expression functions should follow ExpressionFunction interface patterns
  • Validate function naming follows project conventions (camelCase)

Files:

  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/utils/Utils.java
  • core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
core/src/main/java/org/opensearch/sql/ast/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible

  • Follow visitor pattern for AST traversal
  • Ensure proper toString() implementation for debugging

Files:

  • core/src/main/java/org/opensearch/sql/ast/expression/Argument.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

  • Suggest extracting complex logic into helper methods
  • Check for code organization and logical grouping
  • Validate all RelNode types are handled

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4949.yml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yaml
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/ppl/CastFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • core/src/test/java/org/opensearch/sql/utils/UtilsTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLStringBuiltinFunctionIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteArrayFunctionIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/CalciteToolsHelper.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_dedup_not_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_top_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_single_count_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_by_bucket_nullable_push.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_skip_script_encoding.json
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_partial_filter_isnull.json
  • integ-test/src/test/resources/expectedOutput/calcite/explain_nested_agg_sort_push.yaml
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_partial_filter_isnull.json
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_extended_for_standardization.json
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Prefer `Optional<T>` for nullable returns in Java

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧬 Code graph analysis (5)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStatsCommandIT.java (5)
integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java (1)
  • TestsConstants (9-101)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteInformationSchemaCommandIT.java (1)
  • Ignore (11-18)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteJsonFunctionsIT.java (1)
  • Ignore (11-18)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePrometheusDataSourceCommandsIT.java (1)
  • Ignore (11-18)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteShowDataSourcesCommandIT.java (1)
  • Ignore (11-18)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
  • MatcherUtils (43-502)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java (3)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableIndexScanRule.java (1)
  • EnumerableIndexScanRule (17-54)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableSystemIndexScanRule.java (1)
  • EnumerableSystemIndexScanRule (20-50)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
  • EnumerableNestedAggregateRule (22-67)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/OpenSearchIndexRules.java (1)
  • OpenSearchIndexRules (12-81)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/EnumerableNestedAggregate.java (1)
  • EnumerableNestedAggregate (31-96)
🪛 Biome (2.1.2)
integ-test/src/test/resources/nested_simple.json

[error] 2-3: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 6-7: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 7-8: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 8-9: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 9-10: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

🪛 LanguageTool
docs/user/ppl/cmd/explain.md

[style] ~13-~13: As a shorter alternative for ‘able to’, consider using “can”.
Context: ... plus generated code, if the whole plan is able to pushdown, it equals to standard mode. *...

(BE_ABLE_TO)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)

Signed-off-by: Lantao Jin <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

1076-1085: Add defensive field existence check before relBuilder.field(root).

The relBuilder.field(root) call on line 1084 throws IllegalArgumentException if the field doesn't exist. Extracting the root portion via substringBefore(".") may produce field names that don't exist as standalone fields in the schema—for example, extracting "a" from "a.b" when only the nested form "a.b" exists in the row type.

🔎 Proposed fix with defensive check
  private boolean containsNestedAggregator(RelBuilder relBuilder, List<RexInputRef> aggCallRefs) {
+   List<String> fieldNames = relBuilder.peek().getRowType().getFieldNames();
    return aggCallRefs.stream()
-       .map(r -> relBuilder.peek().getRowType().getFieldNames().get(r.getIndex()))
+       .map(r -> fieldNames.get(r.getIndex()))
        .map(name -> org.apache.commons.lang3.StringUtils.substringBefore(name, "."))
-       .anyMatch(root -> relBuilder.field(root).getType().getSqlTypeName() == SqlTypeName.ARRAY);
+       .filter(fieldNames::contains)
+       .anyMatch(root -> relBuilder.field(root).getType().getSqlTypeName() == SqlTypeName.ARRAY);
  }
🧹 Nitpick comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)

175-187: Consider more specific error message verification.

The error message verification on line 186 checks for a generic phrase. If the actual exception message contains more specific details about why pushdown cannot be applied (e.g., specific field names or conditions), consider verifying those details to make the test more robust and catch potential regressions in error reporting.

core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java (2)

39-47: Consider replacing assert with explicit validation.

The assert statement (line 40-41) is disabled by default in production JVMs unless the -ea flag is set. If this precondition is critical, consider throwing IllegalArgumentException or IllegalStateException instead for reliable runtime validation.

🔎 Proposed fix
 public static void addIgnoreNullBucketHintToAggregate(RelBuilder relBuilder) {
-   assert relBuilder.peek() instanceof LogicalAggregate
-       : "Hint HINT_AGG_ARGUMENTS can be added to LogicalAggregate only";
+   if (!(relBuilder.peek() instanceof LogicalAggregate)) {
+     throw new IllegalStateException("Hint AGG_ARGS can be added to LogicalAggregate only");
+   }
    final RelHint statHint =

45-47: Consider extracting common hint strategy table setup.

Both addIgnoreNullBucketHintToAggregate and addNestedAggCallHintToAggregate have identical logic for ensuring the hint strategy table is set. This could be extracted to a private helper method to reduce duplication.

🔎 Proposed refactor
+  private static void ensureHintStrategyTableInstalled(RelBuilder relBuilder) {
+    if (relBuilder.getCluster().getHintStrategies() == HintStrategyTable.EMPTY) {
+      relBuilder.getCluster().setHintStrategies(HINT_STRATEGY_TABLE.get());
+    }
+  }
+
   public static void addIgnoreNullBucketHintToAggregate(RelBuilder relBuilder) {
     // ... validation and hint creation ...
     relBuilder.hints(statHint);
-    if (relBuilder.getCluster().getHintStrategies() == HintStrategyTable.EMPTY) {
-      relBuilder.getCluster().setHintStrategies(HINT_STRATEGY_TABLE.get());
-    }
+    ensureHintStrategyTableInstalled(relBuilder);
   }

Also applies to: 60-62

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95ddffb and 085a584.

📒 Files selected for processing (12)
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • docs/user/ppl/limitations/limitations.md
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
💤 Files with no reviewable changes (1)
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintStrategyTable.java
🚧 Files skipped from review as they are similar to previous changes (2)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

  • Public methods MUST have JavaDoc with @param, @return, and @throws
  • Follow existing function implementation patterns in the same package
  • New expression functions should follow ExpressionFunction interface patterns
  • Validate function naming follows project conventions (camelCase)

Files:

  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

  • Suggest extracting complex logic into helper methods
  • Check for code organization and logical grouping
  • Validate all RelNode types are handled

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (9)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: In opensearch-project/sql, when creating custom Calcite RelNode classes that extend EnumerableLimitSort or other Calcite RelNode types, always override the `copy` method. Without overriding copy, the class will downgrade to its parent class type during copy operations, losing the custom implementation.

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java
🧬 Code graph analysis (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (1)
  • MatcherUtils (43-502)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java (1)
  • CalciteEnumerableNestedAggregate (31-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: WhiteSource Security Check
🔇 Additional comments (14)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)

22-22: LGTM!

The import is correctly used in the error handling test at line 2308.


2220-2309: Nested aggregation pushdown tests are properly structured with all expected plan files in place.

All 7 referenced YAML files exist in integ-test/src/test/resources/expectedOutput/calcite/. The tests follow consistent patterns with enabledOnlyWhenPushdownIsEnabled() guards, proper YAML plan comparisons via assertYamlEqualsIgnoreId(), and are backed by test data loaded in the class init() method (Index.NESTED_SIMPLE).

The test coverage appropriately exercises nested aggregation pushdown scenarios:

  • Multiple metrics with grouping (stats count, min, max, avg)
  • Single field aggregations (count on nested.area)
  • Aggregation with sorting
  • Grouping with both bucket_nullable states
  • Top command on nested fields
  • Correct handling of dedup (verified as NOT pushed)
  • Error handling when aggregation cannot be applied

Consider expanding coverage to also test:

  • Multiple nested paths in a single query (e.g., stats on both address.area and another nested path)
  • Nested aggregation with filter predicates (e.g., where conditions before stats)
  • Nested fields combined with eval expressions
core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java (1)

646-646: LGTM!

Clean refactor that delegates hint inspection to the new PPLHintUtils utility, consolidating hint-related logic in one place.

core/src/main/java/org/opensearch/sql/calcite/utils/PPLHintUtils.java (1)

65-83: LGTM!

The accessor methods follow Calcite's hint inspection patterns correctly. The string-based boolean handling ("true"/"false") is consistent with how Calcite's RelHint.kvOptions stores values.

opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/EnumerableNestedAggregateRule.java (1)

22-65: LGTM!

The converter rule follows Calcite's ConverterRule patterns correctly:

  • Uses hint-based detection via PPLHintUtils.hasNestedAggCall to identify nested aggregation scenarios.
  • Returns null when the rule shouldn't apply (no hint) or on validation failure, which is the expected Calcite rule behavior.
  • Debug logging for InvalidRelException is appropriate for troubleshooting without cluttering production logs.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (3)

181-198: LGTM!

The buildSchema() method correctly combines base fields with resolved nested paths using Utils.resolveNestedPath. The use of distinct() prevents duplicate nested paths, and the method is appropriately scoped as private.


379-388: LGTM - Core fix for nested aggregation handling.

This guard correctly prevents aggregation pushdown when bucket fields are of ARRAY (nested) type, which addresses the root cause of bucket-dependent aggregations failing on nested fields. The debug log provides useful diagnostic information.


390-390: LGTM!

The inversion logic is correct: when ignoreNullBucket hint is true, bucket values should exclude nulls, hence bucketNullable = false.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (3)

1066-1067: LGTM!

The hint application logic correctly applies ignoreNullBucket and nestedAggCall hints after the aggregate is built, following the patterns established in PPLHintUtils.


1192-1222: LGTM!

The refactored aggregation flow:

  1. Renamed variable to hintIgnoreNullBucket improves clarity by matching the hint semantics.
  2. Conditional filter application (lines 1213-1219) is a good optimization that avoids building unnecessary filter nodes when nonNullFields is empty.
  3. The hint flag is correctly propagated to aggregateWithTrimming.

2350-2362: LGTM!

The visitRareTopN changes are consistent with the refactoring in visitAggregation, correctly using the renamed hintIgnoreNullBucket variable.

opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableNestedAggregate.java (3)

60-75: LGTM - copy method correctly overridden.

Based on learnings, custom Calcite RelNode classes must override the copy method to prevent downgrading to the parent class type during copy operations. This implementation correctly returns a new CalciteEnumerableNestedAggregate instance, preserving the custom behavior.


77-83: Intentional placeholder for pushdown-only execution.

The implement method throwing UnsupportedOperationException is intentional per the PR design. This class exists to represent nested aggregations in the Calcite plan; if the pushdown rules don't convert it to an OpenSearch-native execution, this exception clearly communicates that nested aggregation requires pushdown support. The error message includes aggCalls for debugging context.


85-88: LGTM!

The 0.9 cost multiplier is a standard Calcite pattern to make this RelNode slightly preferred by the optimizer when nested aggregation is applicable, guiding plan selection toward the pushdown-capable implementation.

Signed-off-by: Lantao Jin <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)

84-84: Typo in variable name: dedupNumerdedupNumber.

Minor naming inconsistency that could cause confusion during maintenance.

🔎 Proposed fix
-    Integer dedupNumer = literal.getValueAs(Integer.class);
+    Integer dedupNumber = literal.getValueAs(Integer.class);

Also update the usage at line 136:

-    relBuilder.aggregate(
-        relBuilder.groupKey(relBuilder.fields(newGroupByList)), relBuilder.literalAgg(dedupNumer));
+    relBuilder.aggregate(
+        relBuilder.groupKey(relBuilder.fields(newGroupByList)), relBuilder.literalAgg(dedupNumber));
♻️ Duplicate comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (3)

22-29: JavaDoc is missing for public class and methods.

Past review comments already flag this issue. Please add JavaDoc for the public class and the init() method as previously requested.


150-157: Remove debug print statement.

Past review comments already flag the System.out.println on line 155. Please remove it as previously requested.


159-171: Remove debug print statement.

Past review comments already flag the System.out.println on line 165. Please remove it as previously requested.

🧹 Nitpick comments (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (1)

110-117: Consider adding a debug log for consistency.

The MAP/ARRAY fallback (lines 75-77) logs a debug message explaining why pushdown was skipped, but this nested-field fallback silently returns. For debugging/troubleshooting parity, consider adding a similar log statement.

🔎 Suggested improvement
     if (targetProjections.stream()
         .anyMatch(
             pair ->
                 Utils.resolveNestedPath(pair.getValue(), scan.getOsIndex().getFieldTypes())
                     != null)) {
-      // fallback to non-pushdown if the dedup columns contain nested fields.
+      LOG.debug("Cannot pushdown the dedup since the dedup fields contain nested types");
       return;
     }
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)

78-93: Consider adding schema verification for consistency.

While the test correctly verifies data rows, it lacks schema verification unlike other similar tests in this class. Adding schema verification would make the test more comprehensive and consistent.

🔎 Suggested enhancement
   @Test
   public void testNestedAggregationBySpan() throws IOException {
     enabledOnlyWhenPushdownIsEnabled();
     JSONObject actual =
         executeQuery(
             String.format(
                 "source=%s | stats count(address.area) as count_area, min(address.area) as"
                     + " min_area, max(address.area) as max_area, avg(address.area) as avg_area,"
                     + " avg(age) as avg_age by span(age, 10)",
                 TEST_INDEX_NESTED_SIMPLE));
+    verifySchemaInOrder(
+        actual,
+        isCalciteEnabled() ? schema("count_area", "bigint") : schema("count_area", "int"),
+        schema("min_area", "double"),
+        schema("max_area", "double"),
+        schema("avg_area", "double"),
+        schema("avg_age", "double"),
+        schema("span(age,10)", "int"));
     verifyDataRows(
         actual,
         rows(0, null, null, null, 19, 10),
         rows(7, 10.24, 429.79, 241.43714285714285, 25, 20),
         rows(2, 9.99, 1000.99, 505.49, 32, 30));
   }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 085a584 and ed94f0a.

📒 Files selected for processing (4)
  • docs/user/ppl/limitations/limitations.md
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/user/ppl/limitations/limitations.md
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (1)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: test-sql-cli-integration (21)
  • GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (13)
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (5)

9-9: LGTM!

The new imports are appropriately added and all are actively used within the file.

Also applies to: 44-44, 66-66


128-138: LGTM!

The refactoring to use centralized rule lists from OpenSearchIndexRules improves maintainability. The NESTED_AGGREGATE_RULE is now properly included in the non-pushdown rules, which aligns with the PR objective of supporting nested aggregations.


140-150: LGTM!

Using buildSchema() instead of directly accessing field names ensures that nested paths are properly included in the schema for filter analysis.


373-407: LGTM with observation.

The changes correctly implement:

  1. Schema building with nested path resolution
  2. Guard against ARRAY (nested) types in bucket fields with appropriate TODO reference
  3. Cleaner hint-based null bucket handling via PPLHintUtils.ignoreNullBucket()
  4. Proper bucket names extraction for AggPushDownAction

The nested type guard (lines 380-389) properly filters out null types before checking, which handles computed fields not in the schema. As per the coding guidelines, the Calcite-specific workaround is documented via the TODO comment.


181-198: No issues with this code. The buildSchema() method is well-documented and correctly implements nested path resolution. The project targets Java 21, which fully supports .toList() (introduced in Java 16), and this pattern is used throughout the codebase.

Likely an incorrect or invalid review comment.

opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/DedupPushdownRule.java (2)

69-79: LGTM!

The extended type check correctly guards against both MAP and ARRAY types, with a clear TODO reference for future enhancement and an appropriately updated log message.


139-139: Method signature confirmed in PPLHintUtils. The migration from PlanUtils to PPLHintUtils is correctly implemented. The method signature addIgnoreNullBucketHintToAggregate(RelBuilder relBuilder) matches the usage at line 139, and consistent usage is found across the codebase (e.g., CalciteRelNodeVisitor.java line 1066).

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLNestedAggregationIT.java (6)

31-49: LGTM! Comprehensive nested aggregation test.

The test correctly guards with enabledOnlyWhenPushdownIsEnabled() and uses conditional schema verification to account for Calcite type differences. The assertions cover both schema and data validation comprehensively.


51-76: LGTM! Test coverage for grouped aggregation is thorough.

The test validates nested aggregation with grouping and verifies multiple expected result rows, ensuring the "by" clause works correctly with nested fields.


95-104: LGTM! Focused test for count aggregation.

The test appropriately validates count operations on nested fields with a clear, focused assertion.


106-133: LGTM! Comprehensive test for nested path grouping.

The test thoroughly validates aggregation grouped by a nested path with extensive data row verification, ensuring correct behavior across multiple result rows.


135-148: LGTM! Test validates nested root grouping.

The test correctly verifies that aggregation grouped by the nested root object works, with appropriate schema and row count validation.


173-185: LGTM! Important test for error handling.

The test correctly validates that an UnsupportedOperationException is thrown when nested aggregation pushdown cannot be applied, using the appropriate assertThrowsWithReplace utility and verifying the error message content.

// Used to track the current sub-builder as analysis progresses
Builder subBuilder = newMetricBuilder;
// Push auto date span & case in group-by list into nested aggregations
// Push auto date span & case in group-by list into structured aggregations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid ambiguous, we call the multiple sub-aggregations structured aggregations instead of nested


private static Pair<Builder, List<MetricParser>> processAggregateCalls(
List<String> aggFieldNames,
List<String> aggNames,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the value could be min(a.b), count(c), rename aggFieldNames to aggNames.

@qianheng-aws qianheng-aws merged commit 77633ef into opensearch-project:main Jan 5, 2026
44 of 45 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4979-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 77633ef589d839c57c4c00fdc8a86d70a02d74d8
# Push it to GitHub
git push --set-upstream origin backport/backport-4979-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4979-to-2.19-dev.

LantaoJin added a commit to LantaoJin/search-plugins-sql that referenced this pull request Jan 5, 2026
)

* refactor: throw exception if pushdown cannot be applied

Signed-off-by: Lantao Jin <[email protected]>

* fix tests

Signed-off-by: Lantao Jin <[email protected]>

* fix IT

Signed-off-by: Lantao Jin <[email protected]>

* Support top/dedup/aggregate by nested sub-fields

Signed-off-by: Lantao Jin <[email protected]>

* fix typo

Signed-off-by: Lantao Jin <[email protected]>

* address comments

Signed-off-by: Lantao Jin <[email protected]>

* minor fixing

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
(cherry picked from commit 77633ef)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] top and other bucket-dependent aggregations don't work on nested fields

3 participants