docs: eval usage on site #1321

cristian-zlai · 2025-12-04T20:37:41Z

Summary

Docs for eval in test /dev loop
Plus a specific section for CI.

Checklist

Added Unit Tests
Covered by existing CI
Integration tested
Documentation update

Summary by CodeRabbit

Documentation
- Added a comprehensive Eval guide with configuration validation, quick schema checks, sample-data testing, local test environment setup, Docker usage, and CI/CD workflow examples.
- Expanded testing docs with pre-run validation, example commands/outputs, schema visualization, lineage, and testing workflows.
Chores
- Updated an environment version reference in compiled test artifacts to "latest."

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-04T20:37:47Z

Walkthrough

Added two documentation pages for Eval configuration validation and CI/local workflows for Zipline Hub; updated a compiled test model artifact by changing its common environment version string from "0.1.0+dev.piyush" to "latest".

Changes

Cohort / File(s)	Summary
Eval documentation `docs/source/running_on_zipline_hub/Eval.md`	New doc describing Eval configuration validation: quick schema checks, testing with sample data, local-eval via Docker, local Iceberg warehouse setup, and end-to-end workflows for development, PR validation, and backfill, plus CI examples (GitHub Actions, GitLab CI).
Test documentation update `docs/source/running_on_zipline_hub/Test.md`	Added Eval section for pre-run validation: quick schema validation, sample-data tests, example commands/outputs, output schema and lineage visualization.
Compiled test artifact `python/test/canary/compiled/models/gcp/listing.v1__2`	Single-line change: updated common env version from `0.1.0+dev.piyush` to `latest`. No behavior changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Focus review on examples/commands in Eval.md and Test.md for accuracy and copy/paste readiness.
Verify the env version change in python/test/canary/compiled/models/gcp/listing.v1__2 is intentional and won’t affect test expectations.

Poem

Docs arrive with careful light,
Schemas checked through day and night,
Local tests and CI cheer,
One small version moved from here,
Eval steps set — the path is bright. ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'docs: eval usage on site' clearly and concisely describes the main change: adding documentation about eval usage. It accurately reflects the core purpose of the pull request.
Description check	✅ Passed	The description includes the required Summary section with relevant details and a Checklist section matching the template. However, the summary is brief and the Documentation update checkbox is unchecked despite being a documentation PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch crf-eval-docs

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

docs/source/running_on_zipline_hub/Test.md (2)
38-58: Add language specifier to output code block.

The fenced code block lacks a language identifier. While this is a CLI output example, specify a language (e.g., text or plaintext) to comply with markdown linting rules.
-```
+```text
 🟢 Eval job finished successfully
 Join Configuration: gcp.demo.user_features__1
 ...
19-78: Content organization: Eval section duplicates Eval.md.

The new Eval section here mirrors the "Quick Schema Validation" and "Testing with Sample Data" sections in the companion Eval.md file. Consider whether Test.md should link to Eval.md instead, or summarize briefly with a reference, to avoid maintaining duplicate content.
docs/source/running_on_zipline_hub/Eval.md (2)
23-43: Add language specifier to CLI output code block.

The fenced code block for example CLI output lacks a language identifier. Add text or plaintext to comply with markdown linting rules.
-```
+```text
 🟢 Eval job finished successfully
 Join Configuration: gcp.demo.user_features__1
 ...
179-183: Document /ping health check endpoint.

Line 189 uses a /ping endpoint to verify service readiness, but this endpoint isn't documented in the setup section. Add a note that the local-eval service exposes this health check endpoint, or clarify the expected behavior.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1531426 and e3b8b2f.

📒 Files selected for processing (2)

docs/source/running_on_zipline_hub/Eval.md (1 hunks)
docs/source/running_on_zipline_hub/Test.md (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/source/running_on_zipline_hub/Test.md

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)

docs/source/running_on_zipline_hub/Eval.md (2)

167-177: Verify Docker image ziplineai/local-eval:latest availability.

The documentation references a Docker image that should be validated to exist in the project's registry or Docker Hub. Confirm the image is published and accessible in the intended CI environment.

1-311: Well-structured and comprehensive documentation.

The Eval feature documentation is clear, well-organized, and includes practical examples for development and CI/CD workflows. The progression from quick validation to advanced local testing is logical and helpful for users.

coderabbitai · 2025-12-04T20:41:03Z

docs/source/running_on_zipline_hub/Eval.md

+```python
+#!/usr/bin/env python3
+"""Build a local Iceberg warehouse with test data for Chronon Eval testing."""
+
+import os
+from datetime import datetime
+from pyspark.sql import SparkSession
+
+def epoch_millis(iso_timestamp):
+    """Convert ISO timestamp to epoch milliseconds"""
+    dt = datetime.fromisoformat(iso_timestamp.replace("Z", "+00:00"))
+    return int(dt.timestamp() * 1000)
+
+def build_warehouse(warehouse_path, catalog_name="ci_catalog"):
+    """Create Iceberg warehouse with test data"""
+
+    print(f"Creating test warehouse at: {warehouse_path}")
+    os.makedirs(warehouse_path, exist_ok=True)
+
+    # Initialize Spark with Iceberg support
+    spark = (
+        SparkSession.builder
+        .appName("chronon-test-warehouse-builder")
+        .master("local[*]")
+        .config("spark.jars.packages",
+                "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3")
+        .config("spark.sql.extensions",
+                "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
+        .config(f"spark.sql.catalog.{catalog_name}",
+                "org.apache.iceberg.spark.SparkCatalog")
+        .config(f"spark.sql.catalog.{catalog_name}.type", "hadoop")
+        .config(f"spark.sql.catalog.{catalog_name}.warehouse", warehouse_path)
+        .getOrCreate()
+    )
+
+    # Create namespace
+    print("Creating namespace 'data'...")
+    spark.sql("CREATE NAMESPACE IF NOT EXISTS data")
+    spark.sql(f"USE {catalog_name}")
+
+    # Create table with schema
+    print("Creating user_activities table...")
+    spark.sql("""
+        CREATE TABLE IF NOT EXISTS data.user_activities__0 (
+            user_id STRING,
+            event_time_ms BIGINT,
+            session_id STRING,
+            event_type STRING,
+            ds STRING
+        ) USING iceberg
+        PARTITIONED BY (ds)
+    """)
+
+    # Insert test data
+    user_activities_data = [
+        ("user_1", epoch_millis("2025-01-01T00:01:00Z"), "session_1", "view", "2025-01-01"),
+        ("user_2", epoch_millis("2025-01-01T00:02:00Z"), "session_2", "click", "2025-01-01"),
+    ]
+
+    df = spark.createDataFrame(
+        user_activities_data,
+        ["user_id", "event_time_ms", "session_id", "event_type", "ds"]
+    )
+
+    df.writeTo(f"{catalog_name}.data.user_activities__0").createOrReplace()
+    print(f"✓ Inserted {df.count()} rows into user_activities__0")
+
+    spark.stop()
+    print(f"\n✓ Warehouse created successfully at: {warehouse_path}")
+
+if __name__ == "__main__":
+    build_warehouse("/tmp/chronon-test-warehouse")
+```


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

PySpark 3.5.0 Iceberg 1.4.3 compatibility production ready

💡 Result:

Short answer: Yes — Iceberg 1.4.3 provides a Spark 3.5 runtime and is intended to be used with Spark/PySpark 3.5.x in production, provided you use the matching runtime JAR (iceberg-spark-runtime-3.5_* with the correct Scala binary) and validate your Spark build/Scala version and any storage/SDK bundles you need. [1][2]

Practical checklist before production deploy:

Use the iceberg-spark-runtime-3.5_2.12 or _2.13 JAR that matches your Spark/PySpark Scala build. [2]

Don’t include other Iceberg modules (iceberg-core, etc.) on the Spark classpath — only the runtime and required storage bundles (aws/gcp/azure) to avoid dependency conflicts. [1][2]

Test your workload (reads, writes, partitioning, concurrency/transactions, compaction/manifest operations) end-to-end in a staging environment with your Spark distribution and JVM/Scala version. [1][2]

Keep an eye on vendor compatibility (if using a distribution like Cloudera, Databricks, etc.), since some distros bundle specific Iceberg versions. [3]

Sources:
[1] Apache Iceberg releases (1.4.3 includes Spark 3.5 runtime).
[2] Apache Iceberg multi-engine support / runtime guidance (Spark 3.5 integration and runtime-jar guidance).
[3] Example vendor compatibility notes (Cloudera) — verify distro-specific pinned Iceberg/Spark combinations.

Verify Scala binary version and add documentation for Iceberg JAR dependency management.

Iceberg 1.4.3 with Spark 3.5.x is production-ready, but the example should clarify that iceberg-spark-runtime-3.5_2.12 assumes Scala 2.12. Verify your Spark build uses matching Scala 2.12 (or use _2.13 variant if needed). Additionally, document that only the runtime JAR should be on the classpath—avoid including other Iceberg modules to prevent dependency conflicts. Recommend end-to-end testing in a staging environment with your actual Spark distribution and storage backend (S3/GCS/Azure) before production use.

docs/source/running_on_zipline_hub/Eval.md

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

docs/source/running_on_zipline_hub/Test.md (1)

21-80: Consider generalizing the --conf path to cover all config types.

Since this page is about testing GroupBys, Joins and StagingQuerys, you might want the eval examples to mirror the backfill path pattern, e.g. compiled/{group_bys|staging_queries|joins}/{team}/{your_conf}, to make it clearer that eval works for more than joins.

docs/source/running_on_zipline_hub/Eval.md (1)

169-277: Tighten Docker usage in examples (image pinning and GitLab dind setup).

Two small robustness tweaks to consider:

Pin ziplineai/local-eval to a specific tag (and mention updating it over time) instead of :latest, so CI runs are reproducible and don’t silently change behavior on image updates.

In the GitLab CI example, image: python:3.11 plus services: docker:dind typically also requires installing the Docker CLI in the job image and setting DOCKER_HOST (per GitLab’s dind docs); calling docker run may otherwise fail.

These are minor, but making them explicit will help users copy the examples into real pipelines.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3b8b2f and 20f7e10.

⛔ Files ignored due to path filters (1)

docs/images/eval_sample.gif is excluded by !**/*.gif

📒 Files selected for processing (3)

docs/source/running_on_zipline_hub/Eval.md (1 hunks)
docs/source/running_on_zipline_hub/Test.md (1 hunks)
python/test/canary/compiled/models/gcp/listing.v1__2 (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/source/running_on_zipline_hub/Test.md

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (39)

GitHub Check: Test Spark (Scala 2.13.17) / streaming_tests
GitHub Check: Test Spark (Scala 2.13.17) / kv_store_tests
GitHub Check: Test Spark (Scala 2.13.17) / udafs_tests
GitHub Check: Test Spark (Scala 2.13.17) / analyzer_tests
GitHub Check: Test Spark (Scala 2.13.17) / groupby_tests
GitHub Check: Test Spark (Scala 2.13.17) / stats_tests
GitHub Check: Test Spark (Scala 2.12.18) / stats_tests
GitHub Check: Test Spark (Scala 2.12.18) / udafs_tests
GitHub Check: Test Spark (Scala 2.12.18) / analyzer_tests
GitHub Check: Test Spark (Scala 2.12.18) / join_tests
GitHub Check: Test Spark (Scala 2.12.18) / streaming_tests
GitHub Check: Test Spark (Scala 2.12.18) / kv_store_tests
GitHub Check: Test Spark (Scala 2.12.18) / groupby_tests
GitHub Check: Test Spark (Scala 2.13.17) / batch_tests
GitHub Check: Test Spark (Scala 2.12.18) / fetcher_tests
GitHub Check: Test Spark (Scala 2.12.18) / spark_tests
GitHub Check: Test Spark (Scala 2.12.18) / batch_tests
GitHub Check: Test Spark (Scala 2.13.17) / join_tests
GitHub Check: Test Spark (Scala 2.13.17) / spark_tests
GitHub Check: Test Spark (Scala 2.13.17) / fetcher_tests
GitHub Check: python_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_aws_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_aws_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_gcp_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / api_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / online_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / service_commons_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / flink_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / service_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / online_tests
GitHub Check: python_lint
GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_gcp_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / service_commons_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / service_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / aggregator_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / api_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / flink_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / aggregator_tests
GitHub Check: enforce_triggered_workflows

🔇 Additional comments (1)

python/test/canary/compiled/models/gcp/listing.v1__2 (1)

78-78: Env VERSION change looks fine.

Updating the canary env VERSION to "latest" in this compiled test artifact is safe and does not affect model logic or schema.

coderabbitai · 2025-12-04T21:10:40Z

docs/source/running_on_zipline_hub/Eval.md

+    # Create namespace
+    print("Creating namespace 'data'...")
+    spark.sql("CREATE NAMESPACE IF NOT EXISTS data")
+    spark.sql(f"USE {catalog_name}")
+
+    # Create table with schema
+    print("Creating user_activities table...")
+    spark.sql("""
+        CREATE TABLE IF NOT EXISTS data.user_activities__0 (
+            user_id STRING,
+            event_time_ms BIGINT,
+            session_id STRING,
+            event_type STRING,
+            ds STRING
+        ) USING iceberg
+        PARTITIONED BY (ds)
+    """)
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Spark Iceberg CREATE NAMESPACE current catalog behavior

💡 Result:

Short answer

Spark tracks a current catalog + namespace; unqualified CREATE NAMESPACE uses the current catalog and namespace. [1]

Iceberg’s SparkCatalog / SparkSessionCatalog implement namespace methods, and Spark passes the currentCatalog/currentNamespace to those methods, so CREATE NAMESPACE on an Iceberg catalog creates the namespace inside that catalog. (Iceberg’s SparkCatalog delegates namespace operations to the underlying Iceberg Catalog.) [2][3]

Notes / gotchas

SparkSessionCatalog wraps/creates a SparkCatalog for Iceberg tables, so behavior is the same but the catalog is backed by the configured Iceberg implementation. [3]

There are edge reports where people expected a new Spark catalog entry to appear in the Hive metastore but it didn’t — catalog registration vs. namespace creation are different concerns. Check your catalog config (spark.sql.catalog..*) and whether you’re using a Hive vs. session catalog. [4][1]

Sources

Apache Iceberg Spark configuration / using catalogs. [1]

Iceberg SparkCatalog javadoc (namespace/table/view API; currentCatalog/currentNamespace parameters). [2]

Iceberg SparkSessionCatalog javadoc (wraps/creates SparkCatalog). [3]

GitHub issue discussing Hive metastore vs. Spark catalog expectations. [4]

🏁 Script executed:

# Search the documentation file to understand the full context cd docs/source/running_on_zipline_hub || true if [ -f "Eval.md" ]; then head -150 Eval.md | tail -40 else find . -name "Eval.md" -type f 2>/dev/null | head -5 fi

Repository: zipline-ai/chronon

Length of output: 1523

Ensure the namespace is created in the Iceberg catalog, not the default catalog.

The CREATE NAMESPACE IF NOT EXISTS data statement runs before USE {catalog_name}, so the namespace is created in the default spark_catalog instead of the target Iceberg catalog. This causes a mismatch: the namespace data exists in spark_catalog, but the table and write operations target {catalog_name}.data. Reorder to switch catalogs first:

- print("Creating namespace 'data'...") - spark.sql("CREATE NAMESPACE IF NOT EXISTS data") - spark.sql(f"USE {catalog_name}") + print(f"Creating namespace 'data' in catalog '{catalog_name}'...") + spark.sql(f"USE {catalog_name}") + spark.sql("CREATE NAMESPACE IF NOT EXISTS data")

🤖 Prompt for AI Agents

In docs/source/running_on_zipline_hub/Eval.md around lines 120 to 137, the CREATE NAMESPACE runs before switching to the Iceberg catalog so the namespace is created in the default spark_catalog; move the catalog switch so spark.sql(f"USE {catalog_name}") is executed before creating the namespace (or alternatively create the namespace fully qualified as {catalog_name}.data), then proceed to create the table—this ensures the namespace is created in the intended Iceberg catalog.

coderabbitai · 2025-12-04T21:10:40Z

docs/source/running_on_zipline_hub/Test.md

+```
+🟢 Eval job finished successfully
+Join Configuration: gcp.demo.user_features__1
+  - Left table: data.user_activity_7d__0
+  - Join parts: 2
+  - Conf dependencies: 3
+  - External tables: 2
+  - Output Schema:
+   [left]    user_id: string
+   [left]    event_timestamp: long
+   [left]    ds: string
+   [joinPart: gcp.user_demographics__0]    user_id_age: integer
+   [derivation]    is_adult: boolean
+
+Lineage:
+[Join] gcp.demo.user_features__1
+├── ✅ [GroupBy] gcp.user_activity_7d__0
+│   └── External: project.events.user_clicks
+└── ✅ [GroupBy] gcp.user_demographics__0
+    └── ✅ [StagingQuery] gcp.raw_demographics__0
+```


⚠️ Potential issue | 🟡 Minor

Add a language to the example-output code fence (MD040).

The example output block is missing a language spec; consider marking it as plain text to satisfy markdownlint:

-``` +```text

🤖 Prompt for AI Agents

In docs/source/running_on_zipline_hub/Test.md around lines 38 to 58 the example-output code fence is missing a language spec which triggers MD040; update the opening triple-backtick to include a language (e.g., change ``` to ```text) so the block is explicitly marked as plain text and save the file.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/source/running_on_zipline_hub/Eval.md (1)
120-123: Fix namespace creation order to target the Iceberg catalog.

The namespace is created before switching catalogs, so it's created in the default spark_catalog instead of ci_catalog. Subsequent operations target ci_catalog.data, causing a mismatch. Move the USE statement first:
-    # Create namespace
-    print("Creating namespace 'data'...")
-    spark.sql("CREATE NAMESPACE IF NOT EXISTS data")
-    spark.sql(f"USE {catalog_name}")
+    # Create namespace
+    print(f"Creating namespace 'data' in catalog '{catalog_name}'...")
+    spark.sql(f"USE {catalog_name}")
+    spark.sql("CREATE NAMESPACE IF NOT EXISTS data")

🧹 Nitpick comments (1)

docs/source/running_on_zipline_hub/Eval.md (1)

109-117: Clarify Scala binary version and add guidance on Iceberg JAR compatibility.

The iceberg-spark-runtime-3.5_2.12 JAR assumes Scala 2.12. If your Spark build uses Scala 2.13, you'll need the _2.13 variant. Add a note clarifying this dependency and recommending verification before production use:

         .config("spark.jars.packages",
-                "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3")
+                "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3")  # Use _2.13 if your Spark uses Scala 2.13
         .config("spark.sql.extensions",
                 "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
         .config(f"spark.sql.catalog.{catalog_name}",
                 "org.apache.iceberg.spark.SparkCatalog")
         .config(f"spark.sql.catalog.{catalog_name}.type", "hadoop")
         .config(f"spark.sql.catalog.{catalog_name}.warehouse", warehouse_path)
+    )
+
+    # Note: Verify that iceberg-spark-runtime JAR matches your Spark/Scala version.
+    # Use `spark.jars.packages` matching your Scala build (_2.12 vs _2.13).
+    # For production, test end-to-end with your actual Spark distribution.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20f7e10 and 74d8dde.

📒 Files selected for processing (1)

docs/source/running_on_zipline_hub/Eval.md (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/source/running_on_zipline_hub/Eval.md

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (39)

GitHub Check: Test Spark (Scala 2.13.17) / spark_tests
GitHub Check: Test Spark (Scala 2.13.17) / udafs_tests
GitHub Check: Test Spark (Scala 2.13.17) / stats_tests
GitHub Check: Test Spark (Scala 2.13.17) / groupby_tests
GitHub Check: Test Spark (Scala 2.13.17) / streaming_tests
GitHub Check: Test Spark (Scala 2.13.17) / analyzer_tests
GitHub Check: Test Spark (Scala 2.13.17) / join_tests
GitHub Check: Test Spark (Scala 2.13.17) / kv_store_tests
GitHub Check: Test Spark (Scala 2.13.17) / batch_tests
GitHub Check: Test Spark (Scala 2.13.17) / fetcher_tests
GitHub Check: Test Spark (Scala 2.12.18) / kv_store_tests
GitHub Check: Test Spark (Scala 2.12.18) / streaming_tests
GitHub Check: Test Spark (Scala 2.12.18) / analyzer_tests
GitHub Check: Test Spark (Scala 2.12.18) / groupby_tests
GitHub Check: Test Spark (Scala 2.12.18) / join_tests
GitHub Check: Test Spark (Scala 2.12.18) / batch_tests
GitHub Check: Test Spark (Scala 2.12.18) / stats_tests
GitHub Check: Test Spark (Scala 2.12.18) / udafs_tests
GitHub Check: Test Spark (Scala 2.12.18) / spark_tests
GitHub Check: Test Spark (Scala 2.12.18) / fetcher_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_aws_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / api_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / aggregator_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_gcp_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / service_commons_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / service_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / online_tests
GitHub Check: Test Non-Spark (Scala 2.12.18) / flink_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / api_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / service_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_aws_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_gcp_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / service_commons_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / online_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / flink_tests
GitHub Check: Test Non-Spark (Scala 2.13.17) / aggregator_tests
GitHub Check: python_lint
GitHub Check: python_tests
GitHub Check: enforce_triggered_workflows

coderabbitai · 2025-12-04T21:18:56Z

docs/source/running_on_zipline_hub/Eval.md

+```
+🟢 Eval job finished successfully
+Join Configuration: gcp.demo.user_features__1
+  - Left table: data.user_activity_7d__0
+  - Join parts: 2
+  - Conf dependencies: 3
+  - External tables: 2
+  - Output Schema:
+   [left]    user_id: string
+   [left]    event_timestamp: long
+   [left]    ds: string
+   [joinPart: gcp.user_demographics__0]    user_id_age: integer
+   [derivation]    is_adult: boolean
+
+Lineage:
+[Join] gcp.demo.user_features__1
+├── ✅ [GroupBy] gcp.user_activity_7d__0
+│   └── External: project.events.user_clicks
+└── ✅ [GroupBy] gcp.user_demographics__0
+    └── ✅ [StagingQuery] gcp.raw_demographics__0
+```


⚠️ Potential issue | 🟡 Minor

Specify language for code fence showing example output.

The example output block is missing a language identifier, which triggers a linter warning. Use text or plaintext:

-``` +```text 🟢 Eval job finished successfully

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

23-23: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

In docs/source/running_on_zipline_hub/Eval.md around lines 23 to 43, the fenced code block showing example output lacks a language identifier which triggers a linter warning; update the opening triple backticks to include a language such as text or plaintext (e.g. ```text) so the block is explicitly marked as plain text and the linter warning is resolved.

docs: eval usage on site

c1ef0a2

Merge branch 'main' into crf-eval-docs

e3b8b2f

cristian-zlai marked this pull request as ready for review December 4, 2025 20:37

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

piyush-zlai approved these changes Dec 4, 2025

View reviewed changes

docs/source/running_on_zipline_hub/Eval.md Outdated Show resolved Hide resolved

Add gif

20f7e10

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

update

74d8dde

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

cristian-zlai added this pull request to the merge queue Dec 4, 2025

Merged via the queue into main with commit 39c4792 Dec 4, 2025
61 of 63 checks passed

cristian-zlai deleted the crf-eval-docs branch December 4, 2025 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: eval usage on site #1321

docs: eval usage on site #1321

Uh oh!

cristian-zlai commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: eval usage on site #1321

docs: eval usage on site #1321

Uh oh!

Conversation

cristian-zlai commented Dec 4, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cristian-zlai commented Dec 4, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading