-
Notifications
You must be signed in to change notification settings - Fork 10
docs: eval usage on site #1321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: eval usage on site #1321
Conversation
WalkthroughAdded two documentation pages for Eval configuration validation and CI/local workflows for Zipline Hub; updated a compiled test model artifact by changing its common environment version string from "0.1.0+dev.piyush" to "latest". Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
docs/source/running_on_zipline_hub/Test.md (2)
38-58: Add language specifier to output code block.The fenced code block lacks a language identifier. While this is a CLI output example, specify a language (e.g.,
textorplaintext) to comply with markdown linting rules.-``` +```text 🟢 Eval job finished successfully Join Configuration: gcp.demo.user_features__1 ...
19-78: Content organization: Eval section duplicates Eval.md.The new Eval section here mirrors the "Quick Schema Validation" and "Testing with Sample Data" sections in the companion Eval.md file. Consider whether Test.md should link to Eval.md instead, or summarize briefly with a reference, to avoid maintaining duplicate content.
docs/source/running_on_zipline_hub/Eval.md (2)
23-43: Add language specifier to CLI output code block.The fenced code block for example CLI output lacks a language identifier. Add
textorplaintextto comply with markdown linting rules.-``` +```text 🟢 Eval job finished successfully Join Configuration: gcp.demo.user_features__1 ...
179-183: Document/pinghealth check endpoint.Line 189 uses a
/pingendpoint to verify service readiness, but this endpoint isn't documented in the setup section. Add a note that the local-eval service exposes this health check endpoint, or clarify the expected behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/source/running_on_zipline_hub/Eval.md(1 hunks)docs/source/running_on_zipline_hub/Test.md(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/source/running_on_zipline_hub/Test.md
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
docs/source/running_on_zipline_hub/Eval.md (2)
167-177: Verify Docker imageziplineai/local-eval:latestavailability.The documentation references a Docker image that should be validated to exist in the project's registry or Docker Hub. Confirm the image is published and accessible in the intended CI environment.
1-311: Well-structured and comprehensive documentation.The Eval feature documentation is clear, well-organized, and includes practical examples for development and CI/CD workflows. The progression from quick validation to advanced local testing is logical and helpful for users.
| ```python | ||
| #!/usr/bin/env python3 | ||
| """Build a local Iceberg warehouse with test data for Chronon Eval testing.""" | ||
|
|
||
| import os | ||
| from datetime import datetime | ||
| from pyspark.sql import SparkSession | ||
|
|
||
| def epoch_millis(iso_timestamp): | ||
| """Convert ISO timestamp to epoch milliseconds""" | ||
| dt = datetime.fromisoformat(iso_timestamp.replace("Z", "+00:00")) | ||
| return int(dt.timestamp() * 1000) | ||
|
|
||
| def build_warehouse(warehouse_path, catalog_name="ci_catalog"): | ||
| """Create Iceberg warehouse with test data""" | ||
|
|
||
| print(f"Creating test warehouse at: {warehouse_path}") | ||
| os.makedirs(warehouse_path, exist_ok=True) | ||
|
|
||
| # Initialize Spark with Iceberg support | ||
| spark = ( | ||
| SparkSession.builder | ||
| .appName("chronon-test-warehouse-builder") | ||
| .master("local[*]") | ||
| .config("spark.jars.packages", | ||
| "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3") | ||
| .config("spark.sql.extensions", | ||
| "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") | ||
| .config(f"spark.sql.catalog.{catalog_name}", | ||
| "org.apache.iceberg.spark.SparkCatalog") | ||
| .config(f"spark.sql.catalog.{catalog_name}.type", "hadoop") | ||
| .config(f"spark.sql.catalog.{catalog_name}.warehouse", warehouse_path) | ||
| .getOrCreate() | ||
| ) | ||
|
|
||
| # Create namespace | ||
| print("Creating namespace 'data'...") | ||
| spark.sql("CREATE NAMESPACE IF NOT EXISTS data") | ||
| spark.sql(f"USE {catalog_name}") | ||
|
|
||
| # Create table with schema | ||
| print("Creating user_activities table...") | ||
| spark.sql(""" | ||
| CREATE TABLE IF NOT EXISTS data.user_activities__0 ( | ||
| user_id STRING, | ||
| event_time_ms BIGINT, | ||
| session_id STRING, | ||
| event_type STRING, | ||
| ds STRING | ||
| ) USING iceberg | ||
| PARTITIONED BY (ds) | ||
| """) | ||
|
|
||
| # Insert test data | ||
| user_activities_data = [ | ||
| ("user_1", epoch_millis("2025-01-01T00:01:00Z"), "session_1", "view", "2025-01-01"), | ||
| ("user_2", epoch_millis("2025-01-01T00:02:00Z"), "session_2", "click", "2025-01-01"), | ||
| ] | ||
|
|
||
| df = spark.createDataFrame( | ||
| user_activities_data, | ||
| ["user_id", "event_time_ms", "session_id", "event_type", "ds"] | ||
| ) | ||
|
|
||
| df.writeTo(f"{catalog_name}.data.user_activities__0").createOrReplace() | ||
| print(f"✓ Inserted {df.count()} rows into user_activities__0") | ||
|
|
||
| spark.stop() | ||
| print(f"\n✓ Warehouse created successfully at: {warehouse_path}") | ||
|
|
||
| if __name__ == "__main__": | ||
| build_warehouse("/tmp/chronon-test-warehouse") | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
PySpark 3.5.0 Iceberg 1.4.3 compatibility production ready
💡 Result:
Short answer: Yes — Iceberg 1.4.3 provides a Spark 3.5 runtime and is intended to be used with Spark/PySpark 3.5.x in production, provided you use the matching runtime JAR (iceberg-spark-runtime-3.5_* with the correct Scala binary) and validate your Spark build/Scala version and any storage/SDK bundles you need. [1][2]
Practical checklist before production deploy:
- Use the iceberg-spark-runtime-3.5_2.12 or _2.13 JAR that matches your Spark/PySpark Scala build. [2]
- Don’t include other Iceberg modules (iceberg-core, etc.) on the Spark classpath — only the runtime and required storage bundles (aws/gcp/azure) to avoid dependency conflicts. [1][2]
- Test your workload (reads, writes, partitioning, concurrency/transactions, compaction/manifest operations) end-to-end in a staging environment with your Spark distribution and JVM/Scala version. [1][2]
- Keep an eye on vendor compatibility (if using a distribution like Cloudera, Databricks, etc.), since some distros bundle specific Iceberg versions. [3]
Sources:
[1] Apache Iceberg releases (1.4.3 includes Spark 3.5 runtime).
[2] Apache Iceberg multi-engine support / runtime guidance (Spark 3.5 integration and runtime-jar guidance).
[3] Example vendor compatibility notes (Cloudera) — verify distro-specific pinned Iceberg/Spark combinations.
Verify Scala binary version and add documentation for Iceberg JAR dependency management.
Iceberg 1.4.3 with Spark 3.5.x is production-ready, but the example should clarify that iceberg-spark-runtime-3.5_2.12 assumes Scala 2.12. Verify your Spark build uses matching Scala 2.12 (or use _2.13 variant if needed). Additionally, document that only the runtime JAR should be on the classpath—avoid including other Iceberg modules to prevent dependency conflicts. Recommend end-to-end testing in a staging environment with your actual Spark distribution and storage backend (S3/GCS/Azure) before production use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
docs/source/running_on_zipline_hub/Test.md (1)
21-80: Consider generalizing the--confpath to cover all config types.Since this page is about testing
GroupBys,Joins andStagingQuerys, you might want the eval examples to mirror the backfill path pattern, e.g.compiled/{group_bys|staging_queries|joins}/{team}/{your_conf}, to make it clearer thatevalworks for more than joins.docs/source/running_on_zipline_hub/Eval.md (1)
169-277: Tighten Docker usage in examples (image pinning and GitLab dind setup).Two small robustness tweaks to consider:
- Pin
ziplineai/local-evalto a specific tag (and mention updating it over time) instead of:latest, so CI runs are reproducible and don’t silently change behavior on image updates.- In the GitLab CI example,
image: python:3.11plusservices: docker:dindtypically also requires installing the Docker CLI in the job image and settingDOCKER_HOST(per GitLab’s dind docs); callingdocker runmay otherwise fail.These are minor, but making them explicit will help users copy the examples into real pipelines.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/images/eval_sample.gifis excluded by!**/*.gif
📒 Files selected for processing (3)
docs/source/running_on_zipline_hub/Eval.md(1 hunks)docs/source/running_on_zipline_hub/Test.md(1 hunks)python/test/canary/compiled/models/gcp/listing.v1__2(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/source/running_on_zipline_hub/Test.md
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (39)
- GitHub Check: Test Spark (Scala 2.13.17) / streaming_tests
- GitHub Check: Test Spark (Scala 2.13.17) / kv_store_tests
- GitHub Check: Test Spark (Scala 2.13.17) / udafs_tests
- GitHub Check: Test Spark (Scala 2.13.17) / analyzer_tests
- GitHub Check: Test Spark (Scala 2.13.17) / groupby_tests
- GitHub Check: Test Spark (Scala 2.13.17) / stats_tests
- GitHub Check: Test Spark (Scala 2.12.18) / stats_tests
- GitHub Check: Test Spark (Scala 2.12.18) / udafs_tests
- GitHub Check: Test Spark (Scala 2.12.18) / analyzer_tests
- GitHub Check: Test Spark (Scala 2.12.18) / join_tests
- GitHub Check: Test Spark (Scala 2.12.18) / streaming_tests
- GitHub Check: Test Spark (Scala 2.12.18) / kv_store_tests
- GitHub Check: Test Spark (Scala 2.12.18) / groupby_tests
- GitHub Check: Test Spark (Scala 2.13.17) / batch_tests
- GitHub Check: Test Spark (Scala 2.12.18) / fetcher_tests
- GitHub Check: Test Spark (Scala 2.12.18) / spark_tests
- GitHub Check: Test Spark (Scala 2.12.18) / batch_tests
- GitHub Check: Test Spark (Scala 2.13.17) / join_tests
- GitHub Check: Test Spark (Scala 2.13.17) / spark_tests
- GitHub Check: Test Spark (Scala 2.13.17) / fetcher_tests
- GitHub Check: python_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_aws_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_aws_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_gcp_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / api_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / online_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / service_commons_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / flink_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / service_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / online_tests
- GitHub Check: python_lint
- GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_gcp_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / service_commons_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / service_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / aggregator_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / api_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / flink_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / aggregator_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (1)
python/test/canary/compiled/models/gcp/listing.v1__2 (1)
78-78: Env VERSION change looks fine.Updating the canary env
VERSIONto"latest"in this compiled test artifact is safe and does not affect model logic or schema.
| # Create namespace | ||
| print("Creating namespace 'data'...") | ||
| spark.sql("CREATE NAMESPACE IF NOT EXISTS data") | ||
| spark.sql(f"USE {catalog_name}") | ||
|
|
||
| # Create table with schema | ||
| print("Creating user_activities table...") | ||
| spark.sql(""" | ||
| CREATE TABLE IF NOT EXISTS data.user_activities__0 ( | ||
| user_id STRING, | ||
| event_time_ms BIGINT, | ||
| session_id STRING, | ||
| event_type STRING, | ||
| ds STRING | ||
| ) USING iceberg | ||
| PARTITIONED BY (ds) | ||
| """) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Spark Iceberg CREATE NAMESPACE current catalog behavior
💡 Result:
Short answer
- Spark tracks a current catalog + namespace; unqualified CREATE NAMESPACE uses the current catalog and namespace. [1]
- Iceberg’s SparkCatalog / SparkSessionCatalog implement namespace methods, and Spark passes the currentCatalog/currentNamespace to those methods, so CREATE NAMESPACE on an Iceberg catalog creates the namespace inside that catalog. (Iceberg’s SparkCatalog delegates namespace operations to the underlying Iceberg Catalog.) [2][3]
Notes / gotchas
- SparkSessionCatalog wraps/creates a SparkCatalog for Iceberg tables, so behavior is the same but the catalog is backed by the configured Iceberg implementation. [3]
- There are edge reports where people expected a new Spark catalog entry to appear in the Hive metastore but it didn’t — catalog registration vs. namespace creation are different concerns. Check your catalog config (spark.sql.catalog..*) and whether you’re using a Hive vs. session catalog. [4][1]
Sources
- Apache Iceberg Spark configuration / using catalogs. [1]
- Iceberg SparkCatalog javadoc (namespace/table/view API; currentCatalog/currentNamespace parameters). [2]
- Iceberg SparkSessionCatalog javadoc (wraps/creates SparkCatalog). [3]
- GitHub issue discussing Hive metastore vs. Spark catalog expectations. [4]
🏁 Script executed:
# Search the documentation file to understand the full context
cd docs/source/running_on_zipline_hub || true
if [ -f "Eval.md" ]; then
head -150 Eval.md | tail -40
else
find . -name "Eval.md" -type f 2>/dev/null | head -5
fiRepository: zipline-ai/chronon
Length of output: 1523
Ensure the namespace is created in the Iceberg catalog, not the default catalog.
The CREATE NAMESPACE IF NOT EXISTS data statement runs before USE {catalog_name}, so the namespace is created in the default spark_catalog instead of the target Iceberg catalog. This causes a mismatch: the namespace data exists in spark_catalog, but the table and write operations target {catalog_name}.data. Reorder to switch catalogs first:
- print("Creating namespace 'data'...")
- spark.sql("CREATE NAMESPACE IF NOT EXISTS data")
- spark.sql(f"USE {catalog_name}")
+ print(f"Creating namespace 'data' in catalog '{catalog_name}'...")
+ spark.sql(f"USE {catalog_name}")
+ spark.sql("CREATE NAMESPACE IF NOT EXISTS data")🤖 Prompt for AI Agents
In docs/source/running_on_zipline_hub/Eval.md around lines 120 to 137, the
CREATE NAMESPACE runs before switching to the Iceberg catalog so the namespace
is created in the default spark_catalog; move the catalog switch so
spark.sql(f"USE {catalog_name}") is executed before creating the namespace (or
alternatively create the namespace fully qualified as {catalog_name}.data), then
proceed to create the table—this ensures the namespace is created in the
intended Iceberg catalog.
| ``` | ||
| 🟢 Eval job finished successfully | ||
| Join Configuration: gcp.demo.user_features__1 | ||
| - Left table: data.user_activity_7d__0 | ||
| - Join parts: 2 | ||
| - Conf dependencies: 3 | ||
| - External tables: 2 | ||
| - Output Schema: | ||
| [left] user_id: string | ||
| [left] event_timestamp: long | ||
| [left] ds: string | ||
| [joinPart: gcp.user_demographics__0] user_id_age: integer | ||
| [derivation] is_adult: boolean | ||
| Lineage: | ||
| [Join] gcp.demo.user_features__1 | ||
| ├── ✅ [GroupBy] gcp.user_activity_7d__0 | ||
| │ └── External: project.events.user_clicks | ||
| └── ✅ [GroupBy] gcp.user_demographics__0 | ||
| └── ✅ [StagingQuery] gcp.raw_demographics__0 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a language to the example-output code fence (MD040).
The example output block is missing a language spec; consider marking it as plain text to satisfy markdownlint:
-```
+```text🤖 Prompt for AI Agents
In docs/source/running_on_zipline_hub/Test.md around lines 38 to 58 the
example-output code fence is missing a language spec which triggers MD040;
update the opening triple-backtick to include a language (e.g., change ``` to
```text) so the block is explicitly marked as plain text and save the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/source/running_on_zipline_hub/Eval.md (1)
120-123: Fix namespace creation order to target the Iceberg catalog.The namespace is created before switching catalogs, so it's created in the default
spark_cataloginstead ofci_catalog. Subsequent operations targetci_catalog.data, causing a mismatch. Move theUSEstatement first:- # Create namespace - print("Creating namespace 'data'...") - spark.sql("CREATE NAMESPACE IF NOT EXISTS data") - spark.sql(f"USE {catalog_name}") + # Create namespace + print(f"Creating namespace 'data' in catalog '{catalog_name}'...") + spark.sql(f"USE {catalog_name}") + spark.sql("CREATE NAMESPACE IF NOT EXISTS data")
🧹 Nitpick comments (1)
docs/source/running_on_zipline_hub/Eval.md (1)
109-117: Clarify Scala binary version and add guidance on Iceberg JAR compatibility.The
iceberg-spark-runtime-3.5_2.12JAR assumes Scala 2.12. If your Spark build uses Scala 2.13, you'll need the_2.13variant. Add a note clarifying this dependency and recommending verification before production use:.config("spark.jars.packages", - "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3") + "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.3") # Use _2.13 if your Spark uses Scala 2.13 .config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") .config(f"spark.sql.catalog.{catalog_name}", "org.apache.iceberg.spark.SparkCatalog") .config(f"spark.sql.catalog.{catalog_name}.type", "hadoop") .config(f"spark.sql.catalog.{catalog_name}.warehouse", warehouse_path) + ) + + # Note: Verify that iceberg-spark-runtime JAR matches your Spark/Scala version. + # Use `spark.jars.packages` matching your Scala build (_2.12 vs _2.13). + # For production, test end-to-end with your actual Spark distribution.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/source/running_on_zipline_hub/Eval.md(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/source/running_on_zipline_hub/Eval.md
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (39)
- GitHub Check: Test Spark (Scala 2.13.17) / spark_tests
- GitHub Check: Test Spark (Scala 2.13.17) / udafs_tests
- GitHub Check: Test Spark (Scala 2.13.17) / stats_tests
- GitHub Check: Test Spark (Scala 2.13.17) / groupby_tests
- GitHub Check: Test Spark (Scala 2.13.17) / streaming_tests
- GitHub Check: Test Spark (Scala 2.13.17) / analyzer_tests
- GitHub Check: Test Spark (Scala 2.13.17) / join_tests
- GitHub Check: Test Spark (Scala 2.13.17) / kv_store_tests
- GitHub Check: Test Spark (Scala 2.13.17) / batch_tests
- GitHub Check: Test Spark (Scala 2.13.17) / fetcher_tests
- GitHub Check: Test Spark (Scala 2.12.18) / kv_store_tests
- GitHub Check: Test Spark (Scala 2.12.18) / streaming_tests
- GitHub Check: Test Spark (Scala 2.12.18) / analyzer_tests
- GitHub Check: Test Spark (Scala 2.12.18) / groupby_tests
- GitHub Check: Test Spark (Scala 2.12.18) / join_tests
- GitHub Check: Test Spark (Scala 2.12.18) / batch_tests
- GitHub Check: Test Spark (Scala 2.12.18) / stats_tests
- GitHub Check: Test Spark (Scala 2.12.18) / udafs_tests
- GitHub Check: Test Spark (Scala 2.12.18) / spark_tests
- GitHub Check: Test Spark (Scala 2.12.18) / fetcher_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_aws_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / api_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / aggregator_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / cloud_gcp_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / service_commons_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / service_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / online_tests
- GitHub Check: Test Non-Spark (Scala 2.12.18) / flink_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / api_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / service_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_aws_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / cloud_gcp_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / service_commons_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / online_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / flink_tests
- GitHub Check: Test Non-Spark (Scala 2.13.17) / aggregator_tests
- GitHub Check: python_lint
- GitHub Check: python_tests
- GitHub Check: enforce_triggered_workflows
| ``` | ||
| 🟢 Eval job finished successfully | ||
| Join Configuration: gcp.demo.user_features__1 | ||
| - Left table: data.user_activity_7d__0 | ||
| - Join parts: 2 | ||
| - Conf dependencies: 3 | ||
| - External tables: 2 | ||
| - Output Schema: | ||
| [left] user_id: string | ||
| [left] event_timestamp: long | ||
| [left] ds: string | ||
| [joinPart: gcp.user_demographics__0] user_id_age: integer | ||
| [derivation] is_adult: boolean | ||
| Lineage: | ||
| [Join] gcp.demo.user_features__1 | ||
| ├── ✅ [GroupBy] gcp.user_activity_7d__0 | ||
| │ └── External: project.events.user_clicks | ||
| └── ✅ [GroupBy] gcp.user_demographics__0 | ||
| └── ✅ [StagingQuery] gcp.raw_demographics__0 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify language for code fence showing example output.
The example output block is missing a language identifier, which triggers a linter warning. Use text or plaintext:
-```
+```text
🟢 Eval job finished successfully🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
23-23: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In docs/source/running_on_zipline_hub/Eval.md around lines 23 to 43, the fenced
code block showing example output lacks a language identifier which triggers a
linter warning; update the opening triple backticks to include a language such
as text or plaintext (e.g. ```text) so the block is explicitly marked as plain
text and the linter warning is resolved.
Summary
Docs for eval in test /dev loop
Plus a specific section for CI.
Checklist
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.