tempo traces test error 115 #925

Sheeproid · 2025-09-01T10:59:01Z

No description provided.

coderabbitai · 2025-09-01T10:59:08Z

Walkthrough

Introduces a new test fixture suite for case 115 that deploys a Flask-based Checkout service with OpenTelemetry tracing to Tempo, a traffic generator that produces mixed promo/non-promo requests, toolset configuration for Kubernetes and Tempo, and an end-to-end test harness that provisions, validates logs/traces, and cleans up.

Changes

Cohort / File(s)	Summary
Checkout Service (K8s manifest + Flask app in Secret) `tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml`	Adds Secret embedding app.py (Flask, OTEL gRPC exporter to Tempo), Deployment (Python 3.11-slim, mounts Secret, startupProbe /health), and Service (port 8080). Exposes GET /health and POST /checkout with spans, attributes, error recording on promo-code path, and computed totals on success.
Traffic Generator `tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml`	Adds Secret with Python script that loops POSTs to /checkout with randomized items and conditional promo_code, logs outcomes and latency; Deployment installs requests, uses file-based startupProbe, runs script; optional OTEL lines commented out.
Test Case Orchestration `tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml`	Defines prompt/expectations, sets up namespace app-115, deploys Tempo, checkout service, and traffic generator; validates logs for promo/non-promo and processing entries; queries Tempo for traces; removes traffic generator; cleans namespace in teardown.
Toolsets Configuration `tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml`	Enables kubernetes/core, kubernetes/logs, and grafana/tempo toolsets; configures Tempo at http://localhost:3200 with healthcheck.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant TG as Traffic Generator
  participant CS as Checkout Service (Flask)
  participant DB as Simulated DB (span)
  participant Tempo as Tempo (OTLP)

  TG->>CS: POST /checkout (payload)
  activate CS
  note over CS: start span: process_checkout

  CS->>DB: start span: database_query
  alt promo_code present
    DB-->>CS: Simulated DB error
    note over CS: recordException, setStatus(ERROR)
    CS--)Tempo: OTLP spans (error)
    CS-->>TG: 500 {"error": "..."}
  else no promo_code
    DB-->>CS: Simulated success (rates/discount)
    CS-->>TG: 200 {"order_id", "subtotal","shipping","total"}
    CS--)Tempo: OTLP spans (success)
  end
  deactivate CS

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Add a test for traces due to db slowness #907 — Adds a similar checkout-service with OTEL tracing and matching test fixtures; likely parallel or precursor to this test case.
Add 2 evals for loki + port-forwarding support for evals that need it #783 — Introduces port-forward handling and toolset usage that this test case leverages for Tempo access.

Suggested reviewers

aantn
moshemorad

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch traces-failure-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (10)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml (3)
125-160: Harden the container (non-root and no privilege escalation).

Address CKV_K8S_20/23 by adding a minimal securityContext.
       containers:
       - name: traffic-generator
         image: python:3.11-slim
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 10001
+          runAsGroup: 10001
         command: ["/bin/bash", "-c"]
101-103: Fix comment wording.

Minor grammar nit.
-            # Wait 10ms to 50ms second before next request
+            # Wait 10–50 ms before the next request
12-12: Remove unused import.

datetime isn’t used.
-    from datetime import datetime
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml (4)
135-169: Add basic container hardening.

Run as non-root and disable privilege escalation.
       containers:
       - name: checkout
         image: python:3.11-slim
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 10001
+          runAsGroup: 10001
         command: ["/bin/bash", "-c"]
153-161: Consider adding a readinessProbe.

You have a startupProbe; a readinessProbe on /health improves service gating.
         startupProbe:
           httpGet:
             path: /health
             port: 8080
           initialDelaySeconds: 10
           periodSeconds: 5
           timeoutSeconds: 3
           successThreshold: 1
           failureThreshold: 24
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          periodSeconds: 5
+          timeoutSeconds: 2
+          failureThreshold: 3
9-19: Drop unused import.

os isn’t referenced.
-    import os
68-69: Nit: placeholder style vs declared DB.

Query uses ? placeholders (SQLite style) while db.system is postgresql. It’s fine for a dummy, but consider aligning for clarity ($1, $2) or add a comment.
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml (3)
70-72: Align message and wait time (reduce flakiness).

Message says 45s but sleeps 20s. Either change text or actually wait longer to ensure promo traffic and traces.
-  echo "⏰ Letting traffic generator run for 45 seconds to generate requests"
-  sleep 20
+  echo "⏰ Letting traffic generator run for 45 seconds to generate requests"
+  sleep 45
74-86: Avoid --tail=-1 for portability.

Some kubectl versions reject negative tails. Default returns all logs.
-  if kubectl logs -n app-115 -l app=traffic-generator --tail=-1 | grep -q "WITH promo_code"; then
+  if kubectl logs -n app-115 -l app=traffic-generator | grep -q "WITH promo_code"; then
@@
-  if kubectl logs -n app-115 -l app=traffic-generator --tail=-1 | grep -q "WITHOUT promo_code"; then
+  if kubectl logs -n app-115 -l app=traffic-generator | grep -q "WITHOUT promo_code"; then
1-1: Optional: neutralize fixture directory name.

Guideline prefers neutral names. Consider 115_checkout_tracing instead of 115_checkout_errors_tracing (update references accordingly).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 597b8f3 and 14ea855.

📒 Files selected for processing (4)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml (1 hunks)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml (1 hunks)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml (1 hunks)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

tests/llm/**/test_case.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Eval test cases may declare runbooks in test_case.yaml using either runbooks: {} or runbooks: {catalog: [...]}; if omitted, defaults are used

Files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml

tests/llm/**/*.{yaml,yml}

📄 CodeRabbit inference engine (CLAUDE.md)

tests/llm/**/*.{yaml,yml}: Each LLM eval test must use a dedicated Kubernetes namespace named app-
For Kubernetes-related eval assets, always use Secrets for scripts; do not embed scripts in inline manifests or ConfigMaps

Files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml

tests/llm/**

📄 CodeRabbit inference engine (CLAUDE.md)

Resource and file naming in evals should be neutral and must not hint at the problem (avoid names like broken-pod or crashloop-app)

Files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml

tests/llm/**/toolsets.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

tests/llm/**/toolsets.yaml: Eval toolset overrides must be defined in a separate toolsets.yaml file in the test directory (do not put toolset config in test_case.yaml)
In toolsets.yaml, all toolset-specific configuration must be nested under a config field
Only the following top-level fields are allowed in toolsets YAML: enabled, name, description, additional_instructions, prerequisites, tools, docs_url, icon_url, installation_instructions, config, url (MCP only)

Files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml

🧠 Learnings (3)

📚 Learning: 2025-08-24T07:21:02.611Z

Learnt from: CR
PR: robusta-dev/holmesgpt#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T07:21:02.611Z
Learning: Applies to tests/llm/**/toolsets.yaml : Only the following top-level fields are allowed in toolsets YAML: enabled, name, description, additional_instructions, prerequisites, tools, docs_url, icon_url, installation_instructions, config, url (MCP only)

Applied to files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml

📚 Learning: 2025-08-24T07:21:02.611Z

Learnt from: CR
PR: robusta-dev/holmesgpt#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T07:21:02.611Z
Learning: Applies to tests/llm/**/toolsets.yaml : Eval toolset overrides must be defined in a separate toolsets.yaml file in the test directory (do not put toolset config in test_case.yaml)

Applied to files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml

📚 Learning: 2025-08-05T00:42:23.792Z

Learnt from: vishiy
PR: robusta-dev/holmesgpt#782
File: config.example.yaml:31-49
Timestamp: 2025-08-05T00:42:23.792Z
Learning: In robusta-dev/holmesgpt config.example.yaml, the azuremonitorlogs toolset configuration shows "enabled: true" as an example of how to enable the toolset, not as a default setting. The toolset is disabled by default and requires explicit enablement in user configurations.

Applied to files:

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml

🪛 Checkov (3.2.334)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml

[MEDIUM] 111-163: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[MEDIUM] 111-163: Minimize the admission of root containers

(CKV_K8S_23)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml

[MEDIUM] 121-173: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[MEDIUM] 121-173: Minimize the admission of root containers

(CKV_K8S_23)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: build (3.10)
GitHub Check: build (3.11)
GitHub Check: build (3.12)
GitHub Check: build (3.11)
GitHub Check: build (3.12)
GitHub Check: build (3.10)
GitHub Check: llm_evals

🔇 Additional comments (1)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml (1)

8-10: Good: tool-specific settings are nested under config.

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml

github-actions · 2025-09-01T11:10:05Z

Results of HolmesGPT evals

ask_holmes: 32/37 test cases were successful, 0 regressions, 2 skipped, 2 setup failures

Test suite	Test case	Status
ask	01_how_many_pods	✅
ask	02_what_is_wrong_with_pod	✅
ask	04_related_k8s_events	↪️
ask	05_image_version	✅
ask	09_crashpod	✅
ask	10_image_pull_backoff	✅
ask	110_k8s_events_image_pull	✅
ask	11_init_containers	✅
ask	13a_pending_node_selector_basic	✅
ask	14_pending_resources	✅
ask	15_failed_readiness_probe	✅
ask	17_oom_kill	✅
ask	18_crash_looping_v2	✅
ask	19_detect_missing_app_details	✅
ask	20_long_log_file_search	✅
ask	24_misconfigured_pvc	✅
ask	24a_misconfigured_pvc_basic	✅
ask	28_permissions_error	🚧
ask	29_events_from_alert_manager	↪️
ask	39_failed_toolset	✅
ask	41_setup_argo	✅
ask	42_dns_issues_steps_new_tools	⚠️
ask	43_current_datetime_from_prompt	✅
ask	45_fetch_deployment_logs_simple	✅
ask	51_logs_summarize_errors	✅
ask	53_logs_find_term	✅
ask	54_not_truncated_when_getting_pods	✅
ask	59_label_based_counting	✅
ask	60_count_less_than	🚧
ask	61_exact_match_counting	✅
ask	63_fetch_error_logs_no_errors	✅
ask	79_configmap_mount_issue	✅
ask	83_secret_not_found	✅
ask	86_configmap_like_but_secret	✅
ask	93_calling_datadog[0]	✅
ask	93_calling_datadog[1]	✅
ask	93_calling_datadog[2]	✅

Legend

✅ the test was successful
↪️ the test was skipped
⚠️ the test failed but is known to be flaky or known to fail
🚧 the test had a setup failure (not a code regression)
🔧 the test failed due to mock data issues (not a code regression)
❌ the test failed and should be fixed before merging the PR

Sheeproid added 2 commits September 1, 2025 12:49

test tempo error - wip judge fails

c8744a6

make promo code error more rare so ai wont get lucky by getting one

4fd9a75

Sheeproid requested a review from aantn September 1, 2025 11:00

Sheeproid enabled auto-merge (squash) September 1, 2025 11:00

Merge branch 'master' into traces-failure-test

14ea855

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml Show resolved Hide resolved

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml Show resolved Hide resolved

aantn approved these changes Sep 1, 2025

View reviewed changes

Sheeproid merged commit d86ad6f into master Sep 1, 2025
11 checks passed

Sheeproid deleted the traces-failure-test branch September 1, 2025 11:12

0xLeo258 pushed a commit to shenglei5859/holmesgpt that referenced this pull request Sep 3, 2025

tempo traces test error 115 (robusta-dev#925)

77ebcd0

This was referenced Sep 7, 2025

Add Freeform tempo toolset #941

Closed

add free form tempo toolset #948

Merged

ROB-2116: prevent tool calls responses that are too big #956

Merged

This was referenced Sep 15, 2025

Publish model benchmarks #897

Merged

New Evals for newrelic #983

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

tempo traces test error 115 #925

tempo traces test error 115 #925

Uh oh!

Sheeproid commented Sep 1, 2025

Uh oh!

coderabbitai bot commented Sep 1, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

tempo traces test error 115 #925

tempo traces test error 115 #925

Uh oh!

Conversation

Sheeproid commented Sep 1, 2025

Uh oh!

coderabbitai bot commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 1, 2025

Results of HolmesGPT evals

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Sep 1, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)