Skip to content

Conversation

@Sheeproid
Copy link
Contributor

No description provided.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 1, 2025

Walkthrough

Introduces a new test fixture suite for case 115 that deploys a Flask-based Checkout service with OpenTelemetry tracing to Tempo, a traffic generator that produces mixed promo/non-promo requests, toolset configuration for Kubernetes and Tempo, and an end-to-end test harness that provisions, validates logs/traces, and cleans up.

Changes

Cohort / File(s) Summary
Checkout Service (K8s manifest + Flask app in Secret)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml
Adds Secret embedding app.py (Flask, OTEL gRPC exporter to Tempo), Deployment (Python 3.11-slim, mounts Secret, startupProbe /health), and Service (port 8080). Exposes GET /health and POST /checkout with spans, attributes, error recording on promo-code path, and computed totals on success.
Traffic Generator
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml
Adds Secret with Python script that loops POSTs to /checkout with randomized items and conditional promo_code, logs outcomes and latency; Deployment installs requests, uses file-based startupProbe, runs script; optional OTEL lines commented out.
Test Case Orchestration
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
Defines prompt/expectations, sets up namespace app-115, deploys Tempo, checkout service, and traffic generator; validates logs for promo/non-promo and processing entries; queries Tempo for traces; removes traffic generator; cleans namespace in teardown.
Toolsets Configuration
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
Enables kubernetes/core, kubernetes/logs, and grafana/tempo toolsets; configures Tempo at http://localhost:3200 with healthcheck.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant TG as Traffic Generator
  participant CS as Checkout Service (Flask)
  participant DB as Simulated DB (span)
  participant Tempo as Tempo (OTLP)

  TG->>CS: POST /checkout (payload)
  activate CS
  note over CS: start span: process_checkout

  CS->>DB: start span: database_query
  alt promo_code present
    DB-->>CS: Simulated DB error
    note over CS: recordException, setStatus(ERROR)
    CS--)Tempo: OTLP spans (error)
    CS-->>TG: 500 {"error": "..."}
  else no promo_code
    DB-->>CS: Simulated success (rates/discount)
    CS-->>TG: 200 {"order_id", "subtotal","shipping","total"}
    CS--)Tempo: OTLP spans (success)
  end
  deactivate CS
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • aantn
  • moshemorad
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch traces-failure-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@Sheeproid Sheeproid requested a review from aantn September 1, 2025 11:00
@Sheeproid Sheeproid enabled auto-merge (squash) September 1, 2025 11:00
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (10)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml (3)

125-160: Harden the container (non-root and no privilege escalation).

Address CKV_K8S_20/23 by adding a minimal securityContext.

       containers:
       - name: traffic-generator
         image: python:3.11-slim
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 10001
+          runAsGroup: 10001
         command: ["/bin/bash", "-c"]

101-103: Fix comment wording.

Minor grammar nit.

-            # Wait 10ms to 50ms second before next request
+            # Wait 10–50 ms before the next request

12-12: Remove unused import.

datetime isn’t used.

-    from datetime import datetime
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml (4)

135-169: Add basic container hardening.

Run as non-root and disable privilege escalation.

       containers:
       - name: checkout
         image: python:3.11-slim
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 10001
+          runAsGroup: 10001
         command: ["/bin/bash", "-c"]

153-161: Consider adding a readinessProbe.

You have a startupProbe; a readinessProbe on /health improves service gating.

         startupProbe:
           httpGet:
             path: /health
             port: 8080
           initialDelaySeconds: 10
           periodSeconds: 5
           timeoutSeconds: 3
           successThreshold: 1
           failureThreshold: 24
+        readinessProbe:
+          httpGet:
+            path: /health
+            port: 8080
+          periodSeconds: 5
+          timeoutSeconds: 2
+          failureThreshold: 3

9-19: Drop unused import.

os isn’t referenced.

-    import os

68-69: Nit: placeholder style vs declared DB.

Query uses ? placeholders (SQLite style) while db.system is postgresql. It’s fine for a dummy, but consider aligning for clarity ($1, $2) or add a comment.

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml (3)

70-72: Align message and wait time (reduce flakiness).

Message says 45s but sleeps 20s. Either change text or actually wait longer to ensure promo traffic and traces.

-  echo "⏰ Letting traffic generator run for 45 seconds to generate requests"
-  sleep 20
+  echo "⏰ Letting traffic generator run for 45 seconds to generate requests"
+  sleep 45

74-86: Avoid --tail=-1 for portability.

Some kubectl versions reject negative tails. Default returns all logs.

-  if kubectl logs -n app-115 -l app=traffic-generator --tail=-1 | grep -q "WITH promo_code"; then
+  if kubectl logs -n app-115 -l app=traffic-generator | grep -q "WITH promo_code"; then
@@
-  if kubectl logs -n app-115 -l app=traffic-generator --tail=-1 | grep -q "WITHOUT promo_code"; then
+  if kubectl logs -n app-115 -l app=traffic-generator | grep -q "WITHOUT promo_code"; then

1-1: Optional: neutralize fixture directory name.

Guideline prefers neutral names. Consider 115_checkout_tracing instead of 115_checkout_errors_tracing (update references accordingly).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 597b8f3 and 14ea855.

📒 Files selected for processing (4)
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml (1 hunks)
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml (1 hunks)
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml (1 hunks)
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
tests/llm/**/test_case.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

Eval test cases may declare runbooks in test_case.yaml using either runbooks: {} or runbooks: {catalog: [...]}; if omitted, defaults are used

Files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
tests/llm/**/*.{yaml,yml}

📄 CodeRabbit inference engine (CLAUDE.md)

tests/llm/**/*.{yaml,yml}: Each LLM eval test must use a dedicated Kubernetes namespace named app-
For Kubernetes-related eval assets, always use Secrets for scripts; do not embed scripts in inline manifests or ConfigMaps

Files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml
tests/llm/**

📄 CodeRabbit inference engine (CLAUDE.md)

Resource and file naming in evals should be neutral and must not hint at the problem (avoid names like broken-pod or crashloop-app)

Files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/test_case.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml
  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml
tests/llm/**/toolsets.yaml

📄 CodeRabbit inference engine (CLAUDE.md)

tests/llm/**/toolsets.yaml: Eval toolset overrides must be defined in a separate toolsets.yaml file in the test directory (do not put toolset config in test_case.yaml)
In toolsets.yaml, all toolset-specific configuration must be nested under a config field
Only the following top-level fields are allowed in toolsets YAML: enabled, name, description, additional_instructions, prerequisites, tools, docs_url, icon_url, installation_instructions, config, url (MCP only)

Files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
🧠 Learnings (3)
📚 Learning: 2025-08-24T07:21:02.611Z
Learnt from: CR
PR: robusta-dev/holmesgpt#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T07:21:02.611Z
Learning: Applies to tests/llm/**/toolsets.yaml : Only the following top-level fields are allowed in toolsets YAML: enabled, name, description, additional_instructions, prerequisites, tools, docs_url, icon_url, installation_instructions, config, url (MCP only)

Applied to files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
📚 Learning: 2025-08-24T07:21:02.611Z
Learnt from: CR
PR: robusta-dev/holmesgpt#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T07:21:02.611Z
Learning: Applies to tests/llm/**/toolsets.yaml : Eval toolset overrides must be defined in a separate toolsets.yaml file in the test directory (do not put toolset config in test_case.yaml)

Applied to files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
📚 Learning: 2025-08-05T00:42:23.792Z
Learnt from: vishiy
PR: robusta-dev/holmesgpt#782
File: config.example.yaml:31-49
Timestamp: 2025-08-05T00:42:23.792Z
Learning: In robusta-dev/holmesgpt config.example.yaml, the azuremonitorlogs toolset configuration shows "enabled: true" as an example of how to enable the toolset, not as a default setting. The toolset is disabled by default and requires explicit enablement in user configurations.

Applied to files:

  • tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml
🪛 Checkov (3.2.334)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/traffic-generator.yaml

[MEDIUM] 111-163: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 111-163: Minimize the admission of root containers

(CKV_K8S_23)

tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/checkout-service.yaml

[MEDIUM] 121-173: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[MEDIUM] 121-173: Minimize the admission of root containers

(CKV_K8S_23)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.10)
  • GitHub Check: llm_evals
🔇 Additional comments (1)
tests/llm/fixtures/test_ask_holmes/115_checkout_errors_tracing/toolsets.yaml (1)

8-10: Good: tool-specific settings are nested under config.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2025

Results of HolmesGPT evals

  • ask_holmes: 32/37 test cases were successful, 0 regressions, 2 skipped, 2 setup failures
Test suite Test case Status
ask 01_how_many_pods
ask 02_what_is_wrong_with_pod
ask 04_related_k8s_events ↪️
ask 05_image_version
ask 09_crashpod
ask 10_image_pull_backoff
ask 110_k8s_events_image_pull
ask 11_init_containers
ask 13a_pending_node_selector_basic
ask 14_pending_resources
ask 15_failed_readiness_probe
ask 17_oom_kill
ask 18_crash_looping_v2
ask 19_detect_missing_app_details
ask 20_long_log_file_search
ask 24_misconfigured_pvc
ask 24a_misconfigured_pvc_basic
ask 28_permissions_error 🚧
ask 29_events_from_alert_manager ↪️
ask 39_failed_toolset
ask 41_setup_argo
ask 42_dns_issues_steps_new_tools ⚠️
ask 43_current_datetime_from_prompt
ask 45_fetch_deployment_logs_simple
ask 51_logs_summarize_errors
ask 53_logs_find_term
ask 54_not_truncated_when_getting_pods
ask 59_label_based_counting
ask 60_count_less_than 🚧
ask 61_exact_match_counting
ask 63_fetch_error_logs_no_errors
ask 79_configmap_mount_issue
ask 83_secret_not_found
ask 86_configmap_like_but_secret
ask 93_calling_datadog[0]
ask 93_calling_datadog[1]
ask 93_calling_datadog[2]

Legend

  • ✅ the test was successful
  • ↪️ the test was skipped
  • ⚠️ the test failed but is known to be flaky or known to fail
  • 🚧 the test had a setup failure (not a code regression)
  • 🔧 the test failed due to mock data issues (not a code regression)
  • ❌ the test failed and should be fixed before merging the PR

@Sheeproid Sheeproid merged commit d86ad6f into master Sep 1, 2025
11 checks passed
@Sheeproid Sheeproid deleted the traces-failure-test branch September 1, 2025 11:12
0xLeo258 pushed a commit to shenglei5859/holmesgpt that referenced this pull request Sep 3, 2025
This was referenced Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants