Skip to content

Conversation

@gsw945
Copy link
Contributor

@gsw945 gsw945 commented Sep 17, 2025

Closes #123

Summary by CodeRabbit

  • Bug Fixes

    • Improved reliability when configuring models: adds a fallback for Ollama-based models that cannot be deep-copied, preventing errors during setup.
    • Ensures structured extraction continues to work by reconstructing model configuration when needed.
    • Maintains JSON response formatting for extraction outputs.
  • Chores

    • No changes to public APIs; internal handling refined to reduce setup failures for certain model types.

@coderabbitai
Copy link

coderabbitai bot commented Sep 17, 2025

Walkthrough

Adds a fallback in KnowledgeGraphModelConfig.with_model: if deepcopy of a model fails and an Ollama client is present, it reconstructs the model via OllamaGenerativeModel.from_json(model.to_json()), then sets response_format to {"type": "json_object"} before returning the config. No public API signatures changed.

Changes

Cohort / File(s) Summary
Model deepcopy fallback for Ollama models
graphrag_sdk/model_config.py
Wraps deepcopy with try/except. On TypeError and presence of ollama_client, lazily imports OllamaGenerativeModel and rebuilds via from_json(model.to_json()); otherwise re-raises. Ensures generation_config.response_format = {"type": "json_object"} remains applied.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant K as KnowledgeGraphModelConfig
  participant M as Model
  participant O as OllamaGenerativeModel (lazy)

  Caller->>K: with_model(model)
  Note over K: Try to deepcopy model
  K->>M: copy.deepcopy(model)
  alt deepcopy succeeds
    K->>K: set generation_config.response_format = {"type":"json_object"}
    K-->>Caller: return updated config
  else deepcopy raises TypeError
    alt model has ollama_client
      Note over K,O: Lazy import and reconstruct from JSON
      K->>M: model.to_json()
      K->>O: OllamaGenerativeModel.from_json(json)
      K->>K: set generation_config.response_format = {"type":"json_object"}
      K-->>Caller: return updated config
    else no ollama_client
      K-->>Caller: re-raise TypeError
    end
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • Ollama integration #29 — Introduces OllamaGenerativeModel JSON (de)serialization used here for fallback reconstruction when deepcopy fails.

Suggested reviewers

  • swilly22

Poem

I hopped past deepcopy’s snare,
A lockful lair, a thready glare—beware!
From JSON seeds, a model blooms anew,
Ollama whispers, “Rebuild, true.”
Now flows the graph in tidy streams,
Carrots of config fuel my dreams. 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The change implements a JSON-based fallback when copy.deepcopy(model) raises TypeError for Ollama-based models, which directly addresses the "prevent deepcopy/pickle errors" objective from Issue #123. However, the PR does not address several other coding objectives in #123: it does not change host/env var precedence (OLLAMA_API_BASE vs OLLAMA_HOST), does not update model discovery parsing to accept both 'name' and 'model', does not add unit/integration tests, and does not improve connection/error messages. Therefore the PR only partially satisfies the linked issue's coding requirements. Extend the PR to cover the remaining coding items from Issue #123: enforce host precedence (explicit host arg > OLLAMA_API_BASE and remove undocumented OLLAMA_HOST reliance), normalize model discovery responses to accept 'name' or 'model', add unit/integration tests for remote Ollama discovery and model-config serialization, and improve connection/error messages to clearly indicate host/config issues; prefer copying only serializable metadata or a controlled shallow copy rather than relying solely on JSON reconstruction where appropriate.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "avoid deepcopy exception with OllamaGenerativeModel" succinctly and accurately describes the primary code change (adding a fallback to avoid deepcopy failures for Ollama models in model_config.py). It is concise, specific to the main intent, and free of noisy details, so a reviewer scanning history will understand the primary change.
Out of Scope Changes Check ✅ Passed The diff is limited to graphrag_sdk/model_config.py and implements a narrowly scoped fallback for Ollama models that fail deepcopy; no unrelated files or features were modified. This change is in-scope with the deepcopy/pickle portion of Issue #123 and does not introduce apparent out-of-scope modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
graphrag_sdk/model_config.py (3)

52-59: Preserve traceback; fix Ruff TRY201; prefer absolute import.

  • Use bare raise to keep the original stack trace.
  • Use absolute import for consistency with the module’s other imports.

Apply:

-        except TypeError as te:
-            if getattr(model, "ollama_client", None) is not None:
-                from .models.ollama import OllamaGenerativeModel
+        except TypeError:
+            if getattr(model, "ollama_client", None) is not None:
+                from graphrag_sdk.models.ollama import OllamaGenerativeModel
                 extract_data_model = OllamaGenerativeModel.from_json(model.to_json())
             else:
-                raise te
+                raise

52-59: Prefer type check over attribute sentinel.

Duck-typing on ollama_client risks false positives. After the lazy import, check isinstance(model, OllamaGenerativeModel) and otherwise re-raise.

Example:

-        except TypeError:
-            if getattr(model, "ollama_client", None) is not None:
-                from graphrag_sdk.models.ollama import OllamaGenerativeModel
-                extract_data_model = OllamaGenerativeModel.from_json(model.to_json())
-            else:
-                raise
+        except TypeError:
+            from graphrag_sdk.models.ollama import OllamaGenerativeModel
+            if isinstance(model, OllamaGenerativeModel):
+                extract_data_model = OllamaGenerativeModel.from_json(model.to_json())
+            else:
+                raise

60-60: Confirm provider support for response_format=json_object.

If model is Ollama-based, ensure this flag is honored or gracefully ignored; otherwise extraction may fail. Add a small integration test that calls with_model() on a remote Ollama model and executes one extraction turn.

I can draft a minimal test exercising remote host preservation and with_model() behavior if helpful.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6862196 and f951c97.

📒 Files selected for processing (1)
  • graphrag_sdk/model_config.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
graphrag_sdk/model_config.py (1)
graphrag_sdk/models/ollama.py (3)
  • OllamaGenerativeModel (13-104)
  • from_json (88-104)
  • to_json (74-85)
🪛 Ruff (0.12.2)
graphrag_sdk/model_config.py

59-59: Use raise without specifying exception name

Remove exception name

(TRY201)

@galshubeli galshubeli merged commit 13fc340 into FalkorDB:main Sep 17, 2025
2 of 3 checks passed
@gsw945 gsw945 deleted the patch-1 branch September 18, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ollama Remote Host Issues

2 participants