Skip to content

Fix listing models #1748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 25, 2025
Merged

Fix listing models #1748

merged 2 commits into from
Jul 25, 2025

Conversation

engelmi
Copy link
Member

@engelmi engelmi commented Jul 25, 2025

Fixes: #1735

The migration for RefFile to RefJSONFile needs to be checked, and optionally triggered, when listing models. Otherwise no files are being found and the model is listed with 0 Bytes. It also moves the migrating of RefFile to RefJSONFile into a dedicated function so that it can be removed more easily in later releases (when this is no longer needed).

Summary by Sourcery

Fix model listing by migrating legacy RefFile to the new RefJSONFile format during listing, refactor migration logic into dedicated utilities, and clean up duplicate mapping code.

Bug Fixes:

  • Trigger migration of old RefFile records when listing models so files are found and sizes calculated correctly.

Enhancements:

  • Extract migration routines into map_to_refjsonfile and migrate_reffile_to_refjsonfile in the reffile module and apply them in ModelStore.get_ref_file and global_store.list_models.

Chores:

  • Remove duplicate map_ref_file and sanitize_filename definitions from store.py and centralize filename sanitization in common.py.

Copy link
Contributor

sourcery-ai bot commented Jul 25, 2025

Reviewer's Guide

Extract migration logic for converting old RefFile formats into RefJSONFile into a dedicated function and invoke it during get_ref_file and list_models, removing redundant mapping code and relocating filename sanitization.

Sequence diagram for listing models with migration check

sequenceDiagram
    participant GlobalStore
    participant FileSystem
    participant RefFile
    participant RefJSONFile
    participant Migration
    GlobalStore->>FileSystem: List ref files in ref_dir
    loop For each ref_file_name
        GlobalStore->>Migration: migrate_reffile_to_refjsonfile(ref_file_path, snapshot_dir)
        alt Migration returns RefJSONFile
            Migration-->>GlobalStore: RefJSONFile
        else Migration returns None
            GlobalStore->>RefJSONFile: RefJSONFile.from_path(ref_file_path)
        end
        GlobalStore->>RefJSONFile: Access files for model listing
    end
Loading

Class diagram for RefFile and RefJSONFile migration changes

classDiagram
    class RefFile {
        +from_path(path: str)
        +serialize()
        +write_to_file()
        +map_to_refjsonfile(ref_file_path: str, snapshot_directory: str) RefJSONFile
    }
    class RefJSONFile {
        +from_path(path: str)
        +write_to_file()
        hash
        path
        files
    }
    class StoreFile {
        hash
        name
        type
    }
    class StoreFileType
    RefFile --|> RefJSONFile : migration
    RefJSONFile o-- StoreFile : files
    StoreFile --> StoreFileType : type
Loading

Class diagram for ModelStore and migration integration

classDiagram
    class ModelStore {
        refs_directory
        snapshots_directory
        +get_ref_file_path(model_tag: str) str
        +get_ref_file(model_tag: str) RefJSONFile|None
    }
    class RefJSONFile
    class RefFile
    ModelStore --> RefJSONFile : uses
    ModelStore --> RefFile : uses
    ModelStore ..> migrate_reffile_to_refjsonfile : calls
Loading

File-Level Changes

Change Details Files
Introduce dedicated migration routine for RefFile to RefJSONFile conversion
  • Add RefFile.map_to_refjsonfile helper to build a RefJSONFile from an old RefFile
  • Implement migrate_reffile_to_refjsonfile to detect, convert, write, and clean up old ref files
  • Import necessary modules (os, Path, Optional, generate_sha256, sanitize_filename, logger)
ramalama/model_store/reffile.py
Remove inline mapping and sanitize_filename from store.py
  • Delete sanitize_filename and map_ref_file definitions
  • Refactor ModelStore.get_ref_file to call migrate_reffile_to_refjsonfile
  • Simplify existence check and fallback logic
ramalama/model_store/store.py
Trigger migration during model listing and adjust file iteration
  • Replace direct RefFile.from_path calls with migrate_reffile_to_refjsonfile
  • Fallback to RefJSONFile.from_path if no migration occurred
  • Iterate over ref_file.files and use snapshot_file.name when constructing paths
ramalama/model_store/global_store.py
Relocate filename sanitization to common module
  • Add sanitize_filename implementation to common.py
  • Remove duplicate sanitize_filename from store.py
ramalama/common.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @engelmi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where models were incorrectly listed with zero bytes due to a missing migration step for reference files. It introduces a robust mechanism to detect and automatically migrate older RefFile formats to the new RefJSONFile format when models are being listed or retrieved. This ensures proper model file accounting and prepares for future removal of the old format by centralizing the migration logic into a dedicated function.

Highlights

  • RefFile Migration Logic Centralization: The logic for migrating RefFile to RefJSONFile has been extracted into a dedicated, reusable function migrate_reffile_to_refjsonfile within reffile.py. This function now handles the conversion of old reference file formats to the new JSON-based format, including determining file types and blob hashes.
  • Automatic Migration during Model Listing: The list_models function in global_store.py now proactively checks for and triggers the RefFile to RefJSONFile migration when iterating through reference files. This resolves the issue where models were being listed with incorrect sizes (0 bytes) because their associated reference files were in an outdated format.
  • Filename Sanitization Utility: A new sanitize_filename utility function has been added to common.py. This function replaces colons with hyphens in filenames, improving consistency and preventing potential issues with file paths.
  • Consistent RefJSONFile Adoption: The codebase now consistently uses RefJSONFile for managing model references. Backward compatibility is ensured by the new migration routine, which is invoked both during model listing and when retrieving individual reference files.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @engelmi - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `ramalama/model_store/reffile.py:104` </location>
<code_context>
+
+        def determine_blob_hash(filename: str) -> str:
+            blob_path = Path(os.path.join(snapshot_directory, sanitize_filename(ref_file.hash), filename)).resolve()
+            if not os.path.exists(blob_path):
+                return generate_sha256(filename)
+            return blob_path.stem
+
</code_context>

<issue_to_address>
Returning generate_sha256(filename) when the file does not exist may be misleading.

Returning a hash for a non-existent file may cause confusion or downstream errors. Consider raising an error or returning None instead.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@engelmi
Copy link
Member Author

engelmi commented Jul 25, 2025

@BobbyRadford I think this should fix your opened issue #1748. Could you verify it?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an issue with listing models by ensuring that legacy RefFile formats are migrated to the new RefJSONFile format on the fly. The changes include refactoring the migration logic into dedicated utility functions, which improves code organization and maintainability.

My review focuses on the new migration logic in ramalama/model_store/reffile.py. I've identified a potential data corruption issue where a hash is generated from a filename for missing files, and I've suggested a safer alternative. I also have a couple of suggestions to improve code quality and robustness by using more specific exception handling and adhering to Python conventions.

engelmi added 2 commits July 25, 2025 11:55
Fixes: containers#1735

The migration for RefFile to RefJSONFile needs to be checked, and
optionally triggered, when listing models. Otherwise no files are
being found and the model is listed with 0 Bytes.

Signed-off-by: Michael Engel <[email protected]>
@engelmi engelmi force-pushed the fix-listing-models branch from 12e8e06 to d48e404 Compare July 25, 2025 09:55
@rhatdan
Copy link
Member

rhatdan commented Jul 25, 2025

LGTM

@rhatdan rhatdan merged commit 4af2ba3 into containers:main Jul 25, 2025
25 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running with MLX runtime cannot find an existing model
2 participants