-
Notifications
You must be signed in to change notification settings - Fork 234
Inspect add safetensor support #1666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inspect add safetensor support #1666
Conversation
Reviewer's GuideThis PR refactors and centralizes common inspection utilities, adds full safetensor format support (parsing and metadata display), strengthens error handling around snapshot creation, and improves GGUF parser warnings. Sequence diagram for model inspection with safetensor supportsequenceDiagram
actor User
participant Model as Model (inspect)
participant SafetensorInfoParser
participant SafetensorModelInfo
User->>Model: inspect(args)
Model->>SafetensorInfoParser: is_model_safetensor(model_name)
alt is safetensor
Model->>SafetensorInfoParser: parse(model_name, registry, path)
SafetensorInfoParser->>SafetensorModelInfo: create instance
Model->>User: print(SafetensorModelInfo.serialize(...))
else not safetensor
Model->>...: (other inspection logic)
end
Class diagram for new and refactored model inspection typesclassDiagram
class ModelInfoBase {
+str Name
+str Registry
+str Path
+serialize(json: bool = False) str
+to_json() str
}
class GGUFModelInfo {
+serialize(json: bool = False, all: bool = False) str
+to_json(all: bool = False) str
}
class SafetensorModelInfo {
+dict Header
+serialize(json: bool = False, all: bool = False) str
+to_json(all: bool = False) str
}
class Tensor {
+str name
+int n_dimensions
+list[int] dimensions
+str type
+int offset
}
ModelInfoBase <|-- GGUFModelInfo
ModelInfoBase <|-- SafetensorModelInfo
Class diagram for new SafetensorInfoParser and error handlingclassDiagram
class SafetensorInfoParser {
+is_model_safetensor(model_name: str) bool
+parse(model_name: str, model_registry: str, model_path: str) SafetensorModelInfo
}
class ParseError {
}
SafetensorInfoParser ..> SafetensorModelInfo : creates
SafetensorInfoParser ..> ParseError : raises
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @engelmi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly extends the inspect
command's capabilities by introducing comprehensive support for Safetensors models. It includes a substantial refactoring of the model_inspect
module to create a more modular and extensible architecture for handling different model formats, centralizing common information structures and parsing utilities. Additionally, it improves the robustness of model snapshot creation processes across various model sources.
Highlights
- Safetensor Model Inspection: Implemented comprehensive support for inspecting Safetensors models, enabling the
inspect
command to parse and display detailed information about.safetensor
files. - Model Information Refactoring: Centralized common model information base classes (
ModelInfoBase
,Tensor
) and utility functions (adjust_new_line
,get_terminal_width
) into a newramalama/model_inspect/base_info.py
module for improved modularity and extensibility. - Dedicated Parsing Error: Introduced a specific
ParseError
exception inramalama/model_inspect/error.py
to standardize error handling for model parsing failures. - Improved Snapshot Creation Robustness: Enhanced error handling during model snapshot creation processes in
ramalama/ollama.py
andramalama/url.py
to provide more informative messages upon failure.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @engelmi - I've reviewed your changes and found some issues that need to be addressed.
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `ramalama/model_inspect/base_info.py:13` </location>
<code_context>
+ return 80
+
+
+def adjust_new_line(line: str) -> str:
+ filler = "..."
+ max_width = get_terminal_width()
</code_context>
<issue_to_address>
The logic for truncating lines in adjust_new_line may result in an empty string.
If a line is too long and lacks a newline, the function currently returns an empty string, which may cause output to be lost. Ensure the function always returns the truncated line with the filler.
</issue_to_address>
### Comment 2
<location> `ramalama/model_inspect/base_info.py:50` </location>
<code_context>
+ ret = ret + adjust_new_line(f" Registry: {self.Registry}")
+ return ret
+
+ def to_json(self) -> str:
+ return json.dumps(self, sort_keys=True, indent=4)
</code_context>
<issue_to_address>
Serializing a dataclass instance directly with json.dumps may not produce the intended output.
Use asdict(self) or self.__dict__ to ensure the dataclass fields are properly serialized.
</issue_to_address>
### Comment 3
<location> `ramalama/model_inspect/safetensor_info.py:40` </location>
<code_context>
+
+ return ret
+
+ def to_json(self, all: bool = False) -> str:
+ if all:
+ return json.dumps(self, default=lambda o: o.__dict__, sort_keys=True, indent=4)
</code_context>
<issue_to_address>
Serializing self with a lambda in json.dumps may expose internal attributes and is inconsistent with the base class.
Explicitly build the dictionary to serialize, as in the else branch, to avoid unintentionally exposing internal or base class attributes.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR adds support for inspecting safetensors models, refactors model inspection into a common base module, and improves error handling for model pulls and parsing. I've identified a couple of bugs in the new base_info.py
module, a robustness issue in the safetensor_info.py
file, and a potential improvement for exception messages in ramalama/ollama.py
. Please see my comments for details and suggestions.
a28419b
to
8e9b2e4
Compare
Please add tests. |
8e9b2e4
to
f9f6a78
Compare
I had trouble finding a small safetensors model for a new system test, but I could find this one: |
9ff96f1
to
7041746
Compare
ramalama/url.py
Outdated
self.model_store.new_snapshot(tag, snapshot_hash, files) | ||
try: | ||
self.model_store.new_snapshot(tag, snapshot_hash, files) | ||
except Exception as ex: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be catching all exceptions or a specific exception?
Is it safe to return a snapshot_file_path if creation of a new_snapshot fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be ok, I think. For HF-like repos we also catch all errors. The cleanup on failure is taken care of inside the new_snapshot
.
The return value of pull
is actually never used. I am removing these (replacing by simple early returns) in #1643
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to allow it to bubble up to the top IMO. Should only really catch specific exceptions and only if you can do something to address the exception in the handler e.g retry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olliewalsh Fair point. I moved handling the ParseError
to the top-level pull_cli
function. The handling task here is essentially to "beautify" the error. PTAL
/packit build |
c2618c0
to
daf43a4
Compare
ramalama/cli.py
Outdated
try: | ||
model.pull(args) | ||
except ParseError as ex: | ||
perror(f"Failed to pull '{model.model_name}:{model.model_tag}' : {ex}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still want error raised, so the command exits with non 0 status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups, you are right. Moved the except block to the cli's main function and use an EINVAL
as exit code... doesn't really fit, but I couldn't find a better match.
class GGUFInfoParser: | ||
@staticmethod | ||
def is_model_gguf(model_path: str) -> bool: | ||
try: | ||
with open(model_path, "rb") as model_file: | ||
magic_number = GGUFInfoParser.read_string(model_file, GGUFEndian.LITTLE, 4) | ||
return magic_number == GGUFModelInfo.MAGIC_NUMBER | ||
except Exception as ex: | ||
console.warning(f"Failed to read model '{model_path}': {ex}") | ||
except Exception: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you at least add a logger.debug(f"Caught exception: {ex}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Exchanged the console.warning
with a logger.debug
. Changed ✔️
Background: One reason for removing (or lowering the log level) is that applying is_model_gguf
on a safetensor will result in an exception and since it is part of the usual flow (I can't know what model type we are checking) this shouldn't log a warning.
Signed-off-by: Michael Engel <[email protected]>
Relates to: containers#1663 Signed-off-by: Michael Engel <[email protected]>
daf43a4
to
412d561
Compare
LGTM |
This PR adds support to
inspect
for safetensors models based the huggingface doc.Unfortunately, adding system tests for safetensors isn't really possible as I haven't found a small model yet (smallest I've seen is ~1GB). This wouldn't be very practical in the CI.
Summary by Sourcery
Add safetensor support to the inspect command and refactor model inspection into a common base module, while improving error handling for model pulls and parsing.
New Features:
Bug Fixes:
Enhancements: