-
Notifications
You must be signed in to change notification settings - Fork 235
mlx fixes #1673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideRefactors model name normalization by introducing a shared trim_model_name helper and replaces the module-based server invocation with a direct mlx_lm.server call to streamline execution args and normalize output. Class diagram for trim_model_name refactor and usageclassDiagram
class ModelBase {
}
class ModelInfoBase {
}
class trim_model_name {
+str trim_model_name(str model)
}
class _list_models_from_store {
}
class print_pull_message {
}
_list_models_from_store ..> trim_model_name : uses
print_pull_message ..> trim_model_name : uses
Class diagram for _build_mlx_exec_args changeclassDiagram
class ModelBase {
- _build_mlx_exec_args(subcommand, model_path, args, extra)
}
ModelBase : -exec_args changed to use 'mlx_lm.server' instead of 'python -m mlx_lm'
Flow diagram for model name normalization with trim_model_nameflowchart TD
A[Start] --> B{Is model from huggingface?}
B -- Yes --> C[Replace 'huggingface://' with 'hf://']
B -- No --> D
C --> D{Is model from ollama or oci?}
D -- No --> E[Remove ':latest' suffix]
D -- Yes --> F[Do not remove suffix]
E --> G[Return normalized model name]
F --> G
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
@kush-gupt PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @ericcurtin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request primarily aims to resolve issues related to the execution of mlx_lm
commands and enhance the user experience by standardizing how model names are displayed. It refactors existing logic for trimming model names into a dedicated utility function, which is then applied consistently across the codebase for cleaner output.
Highlights
- MLX Execution Path Fix: The command used to execute
mlx_lm
has been updated frompython -m mlx_lm
to directly invokemlx_lm.server
. This addresses an issue where the previous command might not be reliably found or preferred on certain systems, ensuring the correctmlx_lm
server is launched. - Model Name Trimming Utility: A new utility function,
trim_model_name
, has been introduced inramalama/model.py
. This function centralizes the logic for convertinghuggingface://
prefixes tohf://
and removing the:latest
suffix from model names, improving consistency and readability. - Improved Model Name Display: The newly introduced
trim_model_name
function is now used when listing models and in user-facing pull messages. This ensures that model names are presented in a cleaner, more consistent format (e.g.,hf://
instead ofhuggingface://
and without:latest
suffix), addressing the reported issue of confusing output during downloads.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ericcurtin - I've reviewed your changes - here's some feedback:
- Avoid using shlex.quote on model_path when building exec_args as a list, since the literal quotes will be passed to the subprocess and may break path resolution.
- Verify that the
mlx_lm.server
entry point is always available in target environments or provide a fallback topython -m mlx_lm
to prevent invocation errors. - Consider centralizing the supported URI schemes (e.g. ollama://, oci://) into a shared constant rather than hardcoding them in
trim_model_name
to keep the logic easier to extend.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Avoid using shlex.quote on model_path when building exec_args as a list, since the literal quotes will be passed to the subprocess and may break path resolution.
- Verify that the `mlx_lm.server` entry point is always available in target environments or provide a fallback to `python -m mlx_lm` to prevent invocation errors.
- Consider centralizing the supported URI schemes (e.g. ollama://, oci://) into a shared constant rather than hardcoding them in `trim_model_name` to keep the logic easier to extend.
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a trim_model_name
function to handle model name formatting and applies it in several places to ensure consistency. It also fixes an issue where the wrong executable was being called for mlx models.
LGTM, the reason why I wanted to leave python in cmd was the explicit declaration that we're using that python mlx-lm for it, but it should be fine on path |
cc1a7ba
to
4c5af96
Compare
mlx_lm.server is the only one in my path at least on my system. Also, printing output like this which doesn't make sense: Downloading huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ... Trying to pull huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ... Also remove recommendation to install via `brew install ramalama`, skips installing Apple specific dependancies. Signed-off-by: Eric Curtin <[email protected]>
LGTM |
mlx_lm.server is the only one in my path at least on my system.
Also, printing output like this which doesn't make sense:
Downloading huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ... Trying to pull huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ...
Summary by Sourcery
Normalize model URIs and streamline subcommand invocation.
Bug Fixes:
Enhancements: