fix(gguf_parser): fix memoryerror exception when loading non-native models #1452

taronaeo · 2025-05-29T19:11:58Z

When trying to run non-native endian model files, RamaLama would use the GGUFInfoParser.read_string method to parse the model file and eventually hit a MemoryError. This is PR takes into consideration Little and Big Endian model endianness and parses them accordingly.

Tested and verified to work on x86 without affecting anything. I will test it on s390x as well tomorrow when the z15 R&D mainframe comes online.

Summary by Sourcery

Enable correct parsing of GGUF model files with varying endianness and prevent MemoryError when loading non-native endian models.

Bug Fixes:

Fix MemoryError when loading GGUF models with non-native endianness by properly handling big-endian data parsing.

Enhancements:

Extend read_string to accept an endianness parameter and delegate length reading to read_number.
Add EOF check in read_string to gracefully handle incomplete reads.
Propagate the model endianness parameter to all string-reading calls in parsing routines.

sourcery-ai · 2025-05-29T19:12:03Z

Reviewer's Guide

This PR enhances the GGUFInfoParser to correctly handle little- and big-endian models by threading an explicit endianness parameter through string and number reads and adds checks to prevent MemoryErrors on truncated inputs.

Sequence Diagram for GGUFInfoParser.read_string with dynamic length and EOF check

sequenceDiagram
    participant C as Caller
    participant RS_Method as "GGUFInfoParser.read_string()"
    participant GGUFInfoParser_Static as "GGUFInfoParser (static methods)"
    participant M as "model: io.BufferedReader"

    C->>RS_Method: read_string(model, model_endianness, length=-1)
    RS_Method->>GGUFInfoParser_Static: read_number(model, GGUFValueType.UINT64, model_endianness)
    GGUFInfoParser_Static-->>RS_Method: determined_length
    RS_Method->>M: read(determined_length)
    M-->>RS_Method: raw_bytes
    alt len(raw_bytes) < determined_length
        RS_Method-->>C: return ParseError("Unexpected EOF...")
    else Bytes read successfully
        RS_Method->>RS_Method: raw_bytes.decode("utf-8")
        RS_Method-->>C: return decoded_string
    end

Sequence Diagram: Endianness Parameter Propagation in GGUFInfoParser.parse Method

sequenceDiagram
    participant Client
    participant P as "GGUFInfoParser.parse()"
    participant RS as "GGUFInfoParser.read_string()"
    participant RN as "GGUFInfoParser.read_number()"
    participant RV as "GGUFInfoParser.read_value()"
    participant M as "model: io.BufferedReader"

    Client->>P: parse(model_path, ...)
    P->>M: open(model_path, "rb")
    M-->>P: model_file_handle
    P->>RS: read_string(model, model_endianness, 4)
    RS-->>P: magic_number
    P->>P: Validate magic_number
    opt Invalid magic_number
        P-->>Client: Raise ParseError
    end

    P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
    RN-->>P: version
    P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
    RN-->>P: tensor_count
    P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
    RN-->>P: metadata_kv_count

    loop metadata_kv_count times
        P->>RS: read_string(model, model_endianness)
        RS-->>P: key
        P->>P: read_value_type(model, model_endianness)
        P-->>P: value_type
        P->>RV: read_value(model, value_type, model_endianness)
        RV-->>P: value
        P->>P: metadata[key] = value
    end

    loop tensor_count times
        P->>RS: read_string(model, model_endianness)
        RS-->>P: tensor_name
        P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
        RN-->>P: n_dimensions
        loop n_dimensions times
            P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
            RN-->>P: dimension_value
        end
        P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
        RN-->>P: tensor_type
        P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
        RN-->>P: offset
        P->>P: Create Tensor object
    end
    P-->>Client: GGUFModelInfo

Updated Class Diagram for GGUFInfoParser

classDiagram
    class GGUFInfoParser {
        +is_model_gguf(model_path: str) bool
        +read_string(model: io.BufferedReader, model_endianness: GGUFEndian = GGUFEndian.LITTLE, length: int = -1) str
        +read_number(model: io.BufferedReader, value_type: GGUFValueType, model_endianness: GGUFEndian) float
        +read_value(model: io.BufferedReader, value_type: GGUFValueType, model_endianness: GGUFEndian) object
        +parse(model_name: str, model_registry: str, model_path: str) GGUFModelInfo
    }
    class GGUFEndian {
        <<enumeration>>
        LITTLE
        BIG
    }
    class GGUFValueType {
        <<enumeration>>
        UINT32
        UINT64
        BOOL
        STRING
        ARRAY
    }
    class ParseError {
        <<exception>>
        message: str
    }
    GGUFInfoParser ..> GGUFEndian : uses
    GGUFInfoParser ..> GGUFValueType : uses
    GGUFInfoParser ..> ParseError : creates

File-Level Changes

Change	Details	Files
Add explicit endianness support to string parsing	Extend read_string signature to accept model_endianness Use read_number with model_endianness to determine string length Propagate model_endianness to all read_string calls in is_model_gguf, parse, read_value, metadata, and tensor loops	`ramalama/gguf_parser.py`
Handle unexpected EOF when reading strings to prevent MemoryError	Check raw byte count against expected length after reading Return a ParseError on truncated input instead of letting decode or struct unpacking fail	`ramalama/gguf_parser.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @taronaeo - I've reviewed your changes - here's some feedback:

read_string currently returns a ParseError object on EOF, which breaks the expected str return type—raise a ParseError exception instead of returning it.
Magic‐number detection still assumes little‐endian; consider reading the raw 4 bytes and comparing directly so big‐endian GGUF files are correctly identified.
The read_string signature is confusing with positional (endianness, length) parameters—make model_endianness keyword-only or reorder parameters for clarity.

Here's what I looked at during the review

🟡 General issues: 2 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

ramalama/gguf_parser.py

rhatdan · 2025-05-29T19:20:48Z

LGTM

Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): missed some calls Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): typo `return` vs `raise` Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): missing staticmethod declarations Signed-off-by: Aaron Teo <[email protected]>

taronaeo · 2025-05-30T02:48:46Z

Just tested on IBM z15 mainframe and it works as intended. If all CI passes and no other changes are required, feel free to merge into main :)

engelmi

LGTM

taronaeo requested review from rhatdan, ericcurtin, bmahabirbu, maxamillion, swarajpande5, jhjaggars, cgruver, slp and engelmi as code owners May 29, 2025 19:11

sourcery-ai bot approved these changes May 29, 2025

View reviewed changes

ramalama/gguf_parser.py Show resolved Hide resolved

ramalama/gguf_parser.py Show resolved Hide resolved

ramalama/gguf_parser.py Show resolved Hide resolved

taronaeo force-pushed the fix/gguf-parser-string-endian branch from 6c85ca5 to b6cc604 Compare May 29, 2025 19:15

taronaeo force-pushed the fix/gguf-parser-string-endian branch from b6cc604 to 1b32a09 Compare May 30, 2025 02:05

engelmi approved these changes May 30, 2025

View reviewed changes

rhatdan merged commit 27ec0d0 into containers:main May 30, 2025
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gguf_parser): fix memoryerror exception when loading non-native models #1452

fix(gguf_parser): fix memoryerror exception when loading non-native models #1452

Uh oh!

taronaeo commented May 29, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented May 29, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rhatdan commented May 29, 2025

Uh oh!

taronaeo commented May 30, 2025

Uh oh!

engelmi left a comment

Uh oh!

Uh oh!

Uh oh!

fix(gguf_parser): fix memoryerror exception when loading non-native models #1452

fix(gguf_parser): fix memoryerror exception when loading non-native models #1452

Uh oh!

Conversation

taronaeo commented May 29, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence Diagram for GGUFInfoParser.read_string with dynamic length and EOF check

Sequence Diagram: Endianness Parameter Propagation in GGUFInfoParser.parse Method

Updated Class Diagram for GGUFInfoParser

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rhatdan commented May 29, 2025

Uh oh!

taronaeo commented May 30, 2025

Uh oh!

engelmi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

taronaeo commented May 29, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented May 29, 2025 •

edited

Loading