-
Notifications
You must be signed in to change notification settings - Fork 234
fix(gguf_parser): fix memoryerror exception when loading non-native models #1452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(gguf_parser): fix memoryerror exception when loading non-native models #1452
Conversation
Reviewer's GuideThis PR enhances the GGUFInfoParser to correctly handle little- and big-endian models by threading an explicit endianness parameter through string and number reads and adds checks to prevent MemoryErrors on truncated inputs. Sequence Diagram for GGUFInfoParser.read_string with dynamic length and EOF checksequenceDiagram
participant C as Caller
participant RS_Method as "GGUFInfoParser.read_string()"
participant GGUFInfoParser_Static as "GGUFInfoParser (static methods)"
participant M as "model: io.BufferedReader"
C->>RS_Method: read_string(model, model_endianness, length=-1)
RS_Method->>GGUFInfoParser_Static: read_number(model, GGUFValueType.UINT64, model_endianness)
GGUFInfoParser_Static-->>RS_Method: determined_length
RS_Method->>M: read(determined_length)
M-->>RS_Method: raw_bytes
alt len(raw_bytes) < determined_length
RS_Method-->>C: return ParseError("Unexpected EOF...")
else Bytes read successfully
RS_Method->>RS_Method: raw_bytes.decode("utf-8")
RS_Method-->>C: return decoded_string
end
Sequence Diagram: Endianness Parameter Propagation in GGUFInfoParser.parse MethodsequenceDiagram
participant Client
participant P as "GGUFInfoParser.parse()"
participant RS as "GGUFInfoParser.read_string()"
participant RN as "GGUFInfoParser.read_number()"
participant RV as "GGUFInfoParser.read_value()"
participant M as "model: io.BufferedReader"
Client->>P: parse(model_path, ...)
P->>M: open(model_path, "rb")
M-->>P: model_file_handle
P->>RS: read_string(model, model_endianness, 4)
RS-->>P: magic_number
P->>P: Validate magic_number
opt Invalid magic_number
P-->>Client: Raise ParseError
end
P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
RN-->>P: version
P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
RN-->>P: tensor_count
P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
RN-->>P: metadata_kv_count
loop metadata_kv_count times
P->>RS: read_string(model, model_endianness)
RS-->>P: key
P->>P: read_value_type(model, model_endianness)
P-->>P: value_type
P->>RV: read_value(model, value_type, model_endianness)
RV-->>P: value
P->>P: metadata[key] = value
end
loop tensor_count times
P->>RS: read_string(model, model_endianness)
RS-->>P: tensor_name
P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
RN-->>P: n_dimensions
loop n_dimensions times
P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
RN-->>P: dimension_value
end
P->>RN: read_number(model, GGUFValueType.UINT32, model_endianness)
RN-->>P: tensor_type
P->>RN: read_number(model, GGUFValueType.UINT64, model_endianness)
RN-->>P: offset
P->>P: Create Tensor object
end
P-->>Client: GGUFModelInfo
Updated Class Diagram for GGUFInfoParserclassDiagram
class GGUFInfoParser {
+is_model_gguf(model_path: str) bool
+read_string(model: io.BufferedReader, model_endianness: GGUFEndian = GGUFEndian.LITTLE, length: int = -1) str
+read_number(model: io.BufferedReader, value_type: GGUFValueType, model_endianness: GGUFEndian) float
+read_value(model: io.BufferedReader, value_type: GGUFValueType, model_endianness: GGUFEndian) object
+parse(model_name: str, model_registry: str, model_path: str) GGUFModelInfo
}
class GGUFEndian {
<<enumeration>>
LITTLE
BIG
}
class GGUFValueType {
<<enumeration>>
UINT32
UINT64
BOOL
STRING
ARRAY
}
class ParseError {
<<exception>>
message: str
}
GGUFInfoParser ..> GGUFEndian : uses
GGUFInfoParser ..> GGUFValueType : uses
GGUFInfoParser ..> ParseError : creates
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @taronaeo - I've reviewed your changes - here's some feedback:
- read_string currently returns a ParseError object on EOF, which breaks the expected str return type—raise a ParseError exception instead of returning it.
- Magic‐number detection still assumes little‐endian; consider reading the raw 4 bytes and comparing directly so big‐endian GGUF files are correctly identified.
- The read_string signature is confusing with positional (endianness, length) parameters—make model_endianness keyword-only or reorder parameters for clarity.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
6c85ca5
to
b6cc604
Compare
LGTM |
Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): missed some calls Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): typo `return` vs `raise` Signed-off-by: Aaron Teo <[email protected]> fix(gguf_parser): missing staticmethod declarations Signed-off-by: Aaron Teo <[email protected]>
b6cc604
to
1b32a09
Compare
Just tested on IBM z15 mainframe and it works as intended. If all CI passes and no other changes are required, feel free to merge into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When trying to run non-native endian model files, RamaLama would use the
GGUFInfoParser.read_string
method to parse the model file and eventually hit aMemoryError
. This is PR takes into consideration Little and Big Endian model endianness and parses them accordingly.Tested and verified to work on x86 without affecting anything. I will test it on
s390x
as well tomorrow when the z15 R&D mainframe comes online.Summary by Sourcery
Enable correct parsing of GGUF model files with varying endianness and prevent MemoryError when loading non-native endian models.
Bug Fixes:
Enhancements: