-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Draft: [NV TRT RTX EP] Fix onnx checker for constants in subgraph #25579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: [NV TRT RTX EP] Fix onnx checker for constants in subgraph #25579
Conversation
@yuslepukhin Should onnxruntime/onnxruntime/core/graph/graph.cc Lines 3768 to 3769 in e753643
The API description says the 'data will be copied into the graph' so I would have expected it to be false. onnxruntime/include/onnxruntime/core/session/onnxruntime_c_api.h Lines 4890 to 4895 in e753643
|
I am using a Phi4 model generated using ORT GenAI builder. It has an if node for large and small projections which seems to be the issue. |
With the latest change I am able to load a model with this if branch. Still I would like to understand how to correctly handle initializers in subgraphs. |
Yes, the recent change did not comply with the API description. This needs to be addressed. |
Will this be fixed by the ORT team ? This is blocking for Phi on NV EP. |
Yes, I am working on it now. Can you, also please, share a pointer to the exact model you are using? |
It is a Phi4 model from ORT GenAI builder. I can work on getting a model shared if required. |
@gedoensmax please, find a moment to test this the PRs branch. |
### Description <!-- Describe your changes. --> Move moving weights to memory to the end of Graph::Resolve(). Modify Inject so it copies data into TensorProto according to the C API docs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> TypeAndShape inference runs as a part of `Resolve()` and it unable to inspect and load the initializers that point to OrtValues at that time. We choose to move TensorProto to OrtValue conversion at the end of `Resolve()`. References: #25579
I already started it Friday but had some other system issues. Will test this on Monday. |
### Description <!-- Describe your changes. --> Move moving weights to memory to the end of Graph::Resolve(). Modify Inject so it copies data into TensorProto according to the C API docs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> TypeAndShape inference runs as a part of `Resolve()` and it unable to inspect and load the initializers that point to OrtValues at that time. We choose to move TensorProto to OrtValue conversion at the end of `Resolve()`. References: #25579
@yuslepukhin I am still seeing the same error on a Phi4 model. |
Adding this in graph.cc fixes my issues. https://github.com/microsoft/onnxruntime/blob/381c947894275b66486651208e407e2a3f0af750/onnxruntime/core/graph/graph.cc#L4266
I believe the fix is still not lowered to subgraphs as attribute on an ONNX node. |
There are two versions of ::ToGraphProto(). One of them is const and returns a modified copy of GraphProto with all in memory references gone. The other one that is not const does not do it because we would lose all in-memory tags. |
### Description <!-- Describe your changes. --> Move moving weights to memory to the end of Graph::Resolve(). Modify Inject so it copies data into TensorProto according to the C API docs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> TypeAndShape inference runs as a part of `Resolve()` and it unable to inspect and load the initializers that point to OrtValues at that time. We choose to move TensorProto to OrtValue conversion at the end of `Resolve()`. References: #25579 Co-authored-by: Dmitri Smirnov <[email protected]>
Ok I see, I shared the model on sharepoint to you and @skottmckay. |
Thank you for the model. Unfortunately, NVIDIA linker dies at the end of the build on multiple boxes so I can not verify it with a TRT build. Please, run this PR in your environment. Thx! |
9319fa5
to
b1546da
Compare
@yuslepukhin I have been testing with TRT RTX not the TRT EP. Maybe @chilo-ms can help with any build issues or otherwise I am happy to help in any Europe compatible time. I updated this branch to hold the exact code that I am executing. |
@skottmckay Thanks, got lost int the different PRs I will close this. |
Description
This PR is supposed to fix:
AddExternalInitializersFromFilesInMemory
APIThe first issue is resolved by adding a check in
Graph::InjectExternalInitializersFromFilesInMemory(
from what I can tell. The other issue that parsing of theIf
node fails duringGraph::Resolve
withinNvExecutionProvider::GetSupportedList
here. I tried to fix this by loading the external data in memory to raw data.This did not resolve the error though:
@chilo-ms @skottmckay would you be able to help out ? My guess is this has something to do with the ORT Graph wrapping.