-
Notifications
You must be signed in to change notification settings - Fork 55
Openvino/ep weight sharing #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
return ""; | ||
} else { | ||
auto input_type = graph_viewer.GetInputs()[0]->TypeAsProto()->tensor_type().elem_type(); | ||
if (session_context_.precision == "ACCURACY" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this taking care of AUTO:GPU, CPU case for Adobe.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is ACCURACY only for GPU? What about NPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accuracy is currently enabled for GPU. NPU only F16 precision is enabled from OVEP.
The change here is to get the subgraphs input precision.
if (session_context_.precision.find("ACCURACY") != std::string::npos && |
The config for OV ACCURACY mode remains unchanges and it takes care of Adobe case.
} | ||
subgraph_context_.subgraph_name = fused_node.Name(); | ||
|
||
ptr_stream_t model_stream; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this from ptr_stream_t to something more specific like model_stream_t
std::filesystem::path weight_filename = session_context_.onnx_model_path_name.parent_path(); | ||
if (sw.external_weight_filename.empty() && !sw.metadata.empty()) { | ||
// Reasonable assumption that all metadata entries have the same external file location | ||
sw.external_weight_filename = sw.metadata.begin()->second.location; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this assumption for the single model or for two seperate models.
weight_filename /= sw.external_weight_filename; | ||
std::ifstream weight_file(weight_filename); | ||
|
||
if (weight_file) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jatin to do a coverity check
std::filesystem::path weight_filename = session_context_.onnx_model_path_name.parent_path(); | ||
if (sw.external_weight_filename.empty() && !sw.metadata.empty()) { | ||
// Reasonable assumption that all metadata entries have the same external file location | ||
sw.external_weight_filename = sw.metadata.begin()->second.location; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are accessing external weight filename without any checks on the path being valid
"To export this model, set disable_dynamic_shapes to False"; | ||
ORT_THROW(exception_str); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was MSFT requirement to ensure that cache file dir and ep context model blob file path should be mapped ?
@preetha-intel is there a functionality break here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check ensures that only epctx model or OV cache files will be generated.
if (!session_context_.cache_dir.empty() && !session_context_.so_context_enable) { |
ep.context_enable takes the higher precedence
if (blob_filename.empty()) { | ||
blob_filename = session_context_.onnx_model_path_name; | ||
} | ||
const auto name = graph_body_viewer.ModelPath().stem().string() + "_" + subgraph_context_.subgraph_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no checks on the path and parent path to see if we are not pointing to restricted locations. Please talk to Ankit and Vishnu on filepath restrictions
#include <fstream> | ||
#include <utility> | ||
|
||
#include <filesystem> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saurabhkale17 please run cpplint to ensure this issue does not occur with MSFT CI Pipelines
} | ||
|
||
ov::element::Type GetOpenVINOElementType(ONNX_NAMESPACE::TensorProto_DataType dt) { | ||
static std::unordered_map<ONNX_NAMESPACE::TensorProto_DataType, ov::element::Type> map{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saurabhkale17 can you take care of this cpplint issue
7f2ff4f
to
a98fc3e
Compare
48ee137
to
6d1f1cf
Compare
while (!stream.eof()) { | ||
SharedContext::SharedWeights::Metadata::Key key; | ||
SharedContext::SharedWeights::Metadata::Value value; | ||
stream >> key.name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we prevent overflow here
} | ||
if (provider_options_map.find("device_id") != provider_options_map.end()) { | ||
std::string dev_id = provider_options_map.at("device_id").c_str(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we removing old provider options that might show up backward compatibility issues ?
* Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable
Hello I have raised some basic review comments , please determine if they are trivial and raise a JIRA to handle them later. |
LGTM Please take care of review comments going forward. |
|
||
pi.device_type = ParseDeviceType(provider_options, "device_type"); | ||
|
||
if (provider_options.contains("device_id")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'device_id' is still retained with the deprecation warning
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
* Rename EP instance context as session_context * Add support for GetEpContextNodes * enable config option for ovep weight sharing * add config option for ovep weight sharing * Refactor the conditional blocks in OVEP for compilation * Convert initializers with external data to graph inputs * create, store and export metadata for ovep weight sharing * fix error handling in weight sharing * fix crash issue while setting up inputs for wai model * pass weight sharing option to OVEP qdq stripping pass * Aligning OVEP variable names to match the session option value they hold * Add plumbing for context sharing plus refactoring around option handling * Store metadata in shared context * fix: fix provider options * create ov tensor from meta data and external data * create ov tensor * Add support for binding weight as input tensors * Fix for mapping subgraph to ov compiled network arguments * Fix for using so_share_ep_contexts without ep.context* flags * Add remote tensor support for NPU weight sharing * Use a single ov::Core copy across OVEP * Decouple provider option cache_dir from session option ep.context_file_path * Add support for serialization and deserialization of metadata to disk * Load blobs from relative path stored in ep_cache_context * Use remote L0 tensors for shared weights * fix linux ci issues * fix ci issues * Fix Windows build failure * Use ifstream to load weights instead of mmaped file * Fix for epctx models made up entirely of OVEP epctx nodes * Limit ov::Core lifetime to that of provider object * Enforce shared tensors cleanup on shutdown * Add support for default device type based on project configuration * fix: Fixed concrete_backend_ pointer double free issue on Linux * Preetha/weight sharing fix (#545) * Move variables from subgraph to session context for model specific properties * Fix for redundant subgraph creation * Remove unused variable --------- Co-authored-by: Javier E. Martinez <[email protected]> Co-authored-by: saurabhkale117 <[email protected]> Co-authored-by: Preetha Veeramalai <[email protected]> Co-authored-by: ankitm3k <[email protected]> Co-authored-by: Eric Crawford <[email protected]>
These changes are done to ensure that weight sharing happens between two model using session context option ep_weight_sharing. Key changes introduced in this feature are:
Motivation and Context
This change was required to ensure that LLM with prefill and kvcache models can use the same share
The change was also required to ensure EP Context nodes can be formed even when model is being subgraph partitioned.