Skip to content

Conversation

sfatimar
Copy link

These changes are done to ensure that weight sharing happens between two model using session context option ep_weight_sharing. Key changes introduced in this feature are:

  1. Creating a shared context between two models
  2. Extracting external constant initializers and re labelling them back as inputs to the model to allow weight loading in the direct blob.
  3. Creating EP Context Nodes when Subgraph partitioning is happening.

Motivation and Context

This change was required to ensure that LLM with prefill and kvcache models can use the same share
The change was also required to ensure EP Context nodes can be formed even when model is being subgraph partitioned.

return "";
} else {
auto input_type = graph_viewer.GetInputs()[0]->TypeAsProto()->tensor_type().elem_type();
if (session_context_.precision == "ACCURACY" &&
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this taking care of AUTO:GPU, CPU case for Adobe.?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ACCURACY only for GPU? What about NPU

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accuracy is currently enabled for GPU. NPU only F16 precision is enabled from OVEP.
The change here is to get the subgraphs input precision.

if (session_context_.precision.find("ACCURACY") != std::string::npos &&

The config for OV ACCURACY mode remains unchanges and it takes care of Adobe case.

}
subgraph_context_.subgraph_name = fused_node.Name();

ptr_stream_t model_stream;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this from ptr_stream_t to something more specific like model_stream_t

std::filesystem::path weight_filename = session_context_.onnx_model_path_name.parent_path();
if (sw.external_weight_filename.empty() && !sw.metadata.empty()) {
// Reasonable assumption that all metadata entries have the same external file location
sw.external_weight_filename = sw.metadata.begin()->second.location;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assumption for the single model or for two seperate models.

weight_filename /= sw.external_weight_filename;
std::ifstream weight_file(weight_filename);

if (weight_file) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jatin to do a coverity check

std::filesystem::path weight_filename = session_context_.onnx_model_path_name.parent_path();
if (sw.external_weight_filename.empty() && !sw.metadata.empty()) {
// Reasonable assumption that all metadata entries have the same external file location
sw.external_weight_filename = sw.metadata.begin()->second.location;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are accessing external weight filename without any checks on the path being valid

"To export this model, set disable_dynamic_shapes to False";
ORT_THROW(exception_str);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was MSFT requirement to ensure that cache file dir and ep context model blob file path should be mapped ?
@preetha-intel is there a functionality break here

Copy link

@preetha-intel preetha-intel Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check ensures that only epctx model or OV cache files will be generated.

if (!session_context_.cache_dir.empty() && !session_context_.so_context_enable) {

ep.context_enable takes the higher precedence

if (blob_filename.empty()) {
blob_filename = session_context_.onnx_model_path_name;
}
const auto name = graph_body_viewer.ModelPath().stem().string() + "_" + subgraph_context_.subgraph_name;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no checks on the path and parent path to see if we are not pointing to restricted locations. Please talk to Ankit and Vishnu on filepath restrictions

#include <fstream>
#include <utility>

#include <filesystem>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saurabhkale17 please run cpplint to ensure this issue does not occur with MSFT CI Pipelines

}

ov::element::Type GetOpenVINOElementType(ONNX_NAMESPACE::TensorProto_DataType dt) {
static std::unordered_map<ONNX_NAMESPACE::TensorProto_DataType, ov::element::Type> map{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saurabhkale17 can you take care of this cpplint issue

@ankitm3k ankitm3k force-pushed the openvino/ep-weight-sharing branch from 48ee137 to 6d1f1cf Compare January 27, 2025 08:46
while (!stream.eof()) {
SharedContext::SharedWeights::Metadata::Key key;
SharedContext::SharedWeights::Metadata::Value value;
stream >> key.name;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we prevent overflow here

}
if (provider_options_map.find("device_id") != provider_options_map.end()) {
std::string dev_id = provider_options_map.at("device_id").c_str();

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we removing old provider options that might show up backward compatibility issues ?

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable
@sfatimar
Copy link
Author

Hello I have raised some basic review comments , please determine if they are trivial and raise a JIRA to handle them later.

@sfatimar
Copy link
Author

LGTM Please take care of review comments going forward.

@sfatimar sfatimar merged commit a6698ce into ovep-develop Jan 27, 2025
7 of 17 checks passed

pi.device_type = ParseDeviceType(provider_options, "device_type");

if (provider_options.contains("device_id")) {
Copy link

@preetha-intel preetha-intel Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'device_id' is still retained with the deprecation warning

sfatimar added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Jan 31, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Feb 5, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 5, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 5, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 5, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 5, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
sfatimar added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
ankitm3k added a commit that referenced this pull request Feb 6, 2025
* Rename EP instance context as session_context

* Add support for GetEpContextNodes

* enable config option for ovep weight sharing

* add config option for ovep weight sharing

* Refactor the conditional blocks in OVEP for compilation

* Convert initializers with external data to graph inputs

* create, store and export metadata for ovep weight sharing

* fix error handling in weight sharing

* fix crash issue while setting up inputs for wai model

* pass weight sharing option to OVEP qdq stripping pass

* Aligning OVEP variable names to match the session option value they hold

* Add plumbing for context sharing plus refactoring around option handling

* Store metadata in shared context

* fix: fix provider options

* create ov tensor from meta data and external data

* create ov tensor

* Add support for binding weight as input tensors

* Fix for mapping subgraph to ov compiled network arguments

* Fix for using so_share_ep_contexts without ep.context* flags

* Add remote tensor support for NPU weight sharing

* Use a single ov::Core copy across OVEP

* Decouple provider option cache_dir from session option ep.context_file_path

* Add support for serialization and deserialization of metadata to disk

* Load blobs from relative path stored in ep_cache_context

* Use remote L0 tensors for shared weights

* fix linux ci issues

* fix ci issues

* Fix Windows build failure

* Use ifstream to load weights instead of mmaped file

* Fix for epctx models made up entirely of OVEP epctx nodes

* Limit ov::Core lifetime to that of provider object

* Enforce shared tensors cleanup on shutdown

* Add support for default device type based on project configuration

* fix: Fixed concrete_backend_ pointer double free issue on Linux

* Preetha/weight sharing fix (#545)

* Move variables from subgraph to session context for model specific properties

* Fix for redundant subgraph creation

* Remove unused variable

---------

Co-authored-by: Javier E. Martinez <[email protected]>
Co-authored-by: saurabhkale117 <[email protected]>
Co-authored-by: Preetha Veeramalai <[email protected]>
Co-authored-by: ankitm3k <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants