Skip to content

Conversation

edgchen1
Copy link
Contributor

@edgchen1 edgchen1 commented Sep 18, 2025

Description

Add support for MemcpyFromHost and MemcpyToHost ops with plugin EPs.

  • Add CPU EP fallback kernels for the memcpy ops. These are generic implementations using a data transfer manager.
  • Update SessionState::PopulateKernelCreateInfo() to fall back to CPU memcpy kernels if a node's assigned provider doesn't have them.
  • Update MemcpyTransformer to determine whether providers are CPU-based or compatible with other providers by looking at the device type instead of matching against a hardcoded list of provider types. This accommodates plugin EPs, where the provider type can't be hardcoded.

Motivation and Context

Allow plugin EPs to work with models where memcpy ops are required (i.e., models where connected nodes are not fully assigned to the plugin EP).

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@edgchen1 edgchen1 marked this pull request as ready for review September 19, 2025 00:58
@edgchen1 edgchen1 changed the title [WIP] MemcpyFromHost and MemcpyToHost support for plugin EPs MemcpyFromHost and MemcpyToHost support for plugin EPs Sep 19, 2025
skottmckay
skottmckay previously approved these changes Sep 22, 2025
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@edgchen1 edgchen1 merged commit 4545732 into main Sep 23, 2025
96 of 97 checks passed
@edgchen1 edgchen1 deleted the edgchen1/memcpy_op_support_for_plugin_eps branch September 23, 2025 17:42
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
<!-- Describe your changes. -->

Add support for `MemcpyFromHost` and `MemcpyToHost` ops with plugin EPs.

- Add CPU EP fallback kernels for the memcpy ops. These are generic
implementations using a data transfer manager.
- Update `SessionState::PopulateKernelCreateInfo()` to fall back to CPU
memcpy kernels if a node's assigned provider doesn't have them.
- Update `MemcpyTransformer` to determine whether providers are
CPU-based or compatible with other providers by looking at the device
type instead of matching against a hardcoded list of provider types.
This accommodates plugin EPs, where the provider type can't be
hardcoded.

<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Allow plugin EPs to work with models where memcpy ops are required
(i.e., models where connected nodes are not fully assigned to the plugin
EP).
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
<!-- Describe your changes. -->

Add support for `MemcpyFromHost` and `MemcpyToHost` ops with plugin EPs.

- Add CPU EP fallback kernels for the memcpy ops. These are generic
implementations using a data transfer manager.
- Update `SessionState::PopulateKernelCreateInfo()` to fall back to CPU
memcpy kernels if a node's assigned provider doesn't have them.
- Update `MemcpyTransformer` to determine whether providers are
CPU-based or compatible with other providers by looking at the device
type instead of matching against a hardcoded list of provider types.
This accommodates plugin EPs, where the provider type can't be
hardcoded.

<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Allow plugin EPs to work with models where memcpy ops are required
(i.e., models where connected nodes are not fully assigned to the plugin
EP).
adrianlizarraga added a commit that referenced this pull request Sep 24, 2025
### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
#26057
  - main merge date: Sept 15, 11:33am
  - pr: #25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: #26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: #25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
#25689
  - main merge date: Sept 23, 10:42am
  - pr: #26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability #26132
  - main merge date: Sept 23, 8:54pm
  - pr: #26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
@snnn
Copy link
Member

snnn commented Sep 25, 2025

This PR has been cherry-picked into the rel-1.23.1 branch in PR #26140. Removing the release:1.23.1 label.

TedThemistokleous added a commit to ROCm/onnxruntime that referenced this pull request Oct 17, 2025
* ORT 1.23.1 cherrypick 1 [REDO] (microsoft#26140)

### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
microsoft#26057
  - main merge date: Sept 15, 11:33am
  - pr: microsoft#25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: microsoft#26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: microsoft#25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
microsoft#25689
  - main merge date: Sept 23, 10:42am
  - pr: microsoft#26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability microsoft#26132
  - main merge date: Sept 23, 8:54pm
  - pr: microsoft#26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>

* ORT 1.23.1 cherrypick 2 (microsoft#26182)

### Description
Adds the following commits to the `rel-1.23.1` branch for ORT 1.23.1:


- add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart
  - main merge date: July 31, 1:05am
  - pr: microsoft#25590
  - commit: e753643
- [build] fix WebAssembly build on macOS/arm64
  - main merge date: Aug 5, 8:07am
  - pr: microsoft#25653
  - commit: 53f152b
- [CPU] MoE Kernel (microsoft#25958)
  - main merge date: Sept 10, 4:54pm
  - pr: microsoft#25958
  - commit: 930e640
- [CPU] Block-wise QMoE kernel for CPU
  - main merge date: Sept 15, 8:32am
  - pr: microsoft#26009
  - commit: 5d17734
- [C#] Implement missing APIs
  - main merge date: Sept 24, 10:50am
  - pr: microsoft#26101
  - commit: 35dcab5
- Regenerate test model with ONNX IR < 12
  - main merge date: Sept 24, 2:50pm
  - pr: microsoft#26149
  - commit: 88f2652
- [CPU] Fix compilation errors because of unused variables
  - main merge date: Sept 25, 1:21pm
  - pr: microsoft#26147
  - commit: 42fcd71
- [EP ABI] Check if nodes specified in GetCapability() have already been
assigned
  - main merge date: Sept 26, 1:24am
  - pr: microsoft#26156
  - commit: 67d3ba0
- [QNN EP] Add dynamic option to set HTP performance mode
  - main merge date: Sept 26, 11:55am
  - pr: microsoft#26135
  - commit: 6cc40fd

---------

Co-authored-by: xieofxie <[email protected]>
Co-authored-by: hualxie <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-ashwshan <[email protected]>

---------

Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: xieofxie <[email protected]>
Co-authored-by: hualxie <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-ashwshan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants