Skip to content

Conversation

fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Jul 22, 2025

Description

Fix runtime error caused by buffer release too early

Motivation and Context

#25276 introduces buffer usage optimization, which reduced the peak GPU memory usage. However, the implementation may cause issue in some situations because the buffer is released too early. This PR fixes the issue.

@fs-eire fs-eire requested review from Copilot and feich-ms July 22, 2025 02:39
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a runtime error in the WebGPU buffer manager where buffers were being released prematurely, following optimizations introduced in #25276. The fix implements a deferred buffer release mechanism to prevent buffers from being freed while still in use.

Key changes:

  • Buffers that can't be cached are now stored in a pending list instead of being immediately released
  • Buffer release is deferred until the OnRefresh method is called, ensuring buffers remain valid during execution
  • Added a new member variable to track pending buffers awaiting release
Comments suppressed due to low confidence (1)

onnxruntime/core/providers/webgpu/buffer_manager.cc:247

  • The member variable 'pending_buffers' should follow the existing naming convention with a trailing underscore to match other private members like 'buckets_' and 'buckets_limit_'.
  std::vector<std::pair<WGPUBuffer, size_t>> pending_buffers;

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fs-eire Can you point out the runtime errors how to reproduce them? Maybe @feich-ms can take a further look to follow up. A test case will also be good.

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I though it was a revert. Now I understand the changes. LGTM. Thanks.

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Jul 22, 2025
@guschmue guschmue merged commit 47312d6 into main Jul 22, 2025
102 of 105 checks passed
@guschmue guschmue deleted the fs-eire/webgpu-fix-buffer-release-too-early branch July 22, 2025 16:26
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
…oft#25485)

### Description

Fix runtime error caused by buffer release too early

### Motivation and Context

microsoft#25276 introduces buffer usage optimization, which reduced the peak GPU
memory usage. However, the implementation may cause issue in some
situations because the buffer is released too early. This PR fixes the
issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants