-
Notifications
You must be signed in to change notification settings - Fork 53
Closed
Labels
Description
Demonstrate how MLTensor can be used to help web developers manage constant data (e.g., trained weights) on-device.
Dependent PRs
MLConstantOperand: Do we need anMLConstantOperand? #668 (comment)MLTensor: Add MLTensor explainer #754
Motivation
- Allow constant data to be uploaded directly to the device, which is a capability that Execution Providers (EPs) leverage to prevent out-of-memory (OOM) errors (ORT example).
- Re-use constant buffers in system memory between graphs, particularly for encoder-decoder models like Whisper.
Design
MLTensor containing constant data will be associated upon creating the MLOperand. At build(), the constant data will be forwarded into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately after createConstant() succeeds.
Example JS
// Upload constant data directly to device
constantTensor = ctx.createConstant(...); // immutable
builder1 = new MLGraphBuilder(ctx);
constantOp1 = builder1.constant(constantTensor);
constantOp2 = builder2.constant(constantTensor);
// ...
graph1 = await builder1.build(...);
graph2 = await builder2.build(...);
// Optional: free-up system memory
constantTensor.destroy();Proposed IDL
interface MLConstantTensor : MLTensor {};
partial interface MLContext {
Promise<MLConstantTensor> createConstant(MLOperandDataType dataType, ArrayBufferView sourceData);
};
partial interface MLGraphBuilder {
MLOperand constant(MLConstantTensor tensor);
};Edits:
- 9/16: Added
MLOperandDescriptoras required byMLOperand - 9/18: Added constant-initializer to createTensor()
- 9/19: Reuse input(..) via constant usage flag
- 1/29: Have new tensor type passed to constant()
fdwr