Skip to content

Support building graphs from MLTensor containing constants #760

@bbernhar

Description

@bbernhar

Demonstrate how MLTensor can be used to help web developers manage constant data (e.g., trained weights) on-device.

Dependent PRs

Motivation

  • Allow constant data to be uploaded directly to the device, which is a capability that Execution Providers (EPs) leverage to prevent out-of-memory (OOM) errors (ORT example).
  • Re-use constant buffers in system memory between graphs, particularly for encoder-decoder models like Whisper.

Design

MLTensor containing constant data will be associated upon creating the MLOperand. At build(), the constant data will be forwarded into the device. The original constant data (ie. ArrayBuffer input or uploaded device data held by MLTensor) can be discarded immediately after createConstant() succeeds.

Example JS

// Upload constant data directly to device
constantTensor = ctx.createConstant(...); // immutable

builder1 = new MLGraphBuilder(ctx);
constantOp1 = builder1.constant(constantTensor);
constantOp2 = builder2.constant(constantTensor);
// ...
graph1 = await builder1.build(...);
graph2 = await builder2.build(...);

// Optional: free-up system memory
constantTensor.destroy();

Proposed IDL

interface MLConstantTensor : MLTensor {};

partial interface MLContext {
    Promise<MLConstantTensor> createConstant(MLOperandDataType dataType, ArrayBufferView sourceData);
};

partial interface MLGraphBuilder {
    MLOperand constant(MLConstantTensor tensor);
};

Edits:

  • 9/16: Added MLOperandDescriptor as required by MLOperand
  • 9/18: Added constant-initializer to createTensor()
  • 9/19: Reuse input(..) via constant usage flag
  • 1/29: Have new tensor type passed to constant()

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions