microsoft · tianleiwu · Nov 2, 2024 · Nov 1, 2024 · Nov 1, 2024 · Nov 1, 2024
diff --git a/docs/python/api_summary.rst b/docs/python/api_summary.rst
@@ -244,9 +244,36 @@ You can also bind inputs and outputs directly to a PyTorch tensor.
     )
 
     session.run_with_iobinding(binding)
-    
+
 You can also see code examples of this API in in the `ONNX Runtime inferences examples <https://github.com/microsoft/onnxruntime-inference-examples/blob/main/python/api/onnxruntime-python-api.py>`_.
 
+Some onnx data type (like TensorProto.BFLOAT16, TensorProto.FLOAT8E4M3FN and TensorProto.FLOAT8E5M2) are not supported by Numpy. You can directly bind input or output with Torch tensor of corresponding data type
+(like torch.bfloat16, torch.float8_e4m3fn and torch.float8_e5m2) in GPU memory.
+
+.. code-block:: python
+
+    x = torch.ones([3], dtype=torch.float8_e5m2, device='cuda:0')
+    y = torch.empty([3], dtype=torch.bfloat16, device='cuda:0')
+
+    binding = session.io_binding()
+    binding.bind_input(
+        name='X',
+        device_type='cuda',
+        device_id=0,
+        element_type=TensorProto.FLOAT8E5M2,
+        shape=tuple(x.shape),
+        buffer_ptr=x.data_ptr(),
+        )
+    binding.bind_output(
+        name='Y',
+        device_type='cuda',
+        device_id=0,
+        element_type=TensorProto.BFLOAT16,
+        shape=tuple(y.shape),
+        buffer_ptr=y.data_ptr(),
+        )
+	session.run_with_iobinding(binding)
+
 
 API Details
 ===========