forked from KhronosGroup/OpenCL-Docs
-
Notifications
You must be signed in to change notification settings - Fork 1
cl_khr_tensor extension #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
f3a2bb9
Sketch something for cl_khr_tensor
linehill bf94321
Apply suggestions from code review
linehill 36db4b6
* Add brief introduction.
linehill baa7688
Add contributors
linehill 9db1e65
* Fix name for add kernel creator
linehill 141643d
* cl_tensor_type -> cl_tensor _datatype.
linehill db91aee
* clEnqueue(Read,Write)Tensor -> clEnqueue(TranslateFrom,TranslateTo)…
linehill 6fecc4e
Refactor command buffer temporary property out of tensor
linehill f55a904
Fix cl_tensor_type -> cl_tensor_datatype
linehill 6d1c26f
Add an open question
linehill 52d8bb3
Add CL_INVALID_TENSOR error code
linehill 534bcef
Require either buffer or host_ptr to be non-NULL
linehill 7447be2
Regenerate html for cl_exp_tensor
linehill 85edd05
cl_khr_tensor.* -> cl_exp_tensor.*
linehill 8fe9046
* Add overview
linehill 363c69b
Apply suggestions from code review
linehill 6402f59
* Add command buffer counterparts for tensor translation commands
linehill 57fda1a
* translate -> import/export
linehill File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -347,7 +347,9 @@ cl_int clEnqueueTranslateToTensor( | |
|
||
* _mem_pitch_ defines the length of each dimension in elements to be | ||
used for the memory region of _buffer_ or _host_ptr_. The length of | ||
the array must be at least the rank of _tensor_ minus one. | ||
the array must be at least the rank of _tensor_ minus one. if | ||
_mem_pitch_ is NULL or _mem_pitch_[i] is zero, _mem_pitch_[i] is | ||
computed as _region_[i + 1]. | ||
|
||
* _buffer_ and _host_ptr_ refer to a valid buffer object / host | ||
allocation where data is to be read into or to be written from. | ||
|
@@ -408,7 +410,8 @@ follows in pseudo C code: | |
size_t pitch(size_t dim) { | ||
size_t pitch = 1; | ||
for (size_t i = dim; i < tensor_rank - 1; i++) | ||
pitch *= mem_pitch != NULL ? mem_pitch[i] : region[i + 1]; | ||
pitch *= | ||
(mem_pitch != NULL || mem_pitch[i] == 0) ? mem_pitch[i] : region[i + 1]; | ||
return pitch; | ||
} | ||
---- | ||
|
@@ -418,11 +421,131 @@ an abstract function that accesses a tensor element in its storage at | |
given coordinate. The method how the coordinates translate to tensor | ||
storage addresses is unspecified. | ||
|
||
*clEnqueueTranslateFsomTensor* and *clEnqueueTranslateToTensor* | ||
returns CL_SUCCESS if the function is executed | ||
successfully. Otherwise, it returns one of the following errors: | ||
|
||
* CL_INVALID_COMMAND_QUEUE if _command_queue_ is not a valid host | ||
command-queue. | ||
|
||
* CL_INVALID_CONTEXT if the context associated with _command_queue_ | ||
and buffer are not the same or if the context associated with | ||
_command_queue_ and events in _event_wait_list_ are not the same. | ||
|
||
* CL_INVALID_MEM_OBJECT if _buffer_ is not a valid buffer object. | ||
|
||
* CL_INVALID_VALUE if _tensor_origin_ or _mem_origin_ is NULL. | ||
|
||
* CL_INVALID_VALUE if the region being read or written specified by | ||
(_mem_origin_, _region_, _mem_pitch_) is out of bounds. | ||
|
||
* CL_INVALID_VALUE if any _region_ array element is 0. | ||
|
||
* CL_INVALID_VALUE if _mem_pitch_ is not NULL and _mem_pitch_[i] is | ||
not 0 and _mem_pitch_[i] is less than _region_[i]. | ||
|
||
* CL_INVALID_VALUE if _buffer_ and _host_ptr_ both are NULL or non-NULL. | ||
|
||
* CL_INVALID_EVENT_WAIT_LIST if _event_wait_list_ is NULL and | ||
_num_events_in_wait_list_ > 0, or _event_wait_list_ is not NULL and | ||
_num_events_in_wait_list_ is 0, or if event objects in | ||
_event_wait_list_ are not valid events. | ||
|
||
* CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST if the read and write | ||
operations are blocking and the execution status of any of the | ||
events in _event_wait_list_ is a negative integer value. | ||
|
||
* CL_MEM_OBJECT_ALLOCATION_FAILURE if there is a failure to allocate | ||
memory for data store associated with memory object the _tensor_ is | ||
bound to. | ||
|
||
* CL_OUT_OF_RESOURCES if there is a failure to allocate resources | ||
required by the OpenCL implementation on the device. | ||
|
||
* CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources | ||
required by the OpenCL implementation on the host. | ||
|
||
// TODO: add clEnqueueCopyTensor | ||
|
||
// TODO: add clEnqueueFillTensor? | ||
|
||
TODO: add command buffer variants for clEnqueue*Tensor. | ||
If *cl_khr_command_buffer* is is supported, then the following command | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. |
||
buffer counterparts of the *clEnqueueTranslateFromTensor* and | ||
*clEnqueueTranslateToTensor* commands are available. | ||
|
||
[source,c] | ||
---- | ||
cl_int clCommandTranslateFromTensorKHR( | ||
cl_command_buffer_khr command_buffer, | ||
cl_command_queue command_queue, | ||
cl_tensor tensor, | ||
const size_t* tensor_origin, | ||
const size_t* mem_origin, | ||
const size_t* region, | ||
const size_t* mem_pitch, | ||
cl_mem buffer, | ||
void* host_ptr, | ||
cl_uint num_sync_points_in_wait_list, | ||
const cl_sync_point_khr* sync_point_wait_list, | ||
cl_sync_point_khr* sync_point, | ||
cl_mutable_command_khr* mutable_handle); | ||
---- | ||
|
||
[source,c] | ||
---- | ||
cl_int clCommandTranslateToTensorKHR( | ||
cl_command_buffer_khr command_buffer, | ||
cl_command_queue command_queue, | ||
cl_tensor tensor, | ||
const size_t* tensor_origin, | ||
const size_t* mem_origin, | ||
const size_t* region, | ||
const size_t* mem_pitch, | ||
cl_mem buffer, | ||
const void* host_ptr, | ||
cl_uint num_sync_points_in_wait_list, | ||
const cl_sync_point_khr* sync_point_wait_list, | ||
cl_sync_point_khr* sync_point, | ||
cl_mutable_command_khr* mutable_handle); | ||
---- | ||
|
||
* _command_buffer_ refers to valid command-buffer object. | ||
|
||
* For _command_queue_, _tensor_, _tensor_origin_, _mem_origin_, | ||
_region_, _mem_pitch_, _buffer_ and _host_ptr_ parameters refer to | ||
*clEnqueueTranslateFromTensor*. | ||
|
||
* For _num_sync_points_in_wait_list_, _sync_point_wait_list_, | ||
_sync_point_, _mutable_handle_ parameters refer to | ||
*clCommandCopyBufferKHR*. | ||
|
||
*clCommandTranslateFromTensorKHR* and *clCommandTranslateFromTensorKHR* | ||
returns CL_SUCCESS if the function is executed | ||
successfully. Otherwise, it returns one of the following errors: | ||
|
||
* CL_INVALID_COMMAND_QUEUE if _command_queue_ is not NULL. | ||
|
||
* CL_INVALID_COMMAND_BUFFER_KHR if _command_buffer_ is not a valid | ||
command-buffer. | ||
|
||
* CL_INVALID_CONTEXT if the context associated with _command_queue_ | ||
and _command_buffer_ is not the same. | ||
|
||
* CL_INVALID_OPERATION if _command_buffer_ has been finalized. | ||
|
||
* CL_INVALID_VALUE if _mutable_handle_ is not NULL. | ||
|
||
* CL_INVALID_SYNC_POINT_WAIT_LIST_KHR if _sync_point_wait_list_ is | ||
NULL and _num_sync_points_in_wait_list_ is > 0, or | ||
_sync_point_wait_list_ is not NULL and _num_sync_points_in_wait_list_ is | ||
0, or if synchronization-point objects in _sync_point_wait_list_ are | ||
not valid synchronization-points. | ||
|
||
* CL_OUT_OF_RESOURCES if there is a failure to allocate resources | ||
required by the OpenCL implementation on the device. | ||
|
||
* CL_OUT_OF_HOST_MEMORY if there is a failure to allocate resources | ||
required by the OpenCL implementation on the host. | ||
|
||
==== Add New Buffer Property in Section 5.2.1 | ||
|
||
|
@@ -580,17 +703,17 @@ std::vector<float> out_data(b * m * n); | |
// optimal data layout. | ||
clEnqueueTranslateToTensor( | ||
cmd_q, in0, false, {0, 0, 0}, {0, 0, 0}, {b, m, k}, | ||
nullptr, nullptr, nullptr, in0_data.data(), 0, nullptr, nullptr); | ||
nullptr, nullptr, in0_data.data(), 0, nullptr, nullptr); | ||
clEnqueueTranslateToTensor( | ||
cmd_q, in1, false, {0, 0, 0}, {0, 0, 0}, {b, k, n}, | ||
nullptr, nullptr, nullptr, in1_data.data(), 0, nullptr, nullptr); | ||
nullptr, nullptr, in1_data.data(), 0, nullptr, nullptr); | ||
clEnqueueNDRangeKernel( | ||
cmd_q, matmul_kernel, 3, matmul_grid, nullptr, nullptr, 0, nullptr, nullptr); | ||
clEnqueueNDRangeKernel( | ||
cmd_q, add_kernel, 3, add_grid, nullptr, nullptr, 0, nullptr, nullptr); | ||
clEnqueueTranslateFromTensor( | ||
cmd_q, out, false, {0, 0, 0}, {0, 0, 0}, {b, m, n}, | ||
nullptr, nullptr, nullptr, out_data.data(), 0, nullptr, nullptr); | ||
nullptr, nullptr, out_data.data(), 0, nullptr, nullptr); | ||
---- | ||
|
||
An example use of tensors in a command buffer when cl_khr_command_buffer | ||
|
@@ -642,10 +765,10 @@ cl_command_buffer_khr cb = | |
cl_sync_point_khr in0_syncp, in1_syncp, matmul_syncp, add_syncp; | ||
clCommandTranslateToTensorKHR( | ||
cmd_b, cmd_q, in0, {0, 0, 0}, {0, 0, 0}, {b, m, k}, | ||
nullptr, nullptr, nullptr, in0_data.data(), 0, nullptr, &in0_syncp); | ||
nullptr, nullptr, in0_data.data(), 0, nullptr, &in0_syncp); | ||
clCommandTranslateToTensorKHR( | ||
cmd_b, cmd_q, in1, {0, 0, 0}, {0, 0, 0}, {b, k, m}, | ||
nullptr, nullptr, nullptr, in1_data.data(), 0, nullptr, &in1_syncp); | ||
nullptr, nullptr, in1_data.data(), 0, nullptr, &in1_syncp); | ||
clCommandNDRangeKernelKHR( | ||
cmd_b, cmd_q, nullptr, matmul_kernel, 3, matmul_grid, nullptr, nullptr, | ||
2, {in0_syncp, in2_syncp}, &matmul_syncp, nullptr); | ||
|
@@ -654,7 +777,7 @@ clCommandNDRangeKernelKHR( | |
1, {matmul_syncp}, &add_syncp, nullptr); | ||
clCommandTranslateFromTensorKHR( | ||
cmd_b, cmd_q, out, {0, 0, 0}, {0, 0, 0}, {b, k, m}, | ||
nullptr, nullptr, nullptr, out_data.data(), 1, {add_syncp}, nullptr); | ||
nullptr, nullptr, out_data.data(), 1, {add_syncp}, nullptr); | ||
|
||
// Finalize the command buffer. At this point the OpenCL | ||
// implementation may reserve enough storage for all the tensor | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo Fsom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe import/export are even better word here instead of 'translate'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed type and renamed.