-
-
Notifications
You must be signed in to change notification settings - Fork 36.2k
Addons - Bitonic Sort #31852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addons - Bitonic Sort #31852
Conversation
|
The actual sort itself is done and works as expected for multiple data types and counts, but I'm leaving this in draft for now till its more thoroughly reviewed. As it's the first class of its type (encapsulated GPGPU sort/operation) I would like to ensure that the maintainers agree on the class structure and documentation before it goes in, as it could inform how contributors implement/structure future encapsulated GPGPU operations in the future. Obviously this is WebGPURenderer only, and the documentation should likely change to expose this more transparently to users ( class BitonicSort should also be renamed class BitonicSortGPU ). There are also improvements that can be made to the class that may or may not be considered blocking such as:
|
|
Is there anything blocking this PR. If possible, I would like to use it for performance optimizations within the compute bird sample, but would like the class structure to be reviewed for the reasons stated above. |
|
Wouldn't it be more straightforward for a Sort Module to simply perform the sorting operation, with perhaps an option, only in very specific and advanced cases, to manually control its update? Something such as: const bitonicSort = new BitonicSort({
dataBuffer: arrayToSort,
});
renderer.setAnimationLoop(async () => {
await bitonicSort.compute();
renderer.compute(computeProgramUsingSortedArray);
});And for manual sorting step control: const bitonicSort = new BitonicSort({
dataBuffer: arrayToSort,
});
renderer.setAnimationLoop(async () => {
while (!bitonicSort.isSorted) {
await bitonicSort.step();
}
renderer.compute(computeProgramUsingSortedArray);
}); |
There is already a function for this at the bottom of the class: async compute( renderer ) {
this.globalOpsRemaining = 0;
this.globalOpsInSpan = 0;
this.currentDispatch = 0;
for ( let i = 0; i < this.stepCount; i ++ ) {
await this.computeStep( renderer );
}
}The Bitonic Sort example does compute step rather than a full compute to visually represent the swaps. The current implementation uses multiple dispatches to perform the sort, so its also leveraging the computeStep functionality. More efficient algorithms could likely do a sort in a single dispatch. Accordingly, for those sorts, the compute step function would execute different code than the main sort algorithm, but in bitonic sort's case, it goes through this.computeStep for both the complete sort and the step-by-step sort. |
I would prefer something like this object syntax for the arguments. |
|
|
||
| const scene = new THREE.Scene(); | ||
|
|
||
| const infoArray = new Uint32Array( 3, 2, 2 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be new Uint32Array( [ 3, 2, 2 ] )? Not sure if it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Does this sort faster than the pure js + worker sort used by playcanvas for 3DGS scenes? |
It would depend on your use case. If you're only sorting 32 or 256 elements then it would probably be performant to sort on the CPU, as the dispatch time between the CPU and the GPU may be larger than the time to sort. Additionally Bitonic Sort is only really performant for 2^n elements and is not as generalizable or as performant as a GPU radix sort. You'd also, as far as I'm aware, be the first person to ever use the TSL Bitonic Sort outside of the three.js bitonic sort example. EDIT: @arcman7 This would not be faster than your company's existing WebGPU/WGSL radix sort IMO. However, there should no longer be any barrier toward implementing a radix sort within TSL now that Three.js has workgroup barriers, subgroup functionality, dispatchWorkgroupsIndirect, and most of the compute functionality you'd need. |

Description
Creates an add-on that encapsulates the bitonic sort functionality present in the webgpu_compute_sort_bitonic example. Currently only handles scalar inputs because I'm uncertain whether TSL currently emulates the boolean vector functionality of GLSL. I've also removed the timestamps from the bitonic sort example since they aren't really informative when presented at such high speed, and one can already perceive that the new encapsulated bitonic sort takes less dispatches than the previous local sort and the global sort only example.