Fp16 windows (depends on #3429) #3458

gkisalapl · 2025-09-04T06:26:25Z

This PR make possible to perform windows build with FP16
enabled, uint16_t is used as FP16 storage only type

Self-evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Moved quantize functions from GGML to nntrainer Signed-off-by: p-debski2 <[email protected]>

Added q8_0 row quantization, and multiple dequantization functions Signed-off-by: p-debski2 <[email protected]>

Added includes for intrinsic functions Signed-off-by: p-debski2 <[email protected]>

Moved nntr_ggml_impl to a separate directory and added a shared header with structure definitions Signed-off-by: p-debski2 <[email protected]>

Added some util macros and functions to fix building on Linux Signed-off-by: p-debski2 <[email protected]>

Moved more ggml functions and removed the includes from interface files so that they use only the nntr_ggml implementaion Signed-off-by: p-debski2 <[email protected]>

Added more GGML type definitions for AVX operations Signed-off-by: p-debski2 <[email protected]>

Removed ggml includes, renamed some functions, moved some declerations to common nntr_ggml_impl headers Signed-off-by: p-debski2 <[email protected]>

Added a define guard to stop function redefinition Signed-off-by: p-debski2 <[email protected]>

Added some comments for generating docs on helper GGML structures Signed-off-by: p-debski2 <[email protected]>

Removed ggml from meson & tizen specification, leaving the submodule for now Signed-off-by: p-debski2 <[email protected]>

Deleted ggml as a dependency from the project Signed-off-by: p-debski2 <[email protected]>

This commit introduces mapping logic to the RMS Norm OpenCL kernel following kernel execution. Additionally, this patch updates the unit tests for the OpenCL BLAS kernels. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

This commit adds helper functions for SVM allocation. This patch also fixes issues where allocated memory is not destroyed. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

This commit introduces a fallback implementation for the fused unpack q4_0x8 and 16-bit transpose operation, which preprocesses q4_0x8 data. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

This commit adds AVX2 implementation of a fused unpack q4_0x8 and 16-bit transpose operation. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

This commit removes pre-allocated SVM for transposed data. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

This PR make possible to perform windows build with FP16 enabled, uint16_t is used as FP16 storage only type **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Grzegorz Kisala <[email protected]>

gkisalapl · 2025-09-09T08:58:24Z

PR replaced by #3468

p-debski2 and others added 17 commits August 25, 2025 16:43

Integrated quantization functions

4ea5406

Moved quantize functions from GGML to nntrainer Signed-off-by: p-debski2 <[email protected]>

Added more functions for quantization

fc1f5a1

Added q8_0 row quantization, and multiple dequantization functions Signed-off-by: p-debski2 <[email protected]>

Added include for specific implementations

bd4af5f

Added includes for intrinsic functions Signed-off-by: p-debski2 <[email protected]>

Added more ggml functions to the nntrainer project

19d74fe

Moved nntr_ggml_impl to a separate directory and added a shared header with structure definitions Signed-off-by: p-debski2 <[email protected]>

Fixed Linux build

d98cdbf

Added some util macros and functions to fix building on Linux Signed-off-by: p-debski2 <[email protected]>

Removed ggml includes from interfaces

17bca9c

Moved more ggml functions and removed the includes from interface files so that they use only the nntr_ggml implementaion Signed-off-by: p-debski2 <[email protected]>

Additional type defines

fabbb0d

Added more GGML type definitions for AVX operations Signed-off-by: p-debski2 <[email protected]>

Made ggml_interface_fp16 independent

2882b42

Removed ggml includes, renamed some functions, moved some declerations to common nntr_ggml_impl headers Signed-off-by: p-debski2 <[email protected]>

Attempt at fixing Tizen build

79ca32f

Added a define guard to stop function redefinition Signed-off-by: p-debski2 <[email protected]>

Added doxygen comments

d54ccb7

Added some comments for generating docs on helper GGML structures Signed-off-by: p-debski2 <[email protected]>

Removed ggml from meson build

b2affd6

Removed ggml from meson & tizen specification, leaving the submodule for now Signed-off-by: p-debski2 <[email protected]>

Removed the ggml submodule

b95b19b

Deleted ggml as a dependency from the project Signed-off-by: p-debski2 <[email protected]>

[OpenCL] Remove unused SVM memory space

5e5703b

This commit removes pre-allocated SVM for transposed data. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <[email protected]>

gkisalapl changed the title ~~Fp16 windows~~ Fp16 windows (depends on #3429) Sep 4, 2025

gkisalapl force-pushed the fp16_windows branch from fc7a4aa to ae63fde Compare September 4, 2025 06:33

gkisalapl force-pushed the fp16_windows branch from ae63fde to 798727d Compare September 4, 2025 06:36

github-actions bot added the Need Review label Sep 4, 2025

gkisalapl closed this Sep 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fp16 windows (depends on #3429) #3458

Fp16 windows (depends on #3429) #3458

Uh oh!

gkisalapl commented Sep 4, 2025 •

edited

Loading

Uh oh!

gkisalapl commented Sep 9, 2025

Uh oh!

Uh oh!

Fp16 windows (depends on #3429) #3458

Fp16 windows (depends on #3429) #3458

Uh oh!

Conversation

gkisalapl commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gkisalapl commented Sep 9, 2025

Uh oh!

Uh oh!

gkisalapl commented Sep 4, 2025 •

edited

Loading