Skip to content

Releases: jorgealias/llama.cpp

b3913

13 Oct 06:22
92be9f1

Choose a tag to compare

flake.lock: Update (#9870)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/bc947f541ae55e999ffdb4013441347d83b00feb?narHash=sha256-NOiTvBbRLIOe5F6RbHaAh6%2B%2BBNjsb149fGZd1T4%2BKBg%3D' (2024-10-04)
  → 'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

b3780

18 Sep 03:31
faf67b3

Choose a tag to compare

[SYCL]set context default value to avoid memory issue, update guide (…

b3772

17 Sep 04:14
23e0d70

Choose a tag to compare

ggml : move common CPU backend impl to new header (#9509)

b3755

15 Sep 06:03
822b632

Choose a tag to compare

ggml : ggml_type_name return "NONE" for invalid values (#9458)

When running on Windows, the quantization utility attempts to print the types that are not set which leads to a crash.

b3687

08 Sep 06:39
a5b5d9a

Choose a tag to compare

llama.android : fix build (#9350)

b3620

25 Aug 08:00
e11bd85

Choose a tag to compare

CPU/CUDA: Gemma 2 FlashAttention support (#8542)

* CPU/CUDA: Gemma 2 FlashAttention support

* apply logit_softcap to scale in kernel

* disable logit softcapping tests on Metal

* remove metal check

b3602

18 Aug 23:35
554b049

Choose a tag to compare

flake.lock: Update (#9068)

b3570

12 Aug 01:05
4134999

Choose a tag to compare

gguf-py : Numpy dequantization for most types (#8939)

* gguf-py : Numpy dequantization for most types

* gguf-py : Numpy dequantization for grid-based i-quants

b3511

05 Aug 00:10
0d6fb52

Choose a tag to compare

Install curl in runtime layer (#8693)

b3488

30 Jul 03:00
75af08c

Choose a tag to compare

ggml: bugfix: fix the inactive elements is agnostic for risc-v vector…