Releases · jorgealias/llama.cpp

13 Oct 06:22

92be9f1

b3913 Latest

Latest

flake.lock: Update (#9870)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/bc947f541ae55e999ffdb4013441347d83b00feb?narHash=sha256-NOiTvBbRLIOe5F6RbHaAh6%2B%2BBNjsb149fGZd1T4%2BKBg%3D' (2024-10-04)
  → 'github:NixOS/nixpkgs/5633bcff0c6162b9e4b5f1264264611e950c8ec7?narHash=sha256-9UTxR8eukdg%2BXZeHgxW5hQA9fIKHsKCdOIUycTryeVw%3D' (2024-10-09)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Assets 22

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-10-13T06:22:18Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-10-13T06:22:24Z
llama-b1-bin-win-hip-x64-gfx1030.zip

236 MB 2024-10-13T06:22:32Z
llama-b1-bin-win-hip-x64-gfx1100.zip

237 MB 2024-10-13T06:22:37Z
llama-b1-bin-win-hip-x64-gfx1101.zip

237 MB 2024-10-13T06:22:43Z
llama-b3913-bin-macos-arm64.zip

52 MB 2024-10-13T06:22:48Z
llama-b3913-bin-macos-x64.zip

52.8 MB 2024-10-13T06:22:49Z
llama-b3913-bin-ubuntu-x64.zip

58.5 MB 2024-10-13T06:22:51Z
llama-b3913-bin-win-avx-x64.zip

7.79 MB 2024-10-13T06:22:53Z
llama-b3913-bin-win-avx2-x64.zip

7.79 MB 2024-10-13T06:22:53Z
Source code (zip)

2024-10-13T03:11:26Z
Source code (tar.gz)

2024-10-13T03:11:26Z

18 Sep 03:31

github-actions

b3780

faf67b3

b3780

[SYCL]set context default value to avoid memory issue, update guide (…

Assets 19

17 Sep 04:14

github-actions

b3772

23e0d70

b3772

ggml : move common CPU backend impl to new header (#9509)

Assets 19

15 Sep 06:03

github-actions

b3755

822b632

b3755

ggml : ggml_type_name return "NONE" for invalid values (#9458)

When running on Windows, the quantization utility attempts to print the types that are not set which leads to a crash.

Assets 19

08 Sep 06:39

github-actions

b3687

a5b5d9a

b3687

llama.android : fix build (#9350)

Assets 19

25 Aug 08:00

github-actions

b3620

e11bd85

b3620

CPU/CUDA: Gemma 2 FlashAttention support (#8542)

* CPU/CUDA: Gemma 2 FlashAttention support

* apply logit_softcap to scale in kernel

* disable logit softcapping tests on Metal

* remove metal check

Assets 19

18 Aug 23:35

github-actions

b3602

554b049

b3602

flake.lock: Update (#9068)

Assets 19

12 Aug 01:05

github-actions

b3570

4134999

b3570

gguf-py : Numpy dequantization for most types (#8939)

* gguf-py : Numpy dequantization for most types

* gguf-py : Numpy dequantization for grid-based i-quants

Assets 20

05 Aug 00:10

github-actions

b3511

0d6fb52

b3511

Install curl in runtime layer (#8693)

Assets 20

30 Jul 03:00

github-actions

b3488

75af08c

b3488

ggml: bugfix: fix the inactive elements is agnostic for risc-v vector…

Assets 20

Releases: jorgealias/llama.cpp

b3913

Uh oh!

b3780

Uh oh!

b3772

Uh oh!

b3755

Uh oh!

b3687

Uh oh!

b3620

Uh oh!

b3602

Uh oh!

b3570

Uh oh!

b3511

Uh oh!

b3488

Uh oh!