Quantized Gorilla #160

CharlieJCJ · 2024-01-30T00:19:48Z

Resolved #77 , demo displaying local inference with textwebui.

K-quantized gorilla models can be found on Huggingface: Llama-based, MPT-Based, Falcon-Based, gorilla-openfunctions-v0-gguf, gorilla-openfunctions-v1-gguf

A tutorial walkthrough on how to quantize model using llama.cpp with different quantization methods documented in .

Running local inference with Gorilla on a clean interface is simple. Demoed using text-generation-webui, add your desired models, and run inference.

More details in /inference README

Co-authored-by: Pranav Ramesh [email protected]

added self-contained colab for tutorial for llama.cpp local inference with Gorilla

Co-authored-by: Pranav Ramesh <[email protected]>

…b.com>

ShishirPatil · 2024-01-31T07:15:45Z

Thanks for the PR @CharlieJCJ and @pranramesh!

Did you get a chance to test all the three models? I remember for Falcon and MPT they needed some minor tweaks. So, good to test they work - functionally, and the outputs make logical sense.

Also, while you are at it, do you mind quantizing the openfunctions-v0 and openfunctions-v1as well? You don't have to have the inference scripts ready now.

Rest of it looks good, I'll go ahead and merge.

CharlieJCJ · 2024-01-31T08:05:52Z

Yep, I'll take care of quantizing openfunctions-v0 and openfunctions-v1, and add their links to README as well in a day. I've tested MPT and Falcon ones, and their outputs are consistent with the original models.

CharlieJCJ · 2024-01-31T10:38:49Z

openfunctions-v0 and openfunctions-v1 quantized versions generated and uploaded.

K-quantized gorilla-openfunctions-v0 and gorilla-openfunctions-v1 models can be found on Huggingface: gorilla-openfunctions-v0-gguf, gorilla-openfunctions-v1-gguf

/inference README updated.

ShishirPatil

LGTM.

One potential follow-up is to benchmark the performance of these models with respect to the full precision ones.

CharlieJCJ and others added 9 commits January 28, 2024 14:48

Init commit

043fb02

[add] instructions

79e8ebc

Update README.md with local inference colab tutorial

190221c

added self-contained colab for tutorial for llama.cpp local inference with Gorilla

Update README.md

80da257

Update README.md

3472325

Updated README with text-webui interface

3dcef81

Co-authored-by: Pranav Ramesh <[email protected]>

add gif Co-authored-by: Pranav Ramesh <[email protected]…

e9b111a

…b.com>

word formatting

db03083

formatting colab button

8772d0b

add openfunctions v0 v1 quantized links

5f1793f

ShishirPatil approved these changes Feb 4, 2024

View reviewed changes

ShishirPatil merged commit 95edaa0 into ShishirPatil:main Feb 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantized Gorilla #160

Quantized Gorilla #160

Uh oh!

CharlieJCJ commented Jan 30, 2024 •

edited

Loading

Uh oh!

ShishirPatil commented Jan 31, 2024

Uh oh!

CharlieJCJ commented Jan 31, 2024 •

edited

Loading

Uh oh!

CharlieJCJ commented Jan 31, 2024 •

edited

Loading

Uh oh!

ShishirPatil left a comment •

edited

Loading

Uh oh!

Uh oh!

Quantized Gorilla #160

Quantized Gorilla #160

Uh oh!

Conversation

CharlieJCJ commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShishirPatil commented Jan 31, 2024

Uh oh!

CharlieJCJ commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CharlieJCJ commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShishirPatil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CharlieJCJ commented Jan 30, 2024 •

edited

Loading

CharlieJCJ commented Jan 31, 2024 •

edited

Loading

CharlieJCJ commented Jan 31, 2024 •

edited

Loading

ShishirPatil left a comment •

edited

Loading