Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/_typos.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
[default.extend-identifiers]
arange = "arange" # np.arange
arange = "arange" # np.arange

[files]
extend-exclude = ["*.ipynb"]
2 changes: 1 addition & 1 deletion .github/workflows/python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
OPENAI_ORG_ID: ${{ secrets.OPENAI_ORG_ID }}
SKIP_TESTS_NAAI: "tests/llm tests/local_llm tests/data"
SKIP_TESTS_NAAI: "tests/llm tests/data"
run: poetry run nox -s test-${{ matrix.python-version }}
quality:
runs-on: ubuntu-22.04
Expand Down
2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,6 @@
},
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [],
"python.testing.pytestArgs": ["-s"],
"markdown.extension.orderedList.marker": "one",
}
16 changes: 2 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,9 @@ Requires: Python 3.11, or 3.12
Install the entire package from [PyPI](https://pypi.org/project/not-again-ai/) with:

```bash
$ pip install not_again_ai[llm,local_llm,statistics,viz]
$ pip install not_again_ai[data,llm,statistics,viz]
```

Note that local LLM requires separate installations and will not work out of the box due to how hardware dependent it is. Be sure to check the [notebooks](notebooks/local_llm/) for more details.

The package is split into subpackages, so you can install only the parts you need.

### Base
Expand All @@ -49,16 +47,7 @@ The package is split into subpackages, so you can install only the parts you nee
1. Using AOAI requires using Entra ID authentication. See https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity for how to set this up for your AOAI deployment.
* Requires the correct role assigned to your user account and being signed into the Azure CLI.
1. (Optional) Set the `AZURE_OPENAI_ENDPOINT` environment variable.
1. Setup GitHub Models
1. Get a Personal Access Token from https://github.com/settings/tokens and set the `GITHUB_TOKEN` environment variable. The token does not need any permissions.
1. Check the [Github Marketplace](https://github.com/marketplace/models) to see which models are available.


### Local LLM
1. `pip install not_again_ai[llm,local_llm]`
1. Some HuggingFace transformers tokenizers are gated behind access requests. If you wish to use these, you will need to request access from HuggingFace on the model card.
* Then set the `HF_TOKEN` environment variable to your HuggingFace API token which can be found here: https://huggingface.co/settings/tokens
1. If you wish to use Ollama:
1. If you wish to use Ollama:
1. Follow the instructions at https://github.com/ollama/ollama to install Ollama for your system.
1. (Optional) [Add Ollama as a startup service (recommended)](https://github.com/ollama/ollama/blob/main/docs/linux.md#adding-ollama-as-a-startup-service-recommended)
1. (Optional) To make the Ollama service accessible on your local network from a Linux server, add the following to the `/etc/systemd/system/ollama.service` file which will make Ollama available at `http://<local_address>:11434`:
Expand All @@ -68,7 +57,6 @@ The package is split into subpackages, so you can install only the parts you nee
Environment="OLLAMA_HOST=0.0.0.0"
```
1. It is recommended to always have the latest version of Ollama. To update Ollama check the [docs](https://github.com/ollama/ollama/blob/main/docs/). The command for Linux is: `curl -fsSL https://ollama.com/install.sh | sh`
1. HuggingFace transformers and other requirements are hardware dependent so for providers other than Ollama, this only installs some generic dependencies. Check the [notebooks](notebooks/local_llm/) for more details on what is available and how to install it.


### Statistics
Expand Down
119 changes: 54 additions & 65 deletions notebooks/llm/01_openai_chat_completion.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Using OpenAI Chat Completions\n",
"\n",
"This notebook covers how to use the Chat Completions API and other features such as creating prompts and function calling."
"This notebook covers how to use the Chat Completions API and other features such as creating prompts and function calling.\n"
]
},
{
Expand All @@ -22,11 +22,11 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from not_again_ai.llm.openai_api.openai_client import openai_client\n",
"from not_again_ai.llm.chat_completion.providers.openai_api import openai_client\n",
"\n",
"client = openai_client()"
]
Expand All @@ -37,14 +37,14 @@
"source": [
"## Basic Chat Completion\n",
"\n",
"The `chat_completion` function is an easy way to get responses from OpenAI models. \n",
"It requires the prompt to the model to be formatted in the chat completion format, \n",
"see the [API reference](https://platform.openai.com/docs/api-reference/chat/create) for more details."
"The `chat_completion` function is an easy way to get responses from OpenAI models.\n",
"It requires the prompt to the model to be formatted in the chat completion format,\n",
"see the [API reference](https://platform.openai.com/docs/api-reference/chat/create) for more details.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 2,
"metadata": {},
"outputs": [
{
Expand All @@ -53,18 +53,27 @@
"'Hello! How can I assist you today?'"
]
},
"execution_count": 7,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from not_again_ai.llm.openai_api.chat_completion import chat_completion\n",
"from not_again_ai.llm.chat_completion import chat_completion\n",
"from not_again_ai.llm.chat_completion.types import ChatCompletionRequest, SystemMessage, UserMessage\n",
"\n",
"messages = [{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"Hello!\"}]\n",
"response = chat_completion(messages=messages, model=\"gpt-4o-mini-2024-07-18\", max_tokens=100, client=client)\n",
"messages = [\n",
" SystemMessage(content=\"You are a helpful assistant.\"),\n",
" UserMessage(content=\"Hello!\"),\n",
"]\n",
"request = ChatCompletionRequest(\n",
" messages=messages,\n",
" model=\"gpt-4o-mini-2024-07-18\",\n",
" max_completion_tokens=100,\n",
")\n",
"response = chat_completion(request, \"openai\", client)\n",
"\n",
"response[\"message\"]"
"response.choices[0].message.content"
]
},
{
Expand All @@ -75,50 +84,46 @@
"\n",
"Injecting variables into prompts is a common task and we provide the `chat_prompt` which uses [Liquid templating](https://jg-rp.github.io/liquid/).\n",
"\n",
"In the `messages_unformatted` argument, the \"content\" field can be a [Python Liquid](https://jg-rp.github.io/liquid/introduction/getting-started) template string to allow for more dynamic prompts which not only supports variable injection, but also conditional logic, loops, and comments.\n"
"In the `messages` argument, the \"content\" field can be a [Python Liquid](https://jg-rp.github.io/liquid/introduction/getting-started) template string to allow for more dynamic prompts which not only supports variable injection, but also conditional logic, loops, and comments.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'role': 'system',\n",
" 'content': '- You are a helpful assistant trying to extract places that occur in a given text.\\n- You must identify all the places in the text and return them in a list like this: [\"place1\", \"place2\", \"place3\"].'},\n",
" {'role': 'user',\n",
" 'content': 'Here is the text I want you to extract places from:\\nI went to Paris and Berlin.'}]"
"[SystemMessage(content='- You are a helpful assistant trying to extract places that occur in a given text.\\n- You must identify all the places in the text and return them in a list like this: [\"place1\", \"place2\", \"place3\"].', role=<Role.SYSTEM: 'system'>, name=None),\n",
" UserMessage(content='Here is the text I want you to extract places from:\\nI went to Paris and Berlin.', role=<Role.USER: 'user'>, name=None)]"
]
},
"execution_count": 8,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from not_again_ai.llm.openai_api.prompts import chat_prompt\n",
"from not_again_ai.llm.prompting.compile_messages import compile_messages\n",
"\n",
"place_extraction_prompt = [\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": \"\"\"- You are a helpful assistant trying to extract places that occur in a given text.\n",
"- You must identify all the places in the text and return them in a list like this: [\"place1\", \"place2\", \"place3\"].\"\"\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"\"\"Here is the text I want you to extract places from:\n",
" SystemMessage(\n",
" content=\"\"\"- You are a helpful assistant trying to extract places that occur in a given text.\n",
"- You must identify all the places in the text and return them in a list like this: [\"place1\", \"place2\", \"place3\"].\"\"\"\n",
" ),\n",
" UserMessage(\n",
" content=\"\"\"Here is the text I want you to extract places from:\n",
"{%- # The user's input text goes below %}\n",
"{{text}}\"\"\",\n",
" },\n",
" ),\n",
"]\n",
"\n",
"variables = {\n",
" \"text\": \"I went to Paris and Berlin.\",\n",
"}\n",
"\n",
"messages = chat_prompt(messages_unformatted=place_extraction_prompt, variables=variables)\n",
"messages = compile_messages(messages=place_extraction_prompt, variables=variables)\n",
"messages"
]
},
Expand All @@ -132,12 +137,12 @@
"\n",
"We explicitly require a tokenizer since loading it has some overhead, so we want to avoid doing so many times for certain use cases.\n",
"\n",
"NOTE: This function not support counting tokens used by function calling."
"NOTE: This function not support counting tokens used by function calling.\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 4,
"metadata": {},
"outputs": [
{
Expand All @@ -149,10 +154,10 @@
}
],
"source": [
"from not_again_ai.llm.openai_api.tokens import load_tokenizer, num_tokens_from_messages\n",
"from not_again_ai.llm.prompting.providers.openai_tiktoken import TokenizerOpenAI\n",
"\n",
"tokenizer = load_tokenizer(model=\"gpt-4o-2024-05-13\")\n",
"num_tokens = num_tokens_from_messages(messages=messages, tokenizer=tokenizer, model=\"gpt-4o-mini-2024-07-18\")\n",
"tokenizer = TokenizerOpenAI(model=\"gpt-4o-mini-2024-07-18\")\n",
"num_tokens = tokenizer.num_tokens_in_messages(messages=messages)\n",
"print(num_tokens)"
]
},
Expand All @@ -161,34 +166,24 @@
"metadata": {},
"source": [
"## Chat Completion with Function Calling and other Parameters\n",
"\n",
"The `chat_completion` function can also be used to call functions in the prompt and a myriad of other commonly used parameters like temperature, max_tokens, and logprobs. See the docstring for more details.\n",
"\n",
"See the [gpt-4-v.ipynb](gpt-4-v.ipynb) for full details on how to use the vision features of `chat_completion` and `chat_prompt`."
"See the [gpt-4-v.ipynb](gpt-4-v.ipynb) for full details on how to use the vision features of `chat_completion` and `chat_prompt`.\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'choices': [{'finish_reason': 'stop',\n",
" 'message': None,\n",
" 'tool_names': ['get_current_weather'],\n",
" 'tool_args_list': [{'location': 'Boston, MA', 'format': 'fahrenheit'}]},\n",
" {'finish_reason': 'stop',\n",
" 'message': None,\n",
" 'tool_names': ['get_current_weather'],\n",
" 'tool_args_list': [{'location': 'Boston, MA', 'format': 'fahrenheit'}]}],\n",
" 'completion_tokens': 40,\n",
" 'prompt_tokens': 101,\n",
" 'system_fingerprint': 'fp_611b667b19',\n",
" 'response_duration': 0.786}"
"ChatCompletionResponse(choices=[ChatCompletionChoice(message=AssistantMessage(content='', role=<Role.ASSISTANT: 'assistant'>, name=None, refusal=None, tool_calls=[ToolCall(id='call_2moHfKov3UxMINi9umf6zod1', function=Function(name='get_current_weather', arguments={'location': 'Boston, MA', 'format': 'fahrenheit'}), type='function')]), finish_reason='tool_calls', json_message=None, logprobs=None, extras={}), ChatCompletionChoice(message=AssistantMessage(content='', role=<Role.ASSISTANT: 'assistant'>, name=None, refusal=None, tool_calls=[ToolCall(id='call_LZA4NljGxgSZ6dGc4hdO9hBA', function=Function(name='get_current_weather', arguments={'location': 'Boston, MA', 'format': 'fahrenheit'}), type='function')]), finish_reason='tool_calls', json_message=None, logprobs=None, extras={})], errors='', completion_tokens=46, prompt_tokens=99, completion_detailed_tokens=None, prompt_detailed_tokens=None, response_duration=4.4759, system_fingerprint='fp_bd83329f63', extras={'prompt_filter_results': None})"
]
},
"execution_count": 10,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
Expand Down Expand Up @@ -221,37 +216,31 @@
"]\n",
"# Ask the model to call the function\n",
"messages = [\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"What's the current weather like in {{ city_state }} today? Call the get_current_weather function.\",\n",
" }\n",
" UserMessage(\n",
" content=\"What's the current weather like in {{ city_state }} today? Call the get_current_weather function.\",\n",
" )\n",
"]\n",
"\n",
"messages = chat_prompt(messages_unformatted=messages, variables={\"city_state\": \"Boston, MA\"})\n",
"messages = compile_messages(messages=messages, variables={\"city_state\": \"Boston, MA\"})\n",
"\n",
"client = openai_client()\n",
"\n",
"response = chat_completion(\n",
"request = ChatCompletionRequest(\n",
" messages=messages,\n",
" model=\"gpt-4o-mini-2024-07-18\",\n",
" client=client,\n",
" tools=tools,\n",
" tool_choice=\"required\", # Force the model to use the tool\n",
" max_tokens=300,\n",
" max_completion_tokens=300,\n",
" temperature=0,\n",
" logprobs=(True, 2), # logprobs=(True, 2) returns the log probabilities of the top 2 tokens\n",
" log_probs=True,\n",
" top_log_probs=2, # returns the log probabilities of the top 2 tokens\n",
" seed=42, # Set the seed for reproducibility. The API will also return a `system_fingerprint` field to monitor changes in the backend.\n",
" n=2, # Generate 2 completions at once\n",
")\n",
"response = chat_completion(request, \"openai\", client)\n",
"response"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
Loading
Loading