-
Notifications
You must be signed in to change notification settings - Fork 595
5. Model configuration
The Coze Loop Open-source Edition supports various LLM models through the Eino framework. This document introduces the list of supported models and the steps to configure models for Coze Loop.
The models supported by the Coze Loop Open-source Edition are as follows:
- Volcengine Ark | BytePlus ModelArk
- OpenAI
- DeepSeek
- Claude
- Gemini
- Ollama
- Qwen
- Qianfan
Before modifying the model configuration, please make sure you have understood the following Important notes:
- Ensure that the IDs of each model are globally unique and greater than zero. Do not modify the IDs after the models are online.
- Before deleting a model, please ensure that the model no longer handles online traffic.
- Ensure that the evaluators are equipped with models having strong function call capabilities; otherwise, the evaluators might not function properly.
- If you use the Qianfan model, besides the configuration file
model_config.yaml
, you also need to configureconf/default/app/runtime/model_runtime_config.
Configure qianfan_ak and qianfan_sk in theyaml
file
Before deploying the Coze Loop open-source edition, you need to first fill in the model configuration file conf/default/app/runtime/model_config.yaml
, setting up the available model list for Compass
- In the
model_config.yaml
configuration file, each id represents a model, and numbers must start from 0 - The most critical step in model configuration is setting the model and api_key fields; please ensure the field values are correct
- For different series of models, configuration fields are different. You can refer to the configuration file example for examples of various models and modify the key configurations as needed. Fields needing modifications are marked with Change It
Here, using the OpenAI and Volcengine Ark models as examples, steps to configure the model files are demonstrated. You can quickly configure the models for installing and testing the Coze Loop open-source edition For models such as DeepSeek
-
Enter the directory
conf/default/app/runtime/
-
Edit the file
model_config.yaml
, and modify the api_key and model fields. The following content represents the Cici model and OpenAI model configuration for the Coze Loop open-source Edition with Vulcan Ark. Overwrite the original file with the content below, then modify the api_key and model to replace them with your OpenAI and Vulcan Ark model configuration parameters.models: - id: 1 name: "doubao" frame: "eino" protocol: "ark" protocol_config: base_url: "" # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/ api_key: "***" # Ark model API Key. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1541594; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1361424?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source model: "***" # Ark model Endpoint ID. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1099522; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1099522?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source param_config: param_schemas: - name: "temperature" label: "temperature" desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'." type: "float" min: "0" max: "1.0" default_val: "0.7" - name: "max_tokens" label: "max_tokens" desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters." type: "int" min: "1" max: "4096" default_val: "2048" - name: "top_p" label: "top_p" desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness." type: "float" # min: "0.001" max: "1.0" default_val: "0.7" - id: 2 name: "openapi" frame: "eino" protocol: "openai" protocol_config: api_key: "***" # OpenAI API Key model: "***" # OpenAI Model ID param_config: param_schemas: - name: "temperature" label: "temperature" desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'." type: "float" min: "0" max: "1.0" default_val: "0.7" - name: "max_tokens" label: "max_tokens" desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters." type: "int" min: "1" max: "4096" default_val: "2048" - name: "top_p" label: "top_p" desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness." type: "float" # min: "0.001" max: "1.0" default_val: "0.7"
-
Save the file.
-
First deployment of Coze Loop Open-source Edition: After modifying the configuration file, simply start the service directly. For detailed steps, refer to Quick Start.
# Start the service, default is development mode docker compose up --build
-
Modify model configuration after deployment: The Coze Loop Open-source Edition supports automatic monitoring of the model configuration file. Therefore, no actions are required for the changes to the model configuration file to take effect. File monitoring may fail in some scenarios, for example, file mount monitoring may fail in macOS. If the model list is not updated, you can execute the following command to manually restart the service without rebuilding the image.
# Restart the service and keep RUN_MODE consistent docker compose restart app
See backend/modules/llm/domain/entity/manage.go
for the detailed model code structure definition. The following are the complete model configuration fields, indicating the meaning of each field and whether it is required.
Parameter | Required | Example | Parameter description |
---|---|---|---|
id | Required | 1 | Unique identifier, must be greater than 0. |
name | Required | GPT-4 multimodal model | Model name (e.g., "your-model-name"). |
desc | Optional | Intelligent assistant supporting image understanding and text generation | Model description (e.g., “multi-modal chat model”). |
ability.function_call | Optional | true | Enable function call capability (default false). |
ability.multi_modal | Optional | true | Enable multi-modal capability (default false), requires simultaneous configuration of ability.multi_modal.image . |
ability.multi_modal.image.url_enabled | Optional | true | Allow processing images via URL (requires enabling multi_modal first). |
ability.multi_modal.image.binary_enabled | Optional | true | Allow processing images via binary data (requires enabling multi_modal first). |
ability.multi_modal.image.max_image_size | Optional | 20 | Maximum image size (MB), 0 means no limit. |
ability.multi_modal.image.max_image_count | Optional | 5 | Maximum number of images, 0 means no limit. |
frame | Required | eino | Fixed value "eino". |
protocol | Required | openai | Model protocol type, optional: ark/openai/deepseek/qwen/qianfan, etc. |
protocol_config.api_key | Required | sk-abc1234567890xyz | Model API token (e.g., OpenAI's sk-xxx). |
protocol_config.model | Required | gpt-4-1106-preview | The name of the called model (e.g., "gpt-4"). |
protocol_config.base_url | Optional | https://api.openai.com/v1 | Model API base URL, if not provided, the default value will be used. |
protocol_config_ark.region | Optional | cn-east-2 | For ark protocol only: region configuration (e.g., "cn-north-1"). |
protocol_config_ark.access_key | Optional | ark-access-key-123 | Only for ark protocol: access token. |
protocol_config_ark.secret_key | Optional | ark-secret-key-456 | Only for ark protocol: token. |
protocol_config_open_ai.by_azure | Optional | true | Only openai protocol: Whether to use Azure version (default false). |
protocol_config_claude.by_bedrock | Optional | true | Only claude protocol: Whether to use Bedrock version (default false). |
scenario_configs.default.scenario | Optional | default | Default scenario identifier, used when a scenario does not match. |
scenario_configs.default.quota.qpm | Optional | 60 | The default scenario QPM limit (0 means no limit). |
scenario_configs.default.unavailable | Optional | false | The visibility of the default scenario model (true means not visible, default is false). |
scenario_configs.evaluator.unavailable | Optional | true | Evaluator scenario: If the model does not support a function call, it must be set to true. |
param_config.param_schemas.name | Required | temperature | Parameter name (e.g., "temperature"). |
param_config.param_schemas.label | Required | randomness of text generated by LLMs | Frontend display name (e.g., "randomness of text generated by LLMs"). |
param_config.param_schemas.desc | Required | The higher the value, the more diverse and creative the output | Front-end description (e.g., "Increase to make the output more diverse") |
param_config.param_schemas.type | Required | float | Parameter type (float/int/bool/string) |
param_config.param_schemas.min | Required | 0 | The minimum value of the parameter (e.g., 0 for temperature). |
param_config.param_schemas.max | Required | 1.0 | The maximum value of the parameter (e.g., 4096 for max_tokens). |
param_config.param_schemas.default_val | Required | 0.7 | The default value of the parameter (e.g., 0.7 for top_p). |
For the complete configuration example and field description of each model, refer to conf/default/app/model_config_example
.
You can also refer to the following simplest configuration. The model field configurations are basically the same, with differences in the protocol field.
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "ark"
protocol_config:
base_url: "" # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/
api_key: "" # Ark model API Key. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1541594; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1361424?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
model: "" # Ark model Endpoint ID. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1099522; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1099522?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "claude"
protocol_config:
api_key: "" # Change It。 Modify to the API Key you have already applied for
model: "" # Change It。 Modify to the model ID you have already activated
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "deepseek"
protocol_config:
api_key: "" # Change It。 Modify to the API Key you have already applied for
model: "" # Change It。 Change it to the activated model ID
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "gemini"
protocol_config:
api_key: "" # Change It。 Change it to the API Key you have applied for
model: "" # Change It。 Change it to the activated model ID
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "ollama"
protocol_config:
base_url: "" # Change It。 Replace with the Url model deployed
api_key: "" # Change It。 Replace with the API Key you have applied for
model: "" # Change It。 Replace with the model ID you have activated
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "openai"
protocol_config:
api_key: "" # Change It。 Change to the API Key you have applied for
model: "" # Change It。 Change to the Model ID you have activated
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
If using the Qianfan model, in addition to the configuration file model_config.yaml
, you also need to configure qianfan_ak and qianfan_sk in the backend/modules/llm/infra/config/runtime_config.yaml
file.
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "qianfan"
protocol_config:
model: "" # Change It。 Modify to the model ID you have activated.
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "qwen"
protocol_config:
api_key: "" # Change It。 Modify to the API Key you have applied for.
model: "" # Change It。 Modify to the model ID you have activated.
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"
- id: 1 # Change It
name: "your model name" # Change It
frame: "eino"
protocol: "arkbot"
protocol_config:
base_url: "" # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/
api_key: "" # Change It。 Modify it to the API Key you have applied for
model: "" # Change It。 Modify it to the Model ID you have activated
param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
param_schemas:
- name: "temperature"
label: "temperature"
desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
type: "float"
min: "0"
max: "1.0"
default_val: "0.7"
- name: "max_tokens"
label: "max_tokens"
desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
type: "int"
min: "1"
max: "4096"
default_val: "2048"
- name: "top_p"
label: "top_p"
desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
type: "float" #
min: "0.001"
max: "1.0"
default_val: "0.7"