Skip to content

5. Model configuration

kidkidkid edited this page Aug 8, 2025 · 15 revisions

The Coze Loop Open-source Edition supports various LLM models through the Eino framework. This document introduces the list of supported models and the steps to configure models for Coze Loop.

Model list

The models supported by the Coze Loop Open-source Edition are as follows:

  • Volcengine Ark | BytePlus ModelArk
  • OpenAI
  • DeepSeek
  • Claude
  • Gemini
  • Ollama
  • Qwen
  • Qianfan

Important notes

Before modifying the model configuration, please make sure you have understood the following Important notes:

  • Ensure that the IDs of each model are globally unique and greater than zero. Do not modify the IDs after the models are online.
  • Before deleting a model, please ensure that the model no longer handles online traffic.
  • Ensure that the evaluators are equipped with models having strong function call capabilities; otherwise, the evaluators might not function properly.
  • If you use the Qianfan model, besides the configuration file model_config.yaml, you also need to configure conf/default/app/runtime/model_runtime_config. Configure qianfan_ak and qianfan_sk in the yaml file

Configure the model for Coze Loop

Before deploying the Coze Loop open-source edition, you need to first fill in the model configuration file conf/default/app/runtime/model_config.yaml, setting up the available model list for Compass

  • In the model_config.yaml configuration file, each id represents a model, and numbers must start from 0
  • The most critical step in model configuration is setting the model and api_key fields; please ensure the field values are correct
  • For different series of models, configuration fields are different. You can refer to the configuration file example for examples of various models and modify the key configurations as needed. Fields needing modifications are marked with Change It

Step one: Modify the model configuration file

Here, using the OpenAI and Volcengine Ark models as examples, steps to configure the model files are demonstrated. You can quickly configure the models for installing and testing the Coze Loop open-source edition For models such as DeepSeek

  1. Enter the directory conf/default/app/runtime/

  2. Edit the file model_config.yaml, and modify the api_key and model fields. The following content represents the Cici model and OpenAI model configuration for the Coze Loop open-source Edition with Vulcan Ark. Overwrite the original file with the content below, then modify the api_key and model to replace them with your OpenAI and Vulcan Ark model configuration parameters.

    models:
      - id: 1
        name: "doubao"
        frame: "eino"
        protocol: "ark"
        protocol_config:
          base_url: ""    # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/
          api_key: "***"  # Ark model API Key. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1541594; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1361424?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
          model: "***"    # Ark model Endpoint ID. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1099522; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1099522?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
        param_config:
          param_schemas:
            - name: "temperature"
              label: "temperature"
              desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
              type: "float"
              min: "0"
              max: "1.0"
              default_val: "0.7"
            - name: "max_tokens"
              label: "max_tokens"
              desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
              type: "int"
              min: "1"
              max: "4096"
              default_val: "2048"
            - name: "top_p"
              label: "top_p"
              desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
              type: "float" #
              min: "0.001"
              max: "1.0"
              default_val: "0.7"
      - id: 2
        name: "openapi"
        frame: "eino"
        protocol: "openai"
        protocol_config:
          api_key: "***"  # OpenAI API Key
          model: "***"    # OpenAI Model ID
        param_config:
          param_schemas:
            - name: "temperature"
              label: "temperature"
              desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
              type: "float"
              min: "0"
              max: "1.0"
              default_val: "0.7"
            - name: "max_tokens"
              label: "max_tokens"
              desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
              type: "int"
              min: "1"
              max: "4096"
              default_val: "2048"
            - name: "top_p"
              label: "top_p"
              desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
              type: "float" #
              min: "0.001"
              max: "1.0"
              default_val: "0.7"
  3. Save the file.

Step Two: Apply configuration.

  • First deployment of Coze Loop Open-source Edition: After modifying the configuration file, simply start the service directly. For detailed steps, refer to Quick Start.

    # Start the service, default is development mode
    docker compose up --build
  • Modify model configuration after deployment: The Coze Loop Open-source Edition supports automatic monitoring of the model configuration file. Therefore, no actions are required for the changes to the model configuration file to take effect. File monitoring may fail in some scenarios, for example, file mount monitoring may fail in macOS. If the model list is not updated, you can execute the following command to manually restart the service without rebuilding the image.

    # Restart the service and keep RUN_MODE consistent
     docker compose restart app

Description

See backend/modules/llm/domain/entity/manage.go for the detailed model code structure definition. The following are the complete model configuration fields, indicating the meaning of each field and whether it is required.

Parameter Required Example Parameter description
id Required 1 Unique identifier, must be greater than 0.
name Required GPT-4 multimodal model Model name (e.g., "your-model-name").
desc Optional Intelligent assistant supporting image understanding and text generation Model description (e.g., “multi-modal chat model”).
ability.function_call Optional true Enable function call capability (default false).
ability.multi_modal Optional true Enable multi-modal capability (default false), requires simultaneous configuration of ability.multi_modal.image.
ability.multi_modal.image.url_enabled Optional true Allow processing images via URL (requires enabling multi_modal first).
ability.multi_modal.image.binary_enabled Optional true Allow processing images via binary data (requires enabling multi_modal first).
ability.multi_modal.image.max_image_size Optional 20 Maximum image size (MB), 0 means no limit.
ability.multi_modal.image.max_image_count Optional 5 Maximum number of images, 0 means no limit.
frame Required eino Fixed value "eino".
protocol Required openai Model protocol type, optional: ark/openai/deepseek/qwen/qianfan, etc.
protocol_config.api_key Required sk-abc1234567890xyz Model API token (e.g., OpenAI's sk-xxx).
protocol_config.model Required gpt-4-1106-preview The name of the called model (e.g., "gpt-4").
protocol_config.base_url Optional https://api.openai.com/v1 Model API base URL, if not provided, the default value will be used.
protocol_config_ark.region Optional cn-east-2 For ark protocol only: region configuration (e.g., "cn-north-1").
protocol_config_ark.access_key Optional ark-access-key-123 Only for ark protocol: access token.
protocol_config_ark.secret_key Optional ark-secret-key-456 Only for ark protocol: token.
protocol_config_open_ai.by_azure Optional true Only openai protocol: Whether to use Azure version (default false).
protocol_config_claude.by_bedrock Optional true Only claude protocol: Whether to use Bedrock version (default false).
scenario_configs.default.scenario Optional default Default scenario identifier, used when a scenario does not match.
scenario_configs.default.quota.qpm Optional 60 The default scenario QPM limit (0 means no limit).
scenario_configs.default.unavailable Optional false The visibility of the default scenario model (true means not visible, default is false).
scenario_configs.evaluator.unavailable Optional true Evaluator scenario: If the model does not support a function call, it must be set to true.
param_config.param_schemas.name Required temperature Parameter name (e.g., "temperature").
param_config.param_schemas.label Required randomness of text generated by LLMs Frontend display name (e.g., "randomness of text generated by LLMs").
param_config.param_schemas.desc Required The higher the value, the more diverse and creative the output Front-end description (e.g., "Increase to make the output more diverse")
param_config.param_schemas.type Required float Parameter type (float/int/bool/string)
param_config.param_schemas.min Required 0 The minimum value of the parameter (e.g., 0 for temperature).
param_config.param_schemas.max Required 1.0 The maximum value of the parameter (e.g., 4096 for max_tokens).
param_config.param_schemas.default_val Required 0.7 The default value of the parameter (e.g., 0.7 for top_p).

Example

For the complete configuration example and field description of each model, refer to conf/default/app/model_config_example. You can also refer to the following simplest configuration. The model field configurations are basically the same, with differences in the protocol field.

Volcengine Ark

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "ark" 
  protocol_config:
    base_url: "" # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/
    api_key: ""  # Ark model API Key. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1541594; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1361424?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
    model: ""    # Ark model Endpoint ID. For users in China, refer to the Volcengine Ark documentation at https://www.volcengine.com/docs/82379/1099522; for users outside China, refer to the BytePlus ModelArk documentation at https://docs.byteplus.com/en/docs/ModelArk/1099522?utm_source=github&utm_medium=readme&utm_campaign=coze_open_source
  param_config:  # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Claude

- id: 1                    # Change It
  name: "your model name"  # Change It
  frame: "eino"
  protocol: "claude" 
  protocol_config:
    api_key: "" # Change It。 Modify to the API Key you have already applied for
    model: ""   # Change It。 Modify to the model ID you have already activated
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Deepseek

- id: 1                    # Change It
  name: "your model name"  # Change It
  frame: "eino"
  protocol: "deepseek" 
  protocol_config:
    api_key: "" # Change It。 Modify to the API Key you have already applied for
    model: ""   # Change It。 Change it to the activated model ID
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Gemini

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "gemini" 
  protocol_config:
    api_key: "" # Change It。 Change it to the API Key you have applied for
    model: ""   # Change It。 Change it to the activated model ID
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Ollama

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "ollama" 
  protocol_config:
    base_url: "" # Change It。 Replace with the Url model deployed
    api_key: ""  # Change It。 Replace with the API Key you have applied for
    model: ""    # Change It。 Replace with the model ID you have activated
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

OpenAI

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "openai" 
  protocol_config:
    api_key: "" # Change It。 Change to the API Key you have applied for
    model: ""   # Change It。 Change to the Model ID you have activated
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Qianfan

If using the Qianfan model, in addition to the configuration file model_config.yaml, you also need to configure qianfan_ak and qianfan_sk in the backend/modules/llm/infra/config/runtime_config.yaml file.

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "qianfan" 
  protocol_config:
    model: ""   # Change It。 Modify to the model ID you have activated.
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Qwen

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "qwen" 
  protocol_config:
    api_key: "" # Change It。 Modify to the API Key you have applied for.
    model: ""   # Change It。 Modify to the model ID you have activated.
  param_config: # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"

Arkbot

- id: 1                   # Change It
  name: "your model name" # Change It
  frame: "eino"
  protocol: "arkbot" 
  protocol_config:
    base_url: "" # Default is https://ark.cn-beijing.volces.com/api/v3/; If using Byteplus ModelArk,use https://ark.ap-southeast.bytepluses.com/api/v3/
    api_key: ""  # Change It。 Modify it to the API Key you have applied for
    model: ""    # Change It。 Modify it to the Model ID you have activated
  param_config:  # Usually no modifications are needed, as this determines which parameters are adjustable on the front end, their adjustable ranges, and default values.
    param_schemas:
      - name: "temperature"
        label: "temperature"
        desc: "Increasing the temperature will make the model output more diverse and creative. Conversely, lowering the temperature will make the output more compliant with instructions but less diverse. It is recommended not to adjust together with 'Top p'."
        type: "float"
        min: "0"
        max: "1.0"
        default_val: "0.7"
      - name: "max_tokens"
        label: "max_tokens"
        desc: "Controls the maximum length of tokens output by the model. Typically, 100 tokens are approximately equal to 150 Chinese characters."
        type: "int"
        min: "1"
        max: "4096"
        default_val: "2048"
      - name: "top_p"
        label: "top_p"
        desc: "During generation, selects the smallest set of tokens whose cumulative probability reaches top_p. Tokens outside the set are excluded, balancing diversity and reasonableness."
        type: "float" #
        min: "0.001"
        max: "1.0"
        default_val: "0.7"
Clone this wiki locally