Skip to content

Commit b822314

Browse files
committed
add docs
1 parent f68fa7e commit b822314

File tree

6 files changed

+87
-0
lines changed

6 files changed

+87
-0
lines changed

doc/source/getting_started/installation.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,9 @@ Currently, supported models include:
8888
- ``minicpm3-4b``
8989
- ``internlm3-instruct``
9090
- ``moonlight-16b-a3b-instruct``
91+
- ``qwenLong-l1``
9192
- ``qwen3``
93+
- ``minicpm4``
9294
.. vllm_end
9395
9496
To install Xinference and vLLM::
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
.. _models_builtin_flux.1-kontext-dev:
2+
3+
==================
4+
FLUX.1-Kontext-dev
5+
==================
6+
7+
- **Model Name:** FLUX.1-Kontext-dev
8+
- **Model Family:** stable_diffusion
9+
- **Abilities:** image2image
10+
- **Available ControlNet:** None
11+
12+
Specifications
13+
^^^^^^^^^^^^^^
14+
15+
- **Model ID:** black-forest-labs/FLUX.1-Kontext-dev
16+
- **GGUF Model ID**: bullerwins/FLUX.1-Kontext-dev-GGUF
17+
- **GGUF Quantizations**: BF16, Q2_K, Q3_K_S, Q4_K_M, Q4_K_S, Q4_K_S, Q5_K_M, Q5_K_S, Q5_K_S, Q6_K, Q8_0
18+
19+
20+
Execute the following command to launch the model::
21+
22+
xinference launch --model-name FLUX.1-Kontext-dev --model-type image
23+
24+
25+
For GGUF quantization, using below command:
26+
27+
xinference launch --model-name FLUX.1-Kontext-dev --model-type image --gguf_quantization ${gguf_quantization} --cpu_offload True

doc/source/models/builtin/image/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ The following is a list of built-in image models in Xinference:
1515

1616
flux.1-dev
1717

18+
flux.1-kontext-dev
19+
1820
flux.1-schnell
1921

2022
got-ocr2_0

doc/source/models/builtin/llm/index.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -491,6 +491,11 @@ The following is a list of built-in LLM in Xinference:
491491
- 40960
492492
- Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support
493493

494+
* - :ref:`qwenlong-l1 <models_llm_qwenlong-l1>`
495+
- chat
496+
- 32768
497+
- QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
498+
494499
* - :ref:`qwq-32b <models_llm_qwq-32b>`
495500
- chat, reasoning, tools
496501
- 131072
@@ -796,6 +801,8 @@ The following is a list of built-in LLM in Xinference:
796801

797802
qwen3
798803

804+
qwenlong-l1
805+
799806
qwq-32b
800807

801808
qwq-32b-preview
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
.. _models_llm_qwenlong-l1:
2+
3+
========================================
4+
qwenLong-l1
5+
========================================
6+
7+
- **Context Length:** 32768
8+
- **Model Name:** qwenLong-l1
9+
- **Languages:** en, zh
10+
- **Abilities:** chat
11+
- **Description:** QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
12+
13+
Specifications
14+
^^^^^^^^^^^^^^
15+
16+
17+
Model Spec 1 (pytorch, 32 Billion)
18+
++++++++++++++++++++++++++++++++++++++++
19+
20+
- **Model Format:** pytorch
21+
- **Model Size (in billions):** 32
22+
- **Quantizations:** none
23+
- **Engines**: vLLM, Transformers
24+
- **Model ID:** Tongyi-Zhiwen/QwenLong-L1-32B
25+
- **Model Hubs**: `Hugging Face <https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1-32B>`__, `ModelScope <https://modelscope.cn/models/iic/QwenLong-L1-32B>`__
26+
27+
Execute the following command to launch the model, remember to replace ``${quantization}`` with your
28+
chosen quantization method from the options listed above::
29+
30+
xinference launch --model-engine ${engine} --model-name qwenLong-l1 --size-in-billions 32 --model-format pytorch --quantization ${quantization}
31+
32+
33+
Model Spec 2 (awq, 32 Billion)
34+
++++++++++++++++++++++++++++++++++++++++
35+
36+
- **Model Format:** awq
37+
- **Model Size (in billions):** 32
38+
- **Quantizations:** Int4
39+
- **Engines**: vLLM, Transformers
40+
- **Model ID:** Tongyi-Zhiwen/QwenLong-L1-32B-AWQ
41+
- **Model Hubs**: `Hugging Face <https://huggingface.co/Tongyi-Zhiwen/QwenLong-L1-32B-AWQ>`__, `ModelScope <https://modelscope.cn/models/iic/QwenLong-L1-32B-AWQ>`__
42+
43+
Execute the following command to launch the model, remember to replace ``${quantization}`` with your
44+
chosen quantization method from the options listed above::
45+
46+
xinference launch --model-engine ${engine} --model-name qwenLong-l1 --size-in-billions 32 --model-format awq --quantization ${quantization}
47+

doc/source/user_guide/backends.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,9 @@ Currently, supported model includes:
151151
- ``minicpm3-4b``
152152
- ``internlm3-instruct``
153153
- ``moonlight-16b-a3b-instruct``
154+
- ``qwenLong-l1``
154155
- ``qwen3``
156+
- ``minicpm4``
155157
.. vllm_end
156158
157159
.. _sglang_backend:

0 commit comments

Comments
 (0)