You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- 2024.07.17: Support newly released InternVL2 models: `model_type` are internvl2-1b, internvl2-40b, internvl2-llama3-76b. For best practices, refer to [here](docs/source_en/Multi-Modal/internvl-best-practice.md).
58
59
- 2024.07.17: Support the training and inference of [NuminaMath-7B-TIR](https://huggingface.co/AI-MO/NuminaMath-7B-TIR). Use with model_type `numina-math-7b`.
59
60
- 🔥2024.07.16: Support exporting for ollama and bitsandbytes. Use `swift export --model_type xxx --to_ollama true` or `swift export --model_type xxx --quant_method bnb --quant_bits 4`
60
61
- 2024.07.08: Support cogvlm2-video-13b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/cogvlm2-video-best-practice.md).
61
62
- 2024.07.08: Support internlm-xcomposer2_5-7b-chat. You can check the best practice [here](docs/source_en/Multi-Modal/internlm-xcomposer2-best-practice.md).
62
63
- 🔥2024.07.06: Support for the llava-next-video series models: llava-next-video-7b-instruct, llava-next-video-7b-32k-instruct, llava-next-video-7b-dpo-instruct, llava-next-video-34b-instruct. You can refer to [llava-video best practice](docs/source_en/Multi-Modal/llava-video-best-practice.md) for more information.
63
-
- 🔥2024.07.06: Support internvl2 series: internvl2-2b, internvl2-4b, internvl2-8b, internvl2-26b.
64
+
- 🔥2024.07.06: Support InternVL2 series: internvl2-2b, internvl2-4b, internvl2-8b, internvl2-26b.
64
65
- 2024.07.06: Support codegeex4-9b-chat.
65
66
- 2024.07.04: Support internlm2_5-7b series: internlm2_5-7b, internlm2_5-7b-chat, internlm2_5-7b-chat-1m.
66
67
- 2024.07.02: Support for using vLLM for accelerating inference and deployment of multimodal large models such as the llava series and phi3-vision models. You can refer to the [Multimodal & vLLM Inference Acceleration Documentation](docs/source_en/Multi-Modal/vllm-inference-acceleration.md) for more information.
@@ -606,7 +607,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
606
607
| Llava1.5<br>Llava1.6 |[Llava series models](https://github.com/haotian-liu/LLaVA)| English | 7B-34B | chat model |
607
608
| Llava-Next<br>Llava-Next-Video |[Llava-Next series models](https://github.com/LLaVA-VL/LLaVA-NeXT)| Chinese<br>English | 7B-110B | chat model |
608
609
| mPLUG-Owl |[mPLUG-Owl series models](https://github.com/X-PLUG/mPLUG-Owl)| English | 11B | chat model |
609
-
| InternVL<br>Mini-Internvl<br>Internvl2|[InternVL](https://github.com/OpenGVLab/InternVL)| Chinese<br>English |2B-40B<br>including quantized version | chat model |
610
+
| InternVL<br>Mini-InternVL<br>InternVL2|[InternVL](https://github.com/OpenGVLab/InternVL)| Chinese<br>English |1B-40B<br>including quantized version | chat model |
610
611
| Llava-llama3 |[xtuner](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers)| English | 8B | chat model |
611
612
| Phi3-Vision | Microsoft | English | 4B | chat model |
612
613
| PaliGemma | Google | English | 3B | chat model |
<<< image1: <img>http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/cat.png</img> image2: <img>http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/animal.png</img> What is the difference bewteen the two images?
187
+
Input an image path or URL <<<
188
+
The two images are of the same kitten, but the first image is a close-up shot, while the second image is a more distant, artistic illustration. The close-up image captures the kitten in detail, showing its fur, eyes, and facial features in sharp focus. In contrast, the artistic illustration is more abstract and stylized, with a blurred background and a different color palette. The distant illustration gives the kitten a more whimsical and dreamy appearance, while the close-up image emphasizes the kitten's realism and detail.
189
+
```
190
+
110
191
示例图片如下:
111
192
112
193
cat:
@@ -134,6 +215,7 @@ ocr:
134
215
```python
135
216
import os
136
217
os.environ['CUDA_VISIBLE_DEVICES'] ='0'
218
+
# os.environ['MODELSCOPE_API_TOKEN'] = 'Your API Token' # If the message "The request model does not exist!" appears.
0 commit comments