Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# MiniCPM-V-2
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on MiniCPM-V-2 models on [Intel GPUs](../../../README.md). For illustration purposes, we utilize the [openbmb/MiniCPM-V-2](https://huggingface.co/openbmb/MiniCPM-V-2) as a reference MiniCPM-V-2 model.
In this directory, you will find examples on how you could apply IPEX-LLM INT4 optimizations on MiniCPM-V-2 models on [Intel GPUs](../../../README.md). For illustration purposes, we utilize the [openbmb/MiniCPM-V-2](https://huggingface.co/openbmb/MiniCPM-V-2) and [openbmb/MiniCPM-V-2_6](https://huggingface.co/openbmb/MiniCPM-V-2_6) as reference MiniCPM-V-2 models.

## 0. Requirements
To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information.
Expand Down Expand Up @@ -27,7 +27,7 @@ conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

pip install timm peft
pip install timm peft transformers==4.40.0 trl
```

### 2. Configures OneAPI environment variables for Linux
Expand Down Expand Up @@ -130,6 +130,18 @@ What is in the image?
In the image, there is a young child holding a teddy bear. The teddy bear appears to be dressed in a pink tutu. The child is also wearing a red and white striped dress. The background of the image includes a stone wall and some red flowers.
```

#### [openbmb/MiniCPM-V-2_6](https://huggingface.co/openbmb/MiniCPM-V-2_6)

```log
Inference time: 3.102498769760132 s
-------------------- Input --------------------
http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg
-------------------- Prompt --------------------
What is in the image?
-------------------- Output --------------------
The image features a young child holding a white teddy bear with a pink tutu. The child is wearing a striped dress and is standing in front of a stone wall with some red flowers in the background.
```

The sample input image is (which is fetched from [COCO dataset](https://cocodataset.org/#explore?id=264959)):

<a href="http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg"><img width=400px src="http://farm6.staticflickr.com/5268/5602445367_3504763978_z.jpg" ></a>
Original file line number Diff line number Diff line change
Expand Up @@ -157,14 +157,16 @@ def _pos_embed(self, x: torch.Tensor) -> torch.Tensor:
# here the prompt tuning refers to https://huggingface.co/openbmb/MiniCPM-V-2/blob/main/README.md
msgs = [{'role': 'user', 'content': args.prompt}]
st = time.time()
res, context, _ = model.chat(
res = model.chat(
image=image,
msgs=msgs,
context=None,
tokenizer=tokenizer,
sampling=False,
temperature=0.7
)
if model.config._name_or_path.endswith("2"):
res, context, _ = res
end = time.time()
print(f'Inference time: {end-st} s')
print('-'*20, 'Input', '-'*20)
Expand Down