-
Notifications
You must be signed in to change notification settings - Fork 84
Description
Issue 1: FlashAttention Compatibility
The first issue we encountered was related to FlashAttention. This can be resolved by disabling Flash Attention explicitly:
Wherever use_flash_attention is referenced, set its value to "eager" to ensure compatibility and prevent errors on systems where Flash Attention is not supported.
Changes made in config.json file
changed "mm_vision_tower": "google/siglip-so400m-patch14-384" to "mm_vision_tower": "openai/clip-vit-base-patch32"
- Set "use_flash_attention": false
- Set "sliding_window": 0
Issue 2: No CUDA GPUs Available
Installed the CPU-only versions of PyTorch, TorchVision, and TorchAudio using:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
Replaced hardcoded .cuda() calls in videollama2/init.py, videollama2/model/init.py
input_ids = tokenizer_multimodal_token(prompt, tokenizer, modal_token, return_tensors='pt').unsqueeze(0).long()
attention_masks = input_ids.ne(tokenizer.pad_token_id).long()
if device != "cpu":
kwargs['device_map'] = {"": device}