You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First download the Vicuna v1.1 weights following the instructions [here](https://github.com/lm-sys/FastChat). Update the parameter `llm_model` in `configs/models/blip2/blip2_xinstruct_vicuna7b.yaml` and `configs/models/blip2/blip2_xinstruct_vicuna13b.yaml` and in the demo configs under `projects/xinstructblip/demo/configs` to the path of the downloaded model folder.
49
49
50
50
### X-InstructBLIP Weights
51
-
Weights of the model are released [here](). When loading the model using the LAVIS codebase they should be automatically downloaded.
51
+
Weights of the model are released [here (7b)](https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/blip2/blip2_xinstruct_vicuna7b.yaml) and [here (13b)](https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/blip2/blip2_xinstruct_vicuna13b.yaml). When loading the model using the LAVIS codebase they should be automatically downloaded.
52
52
```
53
53
from lavis.models import load_model
54
54
model = load_model("blip2_vicuna_xinstruct", "vicuna7b")
@@ -223,6 +223,7 @@ Download the Audiocaps captions from [here](https://github.com/cdjkim/audiocaps/
223
223
*`original_data_file`: the path to the captions for Audiocaps downloaded above for the relevant split.
224
224
225
225
### DisCRn
226
+
The dataset is found here: [Audio-Video](https://storage.cloud.google.com/sfr-xinstructblip-data-research/data/discrn/audiocaps.json) and [Image-3D](https://storage.cloud.google.com/sfr-xinstructblip-data-research/data/discrn/objaverse.json).
226
227
The files `projects/xinstructblip/discrn/data_generation/objaverse_img_3d.py` are `projects/xinstructblip/discrn/data_generation/audiocaps_video_audio.py` generate the image-3d and audio-video cross-modal reasoning pairs for the DisCRn task.
227
228
#### Image-3D
228
229
The arguments are as above, with the same 3D caption data
@@ -255,4 +256,4 @@ The arguments are as above, with the same audio caption data. Note that you shou
0 commit comments