Skip to content

Commit 506965b

Browse files
authored
Merge pull request #730 from artemisp/main
Update X-InstructBLIP README.md (typos, better reference to data)
2 parents ac8fc98 + efe9a8c commit 506965b

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

projects/xinstructblip/README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ X-InstructBLIP a simple yet effective multimodal framework built on top of a fro
1515

1616
### LAVIS Repository
1717
```
18-
git clone https://github.com/artemisp/LAVIS-XInstructBLIP.git # Once PR accepted change to official LAVIS
18+
git clone https://github.com/salesforce/LAVIS.git
1919
cd LAVIS-XInstructBLIP
2020
pip install -e .
2121
```
@@ -48,7 +48,7 @@ wget -P /usr/bin https://github.com/unlimblue/KNN_CUDA/raw/master/ninja
4848
First download the Vicuna v1.1 weights following the instructions [here](https://github.com/lm-sys/FastChat). Update the parameter `llm_model` in `configs/models/blip2/blip2_xinstruct_vicuna7b.yaml` and `configs/models/blip2/blip2_xinstruct_vicuna13b.yaml` and in the demo configs under `projects/xinstructblip/demo/configs` to the path of the downloaded model folder.
4949

5050
### X-InstructBLIP Weights
51-
Weights of the model are released [here](). When loading the model using the LAVIS codebase they should be automatically downloaded.
51+
Weights of the model are released [here (7b)](https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/blip2/blip2_xinstruct_vicuna7b.yaml) and [here (13b)](https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/blip2/blip2_xinstruct_vicuna13b.yaml) . When loading the model using the LAVIS codebase they should be automatically downloaded.
5252
```
5353
from lavis.models import load_model
5454
model = load_model("blip2_vicuna_xinstruct", "vicuna7b")
@@ -223,6 +223,7 @@ Download the Audiocaps captions from [here](https://github.com/cdjkim/audiocaps/
223223
* `original_data_file`: the path to the captions for Audiocaps downloaded above for the relevant split.
224224

225225
### DisCRn
226+
The dataset is found here: [Audio-Video](https://storage.cloud.google.com/sfr-xinstructblip-data-research/data/discrn/audiocaps.json) and [Image-3D](https://storage.cloud.google.com/sfr-xinstructblip-data-research/data/discrn/objaverse.json).
226227
The files `projects/xinstructblip/discrn/data_generation/objaverse_img_3d.py` are `projects/xinstructblip/discrn/data_generation/audiocaps_video_audio.py` generate the image-3d and audio-video cross-modal reasoning pairs for the DisCRn task.
227228
#### Image-3D
228229
The arguments are as above, with the same 3D caption data
@@ -255,4 +256,4 @@ The arguments are as above, with the same audio caption data. Note that you shou
255256
archivePrefix={arXiv},
256257
primaryClass={cs.CV}
257258
}
258-
```
259+
```

0 commit comments

Comments
 (0)