Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Commit 9742828

Browse files
Justin Johnsonrbgirshick
authored andcommitted
CLEVR dataset generation code
0 parents  commit 9742828

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+2612
-0
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
.DS_Store
2+
__pycache__/
3+
*.swp
4+
*.pyc
5+
output/

CONTRIBUTING.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Contributing to clevr-dataset-gen
2+
We want to make contributing to this project as easy and transparent as
3+
possible.
4+
5+
## Pull Requests
6+
We actively welcome your pull requests.
7+
8+
1. Fork the repo and create your branch from `master`.
9+
2. If you've added code that should be tested, add tests.
10+
3. If you've changed APIs, update the documentation.
11+
4. Ensure the test suite passes.
12+
5. Make sure your code lints.
13+
6. If you haven't already, complete the Contributor License Agreement ("CLA").
14+
15+
## Contributor License Agreement ("CLA")
16+
In order to accept your pull request, we need you to submit a CLA. You only need
17+
to do this once to work on any of Facebook's open source projects.
18+
19+
Complete your CLA here: <https://code.facebook.com/cla>
20+
21+
## Issues
22+
We use GitHub issues to track public bugs. Please ensure your description is
23+
clear and has sufficient instructions to be able to reproduce the issue.
24+
25+
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
26+
disclosure of security bugs. In those cases, please go through the process
27+
outlined on that page and do not file a public issue.
28+
29+
## Coding Style
30+
* 2 spaces for indentation rather than tabs
31+
* 80 character line length
32+
33+
## License
34+
By contributing to __________, you agree that your contributions will be licensed
35+
under the LICENSE file in the root directory of this source tree.

LICENSE

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
BSD License
2+
3+
For clevr-dataset-gen software
4+
5+
Copyright (c) 2017-present, Facebook, Inc. All rights reserved.
6+
7+
Redistribution and use in source and binary forms, with or without modification,
8+
are permitted provided that the following conditions are met:
9+
10+
* Redistributions of source code must retain the above copyright notice, this
11+
list of conditions and the following disclaimer.
12+
13+
* Redistributions in binary form must reproduce the above copyright notice,
14+
this list of conditions and the following disclaimer in the documentation
15+
and/or other materials provided with the distribution.
16+
17+
* Neither the name Facebook nor the names of its contributors may be used to
18+
endorse or promote products derived from this software without specific
19+
prior written permission.
20+
21+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
22+
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
23+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
24+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR
25+
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
26+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
28+
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
29+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
30+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

PATENTS

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
Additional Grant of Patent Rights Version 2
2+
3+
"Software" means the clevr-dataset-gen software contributed by Facebook, Inc.
4+
5+
Facebook, Inc. ("Facebook") hereby grants to each recipient of the Software
6+
("you") a perpetual, worldwide, royalty-free, non-exclusive, irrevocable
7+
(subject to the termination provision below) license under any Necessary
8+
Claims, to make, have made, use, sell, offer to sell, import, and otherwise
9+
transfer the Software. For avoidance of doubt, no license is granted under
10+
Facebook’s rights in any patent claims that are infringed by (i) modifications
11+
to the Software made by you or any third party or (ii) the Software in
12+
combination with any software or other technology.
13+
14+
The license granted hereunder will terminate, automatically and without notice,
15+
if you (or any of your subsidiaries, corporate affiliates or agents) initiate
16+
directly or indirectly, or take a direct financial interest in, any Patent
17+
Assertion: (i) against Facebook or any of its subsidiaries or corporate
18+
affiliates, (ii) against any party if such Patent Assertion arises in whole or
19+
in part from any software, technology, product or service of Facebook or any of
20+
its subsidiaries or corporate affiliates, or (iii) against any party relating
21+
to the Software. Notwithstanding the foregoing, if Facebook or any of its
22+
subsidiaries or corporate affiliates files a lawsuit alleging patent
23+
infringement against you in the first instance, and you respond by filing a
24+
patent infringement counterclaim in that lawsuit against that party that is
25+
unrelated to the Software, the license granted hereunder will not terminate
26+
under section (i) of this paragraph due to such counterclaim.
27+
28+
A "Necessary Claim" is a claim of a patent owned by Facebook that is
29+
necessarily infringed by the Software standing alone.
30+
31+
A "Patent Assertion" is any lawsuit or other action alleging direct, indirect,
32+
or contributory infringement or inducement to infringe any patent, including a
33+
cross-claim or counterclaim.

README.md

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# CLEVR Dataset Generation
2+
3+
This is the code used to generate the [CLEVR dataset](http://cs.stanford.edu/people/jcjohns/clevr/) as described in the paper:
4+
5+
**[CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning](http://cs.stanford.edu/people/jcjohns/clevr/)**
6+
<br>
7+
<a href='http://cs.stanford.edu/people/jcjohns/'>Justin Johnson</a>,
8+
<a href='http://home.bharathh.info/'>Bharath Hariharan</a>,
9+
<a href='https://lvdmaaten.github.io/'>Laurens van der Maaten</a>,
10+
<a href='http://vision.stanford.edu/feifeili/'>Fei-Fei Li</a>,
11+
<a href='http://larryzitnick.org/'>Larry Zitnick</a>,
12+
<a href='http://www.rossgirshick.info/'>Ross Girshick</a>
13+
<br>
14+
Presented at [CVPR 2017](http://cvpr2017.thecvf.com/)
15+
16+
Code and pretrained models for the baselines used in the paper [can be found here](https://github.com/facebookresearch/clevr-iep).
17+
18+
You can use this code to render synthetic images and compositional questions for those images, like this:
19+
20+
<div align="center">
21+
<img src="images/example1080.png" width="800px">
22+
</div>
23+
24+
**Q:** How many small spheres are there? <br>
25+
**A:** 2
26+
27+
**Q:** What number of cubes are small things or red metal objects? <br>
28+
**A:** 2
29+
30+
**Q:** Does the metal sphere have the same color as the metal cylinder? <br>
31+
**A:** Yes
32+
33+
**Q:** Are there more small cylinders than metal things? <br>
34+
**A:** No
35+
36+
**Q:** There is a cylinder that is on the right side of the large yellow object behind the blue ball; is there a shiny cube in front of it? <br>
37+
**A:** Yes
38+
39+
If you find this code useful in your research then please cite
40+
41+
```
42+
@inproceedings{johnson2017clevr,
43+
title={CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning},
44+
author={Johnson, Justin and Hariharan, Bharath and van der Maaten, Laurens
45+
and Fei-Fei, Li and Zitnick, C Lawrence and Girshick, Ross},
46+
booktitle={CVPR},
47+
year={2017}
48+
}
49+
```
50+
51+
All code was developed and tested on OSX and Ubuntu 16.04.
52+
53+
## Step 1: Generating Images
54+
First we render synthetic images using [Blender](https://www.blender.org/), outputting both rendered images as well as a JSON file containing ground-truth scene information for each image.
55+
56+
Blender ships with its own installation of Python which is used to execute scripts that interact with Blender; you'll need to add the `image_generation` directory to Python path of Blender's bundled Python. The easiest way to do this is by adding a `.pth` file to the `site-packages` directory of Blender's Python, like this:
57+
58+
```bash
59+
echo $PWD/image_generation >> $BLENDER/$VERSION/python/lib/python3.5/site-packages/clevr.pth
60+
```
61+
62+
where `$BLENDER` is the directory where Blender is installed and `$VERSION` is your Blender version; for example on OSX you might run:
63+
64+
```bash
65+
echo $PWD/image_generation >> /Applications/blender/blender.app/Contents/Resources/2.78/python/lib/python3.5/site-packages/clevr.pth
66+
```
67+
68+
You can then render some images like this:
69+
70+
```bash
71+
cd image_generation
72+
blender --background --python render_images.py -- --num_images 10
73+
```
74+
75+
On OSX the `blender` binary is located inside the blender.app directory; for convenience you may want to
76+
add the following alias to your `~/.bash_profile` file:
77+
78+
```bash
79+
alias blender='/Applications/blender/blender.app/Contents/MacOS/blender'
80+
```
81+
82+
If you have an NVIDIA GPU with CUDA installed then you can use the GPU to accelerate rendering like this:
83+
84+
```bash
85+
blender --background --python render_images.py -- --num_images 10 --use_gpu 1
86+
```
87+
88+
After this command terminates you should have ten freshly rendered images stored in `output/images` like these:
89+
90+
<div align="center">
91+
<img src="images/img1.png" width="260px">
92+
<img src="images/img2.png" width="260px">
93+
<img src="images/img3.png" width="260px">
94+
<br>
95+
<img src="images/img4.png" width="260px">
96+
<img src="images/img5.png" width="260px">
97+
<img src="images/img6.png" width="260px">
98+
</div>
99+
100+
The file `output/CLEVR_scenes.json` will contain ground-truth scene information for all newly rendered images.
101+
102+
You can find [more details about image rendering here](image_generation/README.md).
103+
104+
## Step 2: Generating Questions
105+
Next we generate questions, functional programs, and answers for the rendered images generated in the previous step.
106+
This step takes as input the single JSON file containing all ground-truth scene information, and outputs a JSON file
107+
containing questions, answers, and functional programs for the questions in a single JSON file.
108+
109+
You can generate questions like this:
110+
111+
```bash
112+
cd question_generation
113+
python generate_questions.py
114+
```
115+
116+
The file `output/CLEVR_questions.json` will then contain questions for the generated images.
117+
118+
You can [find more details about question generation here](question_generation/README.md).

image_generation/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
output/

image_generation/README.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# CLEVR Image Generation
2+
3+
Images are generated by using Blender to invoke the script `render_images.py` like this:
4+
5+
```
6+
blender --background --python render_images.py -- [args]
7+
```
8+
9+
Any arguments following the `--` will be captured by `render_images.py`.
10+
11+
This command should be run from the `image_generation` directory, since by default the script will load resources from the `data` directory.
12+
13+
When rendering on cluster machines without audio drivers installed you may need to add the `-noaudio` flag to the Blender invocation like this:
14+
15+
```
16+
blender --background -noaudio --python render_images.py -- [args]
17+
```
18+
19+
You can also run `render_images.py` as a standalone script to view help on all command line flags like this:
20+
21+
```
22+
python render_images.py --help
23+
```
24+
25+
## Setup
26+
You will need to download and install [Blender](https://www.blender.org/); code has been developed and tested using Blender version 2.78c but other versions may work as well.
27+
28+
Blender ships with its own version of Python 3.5, and it uses its bundled Python to execute scripts. You'll need to add this directory to the Python path of Blender's bundled Python with a command like this:
29+
30+
```
31+
echo $PWD >> $BLENDER/$VERSION/python/lib/python3.5/site-packages/clevr.pth
32+
```
33+
34+
where `$BLENDER` is the directory where Blender is installed and `$VERSION` is your Blender version; for example on OSX you might run:
35+
36+
```
37+
echo $PWD >> /Applications/blender/blender.app/Contents/Resources/2.78/python/lib/python3.5/site-packages/clevr.pth
38+
```
39+
40+
## Rendering Overview
41+
The file `data/base_scene.blend` contains a Blender scene used for the basis of all CLEVR images. This scene contains a ground plane, a camera, and several light sources. After loading the base scene, the positions of the camera and lights are randomly jittered (controlled with the `--key_light_jitter`, `--fill_light_jitter`, `--back_light_jitter`, and `--camera_jitter` flags).
42+
43+
After the base scene has been loaded, objects are placed one by one into the scene. The number of objects for each scene is a random integer between `--min_objects` (default 3) and `--max_objects` (default 10), and each object has a random shape, size, color, and material.
44+
45+
After placing all objects, we ensure that no objects are fully occluded; in particular each object must occupy at least 100 pixels in the rendered image (customizable using `--min_pixels_per_object`). To accomplish this, we assign each object a unique color and render a version of the scene with lighting and shading disabled, writing it to a temporary file; we can then count the number of pixels of each color in this pre-render to check the number of visible pixels for each object.
46+
47+
Each invocation of `render_images.py` will render `--num_images` images, and they will be numbered starting at `--start_idx` (default 0). Using non-default values for `--start_idx` allows you to distribute rendering across many workers and recombine their results later without filename conflicts.
48+
49+
### Object Placement
50+
Each object is positioned randomly, but before actually adding the object to the scene we ensure that its center is at least `--min_dist` units away from the centers of all other objects. We also ensure that between each pair of objects, the left/right and front/back distance along the ground plane is at least `--margin` units; this helps to minimize ambiguous spatial relationships. If after `--max_retries` attempts we are unable to find a suitable position for an object, then all objects are deleted and placed again from scratch.
51+
52+
### Image Resolution
53+
By default images are rendered at `320x240`, but the resolution can be customized using the `--height` and `--width` flags.
54+
55+
### GPU Acceleration
56+
Rendering uses CPU by default, but if you have an NVIDIA GPU with CUDA installed then you can use the GPU to accelerate rendering by adding the flag `--use_gpu 1`. Blender also supports acceleration using OpenCL which allows the use of non-NVIDIA GPUs; however this is not currently supported by `render_images.py`.
57+
58+
### Rendering Quality
59+
You can control the quality of rendering with the `--render_num_samples` flag; using fewer samples will run more quickly but will result in grainy images. I've found that 64 samples is a good number to use for development; all released CLEVR images were rendered using 512 samples. The `--render_min_bounces` and `--render_max_bounces` control the number of bounces for transparent objects; I've found the default of 8 to work well for these options.
60+
61+
When rendering, Blender breaks up the output image into tiles and renders tiles sequentialy; the `--render_tile_size` flag controls the size of these tiles. This should not affect the output image, but may affect the speed at which it is rendered. For CPU rendering smaller tile sizes may be optimal, while for GPU rendering larger tiles may be faster.
62+
63+
With default settings, rendering a 320x240 image takes about 4 seconds on a Pascal Titan X. It's very likely that these rendering times could be drastically reduced by someone more familiar with Blender, but this rendering speed was acceptable for our purposes.
64+
65+
### Saving Blender Scene Files
66+
You can save a Blender `.blend` file for each rendered image by adding the flag `--save_blendfiles 1`. These files can be more than 5 MB each, so they are not saved by default.
67+
68+
### Output Files
69+
Rendered images are stored in the `--output_image_dir` directory, which is created if it does not exist. The filename of each rendered image is constructed from the `--filename_prefix`, the `--split`, and the image index.
70+
71+
A JSON file for each scene containing ground-truth object positions and attributes is saved in the `--output_scene_dir` directory, which is created if it does not exist. After all images are rendered the JSON files for each individual scene are combined into a single JSON file and written to `--output_scene_file`. This single file will also store the `--split`, `--version` (default 1.0), `--license` (default CC-BY 4.0), and `--date` (default today).
72+
73+
When rendering large numbers of images, I have sometimes experienced random Blender crashes; saving JSON files for each scene as they are rendered ensures that you do not lose information for scenes already rendered in the event of a crash.
74+
75+
If saving Blender scene files for each image (`--save_blendfiles 1`) then they are stored in the `--output_blend_dir` directory, which is created if it does not exist.
76+
77+
### Object Properties
78+
The file `--properties_json` file (default `data/properties.json`) defines the allowed shapes, sizes, colors, and materials used for objects, making it easy to extend CLEVR with new object properties.
79+
80+
Each shape (cube, sphere, cylinder) is stored in its own `.blend` file in the `--shape_dir` (default `data/shapes`); the file `X.blend` contains a single object named `X` centered at the origin with unit size. The `shapes` field of the JSON properties file maps human-readable shape names to `.blend` files in the `--shape_dir`.
81+
82+
The `colors` field of the JSON properties file maps human-readable color names to RGB values between 0 and 255; most of our colors are adapted from [Wad's Optimum 16 Color Palette](http://alumni.media.mit.edu/~wad/color/palette.html).
83+
84+
The `sizes` field of the JSON properties file maps human-readable size names to scaling factors used to scale the object models from the `--shape_dir`.
85+
86+
Each material is stored in its own `.blend` file in the `--material_dir` (default `data/materials`). The file `X.blend` should contain a single NodeTree item named X, and this NodeTree item must have a single `Color` input that accepts an RGBA value so that each material can be used with any color. The `materials` field of the JSON properties file maps human-readable material names to `.blend` files in the `--material_dir`.
87+
88+
### Restricting Shape / Color Combinations
89+
The optional `--shape_color_combos_json` flag can be used to restrict the colors of each shape. If provided, this should give a path to a JSON file mapping shape names to lists of allowed color names. This option can be used to render CLEVR-CoGenT images using the files `data/CoGenT_A.json` and `data/CoGenT_B.json`.

image_generation/collect_scenes.py

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Copyright 2017-present, Facebook, Inc.
2+
# All rights reserved.
3+
#
4+
# This source code is licensed under the BSD-style license found in the
5+
# LICENSE file in the root directory of this source tree. An additional grant
6+
# of patent rights can be found in the PATENTS file in the same directory.
7+
8+
import argparse, json, os
9+
10+
"""
11+
During rendering, each CLEVR scene file is dumped to disk as a separate JSON
12+
file; this is convenient for distributing rendering across multiple machines.
13+
This script collects all CLEVR scene files stored in a directory and combines
14+
them into a single JSON file. This script also adds the version number, date,
15+
and license to the output file.
16+
"""
17+
18+
parser = argparse.ArgumentParser()
19+
parser.add_argument('--input_dir', default='output/scenes')
20+
parser.add_argument('--output_file', default='output/CLEVR_misc_scenes.json')
21+
parser.add_argument('--version', default='1.0')
22+
parser.add_argument('--date', default='7/8/2017')
23+
parser.add_argument('--license',
24+
default='Creative Commons Attribution (CC-BY 4.0')
25+
26+
27+
def main(args):
28+
input_files = os.listdir(args.input_dir)
29+
scenes = []
30+
split = None
31+
for filename in os.listdir(args.input_dir):
32+
if not filename.endswith('.json'):
33+
continue
34+
path = os.path.join(args.input_dir, filename)
35+
with open(path, 'r') as f:
36+
scene = json.load(f)
37+
scenes.append(scene)
38+
if split is not None:
39+
msg = 'Input directory contains scenes from multiple splits'
40+
assert scene['split'] == split, msg
41+
else:
42+
split = scene['split']
43+
scenes.sort(key=lambda s: s['image_index'])
44+
for s in scenes:
45+
print(s['image_filename'])
46+
output = {
47+
'info': {
48+
'date': args.date,
49+
'version': args.version,
50+
'split': split,
51+
'license': args.license,
52+
},
53+
'scenes': scenes
54+
}
55+
with open(args.output_file, 'w') as f:
56+
json.dump(output, f)
57+
58+
59+
if __name__ == '__main__':
60+
args = parser.parse_args()
61+
main(args)
62+

0 commit comments

Comments
 (0)