Skip to content

Added sample_shape parameter to RandomVariable #591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 27, 2017

Conversation

matthewdhoffman
Copy link
Collaborator

This is a small PR that adds a keyword argument sample_shape to Edward RandomVariables that gets passed to the underlying tf.contrib.distributions Distribution's sample function. This lets us replace syntax like ed.Normal(mu=tf.ones([10, 1])*mu_vec, sigma=1.) with ed.Normal(mu=mu_vec, sigma=1., n=10).

IMO, this is clearer, since it explicitly says that some slices are i.i.d., some are independent, and dependent. (Distribution keeps track of these with batch_size and event_size.) It's also (very marginally) more efficient, and it makes doing algebra on the graph a little easier in some cases.

@dustinvtran
Copy link
Member

dustinvtran commented Mar 27, 2017

Cool! I'll push commits to this now.

An example where sample_shape is not only more efficient but necessary is with the Dirichlet process. DirichletProcess(tf.ones(N), Normal(0.0, 1.0)) represents a batch of N independent Dirichlet processes (each with concentration parameter 1.0 and std normal as base measure); this means no atoms are shared across their draws. DirichletProcess(1.0, Normal(0.0, 1.0), sample_shape=N) represents N samples from the same Dirichlet process; this is the correct generative process for mixture models and etcetera.

@dustinvtran dustinvtran force-pushed the feature/sample_shape branch 3 times, most recently from 7397e35 to 28d876b Compare March 27, 2017 19:21
@dustinvtran dustinvtran force-pushed the feature/sample_shape branch from 28d876b to 2c53d8a Compare March 27, 2017 19:21
@dustinvtran
Copy link
Member

dustinvtran commented Mar 27, 2017

I checked if there were any consequences during inference. For example, the user might write

from edward.models import Normal

z = Normal(tf.zeros(100), tf.ones(100))
qz = Normal(tf.Variable(0.0), tf.nn.softplus(tf.Variable(0.0)), sample_shape=100)

inference = ed.KLqp({z: qz})

which is weird. Fortunately the broadcasting during log-density calculations still work, so all variational inference methods are fine (including MAP, even though PointMass with sample_shape is really weird).

I made MonteCarlo raise an error if its Empirical approixmations use a non-scalar sample_shape. Things might go awry especially for auxiliary methods which use momentum.

@dustinvtran
Copy link
Member

@matthewdhoffman | This is ready for you to take a look at again. If you approve, feel free to merge (using the "Squash and merge" option); comments welcome.

@matthewdhoffman matthewdhoffman merged commit 23ac40b into master Mar 27, 2017
@matthewdhoffman
Copy link
Collaborator Author

That KLqp example is very interesting. It's something that someone might conceivably want to do (e.g., maybe one could use it to share variances but not means across a bunch of examples). Hopefully it doesn't trip anyone up.

LGTM, merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants