-
-
Notifications
You must be signed in to change notification settings - Fork 995
Closed
Description
I find it very useful to recognize the functionality of different shape arguments. In tf.distributions and Edward, we distinguish batch_shape
, event_shape
, and sample_shape
: batch represents the number (shape) of independent random variables defined by one object; event represents the shape of a draw from each random variable; and sample represents the number (shape) of draws for each random variable in the batch.
For example:
from edward.models import Dirichlet
x = Dirichlet(tf.ones([3, 2]), sample_shape=(5,))
x.shape
## TensorShape([Dimension(5), Dimension(3), Dimension(2)])
x.sample_shape # determined by an explicit arg
## TensorShape([Dimension(5)])
x.batch_shape # always determined by parameter's shape - event_shape
## TensorShape([Dimension(3)])
x.event_shape # always determined by parameter's inner-most dimensions
## TensorShape([Dimension(2)])
This says: define a Dirichlet object which is a batch of 3 Dirichlet random variables, each has dimensions given by a (2-1)-dimensional simplex, and draw 5 samples from each rv.
I wasn't sure how this is reflected in the current library:
- In the existing docs,
sample
saysreturns a sample from the parameterized distribution with the same dimensions as the parameters.
. This doesn't always seem correct for non-scalar event_shape (multivariate distributions). __init__
'sbatch_size
seems to be a convenience arg for expanding the parameter args with an extra dimension. Is it necessary given that the parameters can be arbitrary tensors?log_pdf
seems to be a convenience function forbatch_log_pdf(..., batch_size=1)
. Instead of separating them, why not combine them and avoid thebatch_size
arg altogether? For example, you can let the parameters determine the batch size, where any inputs to log_pdf must have shapebatch_shape + event_shape
by contract.
Maybe also relates to #146?
fritzo and neighthan
Metadata
Metadata
Assignees
Labels
No labels