Skip to content

Distribution shapes #153

@dustinvtran

Description

@dustinvtran

I find it very useful to recognize the functionality of different shape arguments. In tf.distributions and Edward, we distinguish batch_shape, event_shape, and sample_shape: batch represents the number (shape) of independent random variables defined by one object; event represents the shape of a draw from each random variable; and sample represents the number (shape) of draws for each random variable in the batch.

For example:

from edward.models import Dirichlet

x = Dirichlet(tf.ones([3, 2]), sample_shape=(5,))
x.shape
## TensorShape([Dimension(5), Dimension(3), Dimension(2)])
x.sample_shape  # determined by an explicit arg
## TensorShape([Dimension(5)])
x.batch_shape  # always determined by parameter's shape - event_shape
## TensorShape([Dimension(3)])
x.event_shape  # always determined by parameter's inner-most dimensions
## TensorShape([Dimension(2)])

This says: define a Dirichlet object which is a batch of 3 Dirichlet random variables, each has dimensions given by a (2-1)-dimensional simplex, and draw 5 samples from each rv.

I wasn't sure how this is reflected in the current library:

  • In the existing docs, sample says returns a sample from the parameterized distribution with the same dimensions as the parameters.. This doesn't always seem correct for non-scalar event_shape (multivariate distributions).
  • __init__'s batch_size seems to be a convenience arg for expanding the parameter args with an extra dimension. Is it necessary given that the parameters can be arbitrary tensors?
  • log_pdf seems to be a convenience function for batch_log_pdf(..., batch_size=1). Instead of separating them, why not combine them and avoid the batch_size arg altogether? For example, you can let the parameters determine the batch size, where any inputs to log_pdf must have shape batch_shape + event_shape by contract.

Maybe also relates to #146?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions