Skip to content

Commit ea3c7ee

Browse files
authored
Add DirichletProcess random variable (#555)
* revise dirichlet process base example * add DirichletProcess random variable * move DirichletProcess to edward/models/ * replace pp_dirichlet_process_{rv,base}.py * allow non-scalar concentration parameter * add unit test * update pp_dirichlet_process.py * update DP docstring * update docs; memoize appropriately across samples * fix stick construction; simplify implementation * update docs * update docstring
1 parent 5ca457a commit ea3c7ee

12 files changed

+349
-123
lines changed
7.25 KB
Loading
14.1 KB
Loading

docs/tex/api/model-compositionality.tex

Lines changed: 48 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -141,21 +141,57 @@ \subsubsection{Bayesian Nonparametrics}
141141
collapsing the infinite-dimensional space and lazily defining the
142142
infinite-dimensional space.
143143

144-
For the collapsed approach, see the
144+
In the collapsed approach, we specify a distribution over its
145+
instantiation, and the stochastic process is implicitly marginalized
146+
out. For example, we can represent a distribution over the function
147+
evaluations of a Gaussian process, and not explicitly represent the
148+
function draw.
149+
150+
\begin{lstlisting}[language=Python]
151+
from edward.models import Bernoulli, Normal
152+
153+
def kernel(X):
154+
"""Evaluate kernel over each pair of rows (data points) in the matrix."""
155+
return
156+
157+
X = tf.placeholder(tf.float32, [N, D])
158+
y = MultivariateNormalFull(mu=tf.zeros(N), sigma=kernel(X))
159+
\end{lstlisting}
160+
161+
For more details, see the
145162
\href{/tutorials/supervised-classification}{Gaussian process classification}
146-
tutorial as an example. We specify distributions over the function
147-
evaluations of the Gaussian process, and the Gaussian process is
148-
implicitly marginalized out. This approach is also useful for Poisson
149-
process models.
163+
tutorial. This approach is also useful for Poisson process models.
150164

151-
To work directly on the infinite-dimensional space, one can leverage
152-
random variables with
165+
In the lazy approach, we work directly on the infinite-dimensional space via
153166
\href{https://www.tensorflow.org/versions/master/api_docs/python/control_flow_ops.html}{control flow operations}
154-
in TensorFlow. At runtime, the control flow will lazily define any
155-
parameters in the space necessary in order to generate samples. As an
156-
example, we use a while loop to define a
157-
\href{https://github.com/blei-lab/edward/blob/master/examples/pp_dirichlet_process.py}
158-
{Dirichlet process} according to its stick breaking representation.
167+
in TensorFlow. At runtime, the control flow will execute only the
168+
necessary computation in order to terminate. As an example, Edward
169+
provides a \texttt{DirichletProcess} random variable.
170+
171+
\begin{lstlisting}[language=Python]
172+
from edward.models import DirichletProcess, Normal
173+
174+
def plot_dirichlet_process(alpha):
175+
with tf.Session() as sess:
176+
dp = DirichletProcess(alpha, Normal, mu=0.0, sigma=1.0)
177+
samples = sess.run(dp.sample(1000))
178+
plt.hist(samples, bins=100, range=(-3.0, 3.0))
179+
plt.title("DP({0}, N(0, 1))".format(alpha))
180+
plt.show()
181+
182+
# Dirichlet process with high concentration
183+
plot_dirichlet_process(alpha=1.0)
184+
# Dirichlet process with low concentration (more spread out)
185+
plot_dirichlet_process(alpha=50.0)
186+
\end{lstlisting}
187+
188+
\includegraphics[width=350px]{/images/dirichlet-process-fig0.png}
189+
\includegraphics[width=350px]{/images/dirichlet-process-fig1.png}
190+
191+
To see the essential component defining the \texttt{DirichletProcess}, see
192+
\href{https://github.com/blei-lab/edward/blob/master/examples/pp_dirichlet_process.py}{\texttt{examples/pp_dirichlet_process.py}}
193+
in the Github repository. Its source implementation can be found at
194+
\href{https://github.com/blei-lab/edward/blob/master/edward/models/dirichlet_process.py}{\texttt{edward/models/dirichlet_process.py}}.
159195

160196
\subsubsection{Probabilistic Programs}
161197

docs/tex/api/model.tex

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ \subsubsection{Model}
8383
.. autoclass:: edward.models.RandomVariable
8484
:members:
8585

86+
.. autoclass:: edward.models.DirichletProcess
87+
8688
.. autoclass:: edward.models.Empirical
8789

8890
.. autoclass:: edward.models.PointMass

docs/tex/api/reference.tex

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ \subsubsection{Models}
1212
{%sphinx
1313

1414
* :class:`edward.models.RandomVariable`
15+
* :class:`edward.models.DirichletProcess`
1516
* :class:`edward.models.Empirical`
1617
* :class:`edward.models.PointMass`
1718
* {{tensorflow_distributions}}
@@ -62,4 +63,4 @@ \subsubsection{Criticism}
6263
* :func:`edward.criticisms.evaluate`
6364
* :func:`edward.criticisms.ppc`
6465

65-
%}
66+
%}

docs/tex/iclr2017.tex

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ \subsection{Deep Probabilistic Programming}
1212
The code snippets assume the following versions.
1313

1414
\begin{lstlisting}[language=bash]
15-
pip install edward==1.2.3
15+
pip install -e "git+https://github.com/blei-lab/edward.git#egg=edward"
1616
pip install tensorflow==1.0.0 # alternatively, tensorflow-gpu==1.0.0
1717
pip install keras==1.0.0
1818
\end{lstlisting}
@@ -331,19 +331,20 @@ \subsubsection{Appendix A. Model Examples}
331331
\textbf{Figure 13}. Dirichlet process mixture model \citep{antoniak1974mixtures}.
332332
\begin{lstlisting}[language=python]
333333
import tensorflow as tf
334-
from edward.models import Normal
334+
from edward.models import DirichletProcess, Normal
335+
336+
N = 1000 # number of data points
337+
D = 5 # data dimensionality
335338

336-
mu = DirichletProcess( # see script for implementation
337-
alpha=0.1, base_cls=Normal, mu=tf.zeros(D), sigma=tf.ones(D), sample_n=N)
339+
dp = DirichletProcess(
340+
alpha=1.0, base_cls=Normal, mu=tf.zeros(D), sigma=tf.ones(D))
341+
mu = dp.sample(N)
338342
x = Normal(mu=mu, sigma=tf.ones([N, D]))
339343
\end{lstlisting}
340-
To see the essential component defining the \texttt{DirichletProcess}
341-
random variable, see
344+
To see the essential component defining the \texttt{DirichletProcess}, see
342345
\href{https://github.com/blei-lab/edward/blob/master/examples/pp_dirichlet_process.py}{\texttt{examples/pp_dirichlet_process.py}}
343-
in the Github repository.
344-
A more involved version with a base distribution is available at
345-
\href{https://github.com/blei-lab/edward/blob/master/examples/pp_dirichlet_process_base.py}{\texttt{examples/pp_dirichlet_process_base.py}}
346-
in the Github repository.
346+
in the Github repository. Its source implementation can be found at
347+
\href{https://github.com/blei-lab/edward/blob/master/edward/models/dirichlet_process.py}{\texttt{edward/models/dirichlet_process.py}}.
347348

348349
\subsubsection{Appendix B. Inference Examples}
349350

edward/models/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
from __future__ import division
33
from __future__ import print_function
44

5+
from edward.models.dirichlet_process import *
56
from edward.models.empirical import *
67
from edward.models.point_mass import *
78
from edward.models.random_variable import *

edward/models/dirichlet_process.py

Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
from __future__ import absolute_import
2+
from __future__ import division
3+
from __future__ import print_function
4+
5+
import tensorflow as tf
6+
7+
from edward.models.random_variable import RandomVariable
8+
from edward.models.random_variables import Bernoulli, Beta
9+
from tensorflow.contrib.distributions import Distribution
10+
11+
12+
class DirichletProcess(RandomVariable, Distribution):
13+
def __init__(self, alpha, base_cls, validate_args=False, allow_nan_stats=True,
14+
name="DirichletProcess", value=None, *args, **kwargs):
15+
"""Dirichlet process :math:`\mathcal{DP}(\\alpha, H)`.
16+
17+
It has two parameters: a positive real value :math:`\\alpha`,
18+
known as the concentration parameter (``alpha``), and a base
19+
distribution :math:`H` (``base_cls(*args, **kwargs)``).
20+
21+
Parameters
22+
----------
23+
alpha : tf.Tensor
24+
Concentration parameter. Must be positive real-valued. Its shape
25+
determines the number of independent DPs (batch shape).
26+
base_cls : RandomVariable
27+
Class of base distribution. Its shape (when instantiated)
28+
determines the shape of an individual DP (event shape).
29+
*args, **kwargs : optional
30+
Arguments passed into ``base_cls``.
31+
32+
Examples
33+
--------
34+
>>> # scalar concentration parameter, scalar base distribution
35+
>>> dp = DirichletProcess(0.1, Normal, mu=0.0, sigma=1.0)
36+
>>> assert dp.get_shape() == ()
37+
>>>
38+
>>> # vector of concentration parameters, matrix of Exponentials
39+
>>> dp = DirichletProcess(tf.constant([0.1, 0.4]),
40+
... Exponential, lam=tf.ones([5, 3]))
41+
>>> assert dp.get_shape() == (2, 5, 3)
42+
"""
43+
with tf.name_scope(name, values=[alpha]) as ns:
44+
with tf.control_dependencies([]):
45+
self._alpha = tf.identity(alpha, name="alpha")
46+
self._base_cls = base_cls
47+
self._base_args = args
48+
self._base_kwargs = kwargs
49+
50+
# Instantiate base for use in other methods such as `_get_event_shape`.
51+
self._base = self._base_cls(*self._base_args, **self._base_kwargs)
52+
53+
super(DirichletProcess, self).__init__(
54+
dtype=tf.int32,
55+
parameters={"alpha": self._alpha,
56+
"base_cls": self._base_cls,
57+
"args": self._base_args,
58+
"kwargs": self._base_kwargs},
59+
is_continuous=False,
60+
is_reparameterized=False,
61+
validate_args=validate_args,
62+
allow_nan_stats=allow_nan_stats,
63+
name=ns,
64+
value=value)
65+
66+
@property
67+
def alpha(self):
68+
"""Concentration parameter."""
69+
return self._alpha
70+
71+
def _batch_shape(self):
72+
return tf.convert_to_tensor(self.get_batch_shape())
73+
74+
def _get_batch_shape(self):
75+
return self._alpha.get_shape()
76+
77+
def _event_shape(self):
78+
return tf.convert_to_tensor(self.get_event_shape())
79+
80+
def _get_event_shape(self):
81+
return self._base.get_shape()
82+
83+
def _sample_n(self, n, seed=None):
84+
"""Sample ``n`` draws from the DP. Draws from the base
85+
distribution are memoized across ``n``.
86+
87+
Draws from the base distribution are not memoized across the batch
88+
shape: i.e., each independent DP in the batch shape has its own
89+
memoized samples. Similarly, draws are not memoized across calls
90+
to ``sample()``.
91+
92+
Returns
93+
-------
94+
tf.Tensor
95+
A ``tf.Tensor`` of shape ``[n] + batch_shape + event_shape``,
96+
where ``n`` is the number of samples for each DP,
97+
``batch_shape`` is the number of independent DPs, and
98+
``event_shape`` is the shape of the base distribution.
99+
100+
Notes
101+
-----
102+
The implementation has only one inefficiency, which is that it
103+
draws (n, batch_shape) samples from the base distribution at each
104+
iteration of the while loop. Ideally, we would only draw new
105+
samples for those in the loop returning True.
106+
"""
107+
if seed is not None:
108+
raise NotImplementedError("seed is not implemented.")
109+
110+
batch_shape = self._get_batch_shape().as_list()
111+
event_shape = self._get_event_shape().as_list()
112+
rank = 1 + len(batch_shape) + len(event_shape)
113+
# Note this is for scoping within the while loop's body function.
114+
self._temp_scope = [n, batch_shape, event_shape, rank]
115+
116+
# First stick probability, one for each sample and each DP in the
117+
# batch shape. It has shape (n, batch_shape).
118+
beta_k = Beta(a=tf.ones_like(self._alpha), b=self._alpha).sample(n)
119+
# First base distribution.
120+
# It has shape (n, batch_shape, event_shape).
121+
theta_k = tf.tile( # make (batch_shape, event_shape), then memoize across n
122+
tf.expand_dims(self._base_cls(*self._base_args, **self._base_kwargs).
123+
sample(batch_shape), 0),
124+
[n] + [1] * (rank - 1))
125+
126+
# Initialize all samples as the first base distribution.
127+
draws = theta_k
128+
# Flip coins for each stick probability.
129+
flips = Bernoulli(p=beta_k)
130+
# Get boolean tensor, returning True for samples that return tails
131+
# and are currently equal to theta_k.
132+
# It has shape (n, batch_shape).
133+
bools = tf.logical_and(
134+
tf.cast(1 - flips, tf.bool),
135+
tf.reduce_all(tf.equal(draws, theta_k), # reduce event_shape
136+
[i for i in range(1 + len(batch_shape), rank)]))
137+
138+
samples, _ = tf.while_loop(self._sample_n_cond, self._sample_n_body,
139+
loop_vars=[draws, bools])
140+
return samples
141+
142+
def _sample_n_cond(self, draws, bools):
143+
# Proceed if at least one bool is True.
144+
return tf.reduce_any(bools)
145+
146+
def _sample_n_body(self, draws, bools):
147+
n, batch_shape, event_shape, rank = self._temp_scope
148+
149+
beta_k = Beta(a=tf.ones_like(self._alpha), b=self._alpha).sample(n)
150+
theta_k = tf.tile( # make (batch_shape, event_shape), then memoize across n
151+
tf.expand_dims(self._base_cls(*self._base_args, **self._base_kwargs).
152+
sample(batch_shape), 0),
153+
[n] + [1] * (rank - 1))
154+
155+
if len(bools.get_shape()) > 1:
156+
# ``tf.where`` only index subsets when ``bools`` is at most a
157+
# vector. In general, ``bools`` has shape (n, batch_shape).
158+
# Therefore we tile ``bools`` to be of shape
159+
# (n, batch_shape, event_shape) in order to index per-element.
160+
bools = tf.tile(tf.reshape(
161+
bools, [n] + batch_shape + [1] * len(event_shape)),
162+
[1] + [1] * len(batch_shape) + event_shape)
163+
164+
# Assign True samples to the new theta_k.
165+
draws = tf.where(bools, theta_k, draws)
166+
167+
flips = Bernoulli(p=beta_k)
168+
bools = tf.logical_and(
169+
tf.cast(1 - flips, tf.bool),
170+
tf.reduce_all(tf.equal(draws, theta_k), # reduce event_shape
171+
[i for i in range(1 + len(batch_shape), rank)]))
172+
return draws, bools

0 commit comments

Comments
 (0)