Lambda estimation work from Grant Erdmann #32

jgwinnup · 2015-05-22T19:09:38Z

Grant took a look at the estimation code and made some changes - LM query still needs to be sped up.

weights is unity. Online approximation still in use.

Conflicts: lm/interpolate/train_params_main.cc

kpu · 2015-05-22T19:15:10Z

There are >>>> failed merge markers in the final code

jgwinnup · 2015-05-22T19:23:16Z

Ok the merge conflict should be fixed now

kpu · 2015-05-22T21:47:10Z

Wait, why is the sum of the weights unity? Nothing wrong with sharpening the probability distribution.

jgwinnup · 2015-05-23T02:05:48Z

Now that you made me think about it, I actually remember. "Interpolation" is a misnomer if they do not sum to one.

The math works with a non-unit sum, but then you might be extrapolating, and it would be better to call it "loglinear combination". Each LM's distribution flattens as the sum goes to zero, steepens as the sum gets large. Luckily, it's easy to get loglinear combination (if that's what you want) by making the nullspace matrix the identity. All weights being zero (or infinity) would be allowed, but contrary to our standard intuition.
The nonnegativity constraint was the more annoying one. I have some additional edits in testing that should make the active set algorithm more elegant and efficient.
You could enforce interpolation in ways other than the sum (i.e., 1-norm), but that's the easiest way (for me, at least) computationally. You could choose 2-norm, infinity-norm, or some other measure being equal to unity. You would just have to avoid weights with only one nonzero component, where that component is not unity. E.g., you would need to exclude [0 0 .5] and [0 0 1.5].

kpu · 2015-05-23T14:03:40Z

The distribution is normalized by brute force. It will also produce values greater than zero even if the weights are less than 0. Having a weight less than 0 may seem odd, but consider the case where we have an LM trained on negative examples and effectively want the likelihood ratio from a good model and a bad model. The optimality criterion will generally favor non-negative weights.

There is no reason for either constraint.

jsedoc · 2015-05-24T00:47:08Z

Jeremy,

I've got some optimizations, are you done with your changes?

Regards,
João

On Fri, May 22, 2015 at 10:05 PM, Jeremy Gwinnup [email protected]
wrote:

Now that you made me think about it, I actually remember. "Interpolation"
is a misnomer if they do not sum to one.

The math works with a non-unit sum, but then you might be extrapolating,
and it would be better to call it "loglinear combination". Each LM's
distribution flattens as the sum goes to zero, steepens as the sum gets
large. Luckily, it's easy to get loglinear combination (if that's what you
want) by making the nullspace matrix the identity. All weights being zero
(or infinity) would be allowed, but contrary to our standard intuition.
The nonnegativity constraint was the more annoying one. I have some
additional edits in testing that should make the active set algorithm more
elegant and efficient.
You could enforce interpolation in ways other than the sum (i.e., 1-norm),
but that's the easiest way (for me, at least) computationally. You could
choose 2-norm, infinity-norm, or some other measure being equal to unity.
You would just have to avoid weights with only one nonzero component, where
that component is not unity. E.g., you would need to exclude [0 0 .5] and
[0 0 1.5].

—
Reply to this email directly or view it on GitHub
#32 (comment).

jgwinnup · 2015-05-24T13:54:23Z

Yeah I think so - thanks!

kpu · 2015-05-27T21:01:47Z

@jsedoc Please push your change. I'm holding Jeremy's change (i.e. this one) up because it imposes inappropriate constraints on the \lambda s.

kpu · 2015-05-27T21:10:37Z

@jgwinnup Can you produce an optimizer based on Dyer's code that does not impose constraints on the lambda values?

jgwinnup · 2015-05-27T21:58:37Z

That is probably beyond my skill right now - the above comment about the constraints was from Grant based on the code he did last week. I can back out those changes if you want.
@jsedoc - can you merge your changes into master as opposed to this fork?

jgwinnup · 2015-05-27T22:21:01Z

I just talked to Grant - there will be a version without the constraints ready in the morning barring any problems.

Better output for the user.

jgwinnup · 2015-05-28T14:55:22Z

Ok - this latest push from Grant is set so that there are no constraints on the lambda values as discussed. This compiles on the 'paramwork' branch, I've tried to update it to master, but I don't have a dev environment on this machine. Let me know if there's issues

Newer version of the tuner. It has settings that seem to improve convergence speed somewhat, and the user output is clearer. Suggested use: For a tuning corpus of thousands of lines, first tune on just a dozen or two lines, then use the resulting weights to initialize "params" for tuning the full corpus. It probably will put you in the basin of convergence for Newton's method, and you won't waste time on many bad steps. Right now doing this requires recompiling with the different initialization, but I plan to change that.

jgwinnup · 2015-06-01T13:48:48Z

Chase is integrating code on master

Jeremy Gwinnup and others added 6 commits May 20, 2015 11:32

Small fix for input ifstream

79d142f

Uses Newton step, implicitly enforcing that the sum of the

631031b

weights is unity. Online approximation still in use.

New treatment of variable "context" in train_params_main.cc

ba1f3e5

Safeguards from negative weights, since they can cause overflow.

1591859

Various bug fixes and algorithm improvements in weight optimization.

6af9298

Merge branch 'paramwork' of /media/usb-fat32/sneakernet/kenlm-interp

98d6faf

Conflicts: lm/interpolate/train_params_main.cc

Missed a merge conflict

ff33c29

Grant Erdmann and others added 5 commits May 28, 2015 10:31

Can allow extrapolation and negative weights.

7f1723a

Better output for the user.

Minor reorganization.

2f8fed9

Merge branch 'master' of https://github.com/kpu/kenlm

dfaf421

Merge branch 'paramwork' of /media/usb-fat32/sneakernet/kenlm-interp

bf6596c

Merge branch 'paramwork' of /media/usb-fat32/sneakernet/kenlm-interp

8acb04a

jgwinnup added 3 commits May 29, 2015 08:29

Merge branch 'master' of https://github.com/kpu/kenlm

144a54c

Merge branch 'master' of https://github.com/kpu/kenlm

9594f0d

jgwinnup closed this Jun 1, 2015

Gavin90s mentioned this pull request Mar 11, 2021

core dump occurred when load LM model #328

Closed

Uh oh!

Lambda estimation work from Grant Erdmann #32

Lambda estimation work from Grant Erdmann #32

Uh oh!

Conversation

jgwinnup commented May 22, 2015

Uh oh!

kpu commented May 22, 2015

Uh oh!

jgwinnup commented May 22, 2015

Uh oh!

kpu commented May 22, 2015

Uh oh!

jgwinnup commented May 23, 2015

Uh oh!

kpu commented May 23, 2015

Uh oh!

jsedoc commented May 24, 2015

Uh oh!

jgwinnup commented May 24, 2015

Uh oh!

kpu commented May 27, 2015

Uh oh!

kpu commented May 27, 2015

Uh oh!

jgwinnup commented May 27, 2015

Uh oh!

jgwinnup commented May 27, 2015

Uh oh!

jgwinnup commented May 28, 2015

Uh oh!

jgwinnup commented Jun 1, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants