-
Notifications
You must be signed in to change notification settings - Fork 161
Open
Description
Hi John,
I have read your TRPO paper and I'm trying to reproduce the Fisher-Vector Product calculation function in C. Line 36-37 in agentzoo.py make me confused. I copy the weights to my code, feed ob_no into the network, and check its outputs against prob_np. It turned out that the mean values in prob_np are the original neural network outputs that are not multiplied by 0.1. (I use theano backend, swimmer-v1 test case, 8-64-64-2 network.) Also the *0.1 thing is not mentioned in the TRPO paper. I was wondering whether you can shed some light on this issue.
Wlast = net.layers[-1].W
Wlast.set_value(Wlast.get_value(borrow=True)*0.1)
Thank you in advance!
thanks
Patrick
Metadata
Metadata
Assignees
Labels
No labels