Discussion:
[theano-users] Theano custom Op: define grad()
Yaojie Lu
2018-07-30 01:26:26 UTC
Permalink
Hello,

I want to create a custom Op for the use in PyMC3.
This Op finds the root of the function: f(x) = x + env*exp(x) - a*b^2 where
env = np.array([1, 2]) so the root-finding function should return a vector
for any given pair of a & b.
This is how I define it:

from scipy import optimize
import numpy as np
import theano
import theano.tensor as tt

envod = np.array([1, 2])

def func(x, a, b, env):
value = x + env * np.exp(x) - a * b**2
return value

def jac(x, a, b, env):
jac = 1 + env * np.exp(x)
return jac

def x_from_ab(a, b, env):
Len = len(env)
value = np.zeros(Len)
for i in range(len(envod)):
value[i] = optimize.newton(func, 1, fprime = jac, args = (a, b,
env[i]))
return value

class Xf(tt.Op):
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dvector]

def perform(self, node, inputs, outputs):
a, b = inputs
x = x_from_ab(a, b, envod)
outputs[0][0] = np.array(x)

def grad(self, inputs, output_gradients):
a, b = inputs
x = self(a, b)
g, = output_gradients
return [-g[0] * (-b**2)/(1 + envod[0] * tt.exp(x[0])), -g[0] *
(-2*a*b)/(1 + envod[0] * tt.exp(x[0]))]

I wonder how should I define grad()? I have read all the
posts/documentations that I have found. Any suggestion or link to some
useful reference is welcome.

Many thanks!
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Arnaud Bergeron
2018-07-30 14:19:16 UTC
Permalink
What problem do you have? I don’t see anything obviously wrong.
Post by Yaojie Lu
Hello,
I want to create a custom Op for the use in PyMC3.
This Op finds the root of the function: f(x) = x + env*exp(x) - a*b^2 where env = np.array([1, 2]) so the root-finding function should return a vector for any given pair of a & b.
from scipy import optimize
import numpy as np
import theano
import theano.tensor as tt
envod = np.array([1, 2])
value = x + env * np.exp(x) - a * b**2
return value
jac = 1 + env * np.exp(x)
return jac
Len = len(env)
value = np.zeros(Len)
value[i] = optimize.newton(func, 1, fprime = jac, args = (a, b, env[i]))
return value
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dvector]
a, b = inputs
x = x_from_ab(a, b, envod)
outputs[0][0] = np.array(x)
a, b = inputs
x = self(a, b)
g, = output_gradients
return [-g[0] * (-b**2)/(1 + envod[0] * tt.exp(x[0])), -g[0] * (-2*a*b)/(1 + envod[0] * tt.exp(x[0]))]
I wonder how should I define grad()? I have read all the posts/documentations that I have found. Any suggestion or link to some useful reference is welcome.
Many thanks!
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Yaojie Lu
2018-07-30 15:10:19 UTC
Permalink
Thanks for your reply. I examined my code by doing:

att = tt.dscalar('att')
btt = tt.dscalar('btt')
expr = Xf()(att, btt)
ga = tt.grad(expr[0], att)
print(ga.eval({att:3, btt:4}))

What I expect from ga is the dx/da, which should be a vector as envod is
defined as a vector. However, I can only get dx/da with env = envod[0] from
the above code.
If I remove the index '[0]' for envod and x in my 'def grad()', an error
was thrown:
ValueError: <__main__.Xf object at 0x0000027B90B30C88>.grad returned a term
with 1 dimensions, but 0 are required.
Post by Arnaud Bergeron
What problem do you have? I don’t see anything obviously wrong.
Hello,
I want to create a custom Op for the use in PyMC3.
This Op finds the root of the function: f(x) = x + env*exp(x) - a*b^2
where env = np.array([1, 2]) so the root-finding function should return a
vector for any given pair of a & b.
from scipy import optimize
import numpy as np
import theano
import theano.tensor as tt
envod = np.array([1, 2])
value = x + env * np.exp(x) - a * b**2
return value
jac = 1 + env * np.exp(x)
return jac
Len = len(env)
value = np.zeros(Len)
value[i] = optimize.newton(func, 1, fprime = jac, args = (a, b, env[i]))
return value
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dvector]
a, b = inputs
x = x_from_ab(a, b, envod)
outputs[0][0] = np.array(x)
a, b = inputs
x = self(a, b)
g, = output_gradients
return [-g[0] * (-b**2)/(1 + envod[0] * tt.exp(x[0])), -g[0] *
(-2*a*b)/(1 + envod[0] * tt.exp(x[0]))]
I wonder how should I define grad()? I have read all the
posts/documentations that I have found. Any suggestion or link to some
useful reference is welcome.
Many thanks!
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Arnaud Bergeron
2018-07-30 21:21:54 UTC
Permalink
The gradient with regards to a (a scalar) can't be a vector. That doesn't make any sense. Why do you expect this to work?

The best I can tell you is that gradients seem less wrong if you do this:

def grad(self, inputs, output_gradients):
a, b = inputs
x = self(a, b)
g, = output_gradients
return [(-g[0] * (-b**2)/(1 + envod * tt.exp(x))).sum(), (-g[0] * (-2*a\
*b)/(1 + envod * tt.exp(x))).sum()]

Also you can use verify_grad (from theano.test.unittest_tools) to check if your gradient definition is good like this:

tt.verify_grad(Xf(), [3.0, 4.0])

This will throw an error if the analytical gradient you program and the numerical gradient estimated using finite differences differ too much (with configurable levels, see: http://deeplearning.net/software/theano_versions/dev/extending/unittest.html#validating-the-gradient <http://deeplearning.net/software/theano_versions/dev/extending/unittest.html#validating-the-gradient>)
Post by Yaojie Lu
att = tt.dscalar('att')
btt = tt.dscalar('btt')
expr = Xf()(att, btt)
ga = tt.grad(expr[0], att)
print(ga.eval({att:3, btt:4}))
What I expect from ga is the dx/da, which should be a vector as envod is defined as a vector. However, I can only get dx/da with env = envod[0] from the above code.
ValueError: <__main__.Xf object at 0x0000027B90B30C88>.grad returned a term with 1 dimensions, but 0 are required.
What problem do you have? I don’t see anything obviously wrong.
Post by Yaojie Lu
Hello,
I want to create a custom Op for the use in PyMC3.
This Op finds the root of the function: f(x) = x + env*exp(x) - a*b^2 where env = np.array([1, 2]) so the root-finding function should return a vector for any given pair of a & b.
from scipy import optimize
import numpy as np
import theano
import theano.tensor as tt
envod = np.array([1, 2])
value = x + env * np.exp(x) - a * b**2
return value
jac = 1 + env * np.exp(x)
return jac
Len = len(env)
value = np.zeros(Len)
value[i] = optimize.newton(func, 1, fprime = jac, args = (a, b, env[i]))
return value
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dvector]
a, b = inputs
x = x_from_ab(a, b, envod)
outputs[0][0] = np.array(x)
a, b = inputs
x = self(a, b)
g, = output_gradients
return [-g[0] * (-b**2)/(1 + envod[0] * tt.exp(x[0])), -g[0] * (-2*a*b)/(1 + envod[0] * tt.exp(x[0]))]
I wonder how should I define grad()? I have read all the posts/documentations that I have found. Any suggestion or link to some useful reference is welcome.
Many thanks!
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Yaojie Lu
2018-07-31 04:42:07 UTC
Permalink
I don't know how to take the derivative wrt a vector so I just took the
derivative wrt the first element of the vector to demonstrate what I'd like
to achieve. Maybe it was misleading.
It seems I cannot get the gradient as a vector. But thanks for your answer
anyway! I will try to find a way around it.
Post by Arnaud Bergeron
The gradient with regards to a (a scalar) can't be a vector. That doesn't
make any sense. Why do you expect this to work?
a, b = inputs
x = self(a, b)
g, = output_gradients
return [(-g[0] * (-b**2)/(1 + envod * tt.exp(x))).sum(), (-g[0] * (-2*a\
*b)/(1 + envod * tt.exp(x))).sum()]
Also you can use verify_grad (from theano.test.unittest_tools) to check if
tt.verify_grad(Xf(), [3.0, 4.0])
This will throw an error if the analytical gradient you program and the
numerical gradient estimated using finite differences differ too much (with
http://deeplearning.net/software/theano_versions/dev/extending/unittest.html#validating-the-gradient
)
att = tt.dscalar('att')
btt = tt.dscalar('btt')
expr = Xf()(att, btt)
ga = tt.grad(expr[0], att)
print(ga.eval({att:3, btt:4}))
What I expect from ga is the dx/da, which should be a vector as envod is
defined as a vector. However, I can only get dx/da with env = envod[0] from
the above code.
ValueError: <__main__.Xf object at 0x0000027B90B30C88>.grad returned a
term with 1 dimensions, but 0 are required.
Post by Arnaud Bergeron
What problem do you have? I don’t see anything obviously wrong.
Hello,
I want to create a custom Op for the use in PyMC3.
This Op finds the root of the function: f(x) = x + env*exp(x) - a*b^2
where env = np.array([1, 2]) so the root-finding function should return a
vector for any given pair of a & b.
from scipy import optimize
import numpy as np
import theano
import theano.tensor as tt
envod = np.array([1, 2])
value = x + env * np.exp(x) - a * b**2
return value
jac = 1 + env * np.exp(x)
return jac
Len = len(env)
value = np.zeros(Len)
value[i] = optimize.newton(func, 1, fprime = jac, args = (a, b, env[i]))
return value
itypes = [tt.dscalar, tt.dscalar]
otypes = [tt.dvector]
a, b = inputs
x = x_from_ab(a, b, envod)
outputs[0][0] = np.array(x)
a, b = inputs
x = self(a, b)
g, = output_gradients
return [-g[0] * (-b**2)/(1 + envod[0] * tt.exp(x[0])), -g[0] *
(-2*a*b)/(1 + envod[0] * tt.exp(x[0]))]
I wonder how should I define grad()? I have read all the
posts/documentations that I have found. Any suggestion or link to some
useful reference is welcome.
Many thanks!
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...