[theano-users] Issue with grad, scan, and intermediate values with changing sizes

Discussion:

r***@stanford.edu

2016-05-29 21:33:37 UTC

Hi,

I'm getting an error computing gradients through a scan in which some
intermediate values of the scan function have different sizes in different
iterations (the inputs and outputs always have the same size). Here's a
minimal example:

import numpy as np
import theano
import theano.tensor as T

d = 11
h = 7
W1 = theano.shared(name='W1', value=np.random.uniform(-0.1, 0.1, (d,h)))
W2 = theano.shared(name='W2', value=np.random.uniform(-0.1, 0.1, (h,)))

n = T.lscalar('n')
vecs = T.matrix('vecs')
inds = T.lmatrix('inds')
def recurrence(t, vecs, inds, W1, W2):
cur_inds = inds[T.eq(inds[:,0], t).nonzero()]
cur_vecs = vecs[cur_inds[:,1]]
hidden_layers = T.tanh(cur_vecs.dot(W1))
scores = hidden_layers.dot(W2)
return T.sum(scores)
results, _ = theano.scan(
fn=recurrence, sequences=[T.arange(n)], outputs_info=[None],
non_sequences=[vecs, inds, W1, W2], strict=True)
obj = T.sum(results)
grads = T.grad(obj, [W1, W2])
f = theano.function(inputs=[n, vecs, inds], outputs=grads)
vecs_in = np.ones((10, d))
inds_in = np.array([[0, 0], [1, 1], [1, 2], [2, 3], [3, 4], [3, 5], [3, 6],
[3, 7], [4, 8], [4, 9]])
print f(5, vecs_in, inds_in)

Running this code results in the following error message (tried on 0.7.0,
0.8.2, and 0.9.0dev1.dev-0044349fdf4244c5b616994bf16ad2ff1ff8ce8a):

Traceback (most recent call last):
File "edge_scores.py", line 33, in <module>
print f(5, vecs_in, inds_in)
File
"/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 912, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line
314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File
"/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 899, in __call__
self.fn() if output_subset is None else\
File
"/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 951, in rval
r = p(n, [x[0] for x in i], o)
File
"/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 940, in <lambda>
self, node)
File "theano/scan_module/scan_perform.pyx", line 547, in
theano.scan_module.scan_perform.perform
(/home/robinjia/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:6224)
ValueError: could not broadcast input array from shape (11,4) into shape
(11,2)
Apply node that caused the error: forall_inplace,cpu,grad_of_scan_fn}(n,
Alloc.0, Elemwise{eq,no_inplace}.0, Alloc.0, n, n, W1, W2, vecs, inds,
InplaceDimShuffle{x,0}.0)
Toposort index: 47
Inputs types: [TensorType(int64, scalar), TensorType(float64, col),
TensorType(int8, matrix), TensorType(float64, matrix), TensorType(int64,
scalar), TensorType(int64, scalar), TensorType(float64, matrix),
TensorType(float64, vector), TensorType(float64, matrix), TensorType(int64,
matrix), TensorType(float64, row)]
Inputs shapes: [(), (5, 1), (5, 10), (2, 7), (), (), (11, 7), (7,), (10,
11), (10, 2), (1, 7)]
Inputs strides: [(), (8, 8), (10, 1), (56, 8), (), (), (56, 8), (8,), (88,
8), (16, 8), (56, 8)]
Inputs values: [array(5), array([[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]]), 'not shown', 'not shown', array(5), array(5), 'not shown',
'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,grad_of_scan_fn}.0,
ScalarFromTensor.0)],
[InplaceDimShuffle{1,0,2}(forall_inplace,cpu,grad_of_scan_fn}.1)],
[Reshape{2}(forall_inplace,cpu,grad_of_scan_fn}.2,
MakeVector{dtype='int64'}.0),
Shape_i{1}(forall_inplace,cpu,grad_of_scan_fn}.2)]]

HINT: Re-running with most Theano optimization disabled could give you a
back-trace of when this node was created. This can be done with by setting
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano
optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and
storage map footprint of this apply node.

A couple observations:
- There's no error if I turn off optimizations (theano.config.optimizer =
'None')
- There's no error if I have a single layer and no hidden layer (i.e. if
scores = cur_vecs.dot(W) for W of the appropriate shape).

Thanks!

Robin

--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

norbert

2018-05-22 09:21:22 UTC

Permalink

Hi Robin,

I just came across the same issue.It looks very strange to me because I was
using this kind of code (scan in which some intermediate values of the
scan function have different sizes in different iterations) for a long time
and everything was OK, but after making small modyfications (putting
mentioned fragment into ifelse statement) it suddenly stopped working...
Have you figured out how to manage this issue and could you suggest a
solution?

Norbert

W dniu niedziela, 29 maja 2016 23:33:38 UTC+2 uÅŒytkownik

Post by r***@stanford.edu
Hi,
I'm getting an error computing gradients through a scan in which some
intermediate values of the scan function have different sizes in different
iterations (the inputs and outputs always have the same size). Here's a
import numpy as np
import theano
import theano.tensor as T
d = 11
h = 7
W1 = theano.shared(name='W1', value=np.random.uniform(-0.1, 0.1, (d,h)))
W2 = theano.shared(name='W2', value=np.random.uniform(-0.1, 0.1, (h,)))
n = T.lscalar('n')
vecs = T.matrix('vecs')
inds = T.lmatrix('inds')
cur_inds = inds[T.eq(inds[:,0], t).nonzero()]
cur_vecs = vecs[cur_inds[:,1]]
hidden_layers = T.tanh(cur_vecs.dot(W1))
scores = hidden_layers.dot(W2)
return T.sum(scores)
results, _ = theano.scan(
fn=recurrence, sequences=[T.arange(n)], outputs_info=[None],
non_sequences=[vecs, inds, W1, W2], strict=True)
obj = T.sum(results)
grads = T.grad(obj, [W1, W2])
f = theano.function(inputs=[n, vecs, inds], outputs=grads)
vecs_in = np.ones((10, d))
inds_in = np.array([[0, 0], [1, 1], [1, 2], [2, 3], [3, 4], [3, 5], [3,
6], [3, 7], [4, 8], [4, 9]])
print f(5, vecs_in, inds_in)
Running this code results in the following error message (tried on 0.7.0,
File "edge_scores.py", line 33, in <module>
print f(5, vecs_in, inds_in)
File
"/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 912, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line
314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File
"/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 899, in __call__
self.fn() if output_subset is None else\
File
"/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 951, in rval
r = p(n, [x[0] for x in i], o)
File
"/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 940, in <lambda>
self, node)
File "theano/scan_module/scan_perform.pyx", line 547, in
theano.scan_module.scan_perform.perform
(/home/robinjia/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:6224)
ValueError: could not broadcast input array from shape (11,4) into shape
(11,2)
Apply node that caused the error: forall_inplace,cpu,grad_of_scan_fn}(n,
Alloc.0, Elemwise{eq,no_inplace}.0, Alloc.0, n, n, W1, W2, vecs, inds,
InplaceDimShuffle{x,0}.0)
Toposort index: 47
Inputs types: [TensorType(int64, scalar), TensorType(float64, col),
TensorType(int8, matrix), TensorType(float64, matrix), TensorType(int64,
scalar), TensorType(int64, scalar), TensorType(float64, matrix),
TensorType(float64, vector), TensorType(float64, matrix), TensorType(int64,
matrix), TensorType(float64, row)]
Inputs shapes: [(), (5, 1), (5, 10), (2, 7), (), (), (11, 7), (7,), (10,
11), (10, 2), (1, 7)]
Inputs strides: [(), (8, 8), (10, 1), (56, 8), (), (), (56, 8), (8,), (88,
8), (16, 8), (56, 8)]
Inputs values: [array(5), array([[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]]), 'not shown', 'not shown', array(5), array(5), 'not shown',
'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,grad_of_scan_fn}.0,
ScalarFromTensor.0)],
[InplaceDimShuffle{1,0,2}(forall_inplace,cpu,grad_of_scan_fn}.1)],
[Reshape{2}(forall_inplace,cpu,grad_of_scan_fn}.2,
MakeVector{dtype='int64'}.0),
Shape_i{1}(forall_inplace,cpu,grad_of_scan_fn}.2)]]
HINT: Re-running with most Theano optimization disabled could give you a
back-trace of when this node was created. This can be done with by setting
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano
optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and
storage map footprint of this apply node.
- There's no error if I turn off optimizations (theano.config.optimizer =
'None')
- There's no error if I have a single layer and no hidden layer (i.e. if
scores = cur_vecs.dot(W) for W of the appropriate shape).
Thanks!
Robin

Robin Jia

2018-05-25 18:06:42 UTC

Permalink

Hi Norbert,

Sorry, I don't believe I figured this out, and just went and implemented
something else instead :/

Robin

Post by norbert
Hi Robin,
I just came across the same issue.It looks very strange to me because I
was using this kind of code (scan in which some intermediate values of the
scan function have different sizes in different iterations) for a long time
and everything was OK, but after making small modyfications (putting
mentioned fragment into ifelse statement) it suddenly stopped working...
Have you figured out how to manage this issue and could you suggest a
solution?
Norbert
W dniu niedziela, 29 maja 2016 23:33:38 UTC+2 uÅŒytkownik

Post by r***@stanford.edu
Hi,
I'm getting an error computing gradients through a scan in which some
intermediate values of the scan function have different sizes in different
iterations (the inputs and outputs always have the same size). Here's a
import numpy as np
import theano
import theano.tensor as T
d = 11
h = 7
W1 = theano.shared(name='W1', value=np.random.uniform(-0.1, 0.1, (d,h)))
W2 = theano.shared(name='W2', value=np.random.uniform(-0.1, 0.1, (h,)))
n = T.lscalar('n')
vecs = T.matrix('vecs')
inds = T.lmatrix('inds')
cur_inds = inds[T.eq(inds[:,0], t).nonzero()]
cur_vecs = vecs[cur_inds[:,1]]
hidden_layers = T.tanh(cur_vecs.dot(W1))
scores = hidden_layers.dot(W2)
return T.sum(scores)
results, _ = theano.scan(
fn=recurrence, sequences=[T.arange(n)], outputs_info=[None],
non_sequences=[vecs, inds, W1, W2], strict=True)
obj = T.sum(results)
grads = T.grad(obj, [W1, W2])
f = theano.function(inputs=[n, vecs, inds], outputs=grads)
vecs_in = np.ones((10, d))
inds_in = np.array([[0, 0], [1, 1], [1, 2], [2, 3], [3, 4], [3, 5], [3,
6], [3, 7], [4, 8], [4, 9]])
print f(5, vecs_in, inds_in)
Running this code results in the following error message (tried on 0.7.0,
File "edge_scores.py", line 33, in <module>
print f(5, vecs_in, inds_in)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 912, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line
314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py",
line 899, in __call__
self.fn() if output_subset is None else\
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 951, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py",
line 940, in <lambda>
self, node)
File "theano/scan_module/scan_perform.pyx", line 547, in
theano.scan_module.scan_perform.perform (/home/robinjia/.theano/compil
edir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-
x86_64-2.7.6-64/scan_perform/mod.cpp:6224)
ValueError: could not broadcast input array from shape (11,4) into shape
(11,2)
Apply node that caused the error: forall_inplace,cpu,grad_of_scan_fn}(n,
Alloc.0, Elemwise{eq,no_inplace}.0, Alloc.0, n, n, W1, W2, vecs, inds,
InplaceDimShuffle{x,0}.0)
Toposort index: 47
Inputs types: [TensorType(int64, scalar), TensorType(float64, col),
TensorType(int8, matrix), TensorType(float64, matrix), TensorType(int64,
scalar), TensorType(int64, scalar), TensorType(float64, matrix),
TensorType(float64, vector), TensorType(float64, matrix), TensorType(int64,
matrix), TensorType(float64, row)]
Inputs shapes: [(), (5, 1), (5, 10), (2, 7), (), (), (11, 7), (7,), (10,
11), (10, 2), (1, 7)]
Inputs strides: [(), (8, 8), (10, 1), (56, 8), (), (), (56, 8), (8,),
(88, 8), (16, 8), (56, 8)]
Inputs values: [array(5), array([[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]]), 'not shown', 'not shown', array(5), array(5), 'not
shown', 'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,grad_of_scan_fn}.0,
ScalarFromTensor.0)], [InplaceDimShuffle{1,0,2}(fora
ll_inplace,cpu,grad_of_scan_fn}.1)], [Reshape{2}(forall_inplace,cpu,grad_of_scan_fn}.2,
MakeVector{dtype='int64'}.0), Shape_i{1}(forall_inplace,cpu,
grad_of_scan_fn}.2)]]
HINT: Re-running with most Theano optimization disabled could give you a
back-trace of when this node was created. This can be done with by setting
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano
optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and
storage map footprint of this apply node.
- There's no error if I turn off optimizations (theano.config.optimizer =
'None')
- There's no error if I have a single layer and no hidden layer (i.e. if
scores = cur_vecs.dot(W) for W of the appropriate shape).
Thanks!
Robin

--
---
You received this message because you are subscribed to a topic in the
Google Groups "theano-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/theano-users/zHMG2fMOA-I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
For more options, visit https://groups.google.com/d/optout.