佐藤優
2017-08-09 08:50:12 UTC
I wonder why bellow code is invalid..
from numpy import *
import theano.tensor as T
x = T.dmatrix("x")
mx = x[...,None,:]
a = T.ones((1,3))
T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,10)).astype(float32)})
bellow error is emerged.
---------------------------------------------------------------------------ValueError Traceback (most recent call last)/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 883 outputs =\--> 884 self.fn() if output_subset is None else\ 885 self.fn(output_subset=output_subset)
ValueError: Shape mismatch: A.shape[1] != x.shape[0]
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)<ipython-input-74-52410617594a> in <module>() 3 mx = x[...,None,:] 4 a = T.ones((1,3))----> 5 T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,10)).astype(float32)})
/home/yu/anaconda3/lib/python3.5/site-packages/theano/gof/graph.py in eval(self, inputs_to_values) 517 args = [inputs_to_values[param] for param in inputs] 518 --> 519 rval = self._fn_cache[inputs](*args) 520 521 return rval
/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 896 node=self.fn.nodes[self.fn.position_of_error], 897 thunk=thunk,--> 898 storage_map=getattr(self.fn, 'storage_map', None)) 899 else: 900 # old-style linkers raise their own exceptions
/home/yu/anaconda3/lib/python3.5/site-packages/theano/gof/link.py in raise_with_op(node, thunk, exc_info, storage_map) 323 # extra long error message in that case. 324 pass--> 325 reraise(exc_type, exc_value, exc_trace) 326 327
/home/yu/anaconda3/lib/python3.5/site-packages/six.py in reraise(tp, value, tb) 683 value = tp() 684 if value.__traceback__ is not tb:--> 685 raise value.with_traceback(tb) 686 raise value 687
/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 882 try: 883 outputs =\--> 884 self.fn() if output_subset is None else\ 885 self.fn(output_subset=output_subset) 886 except Exception:
ValueError: Shape mismatch: A.shape[1] != x.shape[0]
Apply node that caused the error: CGemv{inplace}(AllocEmpty{dtype='float64'}.0, TensorConstant{1.0}, InplaceDimShuffle{1,0}.0, Rebroadcast{0}.0, TensorConstant{0.0})
Toposort index: 7
Inputs types: [TensorType(float64, vector), TensorType(float64, scalar), TensorType(float64, matrix), TensorType(float64, vector), TensorType(float64, scalar)]
Inputs shapes: [(3,), (), (3, 5), (1,), ()]
Inputs strides: [(8,), (), (8, 24), (80,), ()]
Inputs values: [array([ 0.00000000e+000, 4.94065646e-324, 9.88131292e-324]), array(1.0), 'not shown', array([ 1.]), array(0.0)]
Inputs type_num: [12, 12, 12, 12, 12]
Outputs clients: [[InplaceDimShuffle{x,0}(CGemv{inplace}.0)]]
Debugprint of the apply node:
CGemv{inplace} [id A] <TensorType(float64, vector)> ''
|AllocEmpty{dtype='float64'} [id B] <TensorType(float64, vector)> ''
| |TensorConstant{3} [id C] <TensorType(int64, scalar)>
|TensorConstant{1.0} [id D] <TensorType(float64, scalar)>
|InplaceDimShuffle{1,0} [id E] <TensorType(float64, matrix)> ''
| |Alloc [id F] <TensorType(float64, matrix)> ''
| |TensorConstant{(1, 1) of 1.0} [id G] <TensorType(float64, (True, True))>
| |Shape_i{0} [id H] <TensorType(int64, scalar)> ''
| | |x [id I] <TensorType(float64, matrix)>
| |TensorConstant{3} [id C] <TensorType(int64, scalar)>
|Rebroadcast{0} [id J] <TensorType(float64, vector)> ''
| |Subtensor{int8, ::, int64} [id K] <TensorType(float64, (True,))> ''
| |InplaceDimShuffle{0,x,1} [id L] <TensorType(float64, (False, True, False))> ''
| | |x [id I] <TensorType(float64, matrix)>
| |Constant{0} [id M] <int8>
| |Constant{0} [id N] <int64>
|TensorConstant{0.0} [id O] <TensorType(float64, scalar)>
Storage map footprint:
- x, Input, Shape: (5, 10), ElemSize: 8 Byte(s), TotalSize: 400 Byte(s)
- InplaceDimShuffle{0,x,1}.0, Shape: (5, 1, 10), ElemSize: 8 Byte(s), TotalSize: 400 Byte(s)
- Alloc.0, Shape: (5, 3), ElemSize: 8 Byte(s), TotalSize: 120 Byte(s)
- InplaceDimShuffle{1,0}.0, Shape: (3, 5), ElemSize: 8 Byte(s), TotalSize: 120 Byte(s)
- AllocEmpty{dtype='float64'}.0, Shape: (3,), ElemSize: 8 Byte(s), TotalSize: 24 Byte(s)
- Subtensor{int8, ::, int64}.0, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- Shape_i{0}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{1.0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{0.0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Constant{0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Rebroadcast{0}.0, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- TensorConstant{3}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{(1, 1) of 1.0}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- Constant{0}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
TotalSize: 593.0 Byte(s) 0.000 GB
TotalSize inputs: 441.0 Byte(s) 0.000 GB
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
I thought above script includes broadcasted operation was wrong,
So no broadcasting used before gradient operation as follows,
x = T.tensor3("x")
mx = x
a = T.ones((1,3))
T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,1,10)).astype(float32)})
successfully performed and dumped bellow result.
array([[ 5., 5., 5.]], dtype=float32)
But why did the former case invalid?
Is the gradient with broadcasting mathmatically invalid?
Why does shape miss much happen on gradient?
Could you taught me about above question?
from numpy import *
import theano.tensor as T
x = T.dmatrix("x")
mx = x[...,None,:]
a = T.ones((1,3))
T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,10)).astype(float32)})
bellow error is emerged.
---------------------------------------------------------------------------ValueError Traceback (most recent call last)/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 883 outputs =\--> 884 self.fn() if output_subset is None else\ 885 self.fn(output_subset=output_subset)
ValueError: Shape mismatch: A.shape[1] != x.shape[0]
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)<ipython-input-74-52410617594a> in <module>() 3 mx = x[...,None,:] 4 a = T.ones((1,3))----> 5 T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,10)).astype(float32)})
/home/yu/anaconda3/lib/python3.5/site-packages/theano/gof/graph.py in eval(self, inputs_to_values) 517 args = [inputs_to_values[param] for param in inputs] 518 --> 519 rval = self._fn_cache[inputs](*args) 520 521 return rval
/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 896 node=self.fn.nodes[self.fn.position_of_error], 897 thunk=thunk,--> 898 storage_map=getattr(self.fn, 'storage_map', None)) 899 else: 900 # old-style linkers raise their own exceptions
/home/yu/anaconda3/lib/python3.5/site-packages/theano/gof/link.py in raise_with_op(node, thunk, exc_info, storage_map) 323 # extra long error message in that case. 324 pass--> 325 reraise(exc_type, exc_value, exc_trace) 326 327
/home/yu/anaconda3/lib/python3.5/site-packages/six.py in reraise(tp, value, tb) 683 value = tp() 684 if value.__traceback__ is not tb:--> 685 raise value.with_traceback(tb) 686 raise value 687
/home/yu/anaconda3/lib/python3.5/site-packages/theano/compile/function_module.py in __call__(self, *args, **kwargs) 882 try: 883 outputs =\--> 884 self.fn() if output_subset is None else\ 885 self.fn(output_subset=output_subset) 886 except Exception:
ValueError: Shape mismatch: A.shape[1] != x.shape[0]
Apply node that caused the error: CGemv{inplace}(AllocEmpty{dtype='float64'}.0, TensorConstant{1.0}, InplaceDimShuffle{1,0}.0, Rebroadcast{0}.0, TensorConstant{0.0})
Toposort index: 7
Inputs types: [TensorType(float64, vector), TensorType(float64, scalar), TensorType(float64, matrix), TensorType(float64, vector), TensorType(float64, scalar)]
Inputs shapes: [(3,), (), (3, 5), (1,), ()]
Inputs strides: [(8,), (), (8, 24), (80,), ()]
Inputs values: [array([ 0.00000000e+000, 4.94065646e-324, 9.88131292e-324]), array(1.0), 'not shown', array([ 1.]), array(0.0)]
Inputs type_num: [12, 12, 12, 12, 12]
Outputs clients: [[InplaceDimShuffle{x,0}(CGemv{inplace}.0)]]
Debugprint of the apply node:
CGemv{inplace} [id A] <TensorType(float64, vector)> ''
|AllocEmpty{dtype='float64'} [id B] <TensorType(float64, vector)> ''
| |TensorConstant{3} [id C] <TensorType(int64, scalar)>
|TensorConstant{1.0} [id D] <TensorType(float64, scalar)>
|InplaceDimShuffle{1,0} [id E] <TensorType(float64, matrix)> ''
| |Alloc [id F] <TensorType(float64, matrix)> ''
| |TensorConstant{(1, 1) of 1.0} [id G] <TensorType(float64, (True, True))>
| |Shape_i{0} [id H] <TensorType(int64, scalar)> ''
| | |x [id I] <TensorType(float64, matrix)>
| |TensorConstant{3} [id C] <TensorType(int64, scalar)>
|Rebroadcast{0} [id J] <TensorType(float64, vector)> ''
| |Subtensor{int8, ::, int64} [id K] <TensorType(float64, (True,))> ''
| |InplaceDimShuffle{0,x,1} [id L] <TensorType(float64, (False, True, False))> ''
| | |x [id I] <TensorType(float64, matrix)>
| |Constant{0} [id M] <int8>
| |Constant{0} [id N] <int64>
|TensorConstant{0.0} [id O] <TensorType(float64, scalar)>
Storage map footprint:
- x, Input, Shape: (5, 10), ElemSize: 8 Byte(s), TotalSize: 400 Byte(s)
- InplaceDimShuffle{0,x,1}.0, Shape: (5, 1, 10), ElemSize: 8 Byte(s), TotalSize: 400 Byte(s)
- Alloc.0, Shape: (5, 3), ElemSize: 8 Byte(s), TotalSize: 120 Byte(s)
- InplaceDimShuffle{1,0}.0, Shape: (3, 5), ElemSize: 8 Byte(s), TotalSize: 120 Byte(s)
- AllocEmpty{dtype='float64'}.0, Shape: (3,), ElemSize: 8 Byte(s), TotalSize: 24 Byte(s)
- Subtensor{int8, ::, int64}.0, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- Shape_i{0}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{1.0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{0.0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Constant{0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Rebroadcast{0}.0, Shape: (1,), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- TensorConstant{3}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{(1, 1) of 1.0}, Shape: (1, 1), ElemSize: 8 Byte(s), TotalSize: 8 Byte(s)
- Constant{0}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
TotalSize: 593.0 Byte(s) 0.000 GB
TotalSize inputs: 441.0 Byte(s) 0.000 GB
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
I thought above script includes broadcasted operation was wrong,
So no broadcasting used before gradient operation as follows,
x = T.tensor3("x")
mx = x
a = T.ones((1,3))
T.grad(mx[...,0].dot(a).sum(), a).eval({x:ones((5,1,10)).astype(float32)})
successfully performed and dumped bellow result.
array([[ 5., 5., 5.]], dtype=float32)
But why did the former case invalid?
Is the gradient with broadcasting mathmatically invalid?
Why does shape miss much happen on gradient?
Could you taught me about above question?
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.