Discussion:
[theano-users] RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM
amiBAR
2018-06-29 08:30:01 UTC
Permalink
Hi,

I'm new in theano. I'm using keras with theano backend.
And I have the following error:


RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM
Apply node that caused the error: GpuDnnConv{algo='small', inplace=True,
num_groups=1}(GpuContiguous.0, GpuContiguous.0,
GpuAllocEmpty{dtype='float32', context_name=None}.0,
GpuDnnConvDesc{border_mode='valid', subsample=(1, 1), dilation=(1, 1),
conv_mode='conv', precision='float32', num_groups=1}.0, Constant{1.0},
Constant{0.0})
Toposort index: 121
Inputs types: [GpuArrayType<None>(float32, (False, False, False, True)),
GpuArrayType<None>(float32, (False, False, False, True)),
GpuArrayType<None>(float32, (False, False, False, True)),
<theano.gof.type.CDataType object at 0x7fe230476dd0>, Scalar(float32),
Scalar(float32)]
Inputs shapes: [(32, 300, 882, 1), (300, 300, 3, 1), (32, 300, 880, 1), 'No
shapes', (), ()]
Inputs strides: [(1058400, 3528, 4, 4), (3600, 12, 4, 4), (1056000, 3520,
4, 4), 'No strides', (), ()]
Inputs values: ['not shown', 'not shown', 'not shown', <capsule object NULL
at 0x7fe1b1d5d1e0>, 1.0, 0.0]
Inputs type_num: [11, 11, 11, '', 11, 11]
Outputs clients: [[Rebroadcast{?,?,?,0}(GpuDnnConv{algo='small',
inplace=True, num_groups=1}.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to
make it longer):
File "CNN_for_Sentiment_Analysis.py", line 650, in <module>
NN_nets_rr0(embeddings,train_name,test_name, dev_name,max_sent_len)
File "CNN_for_Sentiment_Analysis.py", line 394, in NN_nets_rr0
conv_0 = Conv1D(filters=nb_filter, kernel_size=n_gram[0],
activation='relu', padding="valid", name='conv_'+str(n_gram[0]))(drop0)
File
"/lium/buster1/barhoumi/miniconda2/envs/keras-theano-gensim/lib/python2.7/site-packages/keras/engine/topology.py",
line 617, in __call__
output = self.call(inputs, **kwargs)
File
"/lium/buster1/barhoumi/miniconda2/envs/keras-theano-gensim/lib/python2.7/site-packages/keras/layers/convolutional.py",
line 160, in call
dilation_rate=self.dilation_rate[0])
File
"/lium/buster1/barhoumi/miniconda2/envs/keras-theano-gensim/lib/python2.7/site-packages/keras/backend/theano_backend.py",
line 1870, in conv1d
data_format=data_format, dilation_rate=dilation_rate)
File
"/lium/buster1/barhoumi/miniconda2/envs/keras-theano-gensim/lib/python2.7/site-packages/keras/backend/theano_backend.py",
line 1916, in conv2d
filter_dilation=dilation_rate)

Debugprint of the apply node:
GpuDnnConv{algo='small', inplace=True, num_groups=1} [id A]
<GpuArrayType<None>(float32, (False, False, False, True))> ''
|GpuContiguous [id B] <GpuArrayType<None>(float32, (False, False, False,
True))> ''
| |InplaceGpuDimShuffle{0,2,1,x} [id C] <GpuArrayType<None>(float32,
(False, False, False, True))> ''
| |if{inplace,gpu} [id D] <GpuArrayType<None>(float32, 3D)> ''
| |keras_learning_phase [id E] <TensorType(uint8, scalar)>
| |GpuElemwise{Composite{(i0 * i1 * Cast{float32}(LT(i2, i3)))}}[(0,
2)]<gpuarray> [id F] <GpuArrayType<None>(float32, 3D)> ''
| | |GpuArrayConstant{[[[2.]]]} [id G] <GpuArrayType<None>(float32,
(True, True, True))>
| | |GpuReshape{3} [id H] <GpuArrayType<None>(float32, 3D)> ''
| | | |GpuAdvancedSubtensor1 [id I] <GpuArrayType<None>(float32,
matrix)> ''
| | | | |embedding/embeddings [id J] <GpuArrayType<None>(float32,
matrix)>
| | | | |GpuContiguous [id K] <GpuArrayType<None>(int64, vector)> ''
| | | | |GpuElemwise{Cast{int64}}[]<gpuarray> [id L]
<GpuArrayType<None>(int64, vector)> ''
| | | | |GpuReshape{1} [id M] <GpuArrayType<None>(int32, vector)>
''
| | | | |GpuFromHost<None> [id N] <GpuArrayType<None>(int32,
matrix)> ''
| | | | | |/input [id O] <TensorType(int32, matrix)>
| | | | |TensorConstant{(1,) of -1} [id P] <TensorType(int64,
(True,))>
| | | |MakeVector{dtype='int64'} [id Q] <TensorType(int64, vector)>
''
| | | |Shape_i{0} [id R] <TensorType(int64, scalar)> ''
| | | | |/input [id O] <TensorType(int32, matrix)>
| | | |Shape_i{1} [id S] <TensorType(int64, scalar)> ''
| | | | |/input [id O] <TensorType(int32, matrix)>
| | | |Shape_i{1} [id T] <TensorType(int64, scalar)> ''
| | | |embedding/embeddings [id J] <GpuArrayType<None>(float32,
matrix)>
| | |GPUA_mrg_uniform{GpuArrayType<None>(float32, 3D),inplace}.1 [id
U] <GpuArrayType<None>(float32, 3D)> ''
| | | |<GpuArrayType<None>(int32, matrix)> [id V]
<GpuArrayType<None>(int32, matrix)>
| | | |MakeVector{dtype='int64'} [id Q] <TensorType(int64, vector)>
''
| | |GpuArrayConstant{[[[0.5]]]} [id W] <GpuArrayType<None>(float32,
(True, True, True))>
| |GpuReshape{3} [id H] <GpuArrayType<None>(float32, 3D)> ''
|GpuContiguous [id X] <GpuArrayType<None>(float32, (False, False, False,
True))> ''
| |InplaceGpuDimShuffle{2,1,0,x} [id Y] <GpuArrayType<None>(float32,
(False, False, False, True))> ''
| |conv_3/kernel [id Z] <GpuArrayType<None>(float32, 3D)>
|GpuAllocEmpty{dtype='float32', context_name=None} [id BA]
<GpuArrayType<None>(float32, (False, False, False, True))> ''
| |Assert{msg='The convolution would produce an invalid shape (dim[0] <
0).'} [id BB] <TensorType(int64, scalar)> ''
| | |Shape_i{0} [id BC] <TensorType(int64, scalar)> ''
| | | |GpuContiguous [id B] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| | |Elemwise{ge,no_inplace} [id BD] <TensorType(bool, scalar)> ''
| | |Shape_i{0} [id BC] <TensorType(int64, scalar)> ''
| | |TensorConstant{0} [id BE] <TensorType(int8, scalar)>
| |Assert{msg='The convolution would produce an invalid shape (dim[1] <
0).'} [id BF] <TensorType(int64, scalar)> ''
| | |Shape_i{0} [id BG] <TensorType(int64, scalar)> ''
| | | |GpuContiguous [id X] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| | |Elemwise{ge,no_inplace} [id BH] <TensorType(bool, scalar)> ''
| | |Shape_i{0} [id BG] <TensorType(int64, scalar)> ''
| | |TensorConstant{0} [id BE] <TensorType(int8, scalar)>
| |Assert{msg='The convolution would produce an invalid shape (dim[2] <=
0).'} [id BI] <TensorType(int64, scalar)> ''
| | |Elemwise{Composite{((i0 - (((i1 - i2) * i3) + i4)) + i5)}}[(0, 1)]
[id BJ] <TensorType(int64, scalar)> ''
| | | |Shape_i{2} [id BK] <TensorType(int64, scalar)> ''
| | | | |GpuContiguous [id B] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| | | |Shape_i{2} [id BL] <TensorType(int64, scalar)> ''
| | | | |GpuContiguous [id X] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| | | |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| | | |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| | | |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| | | |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| | |Elemwise{gt,no_inplace} [id BN] <TensorType(bool, scalar)> ''
| | |Elemwise{Composite{((i0 - (((i1 - i2) * i3) + i4)) + i5)}}[(0, 1)]
[id BJ] <TensorType(int64, scalar)> ''
| | |TensorConstant{0} [id BE] <TensorType(int8, scalar)>
| |Elemwise{Composite{((i0 - (((i1 - i2) * i3) + i4)) + i5)}}[(0, 1)] [id
BO] <TensorType(int64, scalar)> ''
| |Shape_i{3} [id BP] <TensorType(int64, scalar)> ''
| | |GpuContiguous [id B] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| |Shape_i{3} [id BQ] <TensorType(int64, scalar)> ''
| | |GpuContiguous [id X] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
| |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
| |TensorConstant{1} [id BM] <TensorType(int8, scalar)>
|GpuDnnConvDesc{border_mode='valid', subsample=(1, 1), dilation=(1, 1),
conv_mode='conv', precision='float32', num_groups=1} [id BR]
<CDataType{cudnnConvolutionDescriptor_t}> ''
| |Shape [id BS] <TensorType(int64, vector)> ''
| |GpuContiguous [id X] <GpuArrayType<None>(float32, (False, False,
False, True))> ''
|Constant{1.0} [id BT] <float32>
|Constant{0.0} [id BU] <float32>

Storage map footprint:
- embedding/embeddings, Shared Input, Shape: (30000, 300), ElemSize: 4
Byte(s), TotalSize: 36000000 Byte(s)
- InplaceGpuDimShuffle{0,2,1,x}.0, Shape: (32, 300, 882, 1), ElemSize: 4
Byte(s), TotalSize: 33868800 Byte(s)
- GpuContiguous.0, Shape: (32, 300, 882, 1), ElemSize: 4 Byte(s),
TotalSize: 33868800 Byte(s)
- GpuAllocEmpty{dtype='float32', context_name=None}.0, Shape: (32, 300,
880, 1), ElemSize: 4 Byte(s), TotalSize: 33792000 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (7, 300, 300), ElemSize:
4 Byte(s), TotalSize: 2520000 Byte(s)
- conv_7/kernel, Shared Input, Shape: (7, 300, 300), ElemSize: 4 Byte(s),
TotalSize: 2520000 Byte(s)
- conv_5/kernel, Shared Input, Shape: (5, 300, 300), ElemSize: 4 Byte(s),
TotalSize: 1800000 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (5, 300, 300), ElemSize:
4 Byte(s), TotalSize: 1800000 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (395100, 1), ElemSize: 4
Byte(s), TotalSize: 1580400 Byte(s)
- dense1/kernel, Shared Input, Shape: (395100, 1), ElemSize: 4 Byte(s),
TotalSize: 1580400 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (3, 300, 300), ElemSize:
4 Byte(s), TotalSize: 1080000 Byte(s)
- GpuContiguous.0, Shape: (300, 300, 3, 1), ElemSize: 4 Byte(s),
TotalSize: 1080000 Byte(s)
- conv_3/kernel, Shared Input, Shape: (3, 300, 300), ElemSize: 4 Byte(s),
TotalSize: 1080000 Byte(s)
- GPUA_mrg_uniform{GpuArrayType<None>(float32, 3D),inplace}.0, Shape:
(15360, 6), ElemSize: 4 Byte(s), TotalSize: 368640 Byte(s)
- <GpuArrayType<None>(int32, matrix)>, Shared Input, Shape: (15360, 6),
ElemSize: 4 Byte(s), TotalSize: 368640 Byte(s)
- /input, Input, Shape: (32, 882), ElemSize: 4 Byte(s), TotalSize: 112896
Byte(s)
- conv_5/bias, Shared Input, Shape: (300,), ElemSize: 4 Byte(s),
TotalSize: 1200 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (300,), ElemSize: 4
Byte(s), TotalSize: 1200 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (300,), ElemSize: 4
Byte(s), TotalSize: 1200 Byte(s)
- conv_3/bias, Shared Input, Shape: (300,), ElemSize: 4 Byte(s),
TotalSize: 1200 Byte(s)
- conv_7/bias, Shared Input, Shape: (300,), ElemSize: 4 Byte(s),
TotalSize: 1200 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (300,), ElemSize: 4
Byte(s), TotalSize: 1200 Byte(s)
- GpuElemwise{sub,no_inplace}.0, Shape: (32, 1), ElemSize: 4 Byte(s),
TotalSize: 128 Byte(s)
- /dense1_target, Input, Shape: (32, 1), ElemSize: 4 Byte(s), TotalSize:
128 Byte(s)
- /dense1_sample_weights, Input, Shape: (32,), ElemSize: 4 Byte(s),
TotalSize: 128 Byte(s)
- GpuFromHost<None>.0, Shape: (32, 1), ElemSize: 4 Byte(s), TotalSize: 128
Byte(s)
- TensorConstant{[132000 13..00 131400]}, Shape: (3,), ElemSize: 8
Byte(s), TotalSize: 24 Byte(s)
- TensorConstant{[2 1]}, Shape: (2,), ElemSize: 8 Byte(s), TotalSize: 16
Byte(s)
- TensorConstant{(2,) of 0}, Shape: (2,), ElemSize: 8 Byte(s), TotalSize:
16 Byte(s)
- TensorConstant{(1,) of 300}, Shape: (1,), ElemSize: 8 Byte(s),
TotalSize: 8 Byte(s)
- Adagrad/iterations, Shared Input, Shape: (), ElemSize: 8 Byte(s),
TotalSize: 8.0 Byte(s)
- Shape_i{1}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{876}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- Assert{msg='The convolution would produce an invalid shape (dim[0] <
0).'}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Constant{3}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{440}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Shape_i{0}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{132000}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{(1,) of 878}, Shape: (1,), ElemSize: 8 Byte(s),
TotalSize: 8 Byte(s)
- Shape_i{3}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{878}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- Shape_i{2}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{438}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{(1,) of -1}, Shape: (1,), ElemSize: 8 Byte(s), TotalSize:
8 Byte(s)
- Constant{2}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- Constant{1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{439}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{(1,) of 880}, Shape: (1,), ElemSize: 8 Byte(s),
TotalSize: 8 Byte(s)
- TensorConstant{131400}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{(1,) of 876}, Shape: (1,), ElemSize: 8 Byte(s),
TotalSize: 8 Byte(s)
- Shape_i{1}.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{300}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- TensorConstant{880}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- Constant{0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
- TensorConstant{131700}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0
Byte(s)
- GpuArrayConstant{[inf]}, Shape: (1,), ElemSize: 4 Byte(s), TotalSize: 4
Byte(s)
- TensorConstant{0.0}, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0
Byte(s)
- GpuArrayConstant{[[0.9999999]]}, Shape: (1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[inf]]}, Shape: (1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[0.]]}, Shape: (1, 1), ElemSize: 4 Byte(s), TotalSize:
4 Byte(s)
- GpuArrayConstant{[[-1.]]}, Shape: (1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- training/Adagrad/variable, Shared Input, Shape: (1,), ElemSize: 4
Byte(s), TotalSize: 4 Byte(s)
- GpuArrayConstant{[[[0.5]]]}, Shape: (1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- Constant{0.0}, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- GpuArrayConstant{[[[inf]]]}, Shape: (1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[[2.]]]}, Shape: (1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[[0.]]]}, Shape: (1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- Constant{1.0}, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
- GpuArrayConstant{[[1.e-07]]}, Shape: (1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[[1.e-07]]]}, Shape: (1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[[1.]]}, Shape: (1, 1), ElemSize: 4 Byte(s), TotalSize:
4 Byte(s)
- GpuArrayConstant{[[[[0.5]]]]}, Shape: (1, 1, 1, 1), ElemSize: 4 Byte(s),
TotalSize: 4 Byte(s)
- GpuArrayConstant{[1.e-07]}, Shape: (1,), ElemSize: 4 Byte(s), TotalSize:
4 Byte(s)
- dense1/bias, Shared Input, Shape: (1,), ElemSize: 4 Byte(s), TotalSize:
4 Byte(s)
- Adagrad/lr, Shared Input, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0
Byte(s)
- GpuArrayConstant{[0.]}, Shape: (1,), ElemSize: 4 Byte(s), TotalSize: 4
Byte(s)
- TensorConstant{0}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
- TensorConstant{1}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
- GpuArrayConstant{[0]}, Shape: (1,), ElemSize: 1 Byte(s), TotalSize: 1
Byte(s)
- keras_learning_phase, Input, Shape: (), ElemSize: 1 Byte(s), TotalSize:
1.0 Byte(s)
TotalSize: 153060008.0 Byte(s) 0.143 GB
TotalSize inputs: 50450104.0 Byte(s) 0.047 GB

Does anyone know how to solve this problem. And Thanks in advance :)
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...