[theano-users] GpuElemwise working with old backend but not with new ?

Discussion:

Rodolphe Cambier

2017-08-16 23:17:11 UTC

Hello,

I have the same code running on two computers, one with the old backend and
one with the new one. The code is the following:

import lasagne
import theano
import theano.tensor as T
import lasagne.layers as ll

max_length = 1000
learning_rate = .1

l_in = ll.InputLayer(shape=(None, max_length, 1), name="InputLayer")
l_reshape = ll.ReshapeLayer(l_in, ([0], 1, [1]), name="ReshapeLayer")
l_conv0 = ll.Conv1DLayer(l_reshape, num_filters=15, filter_size=30,
stride=10,

nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_0")
l_conv1 = ll.Conv1DLayer(l_conv0, num_filters=15, filter_size=4, stride=4,

nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_1")
l_conv2 = ll.Conv1DLayer(l_conv1, num_filters=15, filter_size=1, stride=1,

nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_2")
l_out = ll.DenseLayer(ll.dropout(l_conv2, p=0.3), num_units=1,

nonlinearity=lasagne.nonlinearities.linear, name="Denselayer")

predicted_values = lasagne.layers.get_output(l_out)
target_values = T.ivector('target_output')

predict_log = T.sgn(predicted_values) * T.log(1+T.abs_(predicted_values))
target_log = T.sgn(target_values) * T.log(1+T.abs_(target_values))

cost = T.mean(lasagne.objectives.squared_error(predict_log,target_log))
all_params = lasagne.layers.get_all_params(l_out)

updates = lasagne.updates.adagrad(cost, all_params, learning_rate)
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)

So I setup a simple convolutional net, then i try to measure a specific
cost on it, using T.sgn and T.log.
On the old backend, this works fine.
On the new backend, it worked fine for a day (i ran it maybe 15 times),
then at some point it outputted:

Using cuDNN version 5105 on context None
Mapped name None to device cuda0: Tesla K40c (0000:01:00.0)
Traceback (most recent call last):
File "quicktest.py", line 34, in <module>
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function.py",
line 317, in function
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/pfunc.py",
line 486, in pfunc
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1838, in orig_function
fn = m.create(defaults)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1712, in create
input_storage=input_storage_lists, storage_map=storage_map)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/link.py",
line 699, in make_thunk
storage_map=storage_map)[:3]
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/vm.py",
line 1084, in make_all
impl=impl))
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 955, in make_thunk
no_recycling)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 858, in make_c_thunk
output_storage=node_output_storage)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1215, in make_thunk
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1155, in __compile__
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1635, in cthunk_factory
*(in_storage + out_storage + orphd))
RuntimeError: ('The following error happened while compiling the node',
GpuElemwise{Composite{((i0 * log1p(i1)) - (sgn(i2) *
log1p(Abs(i2))))}}[]<gpuarray>(GpuElemwise{sgn,no_inplace}.0,
GpuElemwise{Abs}[(0, 0)]<gpuarray>.0, InplaceGpuDimShuffle{0,x}.0), '\n',
'Could not initialize elemwise support')

I cannot for the love of god find out what I could have changed between the
executions, i reinstalled theano and pygpu, doesn't change anything.
I don't find anyone having this error except for OpenCL related problems,
and this issue <https://github.com/Theano/Theano/issues/5541>, which is
supposed to be fixed (I am on the development version of Theano).

So if anyone has any idea of what i could do to fix the problem, it would
be very welcome :)
Thanks

--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rodolphe Cambier

2017-08-17 03:00:06 UTC

Permalink

I was able to pinpoint the problem to this part:

This code does not work:
import theano
import theano.tensor as T

a = ivector()
fun = theano.function([a], T.log10(a))

And this code does:
import theano
import theano.tensor as T

a = vector()
fun = theano.function([a], T.log10(a))

So basically it is defining the vector as Integer32 that crashes the
GpuElemwise.
And I really don't know why.

Post by Rodolphe Cambier
Hello,
I have the same code running on two computers, one with the old backend
import lasagne
import theano
import theano.tensor as T
import lasagne.layers as ll
max_length = 1000
learning_rate = .1
l_in = ll.InputLayer(shape=(None, max_length, 1), name="InputLayer")
l_reshape = ll.ReshapeLayer(l_in, ([0], 1, [1]), name="ReshapeLayer")
l_conv0 = ll.Conv1DLayer(l_reshape, num_filters=15, filter_size=30,
stride=10,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_0")
l_conv1 = ll.Conv1DLayer(l_conv0, num_filters=15, filter_size=4, stride=4,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_1")
l_conv2 = ll.Conv1DLayer(l_conv1, num_filters=15, filter_size=1, stride=1,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_2")
l_out = ll.DenseLayer(ll.dropout(l_conv2, p=0.3), num_units=1,
nonlinearity=lasagne.nonlinearities.linear, name="Denselayer")
predicted_values = lasagne.layers.get_output(l_out)
target_values = T.ivector('target_output')
predict_log = T.sgn(predicted_values) * T.log(1+T.abs_(predicted_values))
target_log = T.sgn(target_values) * T.log(1+T.abs_(target_values))
cost = T.mean(lasagne.objectives.squared_error(predict_log,target_log))
all_params = lasagne.layers.get_all_params(l_out)
updates = lasagne.updates.adagrad(cost, all_params, learning_rate)
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
So I setup a simple convolutional net, then i try to measure a specific
cost on it, using T.sgn and T.log.
On the old backend, this works fine.
On the new backend, it worked fine for a day (i ran it maybe 15 times),
Using cuDNN version 5105 on context None
Mapped name None to device cuda0: Tesla K40c (0000:01:00.0)
File "quicktest.py", line 34, in <module>
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function.py",
line 317, in function
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/pfunc.py",
line 486, in pfunc
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1838, in orig_function
fn = m.create(defaults)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1712, in create
input_storage=input_storage_lists, storage_map=storage_map)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/link.py",
line 699, in make_thunk
storage_map=storage_map)[:3]
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/vm.py",
line 1084, in make_all
impl=impl))
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 955, in make_thunk
no_recycling)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 858, in make_c_thunk
output_storage=node_output_storage)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1215, in make_thunk
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1155, in __compile__
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1635, in cthunk_factory
*(in_storage + out_storage + orphd))
RuntimeError: ('The following error happened while compiling the node',
GpuElemwise{Composite{((i0 * log1p(i1)) - (sgn(i2) *
log1p(Abs(i2))))}}[]<gpuarray>(GpuElemwise{sgn,no_inplace}.0,
GpuElemwise{Abs}[(0, 0)]<gpuarray>.0, InplaceGpuDimShuffle{0,x}.0), '\n',
'Could not initialize elemwise support')
I cannot for the love of god find out what I could have changed between
the executions, i reinstalled theano and pygpu, doesn't change anything.
I don't find anyone having this error except for OpenCL related problems,
and this issue <https://github.com/Theano/Theano/issues/5541>, which is
supposed to be fixed (I am on the development version of Theano).
So if anyone has any idea of what i could do to fix the problem, it would
be very welcome :)
Thanks

Frédéric Bastien

2017-08-17 20:10:02 UTC

Permalink

thanks. I don't know when @abergeron can check that. You can probably work
around it by introducing a cast to floatX before the log10. Can you make an
issue on github so we don't loose track of this?

Post by Rodolphe Cambier
import theano
import theano.tensor as T
a = ivector()
fun = theano.function([a], T.log10(a))
import theano
import theano.tensor as T
a = vector()
fun = theano.function([a], T.log10(a))
So basically it is defining the vector as Integer32 that crashes the
GpuElemwise.
And I really don't know why.

---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.

Rodolphe Cambier

2017-08-18 01:42:30 UTC

Permalink

To work around I simply used a vector of floats instead of Int32.
Ok I'll make an issue on github.

Post by FrÃ©dÃ©ric Bastien
around it by introducing a cast to floatX before the log10. Can you make an
issue on github so we don't loose track of this?

Post by Rodolphe Cambier
Hello,
I have the same code running on two computers, one with the old backend
import lasagne
import theano
import theano.tensor as T
import lasagne.layers as ll
max_length = 1000
learning_rate = .1
l_in = ll.InputLayer(shape=(None, max_length, 1), name="InputLayer")
l_reshape = ll.ReshapeLayer(l_in, ([0], 1, [1]), name="ReshapeLayer")
l_conv0 = ll.Conv1DLayer(l_reshape, num_filters=15, filter_size=30,
stride=10,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_0")
l_conv1 = ll.Conv1DLayer(l_conv0, num_filters=15, filter_size=4, stride=4,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_1")
l_conv2 = ll.Conv1DLayer(l_conv1, num_filters=15, filter_size=1, stride=1,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_2")
l_out = ll.DenseLayer(ll.dropout(l_conv2, p=0.3), num_units=1,
nonlinearity=lasagne.nonlinearities.linear, name="Denselayer")
predicted_values = lasagne.layers.get_output(l_out)
target_values = T.ivector('target_output')
predict_log = T.sgn(predicted_values) *
T.log(1+T.abs_(predicted_values))
target_log = T.sgn(target_values) * T.log(1+T.abs_(target_values))
cost = T.mean(lasagne.objectives.squared_error(predict_log,target_log))
all_params = lasagne.layers.get_all_params(l_out)
updates = lasagne.updates.adagrad(cost, all_params, learning_rate)
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
So I setup a simple convolutional net, then i try to measure a specific
cost on it, using T.sgn and T.log.
On the old backend, this works fine.
On the new backend, it worked fine for a day (i ran it maybe 15 times),
Using cuDNN version 5105 on context None
Mapped name None to device cuda0: Tesla K40c (0000:01:00.0)
File "quicktest.py", line 34, in <module>
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function.py",
line 317, in function
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/pfunc.py",
line 486, in pfunc
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1838, in orig_function
fn = m.create(defaults)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1712, in create
input_storage=input_storage_lists, storage_map=storage_map)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/link.py",
line 699, in make_thunk
storage_map=storage_map)[:3]
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/vm.py",
line 1084, in make_all
impl=impl))
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 955, in make_thunk
no_recycling)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 858, in make_c_thunk
output_storage=node_output_storage)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1215, in make_thunk
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1155, in __compile__
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1635, in cthunk_factory
*(in_storage + out_storage + orphd))
RuntimeError: ('The following error happened while compiling the node',
GpuElemwise{Composite{((i0 * log1p(i1)) - (sgn(i2) *
log1p(Abs(i2))))}}[]<gpuarray>(GpuElemwise{sgn,no_inplace}.0,
GpuElemwise{Abs}[(0, 0)]<gpuarray>.0, InplaceGpuDimShuffle{0,x}.0), '\n',
'Could not initialize elemwise support')
I cannot for the love of god find out what I could have changed between
the executions, i reinstalled theano and pygpu, doesn't change anything.
I don't find anyone having this error except for OpenCL related
problems, and this issue <https://github.com/Theano/Theano/issues/5541>,
which is supposed to be fixed (I am on the development version of Theano).
So if anyone has any idea of what i could do to fix the problem, it
would be very welcome :)
Thanks
--

Frédéric Bastien

2017-08-23 14:50:19 UTC

Permalink

Post by Rodolphe Cambier
To work around I simply used a vector of floats instead of Int32.
Ok I'll make an issue on github.

work around it by introducing a cast to floatX before the log10. Can you
make an issue on github so we don't loose track of this?

Post by Rodolphe Cambier
Hello,
I have the same code running on two computers, one with the old backend
import lasagne
import theano
import theano.tensor as T
import lasagne.layers as ll
max_length = 1000
learning_rate = .1
l_in = ll.InputLayer(shape=(None, max_length, 1), name="InputLayer")
l_reshape = ll.ReshapeLayer(l_in, ([0], 1, [1]), name="ReshapeLayer")
l_conv0 = ll.Conv1DLayer(l_reshape, num_filters=15, filter_size=30,
stride=10,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_0")
l_conv1 = ll.Conv1DLayer(l_conv0, num_filters=15, filter_size=4, stride=4,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_1")
l_conv2 = ll.Conv1DLayer(l_conv1, num_filters=15, filter_size=1, stride=1,
nonlinearity=lasagne.nonlinearities.rectify, name="Conv1DLayer_2")
l_out = ll.DenseLayer(ll.dropout(l_conv2, p=0.3), num_units=1,
nonlinearity=lasagne.nonlinearities.linear, name="Denselayer")
predicted_values = lasagne.layers.get_output(l_out)
target_values = T.ivector('target_output')
predict_log = T.sgn(predicted_values) *
T.log(1+T.abs_(predicted_values))
target_log = T.sgn(target_values) * T.log(1+T.abs_(target_values))
cost = T.mean(lasagne.objectives.squared_error(predict_log,target_log))
all_params = lasagne.layers.get_all_params(l_out)
updates = lasagne.updates.adagrad(cost, all_params, learning_rate)
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
So I setup a simple convolutional net, then i try to measure a specific
cost on it, using T.sgn and T.log.
On the old backend, this works fine.
On the new backend, it worked fine for a day (i ran it maybe 15 times),
Using cuDNN version 5105 on context None
Mapped name None to device cuda0: Tesla K40c (0000:01:00.0)
File "quicktest.py", line 34, in <module>
train = theano.function([l_in.input_var, target_values], [cost,
predicted_values, target_values], updates =updates,
allow_input_downcast=True)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function.py",
line 317, in function
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/pfunc.py",
line 486, in pfunc
output_keys=output_keys)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1838, in orig_function
fn = m.create(defaults)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/compile/function_module.py",
line 1712, in create
input_storage=input_storage_lists, storage_map=storage_map)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/link.py",
line 699, in make_thunk
storage_map=storage_map)[:3]
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/vm.py",
line 1084, in make_all
impl=impl))
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 955, in make_thunk
no_recycling)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/op.py",
line 858, in make_c_thunk
output_storage=node_output_storage)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1215, in make_thunk
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1155, in __compile__
keep_lock=keep_lock)
File
"/home/rcambier/miniconda2/envs/cardio_env/lib/python2.7/site-packages/theano/gof/cc.py",
line 1635, in cthunk_factory
*(in_storage + out_storage + orphd))
RuntimeError: ('The following error happened while compiling the node',
GpuElemwise{Composite{((i0 * log1p(i1)) - (sgn(i2) *
log1p(Abs(i2))))}}[]<gpuarray>(GpuElemwise{sgn,no_inplace}.0,
GpuElemwise{Abs}[(0, 0)]<gpuarray>.0, InplaceGpuDimShuffle{0,x}.0), '\n',
'Could not initialize elemwise support')
I cannot for the love of god find out what I could have changed between
the executions, i reinstalled theano and pygpu, doesn't change anything.
I don't find anyone having this error except for OpenCL related
problems, and this issue <https://github.com/Theano/Theano/issues/5541>,
which is supposed to be fixed (I am on the development version of Theano).
So if anyone has any idea of what i could do to fix the problem, it
would be very welcome :)
Thanks
--

---
You received this message because you are subscribed to the Google
Groups "theano-users" group.

To unsubscribe from this group and stop receiving emails from it, send an

Post by Rodolphe Cambier
For more options, visit https://groups.google.com/d/optout.