[theano-users] Problem with theano in gpu

Discussion:

Ruben Dario Fonnegra Tarazona

2017-11-22 00:49:45 UTC

Hi.

I'm having ploblems executing code in theano. I installed the dev version
and it runs perfectly in CPU. However, when I try to run anything in the
gpu (even LeNet with MNIST) the model doesn't even run and it appears an
only message saying "Segmentation fault. Core dumped". Despite of that, I
tried to verify theano installation using the command THEANO_FLAGS='';python
-c "import theano; theano.test()" and it did not work (I attach the output
in log file). I tried several things but I couldn't solve the problem. I
attach a file with the output after executing the command, and my theanorc
file. I work with a Kubuntu 16.04, CUDA 8 and cuDNN 6.2 in the Quadro M4000
series (8GB - the problem is not memory allocation). I hope you could help
me solve the problem. Thanks in advice, and I will be very attentive for
your answer.

What I've tried?
- Install stable, bleeding edge and dev theano version, using their
corresponding libgpuarray library (same issue)
- Manually compile OpenBLAS for installation
- Use environment variable CUDA_LAUNCH_BLOCKING set to 1
- Use several floatX types (float16, float32, float64)
- Use device=cpu and device=cuda0 flags

Note: It might be the CUDA drivers; however, I can use TensorFlow running
on GPU without any problem.

output of theano.test()
---------------------------------------------------------------------

HP-Z840-Workstation:~/Data$ THEANO_FLAGS=''; python -c "import theano;
theano.test()"

Theano version 1.0.0

theano is installed in /home/bluegum1/Theano/theano
NumPy version 1.13.3

NumPy relaxed strides checking option: True
NumPy is installed in
/home/bluegum1/.local/lib/python2.7/site-packages/numpy

Python version 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]

nose version 1.3.7

Using cuDNN version 6021 on context None

Mapped name None to device cuda: Quadro M4000 (0000:04:00.0)

..........................................ERROR (theano.gof.opt):
Optimization failure due to: insert_bad_dtype
ERROR (theano.gof.opt): node: Elemwise{add,no_inplace}(<TensorType(float64,
vector)>, <TensorType(float64, vector)>)

ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
File "/home/bluegum1/Theano/theano/gof/opt.py", line 2059, in process_node
remove=remove)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 569, in
replace_all_validate_remove
chk = fgraph.replace_all_validate(replacements, reason)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 518, in
replace_all_validate
fgraph.replace(r, new_r, reason=reason, verbose=False)
File "/home/bluegum1/Theano/theano/gof/fg.py", line 486, in replace
". The type of the replacement must be the same.", old, new)

BadOptimization: BadOptimization Error
Variable: id 140331434464144 Elemwise{Cast{float32}}.0
Op Elemwise{Cast{float32}}(Elemwise{add,no_inplace}.0)
Value Type: <type 'NoneType'>
Old Value: None
New Value: None
Reason: insert_bad_dtype. The type of the replacement must be the same.
Old Graph:
Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
|<TensorType(float64, vector)> [id B] <TensorType(float64, vector)>
|<TensorType(float64, vector)> [id C] <TensorType(float64, vector)>

New Graph:
Elemwise{Cast{float32}} [id D] <TensorType(float32, vector)> ''
|Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''

Hint: relax the tolerance by setting tensor.cmp_sloppy=1
or even tensor.cmp_sloppy=2 for less-strict comparison

......................................S............................./home/bluegum1/Theano/theano/compile/nanguardmode.py:150:
RuntimeWarning: All-NaN slice encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
/home/bluegum1/Theano/theano/compile/nanguardmode.py:150: RuntimeWarning:
All-NaN axis encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))

............................................../home/bluegum1/Theano/theano/gof/vm.py:886:
UserWarning: CVM does not support memory profile, using Stack VM.
'CVM does not support memory profile, using Stack VM.')
............/home/bluegum1/Theano/theano/compile/profiling.py:283:
UserWarning: You are running the Theano profiler with CUDA enabled. Theano
GPU ops execution is asynchronous by default. So by default, the profile is
useless. You must set the environment variable CUDA_LAUNCH_BLOCKING to 1 to
tell the CUDA driver to synchronize the execution to get a meaningful
profile.
warnings.warn(msg)

....................0.0581646137166

0.0581646137166

0.0581646137166

0.0581646137166

.................................................................................................................................................................../home/bluegum1/Theano/theano/gof/vm.py:889:
UserWarning: LoopGC does not support partial evaluation, using Stack VM.
'LoopGC does not support partial evaluation, '
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ViolaciÃ³n
de segmento (`core' generado)

---------------------------------------------------------------------

theanorc file
---------------------------------------------------------------------

[global]
floatX = float32
#device = cuda0
#device = cpu
#optimizer=fast_run
#optimizer=fast_compile #Desabilita la GPU
#optimizer=None

[cuda]
root = /usr/local/cuda-8.0

[nvcc]
fastmath = True

[lib]
cnmem = 1.0

--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

w***@gmail.com

2017-11-22 09:23:18 UTC

Permalink

There is not much I could say for I am running theano and numpy etc on
windows 7 with an older version such as theano 0.9 cuDNN ~5005. Everything
works though.
Also, your configuration file looks quite different from mine. As a
reference, here is my configuration

[global]
floatX = float32
device =cuda
mode=FAST_RUN
allow_gc=False
warn_float64=warn

[cuda]
root = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin

[gpuarray]
preallocate = 0.85

[dnn]
library_path = C:\Program Files\NVIDIA GPU Computing
Toolkit\CUDA\v8.0\lib\x64
include_path = C:\Program Files\NVIDIA GPU Computing
Toolkit\CUDA\v8.0\include

[nvcc]
flags=-LC:\Users.....
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin
fastmath = True
optimizer_including=dnn
[blas]
ldflags = -lf77blas -latlas -lgfortran -lblas

On Wednesday, 22 November 2017 13:49:45 UTC+13, Ruben Dario Fonnegra

Post by Ruben Dario Fonnegra Tarazona
Hi.
I'm having ploblems executing code in theano. I installed the dev version
and it runs perfectly in CPU. However, when I try to run anything in the
gpu (even LeNet with MNIST) the model doesn't even run and it appears an
only message saying "Segmentation fault. Core dumped". Despite of that, I
tried to verify theano installation using the command THEANO_FLAGS='';python
-c "import theano; theano.test()" and it did not work (I attach the
output in log file). I tried several things but I couldn't solve the
problem. I attach a file with the output after executing the command, and
my theanorc file. I work with a Kubuntu 16.04, CUDA 8 and cuDNN 6.2 in the
Quadro M4000 series (8GB - the problem is not memory allocation). I hope
you could help me solve the problem. Thanks in advice, and I will be very
attentive for your answer.
What I've tried?
- Install stable, bleeding edge and dev theano version, using their
corresponding libgpuarray library (same issue)
- Manually compile OpenBLAS for installation
- Use environment variable CUDA_LAUNCH_BLOCKING set to 1
- Use several floatX types (float16, float32, float64)
- Use device=cpu and device=cuda0 flags
Note: It might be the CUDA drivers; however, I can use TensorFlow running
on GPU without any problem.
output of theano.test()
---------------------------------------------------------------------
HP-Z840-Workstation:~/Data$ THEANO_FLAGS=''; python -c "import theano;
theano.test()"
Theano version 1.0.0
theano is installed in /home/bluegum1/Theano/theano
NumPy version 1.13.3
NumPy relaxed strides checking option: True
NumPy is installed in
/home/bluegum1/.local/lib/python2.7/site-packages/numpy
Python version 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]
nose version 1.3.7
Using cuDNN version 6021 on context None
Mapped name None to device cuda: Quadro M4000 (0000:04:00.0)
Optimization failure due to: insert_bad_dtype
Elemwise{add,no_inplace}(<TensorType(float64, vector)>,
<TensorType(float64, vector)>)
File "/home/bluegum1/Theano/theano/gof/opt.py", line 2059, in process_node
remove=remove)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 569, in
replace_all_validate_remove
chk = fgraph.replace_all_validate(replacements, reason)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 518, in
replace_all_validate
fgraph.replace(r, new_r, reason=reason, verbose=False)
File "/home/bluegum1/Theano/theano/gof/fg.py", line 486, in replace
". The type of the replacement must be the same.", old, new)
BadOptimization: BadOptimization Error
Variable: id 140331434464144 Elemwise{Cast{float32}}.0
Op Elemwise{Cast{float32}}(Elemwise{add,no_inplace}.0)
Value Type: <type 'NoneType'>
Old Value: None
New Value: None
Reason: insert_bad_dtype. The type of the replacement must be the same.
Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
|<TensorType(float64, vector)> [id B] <TensorType(float64, vector)>
|<TensorType(float64, vector)> [id C] <TensorType(float64, vector)>
Elemwise{Cast{float32}} [id D] <TensorType(float32, vector)> ''
|Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
Hint: relax the tolerance by setting tensor.cmp_sloppy=1
or even tensor.cmp_sloppy=2 for less-strict comparison
RuntimeWarning: All-NaN slice encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
All-NaN axis encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
UserWarning: CVM does not support memory profile, using Stack VM.
'CVM does not support memory profile, using Stack VM.')
UserWarning: You are running the Theano profiler with CUDA enabled. Theano
GPU ops execution is asynchronous by default. So by default, the profile is
useless. You must set the environment variable CUDA_LAUNCH_BLOCKING to 1 to
tell the CUDA driver to synchronize the execution to get a meaningful
profile.
warnings.warn(msg)
....................0.0581646137166
0.0581646137166
0.0581646137166
0.0581646137166
UserWarning: LoopGC does not support partial evaluation, using Stack VM.
'LoopGC does not support partial evaluation, '
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ViolaciÃ³n
de segmento (`core' generado)
---------------------------------------------------------------------
theanorc file
---------------------------------------------------------------------
[global]
floatX = float32
#device = cuda0
#device = cpu
#optimizer=fast_run
#optimizer=fast_compile #Desabilita la GPU
#optimizer=None
[cuda]
root = /usr/local/cuda-8.0
[nvcc]
fastmath = True
[lib]
cnmem = 1.0

Pascal Lamblin

2017-11-28 17:47:15 UTC

Permalink

Hi,

Can you try running "nosetests -v" or "theano-nose -v" in command line,
rather than from Python?
This should at least give the name of the test during which it crashes.

Post by Ruben Dario Fonnegra Tarazona
Hi.
I'm having ploblems executing code in theano. I installed the dev
version and it runs perfectly in CPU. However, when I try to run
anything in the gpu (even LeNet with MNIST) the model doesn't even run
and it appears an only message saying "Segmentation fault. Core dumped".
Despite of that, I tried to verify theano installation using the command
THEANO_FLAGS='';python -c "import theano; theano.test()" and it did not
work (I attach the output in log file). I tried several things but I
couldn't solve the problem. I attach a file with the output after
executing the command, and my theanorc file. I work with a Kubuntu
16.04, CUDA 8 and cuDNN 6.2 in the Quadro M4000 series (8GB - the
problem is not memory allocation). I hope you could help me solve the
problem. Thanks in advice, and I will be very attentive for your answer.
What I've tried?
- Install stable, bleeding edge and dev theano version, using their
corresponding libgpuarray library (same issue)
- Manually compile OpenBLAS for installation
- Use environment variable CUDA_LAUNCH_BLOCKING set to 1
- Use several floatX types (float16, float32, float64)
- Use device=cpu and device=cuda0 flags
Note: It might be the CUDA drivers; however, I can use TensorFlow
running on GPU without any problem.
output of theano.test()
---------------------------------------------------------------------
HP-Z840-Workstation:~/Data$ THEANO_FLAGS=''; python -c "import theano;
theano.test()"
Theano version 1.0.0
theano is installed in /home/bluegum1/Theano/theano
NumPy version 1.13.3
NumPy relaxed strides checking option: True
NumPy is installed in
/home/bluegum1/.local/lib/python2.7/site-packages/numpy
Python version 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]
nose version 1.3.7
Using cuDNN version 6021 on context None
Mapped name None to device cuda: Quadro M4000 (0000:04:00.0)
Optimization failure due to: insert_bad_dtype
Elemwise{add,no_inplace}(<TensorType(float64, vector)>,
<TensorType(float64, vector)>)
File "/home/bluegum1/Theano/theano/gof/opt.py", line 2059, in
process_node
remove=remove)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 569, in
replace_all_validate_remove
chk = fgraph.replace_all_validate(replacements, reason)
File "/home/bluegum1/Theano/theano/gof/toolbox.py", line 518, in
replace_all_validate
fgraph.replace(r, new_r, reason=reason, verbose=False)
File "/home/bluegum1/Theano/theano/gof/fg.py", line 486, in replace
". The type of the replacement must be the same.", old, new)
BadOptimization: BadOptimization Error
Variable: id 140331434464144 Elemwise{Cast{float32}}.0
Op Elemwise{Cast{float32}}(Elemwise{add,no_inplace}.0)
Value Type: <type 'NoneType'>
Old Value: None
New Value: None
Reason: insert_bad_dtype. The type of the replacement must be the same.
Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
|<TensorType(float64, vector)> [id B] <TensorType(float64, vector)>
|<TensorType(float64, vector)> [id C] <TensorType(float64, vector)>
Elemwise{Cast{float32}} [id D] <TensorType(float32, vector)> ''
|Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
Hint: relax the tolerance by setting tensor.cmp_sloppy=1
or even tensor.cmp_sloppy=2 for less-strict comparison
RuntimeWarning: All-NaN slice encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
RuntimeWarning: All-NaN axis encountered
return np.isinf(np.nanmax(arr)) or np.isinf(np.nanmin(arr))
UserWarning: CVM does not support memory profile, using Stack VM.
'CVM does not support memory profile, using Stack VM.')
UserWarning: You are running the Theano profiler with CUDA enabled.
Theano GPU ops execution is asynchronous by default. So by default, the
profile is useless. You must set the environment variable
CUDA_LAUNCH_BLOCKING to 1 to tell the CUDA driver to synchronize the
execution to get a meaningful profile.
warnings.warn(msg)
....................0.0581646137166
0.0581646137166
0.0581646137166
0.0581646137166
UserWarning: LoopGC does not support partial evaluation, using Stack VM.
'LoopGC does not support partial evaluation, '
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................Violación
de segmento (`core' generado)
---------------------------------------------------------------------
theanorc file
---------------------------------------------------------------------
[global]
floatX = float32
#device = cuda0
#device = cpu
#optimizer=fast_run
#optimizer=fast_compile #Desabilita la GPU
#optimizer=None
[cuda]
root = /usr/local/cuda-8.0
[nvcc]
fastmath = True
[lib]
cnmem = 1.0
--
---
You received this message because you are subscribed to the Google
Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send
For more options, visit https://groups.google.com/d/optout.

--
Pascal Lamblin
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ruben Dario Fonnegra Tarazona

2017-11-28 18:16:08 UTC

Permalink

Hi Pascal.

I already solve the problem. Apparently, the seg fault was caused by some
internal problem with the cudnn 6 library (it seemed to be a bug). After
I verify it was a cudnn library problem, I realize that I could run
code using the dnn.enable=False flag. However, I solve this bug by
uninstalling the cudnn 6 and installing cudnn 7. It worked perfectly. Thank
you very much for your response, in any case. (Y)

El miÃ©rcoles, 22 de noviembre de 2017, 1:49:45 (UTC+1), Ruben Dario