Justin Brody
2014-06-28 23:01:46 UTC
Hello,
I've been trying for many days to properly understand how shared variables
and symbolic variables interact in Theano, but sadly I don't think I'm
there. My ignorance is quite probably reflected in this question but I
would still be very grateful for any guidance.
I'm trying to implement a "deconvolutional network"; specifically I have a
3-tensor of inputs (each input is a 2D image) and a 4-tensor of codes; for
the ith input codes[i] represents a set of codewords which together code
for input i.
I've been having a lot of trouble figuring out how to do gradient descent
on the codewords. Here are the relevant parts of my code:
codes = shared(initial_codes, name="codes") # Shared 4-tensor w/ dims (input #, code #, row #, col #)
idx = T.lscalar()
pre_loss_conv = conv2d(input = codes[idx].dimshuffle('x', 0, 1,2),
filters = dicts.dimshuffle('x', 0,1, 2),
border_mode = 'valid')
loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], pre_loss_conv.shape[3]))
loss_in = inputs[idx]
loss = T.sum(1./2.*(loss_in - loss_conv)**2)
del_codes = T.grad(loss, codes[idx])
delc_fn = function([idx], del_codes)
train_codes = function([input_index], loss, updates = [
[codes, T.set_subtensor(codes[input_index], codes[input_index] -
learning_rate*del_codes[input_index]) ]])
(here codes and dicts are shared tensor variables). Theano is unhappy with
this, specifically with defining
del_codes = T.grad(loss, codes[idx])
The error message I'm getting is: *theano.gradient.DisconnectedInputError:
grad method was asked to compute the gradient with respect to a variable
that is not part of the computational graph of the cost, or is used only by
a non-differentiable operator: Subtensor{int64}.0*
I'm guessing that it wants a symbolic variable instead of codes[idx]; but
then I'm not sure how to get everything connected to get the intended
effect. I'm guessing I'll need to change the final line to something like
learning_rate*del_codes) ]])
Can someone give me some pointers as to how to define this function
properly? I think I'm probably missing something basic about working with
Theano but I'm not sure what.
Thanks in advance!
-Justin
I've been trying for many days to properly understand how shared variables
and symbolic variables interact in Theano, but sadly I don't think I'm
there. My ignorance is quite probably reflected in this question but I
would still be very grateful for any guidance.
I'm trying to implement a "deconvolutional network"; specifically I have a
3-tensor of inputs (each input is a 2D image) and a 4-tensor of codes; for
the ith input codes[i] represents a set of codewords which together code
for input i.
I've been having a lot of trouble figuring out how to do gradient descent
on the codewords. Here are the relevant parts of my code:
codes = shared(initial_codes, name="codes") # Shared 4-tensor w/ dims (input #, code #, row #, col #)
idx = T.lscalar()
pre_loss_conv = conv2d(input = codes[idx].dimshuffle('x', 0, 1,2),
filters = dicts.dimshuffle('x', 0,1, 2),
border_mode = 'valid')
loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], pre_loss_conv.shape[3]))
loss_in = inputs[idx]
loss = T.sum(1./2.*(loss_in - loss_conv)**2)
del_codes = T.grad(loss, codes[idx])
delc_fn = function([idx], del_codes)
train_codes = function([input_index], loss, updates = [
[codes, T.set_subtensor(codes[input_index], codes[input_index] -
learning_rate*del_codes[input_index]) ]])
(here codes and dicts are shared tensor variables). Theano is unhappy with
this, specifically with defining
del_codes = T.grad(loss, codes[idx])
The error message I'm getting is: *theano.gradient.DisconnectedInputError:
grad method was asked to compute the gradient with respect to a variable
that is not part of the computational graph of the cost, or is used only by
a non-differentiable operator: Subtensor{int64}.0*
I'm guessing that it wants a symbolic variable instead of codes[idx]; but
then I'm not sure how to get everything connected to get the intended
effect. I'm guessing I'll need to change the final line to something like
learning_rate*del_codes) ]])
Can someone give me some pointers as to how to define this function
properly? I think I'm probably missing something basic about working with
Theano but I'm not sure what.
Thanks in advance!
-Justin
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit https://groups.google.com/d/optout.
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit https://groups.google.com/d/optout.