jau94
2018-04-26 07:12:04 UTC
Hi,
I am using Theano == 1.0.1 for the Sequential Matching Network
<https://github.com/MarkWuNLP/MultiTurnResponseSelection>. I have tested
their code and it works well
But now, I want to modify their *predict method* in *SMN_last.py. *I want
to be able to provide the test data at each time to the theano.function,
instead of only giving the index (line 195) of a fixed test data:
val_model = theano.function([*index*], [y,predict,cost,error], givens=val_dic,
theano.function method. I don't want to be restricted to only select from
the batches of a fixed test data.
Thefore, I have done the following change to the *predict method *in
*SMN_last.py*. I called this method *load_graph*.
def load_graph(U,batch_size=20,max_l =
U = U.astype(dtype=theano.config.floatX) #Cast the embedding matrix to
sessionmask = T.matrix() #Creates a tensor matrix
lx = []
lxmask = []
for i in range(max_turn): #For max_turn (default=10), generate as many
lxmask.append(T.matrix())
index = T.lscalar() #Declare a tensor scalar
rx = T.matrix('rx') #Declare a tensor matrix with a name. I think
y = T.ivector('y') #Declare a tensor scalar
Words = theano.shared(value = U, name = "Words") #Declare a shared
.reshape((lx[i].shape[0],lx[i].shape[1],Words.shape[1])))
test_set_lx.append(T.cast(test_set[:,offset*i:offset*i +
test_dic[sessionmask] =
r_embedding = sentence2vec(rlayer0_input,rxmask,True)
q_embedding[i],r_embedding))
classifier = LogisticRegression(res, session_hidden_size ,2,rng)
#opt = Adam()
params = classifier.params
params += sentence2vec.params
params += session2vec.params
params += pooling_layer.params
params += [Words]
Note that the main changes are:
1. Addition of tensor *test_set=T.matrix(dtype='int32') *: The tensor that
will store the new input test
2. Delete of the shared tensors, as test is already a tensor and not a
numpy array.
3. Don't do any prediction. The method returns the created computational
graph *test_model*
4. The theano.function now has two inputs: test_set and index
Then, I want to run the model with. I have the same samples of a single
batch 200. The input looks like:
test_set-> (200,1111)
Test set is a numpy array, with dtype='int32'. I call to the forward pass
only for the first batch (index=0):
predictions=test_model(test_set, 0)
indicating to me? I couldn't find any good explanation.
The same input data works well in *SMN_last.py*.
I am using Theano == 1.0.1 for the Sequential Matching Network
<https://github.com/MarkWuNLP/MultiTurnResponseSelection>. I have tested
their code and it works well
But now, I want to modify their *predict method* in *SMN_last.py. *I want
to be able to provide the test data at each time to the theano.function,
instead of only giving the index (line 195) of a fixed test data:
val_model = theano.function([*index*], [y,predict,cost,error], givens=val_dic,
on_unused_input='ignore')
What if the test data can vary each time I want to call to thetheano.function method. I don't want to be restricted to only select from
the batches of a fixed test data.
Thefore, I have done the following change to the *predict method *in
*SMN_last.py*. I called this method *load_graph*.
def load_graph(U,batch_size=20,max_l =
100,hidden_size=100,word_embedding_size=100,session_hidden_size=50,session_input_size
# for optimization
hiddensize = hidden_size# for optimization
U = U.astype(dtype=theano.config.floatX) #Cast the embedding matrix to
a floatX tensor (THIS IS STILL A NUMPY ARRAY)
rng = np.random.RandomState(3435) #A single random number isgenerated and returned
lsize, rsize = max_l,max_l#DECLARE THE INPUT TENSORS!!!
test_set = T.matrix(dtype='int32') #Creates a tensor matrixsessionmask = T.matrix() #Creates a tensor matrix
lx = []
lxmask = []
for i in range(max_turn): #For max_turn (default=10), generate as many
tensor matrices
lx.append(T.matrix())lxmask.append(T.matrix())
index = T.lscalar() #Declare a tensor scalar
rx = T.matrix('rx') #Declare a tensor matrix with a name. I think
this will be the response!
rxmask = T.matrix() #Mask for the response as a tensor matrixy = T.ivector('y') #Declare a tensor scalar
Words = theano.shared(value = U, name = "Words") #Declare a shared
variable with the embeddings
llayer0_input = []
llayer0_input.append(Words[T.cast(lx[i].flatten(),dtype="int32")]llayer0_input = []
.reshape((lx[i].shape[0],lx[i].shape[1],Words.shape[1])))
rlayer0_input =
Words[T.cast(rx.flatten(),dtype="int32")].reshape((rx.shape[0],rx.shape[1],Words.shape[1]))
# input: word embeddings of the mini batch
# # #Why is divided in train, dev, test when we are predicting?
# # #test_set = datasetsWords[T.cast(rx.flatten(),dtype="int32")].reshape((rx.shape[0],rx.shape[1],Words.shape[1]))
# input: word embeddings of the mini batch
# # #Why is divided in train, dev, test when we are predicting?
q_embedding = []
offset = 2 * lsizetest_set_lx = []
test_set_lx_mask = []test_set_lx.append(T.cast(test_set[:,offset*i:offset*i +
lsize],dtype=theano.config.floatX))
test_set_lx_mask.append(T.cast(test_set[:,offset*i +lsize:offset*i + 2*lsize],dtype=theano.config.floatX))
test_set_rx = T.cast(test_set[:,offset*max_turn:offset*max_turn +
lsize],dtype=theano.config.floatX)
test_set_rx_mask = T.cast(test_set[:,offset*max_turntest_set_rx = T.cast(test_set[:,offset*max_turn:offset*max_turn +
lsize],dtype=theano.config.floatX)
+lsize:offset*max_turn +2 *lsize],dtype=theano.config.floatX)
test_set_session_mask =T.cast(test_set[:,-max_turn-1:-1],dtype=theano.config.floatX)
test_set_y =T.cast(test_set[:,-1],dtype='int32') #somehow put int32here
test_dic = {}
test_dic[lx[i]] =test_dic = {}
test_set_lx[i][index*batch_size:(index+1)*batch_size]
test_dic[lxmask[i]] =test_set_lx_mask[i][index*batch_size:(index+1)*batch_size]
test_dic[rx] = test_set_rx[index*batch_size:(index+1)*batch_size]test_dic[sessionmask] =
test_set_session_mask[index*batch_size:(index+1)*batch_size]
test_dic[rxmask] =test_set_rx_mask[index*batch_size:(index+1)*batch_size]
test_dic[y] = test_set_y[index*batch_size:(index+1)*batch_size]sentence2vec =
GRU(n_in=word_embedding_size,n_hidden=hiddensize,n_out=hiddensize)
q_embedding.append(sentence2vec(llayer0_input[i],lxmask[i],True))GRU(n_in=word_embedding_size,n_hidden=hiddensize,n_out=hiddensize)
r_embedding = sentence2vec(rlayer0_input,rxmask,True)
pooling_layer =
ConvSim(rng,max_l,session_input_size,hidden_size=hiddensize)
poolingoutput = []
#test =ConvSim(rng,max_l,session_input_size,hidden_size=hiddensize)
poolingoutput = []
theano.function([index],pooling_layer(llayer0_input[-4],rlayer0_input,q_embedding[i],r_embedding),givens=test_dic,on_unused_input='ignore')
poolingoutput.append(pooling_layer(llayer0_input[i],rlayer0_input,q_embedding[i],r_embedding))
session2vec =
GRU(n_in=session_input_size,n_hidden=session_hidden_size,n_out=session_hidden_size)
res = session2vec(T.stack(poolingoutput,1),sessionmask)GRU(n_in=session_input_size,n_hidden=session_hidden_size,n_out=session_hidden_size)
classifier = LogisticRegression(res, session_hidden_size ,2,rng)
#cost = classifier.negative_log_likelihood(y)
#error = classifier.errors(y)#opt = Adam()
params = classifier.params
params += sentence2vec.params
params += session2vec.params
params += pooling_layer.params
params += [Words]
load_params(params,model_name)
predict = classifier.predict_prob
#test_model = theano.function([test_set,index],
[y,predict,cost,error], givens=test_dic,on_unused_input='ignore')
test_model = theano.function([test_set, index], [predict],predict = classifier.predict_prob
#test_model = theano.function([test_set,index],
[y,predict,cost,error], givens=test_dic,on_unused_input='ignore')
givens=test_dic, on_unused_input='ignore', allow_input_downcast=True)
return test_model
return test_model
1. Addition of tensor *test_set=T.matrix(dtype='int32') *: The tensor that
will store the new input test
2. Delete of the shared tensors, as test is already a tensor and not a
numpy array.
3. Don't do any prediction. The method returns the created computational
graph *test_model*
4. The theano.function now has two inputs: test_set and index
Then, I want to run the model with. I have the same samples of a single
batch 200. The input looks like:
test_set-> (200,1111)
Test set is a numpy array, with dtype='int32'. I call to the forward pass
only for the first batch (index=0):
predictions=test_model(test_set, 0)
File "main_predict_single_batch.py", line 295, in <module>
predict=model(test,0)
File
File
File
File
File
File "theano/scan_module/scan_perform.pyx", line 215, in
Inputs types: [TensorType(int64, scalar), TensorType(float32, (False,
Could someone help me discover what is wrong? What is this error messagepredict=model(test,0)
File
"/data/ijauregi/Desktop/MyPythons/py27_theano_tensorflow/lib/python2.7/site-packages/theano/compile/function_module.py",
line 917, in __call__
storage_map=getattr(self.fn, 'storage_map', None))line 917, in __call__
File
"/data/ijauregi/Desktop/MyPythons/py27_theano_tensorflow/lib/python2.7/site-packages/theano/gof/link.py",
line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)line 325, in raise_with_op
File
"/data/ijauregi/Desktop/MyPythons/py27_theano_tensorflow/lib/python2.7/site-packages/theano/compile/function_module.py",
line 903, in __call__
self.fn() if output_subset is None else\line 903, in __call__
File
"/data/ijauregi/Desktop/MyPythons/py27_theano_tensorflow/lib/python2.7/site-packages/theano/scan_module/scan_op.py",
line 963, in rval
r = p(n, [x[0] for x in i], o)line 963, in rval
File
"/data/ijauregi/Desktop/MyPythons/py27_theano_tensorflow/lib/python2.7/site-packages/theano/scan_module/scan_op.py",
line 952, in p
self, node)line 952, in p
File "theano/scan_module/scan_perform.pyx", line 215, in
theano.scan_module.scan_perform.perform
(/home/ijauregi/.theano/compiledir_Linux-3.10-el7.x86_64-x86_64-with-redhat-7.3-Maipo-x86_64-2.7.13-64/scan_perform/mod.cpp:2628)
NotImplementedError: We didn't implemented yet the case where scan do 0(/home/ijauregi/.theano/compiledir_Linux-3.10-el7.x86_64-x86_64-with-redhat-7.3-Maipo-x86_64-2.7.13-64/scan_perform/mod.cpp:2628)
iteration
forall_inplace,cpu,scan_fn}(Elemwise{minimum,no_inplace}.0,
Elemwise{sub,no_inplace}.0, Subtensor{int64:int64:int8}.0,
Elemwise{Cast{float32}}.0, IncSubtensor{InplaceSet;:int64:}.0,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0,
InplaceDimShuffle{x,0}.0)
Toposort index: 1122forall_inplace,cpu,scan_fn}(Elemwise{minimum,no_inplace}.0,
Elemwise{sub,no_inplace}.0, Subtensor{int64:int64:int8}.0,
Elemwise{Cast{float32}}.0, IncSubtensor{InplaceSet;:int64:}.0,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
<TensorType(float32, matrix)>, <TensorType(float32, matrix)>,
InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0,
InplaceDimShuffle{x,0}.0)
Inputs types: [TensorType(int64, scalar), TensorType(float32, (False,
False, True)), TensorType(float32, 3D), TensorType(float32, (False, False,
True)), TensorType(float32, 3D), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, row), TensorType(float32,
row), TensorType(float32, row)]
Inputs shapes: [(), (0, 200, 1), (0, 200, 300), (0, 200, 1), (2, 200,True)), TensorType(float32, 3D), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, matrix),
TensorType(float32, matrix), TensorType(float32, row), TensorType(float32,
row), TensorType(float32, row)]
100), (300, 100), (100, 100), (300, 100), (300, 100), (100, 100), (100,
100), (1, 100), (1, 100), (1, 100)]
Inputs strides: [(), (800, 4, 4), (1200, 1200, 4), (800, 4, 4), (80000,100), (1, 100), (1, 100), (1, 100)]
400, 4), (400, 4), (400, 4), (400, 4), (400, 4), (400, 4), (400, 4), (400,
4), (400, 4), (400, 4)]
Inputs values: [array(0), array([], shape=(0, 200, 1), dtype=float32),4), (400, 4), (400, 4)]
array([], shape=(0, 200, 300), dtype=float32), array([], shape=(0, 200, 1),
dtype=float32), 'not shown', 'not shown', 'not shown', 'not shown', 'not
shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown']
[[Subtensor{int64:int64:int8}(forall_inplace,cpu,scan_fn}.0,
ScalarFromTensor.0, ScalarFromTensor.0, Constant{1})]]
HINT: Re-running with most Theano optimization disabled could give you a
back-trace of when this node was created. This can be done with by setting
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano
optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint anddtype=float32), 'not shown', 'not shown', 'not shown', 'not shown', 'not
shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown']
[[Subtensor{int64:int64:int8}(forall_inplace,cpu,scan_fn}.0,
ScalarFromTensor.0, ScalarFromTensor.0, Constant{1})]]
HINT: Re-running with most Theano optimization disabled could give you a
back-trace of when this node was created. This can be done with by setting
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano
optimizations can be disabled with 'optimizer=None'.
storage map footprint of this apply node.
indicating to me? I couldn't find any good explanation.
The same input data works well in *SMN_last.py*.
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.