Laura CP
2017-08-18 11:28:17 UTC
I have implemented a seq2seq model in Theano for a summarization task. It
has an initial Embedding layer which I initialized using GlorotUniform()
and it works fine:
self.initialization_weight = GlorotUniform()
self.W = theano.shared(
self.initialization_weight.sample((self.number_of_input_symbols, self.embedding_dimension)), name='W_emb')
self.trainable_variables = [self.W]
However, I want to improve the training phase setting pretrained word2vec or GloVe vectors as my Embedding layer's weights. They're stored in an external file all.glove.vectors.txt.
After loading it in the variable init_weights (a Numpy matrix), I do as follows:
self.initialization_weight = init_weights
self.W = theano.shared(self.initialization_weight, name='W_emb')
self.trainable_variables = [self.W]
It doesn't through any error message and the network starts training. The
thing is I think something is wrong since I get similar results than before
(without word2vec initialization) and the elapsed time in training takes 4
times longer (all experiments run in GPU).
*Note:* word2vec vectors were trained over Wikipedia corpus. I tried with
both dimension 100 and 300, but with 300-dimensional vectors training was
extremely slow.
I'd appreciate if anybody could give me a hint of what's going on. Thanks!
has an initial Embedding layer which I initialized using GlorotUniform()
and it works fine:
self.initialization_weight = GlorotUniform()
self.W = theano.shared(
self.initialization_weight.sample((self.number_of_input_symbols, self.embedding_dimension)), name='W_emb')
self.trainable_variables = [self.W]
However, I want to improve the training phase setting pretrained word2vec or GloVe vectors as my Embedding layer's weights. They're stored in an external file all.glove.vectors.txt.
After loading it in the variable init_weights (a Numpy matrix), I do as follows:
self.initialization_weight = init_weights
self.W = theano.shared(self.initialization_weight, name='W_emb')
self.trainable_variables = [self.W]
It doesn't through any error message and the network starts training. The
thing is I think something is wrong since I get similar results than before
(without word2vec initialization) and the elapsed time in training takes 4
times longer (all experiments run in GPU).
*Note:* word2vec vectors were trained over Wikipedia corpus. I tried with
both dimension 100 and 300, but with 300-dimensional vectors training was
extremely slow.
I'd appreciate if anybody could give me a hint of what's going on. Thanks!
--
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
---
You received this message because you are subscribed to the Google Groups "theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.