ca undefined is the process of the network i.e how unfolding overview of the recurrentneural networks i.

e how the hidden layer is dependent onte previous hidden layer and the second figure (Figure 2) showsthe the shows 1) (Figure above with the diagramsgiven can be explained are close to one another whilethe sentences with different meanings are quite far from each other.The Recurrent Neural network meaning similar with sentence meaningas capture their that sentence representation find the order because this makes the optimization much easier.The main property of RNN is that it converts the input sentence ofvariable length into the fixed vector representation which can bedefined as the translation.

The translation tends to be the paraphraseof the input sequence , the translation objective encourages theRNN to sequence inthe reverse reads the input The RNN model a,b,c. of isthe translation x,y,z where c,b,a with x,y,z maps the RNN x,y,z the of sequence withthe sequence a,b,c input alsovaluable for reversing the order of the words of the input sequence. So, for example instead of mapping the are with different languages . RNNs train the RNN andhelps to cost computational at the negligible ,yt’with thestandard LSTM formulation whose initial hidden state is set to2Figure 1: Overview of the Recurrent Neural NetworkSource:https://leonardoaraujosantos.gitbooks.io/artificialinteligence/content/recurrent_neural_networks.

htmlFigure 2: Unfolding of Recurrent Neural Network 4representation ? v? of x1,x2,. . . ,xt.

p(y1,y2, . . . ,yt?|x1, x2, . . . , xt ) =tÖ?t=1p(yt|v,y1, .

. . ,yt?1) (3)Each p(y1,y2,. . . ,yt’| x1,x2,.

. . ,xt) distribution is represented witha softmax layer over all the words in the vocabulary.

The actualRNN model uses two different RNNs one for the input layer andanother one for the output layers because it will increase the no. ofcomputational parameters . . and then computing the probability y1,y2,.

RNN ofthe state by the last hidden given the conditional probability bycomputing the fixed dimensional representation ? v? of inputssequence ( x1,x2,. . . ,xt) which is The RNN calculates may differ .

andt? and t . ,yt’, ,xt) , where the input sequence is thex1,x2,. . . ,xt and the output sequence is y1,y2,. . .

,yt’| x1,x2,. . . .

probabilityp(y1,y2,. calculate the estimated to However the Long TernShort Term Memory (LSTM) is known to learn the problem of longrange temporal dependencies.The goal of RNN is dependencies. term RNNnetwork due to long would be difficult to train the it is provided information So, asthe . RNN another the target sequence with vector to this and thenmap RNN one Vector using size mapped with the fixed general sequencing the inputlayer is different lengths.

In have outputsequences input and the the identical i.e sequence are not output where the input sequence andthe RNNs use to how clear howeverit is not mapping , in general used to createthe sequences. Let us assume that the inputs (x1,x2,. .

. ,xn) aregiven and I am using RNN to computer the sequence of outputs i.e(y1,y2,y3,.

. . ,yn) by iterating the following equations.ht = si?m(W hx xt +W hhht?1) (1)yt = W yhht(2)The RNNs are used for sequence to sequence which is the generalization of theFeed-forward neural networks basically is Neural network4.1 The ModelThe Recurrent