1
00:00:00,000 --> 00:00:07,000
So finally, guys, uh, I'm quite excited that we will finally train our LSTM, RNN, and, uh, this,

2
00:00:07,000 --> 00:00:10,000
uh, step by step will see again how to probably go ahead and train it already.

3
00:00:10,000 --> 00:00:13,000
We have done the data preprocessing and data ingestion part.

4
00:00:14,000 --> 00:00:16,000
Uh, now let's go ahead and do this.

5
00:00:16,000 --> 00:00:20,000
So first of all I will go ahead and import from TensorFlow okay.

6
00:00:20,000 --> 00:00:27,000
Dot Keras dot models import sequential okay.

7
00:00:27,000 --> 00:00:29,000
So we are going to specifically use sequential.

8
00:00:29,000 --> 00:00:38,000
Along with this I will go ahead and write from TensorFlow dot Keras dot layers import embedding okay

9
00:00:39,000 --> 00:00:39,000
embedding.

10
00:00:39,000 --> 00:00:42,000
And then along with that we will also use LSTM.

11
00:00:42,000 --> 00:00:45,000
Then I'll go ahead and use dense right.

12
00:00:45,000 --> 00:00:46,000
We require an embedding layer.

13
00:00:46,000 --> 00:00:48,000
We require an LSTM layer.

14
00:00:48,000 --> 00:00:49,000
We require a dense layer.

15
00:00:49,000 --> 00:00:51,000
And let me go ahead and use one more.

16
00:00:51,000 --> 00:00:52,000
Something called as dropout.

17
00:00:52,000 --> 00:00:58,000
Dropout is specifically to disable some of the neurons while training so that you know your entire model

18
00:00:58,000 --> 00:00:59,000
does not overfit.

19
00:00:59,000 --> 00:01:02,000
Now I'll go ahead and define the model okay.

20
00:01:02,000 --> 00:01:05,000
So first of all, I'll go ahead and initialize with sequential okay.

21
00:01:06,000 --> 00:01:09,000
Then I will go ahead and add my embedding layer.

22
00:01:09,000 --> 00:01:12,000
I hope I've actually covered everything as such.

23
00:01:12,000 --> 00:01:14,000
You know in the course itself.

24
00:01:14,000 --> 00:01:15,000
Embedding.

25
00:01:15,000 --> 00:01:19,000
And then I have my total underscore words uh, comma.

26
00:01:19,000 --> 00:01:20,000
Let's say, um, yeah.

27
00:01:20,000 --> 00:01:22,000
The number of dimension that I'm going to take is 100.

28
00:01:23,000 --> 00:01:33,000
And my input length should be input underscore length should be my max sequence length.

29
00:01:33,000 --> 00:01:33,000
Okay.

30
00:01:33,000 --> 00:01:36,000
This is what we are specifically using for our embedding layer.

31
00:01:36,000 --> 00:01:45,000
Then let's go ahead and add model dot add model dot add with respect to LSTM, I'm going to add my LSTM

32
00:01:45,000 --> 00:01:45,000
layer.

33
00:01:45,000 --> 00:01:51,000
Let's say I will go ahead and use 150 and along with this 150 neurons okay.

34
00:01:51,000 --> 00:01:55,000
We'll also go ahead and write return underscore sequences.

35
00:01:55,000 --> 00:01:59,000
Sequences is equal to true okay.

36
00:02:00,000 --> 00:02:03,000
Uh along with this I'm also going to go ahead and add my dropout layer.

37
00:02:03,000 --> 00:02:08,000
So let me just go ahead and add some dropout layer where I'm saying, uh, just go ahead and make sure

38
00:02:08,000 --> 00:02:14,000
that you disable 20 percentage of the neurons hidden neurons when you are training the neural network.

39
00:02:14,000 --> 00:02:16,000
Along with this, let me just go ahead and add one more LSTM layer.

40
00:02:16,000 --> 00:02:18,000
If you want, you can add one more.

41
00:02:18,000 --> 00:02:19,000
Just see.

42
00:02:19,000 --> 00:02:21,000
Let's see whether it will be able to perform well.

43
00:02:21,000 --> 00:02:25,000
And here I'm going to use the number of neurons as 100 okay.

44
00:02:25,000 --> 00:02:27,000
Then finally I will go ahead and add model dot add.

45
00:02:27,000 --> 00:02:32,000
And quickly let's go ahead and create my dense and the total words.

46
00:02:32,000 --> 00:02:35,000
And here I'm going to use my activation function.

47
00:02:36,000 --> 00:02:42,000
Uh activation is nothing but softmax since it is an multi.

48
00:02:42,000 --> 00:02:45,000
See over here I'm using softmax.

49
00:02:45,000 --> 00:02:51,000
The reason is very simple because um, with respect to the softmax, uh, here my activation function

50
00:02:51,000 --> 00:02:52,000
because my output.

51
00:02:52,000 --> 00:02:52,000
Right.

52
00:02:52,000 --> 00:02:53,000
It is multi-class.

53
00:02:53,000 --> 00:02:55,000
It is not a single class.

54
00:02:55,000 --> 00:02:55,000
Right.

55
00:02:55,000 --> 00:02:57,000
So that is the reason I'm using softmax.

56
00:02:57,000 --> 00:02:59,000
Otherwise I would have used uh, sigmoid.

57
00:02:59,000 --> 00:02:59,000
Okay.

58
00:03:00,000 --> 00:03:03,000
Now we will go ahead and compile the model.

59
00:03:03,000 --> 00:03:06,000
So this will be the next step for compiling.

60
00:03:06,000 --> 00:03:08,000
So I will go ahead and write model dot compile.

61
00:03:08,000 --> 00:03:14,000
And you know the loss that I'm actually going to use is nothing but categorical cross entropy since

62
00:03:14,000 --> 00:03:17,000
it is multi-class okay.

63
00:03:17,000 --> 00:03:18,000
Cross entropy.

64
00:03:18,000 --> 00:03:21,000
And the optimizer that I'm actually going to use.

65
00:03:21,000 --> 00:03:24,000
Optimizer will be nothing, but it will be Adam.

66
00:03:25,000 --> 00:03:26,000
Okay.

67
00:03:26,000 --> 00:03:31,000
And the metrics that I'm actually going to use is nothing but accuracy.

68
00:03:32,000 --> 00:03:35,000
Metrics will be accuracy.

69
00:03:35,000 --> 00:03:40,000
Okay, so finally I can go ahead and check my model dot summary.

70
00:03:40,000 --> 00:03:43,000
So let me go ahead and write model dot summary.

71
00:03:43,000 --> 00:03:48,000
So overall once I execute this you'll be able to see I'm getting some kind of errors.

72
00:03:48,000 --> 00:03:50,000
So what is the error.

73
00:03:50,000 --> 00:03:51,000
Let's see.

74
00:03:51,000 --> 00:03:56,000
Uh got an unexpected keyword argument drop out okay.

75
00:03:56,000 --> 00:03:59,000
I should be writing this in this way.

76
00:03:59,000 --> 00:04:04,000
It should is it is not a keyword argument, but instead it is a, uh, like how we have a separate layer

77
00:04:04,000 --> 00:04:05,000
like that.

78
00:04:05,000 --> 00:04:05,000
Okay.

79
00:04:05,000 --> 00:04:11,000
So now you can see over here my embedding layer is my first layer, then LSTM, and dropout LSTM second

80
00:04:11,000 --> 00:04:12,000
layer.

81
00:04:12,000 --> 00:04:13,000
And then finally this is my output layer.

82
00:04:13,000 --> 00:04:14,000
Right.

83
00:04:14,000 --> 00:04:19,000
And total number of trainable parameters is how much right.

84
00:04:19,000 --> 00:04:22,000
So this is some around 12 lakhs 19,418.

85
00:04:22,000 --> 00:04:23,000
Right.

86
00:04:23,000 --> 00:04:25,000
So these are my total trainable parameters.

87
00:04:25,000 --> 00:04:28,000
Now it's time that we go ahead and train our entire model.

88
00:04:28,000 --> 00:04:32,000
Right now to train the model uh I will just go ahead and write like this.

89
00:04:32,000 --> 00:04:35,000
So let's go ahead and train the model.

90
00:04:35,000 --> 00:04:38,000
And here I'm just going to use my history okay.

91
00:04:38,000 --> 00:04:41,000
Is equal to model dot fit.

92
00:04:41,000 --> 00:04:46,000
And with respect to this I will go ahead and write my X train comma y train.

93
00:04:47,000 --> 00:04:50,000
Uh let's use around at least 50 epochs.

94
00:04:50,000 --> 00:04:52,000
And this is going to take some time.

95
00:04:52,000 --> 00:04:54,000
It's not going to happen very fast.

96
00:04:54,000 --> 00:04:57,000
And if I also go ahead and use this 50 epochs right.

97
00:04:57,000 --> 00:05:03,000
It is also not going to probably give you the 90 percentage error.

98
00:05:03,000 --> 00:05:06,000
And for that I think we need to run it for one hour.

99
00:05:06,000 --> 00:05:06,000
Okay.

100
00:05:06,000 --> 00:05:10,000
But obviously I will just try to show you with respect to 50 epochs.

101
00:05:10,000 --> 00:05:15,000
So my validation data is basically having my X test.

102
00:05:15,000 --> 00:05:17,000
And here I will use one more bracket.

103
00:05:17,000 --> 00:05:20,000
And this will basically be my Y underscore text.

104
00:05:20,000 --> 00:05:25,000
And I will just go ahead and use my verbose is equal to one so that it gets displayed.

105
00:05:25,000 --> 00:05:29,000
Okay, so this is my history, uh, with respect to the model fit.

106
00:05:29,000 --> 00:05:32,000
Uh, now quickly let me just go ahead and execute it.

107
00:05:32,000 --> 00:05:35,000
So now my execution has started.

108
00:05:35,000 --> 00:05:38,000
Uh, input zero layer with the layer.

109
00:05:38,000 --> 00:05:38,000
None.

110
00:05:38,000 --> 00:05:39,000
Comma 14.

111
00:05:40,000 --> 00:05:45,000
So, guys, uh, over here, one issue, uh, over here that we are specifically getting it is saying

112
00:05:45,000 --> 00:05:49,000
that, hey, input zero of layer sequential one is incompatible with the layer expected shape none comma

113
00:05:49,000 --> 00:05:50,000
14.

114
00:05:50,000 --> 00:05:54,000
But, uh, you know, we have given this font shape none comma 13.

115
00:05:54,000 --> 00:05:59,000
The reason is very simple because here, uh, I've given this max sequence length.

116
00:05:59,000 --> 00:06:01,000
Uh, here I will just give it as minus one.

117
00:06:01,000 --> 00:06:02,000
Okay.

118
00:06:02,000 --> 00:06:04,000
So that is the reason I think that is happening.

119
00:06:04,000 --> 00:06:06,000
So let's go ahead and execute this quickly.

120
00:06:07,000 --> 00:06:10,000
And here we will go ahead and start the epoch okay.

121
00:06:10,000 --> 00:06:14,000
So please make sure that you see this particular code.

122
00:06:14,000 --> 00:06:19,000
And I'm also going to keep the error similarly like that so that you will be able to fix it okay.

123
00:06:19,000 --> 00:06:25,000
Um, again uh, if I probably see my max sequence length over here, it is basically starting from zero,

124
00:06:25,000 --> 00:06:26,000
right.

125
00:06:26,000 --> 00:06:32,000
So when we are starting the indexes from zero, uh, then we uh, to get the actual length of the indexes,

126
00:06:32,000 --> 00:06:33,000
I'm just doing it as minus one.

127
00:06:33,000 --> 00:06:36,000
So here you will be able to see my epoch has basically started.

128
00:06:36,000 --> 00:06:37,000
I'm in epoch two.

129
00:06:37,000 --> 00:06:41,000
Uh, my accuracy is 0.03 right now.

130
00:06:41,000 --> 00:06:44,000
I hope so, it should keep on getting increases.

131
00:06:44,000 --> 00:06:46,000
So this is just 3% accuracy.

132
00:06:46,000 --> 00:06:51,000
And obviously you'll be getting this accuracy because hamlet dot txt is very complex.

133
00:06:51,000 --> 00:06:53,000
It is not that simple okay.

134
00:06:53,000 --> 00:06:56,000
So uh now you'll be able to see that it'll slowly increase.

135
00:06:56,000 --> 00:06:59,000
Now see from .038 it has gone to .046.

136
00:06:59,000 --> 00:07:03,000
Then .55 I hope I've used everything right.

137
00:07:03,000 --> 00:07:04,000
Accuracy is the parameter.

138
00:07:04,000 --> 00:07:05,000
Okay, perfect.

139
00:07:05,000 --> 00:07:06,000
But it will increase.

140
00:07:07,000 --> 00:07:10,000
You just need to run it for many, many epochs, at least 100 epochs.

141
00:07:10,000 --> 00:07:13,000
If you run, you will be able to gain some very good accuracy.

142
00:07:13,000 --> 00:07:17,000
So now, uh, now the accuracy, the loss is also decreasing over here.

143
00:07:17,000 --> 00:07:19,000
Here also the loss is decreasing.

144
00:07:19,000 --> 00:07:23,000
So let's let this continue this specific epoch.

145
00:07:23,000 --> 00:07:29,000
And uh, I think uh, once the training gets completed, then I will show you the final accuracy.

146
00:07:29,000 --> 00:07:35,000
So with respect to only 50 epochs, you can keep on increasing the number of epochs to 100 and do it

147
00:07:35,000 --> 00:07:36,000
and anyhow.

148
00:07:36,000 --> 00:07:39,000
Or you can also go ahead and apply this early stopping if you really want.

149
00:07:39,000 --> 00:07:40,000
Okay.

150
00:07:40,000 --> 00:07:45,000
So with respect to that you can go ahead and apply early stopping if you want.

151
00:07:45,000 --> 00:07:46,000
Okay.

152
00:07:46,000 --> 00:07:48,000
Um but again I did not apply it over here.

153
00:07:48,000 --> 00:07:52,000
If you want to apply I will just give you the code with respect to early stopping also.

154
00:07:52,000 --> 00:07:53,000
Okay.

155
00:07:53,000 --> 00:07:59,000
Uh, how to probably do with this early stopping and where to add the particular parameter as you go

156
00:07:59,000 --> 00:08:00,000
ahead.

157
00:08:00,000 --> 00:08:00,000
Right.

158
00:08:00,000 --> 00:08:07,000
So for if you really want to just implement early stopping quickly, I will just make a code over here.

159
00:08:07,000 --> 00:08:10,000
And I will define this code over here itself.

160
00:08:10,000 --> 00:08:11,000
Right.

161
00:08:11,000 --> 00:08:14,000
So first of all let me just copy and paste this okay.

162
00:08:14,000 --> 00:08:14,000
Okay.

163
00:08:14,000 --> 00:08:18,000
Now first of all it will tell you to hey go ahead and import this early stopping.

164
00:08:18,000 --> 00:08:21,000
So we will go ahead and do this.

165
00:08:21,000 --> 00:08:23,000
And in the callback.

166
00:08:23,000 --> 00:08:29,000
So Tensorflow.keras or callback will be implementing this early stopping uh when you are doing the fit,

167
00:08:29,000 --> 00:08:36,000
you know, uh, over here when you are basically giving this particular, um, this, this, this,

168
00:08:36,000 --> 00:08:40,000
uh, when you're doing this particular fit, right at that point of time, you can add this early.

169
00:08:40,000 --> 00:08:41,000
Early stopping, right?

170
00:08:41,000 --> 00:08:45,000
You can just say, hey, my callback is something like this, right?

171
00:08:45,000 --> 00:08:48,000
I will I will be able to add this particular callback.

172
00:08:48,000 --> 00:08:48,000
Okay.

173
00:08:48,000 --> 00:08:49,000
That's it.

174
00:08:49,000 --> 00:08:51,000
Then automatically the early stopping will be enabled.

175
00:08:51,000 --> 00:08:55,000
But I think, uh, this loss will keep on decreasing as we go ahead.

176
00:08:55,000 --> 00:09:01,000
So let us wait till this entire training gets happened and then, uh, we will go ahead and do the prediction.

177
00:09:01,000 --> 00:09:02,000
Okay.

178
00:09:02,000 --> 00:09:04,000
So yeah, in some time I will see you all.

179
00:09:04,000 --> 00:09:05,000
Thank you.