1
00:00:00,000 --> 00:00:01,000
Hello guys.

2
00:00:01,000 --> 00:00:07,000
So we are going to continue the discussion with respect to the LSTM working of LSTM, RNN neural network.

3
00:00:07,000 --> 00:00:12,000
Now in this video we are going to discuss about the forget gate, uh, which is the first and the foremost

4
00:00:12,000 --> 00:00:14,000
important component in the LSTM RNN.

5
00:00:14,000 --> 00:00:15,000
Okay.

6
00:00:15,000 --> 00:00:17,000
Now yes, we are going to discuss about this.

7
00:00:17,000 --> 00:00:22,000
We have already discussed in the previous video we gave some uh example.

8
00:00:22,000 --> 00:00:23,000
What is the importance of this memory cell.

9
00:00:23,000 --> 00:00:26,000
Then we understood what each and every operation is doing right?

10
00:00:26,000 --> 00:00:27,000
Everything we discussed.

11
00:00:27,000 --> 00:00:29,000
Now let's go ahead and discuss about the forget gate.

12
00:00:29,000 --> 00:00:32,000
Now let's focus on this part okay.

13
00:00:32,000 --> 00:00:34,000
Let's focus on this part.

14
00:00:34,000 --> 00:00:36,000
So on the left hand side of this part.

15
00:00:36,000 --> 00:00:39,000
Now what is exactly happening over here.

16
00:00:39,000 --> 00:00:39,000
Right.

17
00:00:39,000 --> 00:00:46,000
Let's, let's consider that um, I am probably doing, uh, functionality.

18
00:00:46,000 --> 00:00:52,000
Uh, I'm trying to solve a use case which has text, and we need to predict the next word.

19
00:00:53,000 --> 00:00:54,000
Okay.

20
00:00:55,000 --> 00:00:59,000
Let's say right now for every word that I have over here.

21
00:01:00,000 --> 00:01:00,000
Right.

22
00:01:00,000 --> 00:01:01,000
Let's say this is X11.

23
00:01:01,000 --> 00:01:02,000
This is X12.

24
00:01:02,000 --> 00:01:05,000
This is X13X14.

25
00:01:05,000 --> 00:01:11,000
And this is my next word which is nothing but my y y word specifically.

26
00:01:11,000 --> 00:01:14,000
So this will be my Y15 okay.

27
00:01:15,000 --> 00:01:21,000
This will be my output now for every word you know that we have to convert this into vectors, right?

28
00:01:21,000 --> 00:01:25,000
So let's consider that X11 or any word that we are going to use.

29
00:01:25,000 --> 00:01:30,000
We are just going to use it with the help of for 3 or 4 vectors.

30
00:01:30,000 --> 00:01:34,000
Let's say I want to probably represent every word with four vectors.

31
00:01:34,000 --> 00:01:38,000
Let's say it will go like zero, two, four, six.

32
00:01:38,000 --> 00:01:41,000
So this is my first word vector.

33
00:01:41,000 --> 00:01:45,000
The next word vector can be 4512 okay.

34
00:01:45,000 --> 00:01:47,000
I'm just defining okay.

35
00:01:47,000 --> 00:01:52,000
And these are the vectors that I am actually going to give with respect to timestamp okay.

36
00:01:52,000 --> 00:01:57,000
Now let's focus on this operation okay.

37
00:01:57,000 --> 00:01:59,000
Let's focus on this operation.

38
00:01:59,000 --> 00:02:01,000
So guys now let's take this first step.

39
00:02:01,000 --> 00:02:08,000
And uh I will show you like how in terms of neural network it will look like okay, so let's consider

40
00:02:08,000 --> 00:02:08,000
this x of T.

41
00:02:09,000 --> 00:02:14,000
Uh, I've taken as four dimension or let's say that okay, it is the every word.

42
00:02:14,000 --> 00:02:16,000
We are just going to represent it with three dimension okay.

43
00:02:17,000 --> 00:02:19,000
So here is my three dimension.

44
00:02:19,000 --> 00:02:24,000
Every word that I am representing is with respect to three dimension I may use any technique.

45
00:02:24,000 --> 00:02:28,000
And that is what I'm going to pass in the LSTM RNN okay.

46
00:02:28,000 --> 00:02:31,000
Now when we say x of t and HT minus one.

47
00:02:32,000 --> 00:02:38,000
Now when we say my x of t is represented, each and every word is represented by three dimension.

48
00:02:39,000 --> 00:02:42,000
So you will also be seeing that this will be my x of t.

49
00:02:42,000 --> 00:02:47,000
You'll also be seeing my HT minus one, which is the hidden state of the previous neuron.

50
00:02:47,000 --> 00:02:48,000
Right.

51
00:02:48,000 --> 00:02:51,000
This will also be a three dimension okay.

52
00:02:51,000 --> 00:02:53,000
This will also be three dimension.

53
00:02:53,000 --> 00:02:53,000
Why?

54
00:02:53,000 --> 00:02:54,000
I'm saying it will be three dimension.

55
00:02:54,000 --> 00:02:59,000
Because later on when we do this entire calculation right now, I'm just saying that this is the previous

56
00:02:59,000 --> 00:03:00,000
state, right?

57
00:03:00,000 --> 00:03:01,000
I will be getting three dimension.

58
00:03:01,000 --> 00:03:04,000
We have to take three dimensions and you'll be able to understand it.

59
00:03:04,000 --> 00:03:07,000
Why I'm saying this will be three dimensions.

60
00:03:07,000 --> 00:03:09,000
Uh, can't it be four dimension?

61
00:03:09,000 --> 00:03:10,000
Yes, it can be.

62
00:03:10,000 --> 00:03:12,000
See, I will just go ahead and write it over here.

63
00:03:12,000 --> 00:03:13,000
Let's let's consider.

64
00:03:13,000 --> 00:03:14,000
Okay.

65
00:03:14,000 --> 00:03:18,000
My HT minus one is also three dimension.

66
00:03:18,000 --> 00:03:19,000
Let's say one, two, four.

67
00:03:20,000 --> 00:03:21,000
Okay, I can also have four dimension.

68
00:03:21,000 --> 00:03:23,000
I can also have five dimension.

69
00:03:23,000 --> 00:03:29,000
But if I have RT minus one which is the previous hidden state as three dimension, then what you see

70
00:03:29,000 --> 00:03:31,000
over here has a memory cell.

71
00:03:31,000 --> 00:03:34,000
This CT minus one will also be three dimension.

72
00:03:34,000 --> 00:03:36,000
And let's put some values.

73
00:03:36,000 --> 00:03:37,000
Okay.

74
00:03:38,000 --> 00:03:43,000
Uh, in order to clear your confusion, let's consider that I'm going to use my word with four dimension.

75
00:03:43,000 --> 00:03:44,000
Okay.

76
00:03:44,000 --> 00:03:45,000
Four dimension.

77
00:03:45,000 --> 00:03:47,000
Every word I will just go ahead and use with four dimension.

78
00:03:48,000 --> 00:03:53,000
And, uh, let's say if in our previous state my HT minus one is three dimension, then my KT minus

79
00:03:53,000 --> 00:03:54,000
one will also be three dimension.

80
00:03:54,000 --> 00:03:56,000
Why I'm saying this because we'll do the calculation.

81
00:03:56,000 --> 00:03:58,000
Every calculation will be shown to you okay.

82
00:03:59,000 --> 00:04:02,000
Now let's take this xt and HT minus one.

83
00:04:02,000 --> 00:04:03,000
And we are concatenating over here.

84
00:04:03,000 --> 00:04:05,000
What does concatenating basically mean.

85
00:04:05,000 --> 00:04:09,000
So I will I'll, I'll take just as this input.

86
00:04:09,000 --> 00:04:10,000
So this will be my input.

87
00:04:10,000 --> 00:04:16,000
Let's say input words that I'm giving I know first of all I need to give HT minus one.

88
00:04:16,000 --> 00:04:16,000
Right.

89
00:04:16,000 --> 00:04:17,000
So this is my HT minus one.

90
00:04:18,000 --> 00:04:22,000
Now in the case of HT minus one as I said this is three dimension right.

91
00:04:22,000 --> 00:04:25,000
So I will be having three inputs that I will be giving over here.

92
00:04:25,000 --> 00:04:29,000
And then you have the next one that I'm passing XT next.

93
00:04:30,000 --> 00:04:34,000
Since we are concatenating it, I have already told you we have to combine this, right?

94
00:04:34,000 --> 00:04:38,000
So concatenating basically means I'm going to combine HT minus one.

95
00:04:38,000 --> 00:04:41,000
Along with this I'm also going to combine my XT.

96
00:04:41,000 --> 00:04:47,000
Now xt I will just give this as four dimension 1234.

97
00:04:47,000 --> 00:04:49,000
So this is for every word right.

98
00:04:49,000 --> 00:04:51,000
So at t is equal to two.

99
00:04:51,000 --> 00:04:57,000
Let's say I'm passing this information HT minus one basically means at h one the previous hidden state.

100
00:04:57,000 --> 00:04:58,000
I'm getting this values.

101
00:04:58,000 --> 00:05:02,000
So this three dimension is specifically coming from h t minus one.

102
00:05:02,000 --> 00:05:08,000
And these four dimensions are basically coming from my x of uh, my uh current word in this timestamp.

103
00:05:08,000 --> 00:05:09,000
Okay.

104
00:05:09,000 --> 00:05:14,000
When we combine, when we concatenate like this, that basically means we are just combining it, okay.

105
00:05:14,000 --> 00:05:15,000
We are just combining it.

106
00:05:15,000 --> 00:05:16,000
Perfect.

107
00:05:16,000 --> 00:05:16,000
Till here it is.

108
00:05:16,000 --> 00:05:17,000
Fine.

109
00:05:17,000 --> 00:05:19,000
Now this basically becomes my input.

110
00:05:19,000 --> 00:05:25,000
Okay, now in the next layer, once I go ahead, you know there is a neural network.

111
00:05:25,000 --> 00:05:30,000
Now when I say this is a neural network, in short, what I am actually going to do is that I will be

112
00:05:30,000 --> 00:05:35,000
having this kind of hidden layer, okay.

113
00:05:35,000 --> 00:05:40,000
Where neurons will be there, hidden neurons will be there.

114
00:05:40,000 --> 00:05:41,000
Okay.

115
00:05:41,000 --> 00:05:42,000
It can be two hidden neurons.

116
00:05:42,000 --> 00:05:43,000
It can be three hidden neurons.

117
00:05:43,000 --> 00:05:44,000
It is up to you.

118
00:05:44,000 --> 00:05:50,000
Let's for right now, let's consider that I'm just going to use three hidden neurons okay.

119
00:05:50,000 --> 00:05:52,000
So I'm just going to use this.

120
00:05:55,000 --> 00:05:58,000
Three hidden neurons okay.

121
00:05:58,000 --> 00:06:01,000
So in this hidden layer I'm having this three hidden neurons.

122
00:06:01,000 --> 00:06:08,000
And top of this hidden layer what is basically getting applied and activation function is getting applied.

123
00:06:09,000 --> 00:06:09,000
Okay.

124
00:06:09,000 --> 00:06:12,000
An activation function is basically getting applied in every hidden neuron.

125
00:06:12,000 --> 00:06:14,000
And that is what it looks like, right?

126
00:06:15,000 --> 00:06:20,000
Uh, if uh, so I'm just applying an activation function on top of it okay.

127
00:06:20,000 --> 00:06:24,000
And you know in all this neurons also we'll be having bias.

128
00:06:24,000 --> 00:06:26,000
We'll be having bias okay.

129
00:06:26,000 --> 00:06:28,000
So till here it is very much simple.

130
00:06:28,000 --> 00:06:34,000
Now what I will do I will just go ahead combine every node which looks like this.

131
00:06:35,000 --> 00:06:35,000
Okay.

132
00:06:40,000 --> 00:06:42,000
So here I'm just going to combine this.

133
00:06:44,000 --> 00:06:49,000
Okay I'll combine this to let's say this is here.

134
00:06:49,000 --> 00:06:51,000
This is here again.

135
00:06:51,000 --> 00:06:52,000
You can combine each and every layer.

136
00:06:52,000 --> 00:06:54,000
It is up to you okay.

137
00:06:55,000 --> 00:06:56,000
And you can complete this diagram.

138
00:06:56,000 --> 00:07:02,000
So this mostly becomes like a, uh, simple connected uh neurons.

139
00:07:02,000 --> 00:07:03,000
Right.

140
00:07:03,000 --> 00:07:04,000
So this is my input.

141
00:07:04,000 --> 00:07:05,000
This is my hidden layer.

142
00:07:05,000 --> 00:07:09,000
Now once we apply the activation function then what we are going to get, we are going to basically

143
00:07:09,000 --> 00:07:12,000
get f of t f of t.

144
00:07:12,000 --> 00:07:13,000
Okay.

145
00:07:13,000 --> 00:07:18,000
So in short this part that you will be able to see which is called as forget gate.

146
00:07:18,000 --> 00:07:23,000
I have changed into something like this, and this is the entire operation that is probably going to

147
00:07:23,000 --> 00:07:24,000
take place.

148
00:07:24,000 --> 00:07:28,000
Now let's understand how many weights is basically getting assigned.

149
00:07:28,000 --> 00:07:31,000
So here you will be able to see how many, how many inputs.

150
00:07:31,000 --> 00:07:33,000
I am giving one comma seven.

151
00:07:33,000 --> 00:07:34,000
That is one cross seven.

152
00:07:35,000 --> 00:07:35,000
Right.

153
00:07:35,000 --> 00:07:39,000
So this will basically be one cross seven because one row seven inputs.

154
00:07:40,000 --> 00:07:49,000
And then with respect to this since I have three, since I have three, uh, neurons, hidden neurons

155
00:07:49,000 --> 00:07:50,000
in the hidden layer.

156
00:07:50,000 --> 00:07:55,000
So this is basically going to be seven cross three right.

157
00:07:55,000 --> 00:07:56,000
Seven cross three weights.

158
00:07:56,000 --> 00:07:57,000
Right.

159
00:07:57,000 --> 00:07:58,000
This will be my seven cross three weights.

160
00:07:58,000 --> 00:08:04,000
And once I perform this particular operation, if I do this entire dot operation so it will be one cross

161
00:08:04,000 --> 00:08:09,000
seven uh dot seven cross three, which will be nothing.

162
00:08:09,000 --> 00:08:13,000
But here this two will get combined.

163
00:08:13,000 --> 00:08:14,000
So I will finally have one cross three.

164
00:08:14,000 --> 00:08:20,000
So that is the reason this f of t that you will be seeing is that I will be getting a output dimension

165
00:08:20,000 --> 00:08:22,000
of one cross, three right?

166
00:08:22,000 --> 00:08:23,000
One cross.

167
00:08:23,000 --> 00:08:25,000
This is what we are basically getting.

168
00:08:25,000 --> 00:08:32,000
Okay, now you can understand why I basically took CT minus one as three dimension RT minus one as three

169
00:08:32,000 --> 00:08:32,000
dimension.

170
00:08:32,000 --> 00:08:35,000
When this is three dimension, this has to be three dimension.

171
00:08:35,000 --> 00:08:39,000
And just by doing this calculation here, you can see that I'm able to get three dimensions right.

172
00:08:40,000 --> 00:08:44,000
Uh, if this is four dimension, this also has to be four dimension because based on the calculation

173
00:08:44,000 --> 00:08:45,000
it will happen in that way.

174
00:08:45,000 --> 00:08:45,000
Right.

175
00:08:45,000 --> 00:08:51,000
So here my f of t what I have done is that I have able to find out this output vector which will be

176
00:08:51,000 --> 00:08:52,000
of one cross three.

177
00:08:52,000 --> 00:08:52,000
Okay.

178
00:08:52,000 --> 00:08:53,000
Perfect.

179
00:08:53,000 --> 00:08:54,000
Till here.

180
00:08:54,000 --> 00:08:57,000
I think everybody is understood this forget gate operation.

181
00:08:57,000 --> 00:09:02,000
Now going forward, you will be able to see that there is a point wise operation.

182
00:09:03,000 --> 00:09:09,000
So here, if I go ahead and take this forward, there is something a point wise operation.

183
00:09:09,000 --> 00:09:17,000
Now this point wise operation is along with a cell, uh, CT minus one, which is with my memory cell

184
00:09:17,000 --> 00:09:24,000
and CT minus one is the previous, um, previous memory cell information that we are getting with respect

185
00:09:24,000 --> 00:09:25,000
to CT minus one.

186
00:09:25,000 --> 00:09:28,000
Now, once we do this point operation, what does this point operation basically means?

187
00:09:28,000 --> 00:09:33,000
Whatever vector we are getting, we are doing a point wise operation with the previous cell state.

188
00:09:33,000 --> 00:09:34,000
Now what does this mean?

189
00:09:34,000 --> 00:09:38,000
Let's let's consider this from my previous cell state.

190
00:09:38,000 --> 00:09:40,000
I get this information CT minus one.

191
00:09:43,000 --> 00:09:44,000
CT minus one.

192
00:09:44,000 --> 00:09:49,000
And let's say I have some values like um, 689.

193
00:09:49,000 --> 00:09:49,000
Okay.

194
00:09:50,000 --> 00:09:56,000
Now when we do this point wise operation, point wise multiplication operation.

195
00:09:57,000 --> 00:09:57,000
Okay.

196
00:09:57,000 --> 00:10:06,000
And let's say my f of t that I have actually calculated is something like 000.

197
00:10:06,000 --> 00:10:08,000
Let's consider like this okay.

198
00:10:08,000 --> 00:10:12,000
So what we are doing, we are basically doing a point wise operation with 000 okay.

199
00:10:12,000 --> 00:10:15,000
Now what does 000 basically mean?

200
00:10:15,000 --> 00:10:21,000
Zero zero basically means that whatever operation we are doing over here till finding this f of t,

201
00:10:21,000 --> 00:10:29,000
I'm saying that now let's see that the complete sentence context has been changed.

202
00:10:33,000 --> 00:10:39,000
Now, when the complete sentence context has been changed and let's say I'm getting the value 000,

203
00:10:39,000 --> 00:10:40,000
this is the value that we are getting.

204
00:10:40,000 --> 00:10:47,000
What does this basically mean once we do the dot operation here, the previous context, whatever was

205
00:10:47,000 --> 00:10:50,000
present, we are just making it entirely to zero.

206
00:10:50,000 --> 00:10:58,000
That basically means we are removing all the previous context.

207
00:11:00,000 --> 00:11:05,000
Now here you know that when we apply a sigmoid activation function, it will give you a output between

208
00:11:05,000 --> 00:11:06,000
0 to 1.

209
00:11:06,000 --> 00:11:06,000
Right.

210
00:11:06,000 --> 00:11:09,000
It is going to give you an output between 0 to 1.

211
00:11:09,000 --> 00:11:09,000
Right.

212
00:11:09,000 --> 00:11:12,000
And let's say I got all the outputs to zero.

213
00:11:12,000 --> 00:11:18,000
And I if I'm getting the outputs to zero, that basically means whatever operation we have done.

214
00:11:18,000 --> 00:11:19,000
Right.

215
00:11:19,000 --> 00:11:22,000
And with respect to the new sentence, the complete context has been changed.

216
00:11:22,000 --> 00:11:26,000
So I have to remove all the context that is already remembered, uh, remaining in the previous cell

217
00:11:26,000 --> 00:11:27,000
state.

218
00:11:27,000 --> 00:11:27,000
Okay.

219
00:11:28,000 --> 00:11:35,000
Now similarly, let's go ahead and consider that if my F of t is one, let's say if my f of t is completely

220
00:11:35,000 --> 00:11:36,000
one.

221
00:11:36,000 --> 00:11:38,000
Now if I do the dot operation.

222
00:11:38,000 --> 00:11:40,000
So this is my first scenario.

223
00:11:40,000 --> 00:11:41,000
Now this will be my second scenario.

224
00:11:41,000 --> 00:11:45,000
Let's say my CT minus one is nothing but 689.

225
00:11:45,000 --> 00:11:45,000
nine.

226
00:11:46,000 --> 00:11:51,000
Now, if I do this point wise operation with one one, one, what I'm actually going to get, I'm going

227
00:11:51,000 --> 00:11:53,000
to get the same vector 689.

228
00:11:53,000 --> 00:12:00,000
So this actually means that I am not going to remove anything from here from the CT minus one memory

229
00:12:00,000 --> 00:12:00,000
cell.

230
00:12:00,000 --> 00:12:04,000
I'm not going to remove anything because all the context is important for me.

231
00:12:05,000 --> 00:12:05,000
Okay.

232
00:12:06,000 --> 00:12:08,000
Now similarly I may also give you another scenario.

233
00:12:08,000 --> 00:12:17,000
Let's say if I go ahead and say my CT minus one is 689, now I'm going to do a dot operation with,

234
00:12:17,000 --> 00:12:23,000
uh, the dot operation will be with uh, let's say 0.51.5.

235
00:12:23,000 --> 00:12:26,000
Now when I do this dot operation, what does this basically mean?

236
00:12:26,000 --> 00:12:33,000
So six divided by 0.5 is uh, one by two is 0.38 divided by one will be eight, nine divided by 0.5

237
00:12:33,000 --> 00:12:35,000
will be 4.5.

238
00:12:35,000 --> 00:12:35,000
Right?

239
00:12:35,000 --> 00:12:36,000
So 3.5 will be 4.5.

240
00:12:36,000 --> 00:12:43,000
Now here it says that from the memory cell I am removing some context from this vector and some context

241
00:12:43,000 --> 00:12:44,000
from this vector.

242
00:12:44,000 --> 00:12:46,000
But I am not removing any context from this vector.

243
00:12:47,000 --> 00:12:47,000
Right.

244
00:12:47,000 --> 00:12:53,000
So in short, what we are doing by this operation is that we are adding some information or removing

245
00:12:53,000 --> 00:12:55,000
some, some information.

246
00:12:55,000 --> 00:13:00,000
Sorry, I'm not adding some information, but I am removing some information out of it.

247
00:13:00,000 --> 00:13:00,000
Right.

248
00:13:00,000 --> 00:13:02,000
I'm not adding over here, but I'm removing.

249
00:13:02,000 --> 00:13:03,000
Right.

250
00:13:03,000 --> 00:13:04,000
Removing some kind of information.

251
00:13:04,000 --> 00:13:08,000
So that is the reason this is entirely called as forget.

252
00:13:08,000 --> 00:13:08,000
Right.

253
00:13:08,000 --> 00:13:11,000
So the conclusion with respect to forget get is that.

254
00:13:13,000 --> 00:13:18,000
Here based on the context.

255
00:13:23,000 --> 00:13:26,000
Based on the context this forget gate.

256
00:13:28,000 --> 00:13:33,000
Will let go some information.

257
00:13:36,000 --> 00:13:40,000
Or will not let go.

258
00:13:42,000 --> 00:13:44,000
Not let go some information.

259
00:13:46,000 --> 00:13:50,000
And that is what is all about forgetting information.

260
00:13:51,000 --> 00:13:55,000
It is making to forget some of the information based on the context, right?

261
00:13:55,000 --> 00:13:59,000
And this is how you basically understand about forget gate.

262
00:13:59,000 --> 00:14:00,000
Right.

263
00:14:00,000 --> 00:14:05,000
So I hope you were able to understand this entire LSTM architecture with respect to forget gate.

264
00:14:05,000 --> 00:14:09,000
I broke this down completely and I also helped you to understand this.

265
00:14:09,000 --> 00:14:11,000
What how does this neural network look like?

266
00:14:11,000 --> 00:14:11,000
Okay.

267
00:14:12,000 --> 00:14:18,000
But with the help of forget gate, what we are doing is that we may remove the complete, uh, we may,

268
00:14:18,000 --> 00:14:23,000
uh, tell the memory cell to forget some of the information or to remember some of the information or

269
00:14:23,000 --> 00:14:25,000
forget some information.

270
00:14:25,000 --> 00:14:26,000
Something can be actually done.

271
00:14:26,000 --> 00:14:30,000
And this was the importance with respect to the, uh, forget gate.

272
00:14:30,000 --> 00:14:36,000
Uh, now, in my next video, we are going to basically go ahead and discuss about something called

273
00:14:36,000 --> 00:14:37,000
as input gate.

274
00:14:37,000 --> 00:14:39,000
Right now, what exactly is this input gate?

275
00:14:39,000 --> 00:14:39,000
We'll get to know.

276
00:14:40,000 --> 00:14:42,000
But here I think we discussed till here.

277
00:14:42,000 --> 00:14:47,000
And this multiplication operation is just like you are forgetting some information you are allowing

278
00:14:47,000 --> 00:14:48,000
based on the context.

279
00:14:48,000 --> 00:14:53,000
You are telling the memory cell to forget some of the information or let go some of the information

280
00:14:53,000 --> 00:14:54,000
by remembering it.

281
00:14:54,000 --> 00:14:56,000
Okay, so yes, this was it for my side.

282
00:14:56,000 --> 00:14:59,000
I will see you all in the next video where I will be discussing about input gate.

283
00:14:59,000 --> 00:15:00,000
Thank you.

