WEBVTT

00:02.130 --> 00:08.310
Hello and welcome to this tutorial here, we will understand, unless the young and dear, you let us

00:08.310 --> 00:08.970
begin.

00:10.580 --> 00:17.580
Alice D.M. stands for long short term memory, whereas GIU stands for gaited, Greek current.

00:17.740 --> 00:18.250
You need.

00:20.070 --> 00:22.510
I ran into a strong short term memory.

00:22.780 --> 00:30.580
If I think events is long enough, then adenine will have a hard time to get the information from earlier

00:30.580 --> 00:32.890
times stapes to the later ones.

00:33.820 --> 00:36.520
Let us understand this with the help of an example.

00:38.350 --> 00:45.490
If you are trying to process a paragraph of text to do predictions, then recurrent neural networks

00:45.490 --> 00:49.540
may leave out important information from the beginning.

00:52.800 --> 00:53.490
Let's deal.

00:53.670 --> 00:57.930
And Jerry, you were created as the solution to short term memory.

00:59.400 --> 01:02.020
They have internal mechanics called as Gates.

01:02.220 --> 01:04.800
They can regulate the flow of information.

01:07.970 --> 01:13.320
In this picture, we have two diagrams, unless the young and G are you left side?

01:13.400 --> 01:20.120
We help enlist the young and right side, we help guide you in LSP and there are four main components

01:20.290 --> 01:26.000
for Gate Gate, Sayle, gate output gate and input gate vair.

01:26.150 --> 01:27.320
Whereas in Jiahu.

01:27.350 --> 01:29.540
There are only two main components.

01:29.750 --> 01:31.790
It is said gate and update gate.

01:32.570 --> 01:37.040
At the bottom of these two diagrams you can see symbols and their meanings.

01:37.730 --> 01:43.100
First one is the sigmoid activation function, then tennet activation function.

01:43.310 --> 01:50.420
After that point, base multiplication, then point wise addition and at the end vector concatenation.

01:51.290 --> 01:57.470
The gate that you can see here in these two diagrams can learn which data elastic weights is important

01:57.470 --> 01:59.390
to keep and which is not.

01:59.900 --> 02:06.890
By doing that, these gate passes to relevant information down the long chain of thick vents to make

02:06.890 --> 02:07.940
deep predictions.

02:09.050 --> 02:11.220
Let us understand at least the first.

02:11.450 --> 02:18.820
As you can see here, in the end, there are four gates for gate output update and input.

02:19.250 --> 02:23.600
Forget Gate will decide what to forget from the previous memory you need.

02:24.230 --> 02:30.710
The input gate will decide what to accept inside the new Rhon update gate will update the memory IDs

02:31.100 --> 02:34.970
and the output gate will do the output as long term memory.

02:35.720 --> 02:37.980
So this is the working of Alist the young.

02:38.170 --> 02:38.980
Let us understand.

02:39.270 --> 02:46.820
Working off GIAT you Jiahu is a comparatively new generation of neural networks and it is similar to

02:46.880 --> 02:47.450
LSD.

02:48.440 --> 02:55.000
There are only two gates in GIAT you received and update gate the update gate access.

02:55.010 --> 03:00.020
Forget gate and input gate inside the LSP update gate.

03:00.050 --> 03:07.730
Besides what information to throw away and what information to add in, Meuron reset gate is used to

03:07.730 --> 03:10.620
decide how much pot information to for gate.

03:11.330 --> 03:18.350
So the city working off GIAT you using illest the young and yet you short term memory will be long term

03:18.350 --> 03:19.000
memory.

03:19.940 --> 03:23.760
So this ordeal about the Young and Jianyu NCR.

03:24.150 --> 03:26.990
I will see you in the next one till then.

03:27.150 --> 03:28.140
Happy learning.
