WEBVTT

00:01.940 --> 00:02.460
Here, everyone.

00:02.620 --> 00:09.950
So in the last video between our own way to work model and meeting Martin and we tested and somewhere

00:09.990 --> 00:16.330
we found when a good result or so, but now we are going to use this pre train model and this pre train

00:16.330 --> 00:18.370
model is trained by Google.

00:18.460 --> 00:20.140
So obviously you can think about that.

00:20.290 --> 00:26.680
The amount of data they use for the training purpose, that will be a nominal sum on top data.

00:27.010 --> 00:32.560
So in this model, we are just going to import it and we will see how we can.

00:34.000 --> 00:39.440
Logos model, and then we will based on our testing dataset.

00:40.050 --> 00:43.950
And let's see what results we'll be getting on exactly.

00:44.100 --> 00:45.030
Similar stuff.

00:48.150 --> 00:55.550
Let me clear all the outputs and let us start working on so import all those lively Jansing tenths of

00:55.550 --> 00:55.820
law.

00:56.940 --> 00:59.640
One more is a Kagan library.

01:00.390 --> 01:04.080
We need to make the Kaggle directly because same thing.

01:04.080 --> 01:06.270
We are going to download it from the Kegl only.

01:06.720 --> 01:10.500
Let me go to this link and you will get idea.

01:11.530 --> 01:14.740
Next, things we need to applaud this careful dog.

01:15.520 --> 01:18.690
Now, as we are dealing with a different notebook.

01:19.030 --> 01:20.680
Every time we have to do this thing.

01:21.430 --> 01:23.310
So Kagle Dargis on got uploaded.

01:23.830 --> 01:24.910
Let me just copy.

01:24.950 --> 01:27.470
Do not kegl for that.

01:29.250 --> 01:33.850
And let me be severely and then let just download this data set.

01:34.520 --> 01:35.780
Now it's a little bigger.

01:35.850 --> 01:41.450
Martin almost drowned PTB, so it will take a little bit amount of time.

01:42.230 --> 01:43.940
So I'm just fast forwarding my video.

01:43.970 --> 01:47.300
Tilda's download will complete.

01:51.710 --> 01:57.920
All right, so you can see model is successfully downloaded and it's almost around PGD, so it took

01:57.920 --> 01:58.490
a little time.

01:59.060 --> 02:00.860
Next phase, we need to inject this model.

02:00.950 --> 02:05.240
Now, unzipping definitely is going to take a good amount of time.

02:06.340 --> 02:07.590
So I ended up in the file.

02:09.990 --> 02:12.900
We can even go to this contain.

02:15.160 --> 02:20.640
Let me go one step up and instead of containers, start extracting all those things.

02:23.750 --> 02:28.870
This particular bin file, we hope, downloading Jim File, VEO downloaded.

02:28.910 --> 02:31.270
We are expecting into this bin fine.

02:32.420 --> 02:33.330
Let me collapse.

02:36.630 --> 02:42.540
All right, so now you can see model is completely unzip and it is available.

02:43.450 --> 02:46.710
Think, you know, many people file for Mix Bender Dodgier.

02:46.890 --> 02:48.070
And again, it will be me.

02:48.660 --> 02:50.070
So it's a quite big model.

02:51.350 --> 02:55.750
Next is we need to load this model, and for that we are going to use this vector.

02:56.660 --> 02:59.330
We are limited to one Lack's reactor only.

02:59.330 --> 03:00.160
We are going to load it.

03:00.170 --> 03:01.910
Otherwise it will be too much heavy.

03:02.420 --> 03:05.840
So let us lower fully model inside of it.

03:06.090 --> 03:07.130
Modern variable.

03:07.760 --> 03:11.360
And we are not doing any kind of model building.

03:11.390 --> 03:19.020
We are just loading the existing retrain model and that is being pre trained by the global organization.

03:19.610 --> 03:23.540
And that is Google News based on the news data available.

03:24.000 --> 03:25.340
And they train this model.

03:25.860 --> 03:28.160
But like, let's go with the.

03:29.470 --> 03:35.650
Man, so let me execute it and it will give us the vector representational man.

03:36.650 --> 03:40.460
Now you can see the size of this vector is much, much bigger.

03:40.970 --> 03:45.530
If you try to find, let's say, land function.

03:48.960 --> 03:52.290
Earlier case, we had a just 32 diamond set.

03:52.380 --> 03:58.430
But this Google retrain model, that is a 300 dimension vector.

03:58.460 --> 04:01.790
So every single token in this case will be presented by.

04:02.880 --> 04:04.570
Three hundred different numbers.

04:04.960 --> 04:07.360
Now let's try to find the same thing.

04:07.570 --> 04:14.140
Like a most similar with this pre train modern and we'll try to compare what results we will get.

04:14.620 --> 04:21.280
So in this case, also, we got the woman, but the boys first one compared to earlier, we had a couple

04:21.550 --> 04:22.720
girls and then boy.

04:23.380 --> 04:25.990
So a little better riding man is related to women.

04:26.170 --> 04:27.730
And then Boyd and teenager again.

04:28.340 --> 04:31.100
That's OK, but good results.

04:31.120 --> 04:31.570
We are right.

04:32.200 --> 04:36.880
But the real test will come when we test on this king minus man.

04:37.120 --> 04:37.910
Plus remain.

04:38.530 --> 04:39.780
So that is electric magic.

04:40.420 --> 04:42.460
And we got the king.

04:42.460 --> 04:45.610
So first when we got the king, but then we have a queen.

04:45.940 --> 04:48.730
So that's quite embracing and quite good results we got.

04:49.120 --> 04:56.660
So as the queen is to man, woman as queen, let's drive with the Germany capital, Berlin.

04:56.710 --> 05:01.930
So Paris is the capital of France and we got, yes, France.

05:01.940 --> 05:03.820
So in case those of you go to France.

05:03.830 --> 05:06.730
So what this model also, it works fine.

05:07.090 --> 05:10.570
But this particular case, we got a very good, quite impressive results.

05:11.050 --> 05:13.190
Let me go with the last one.

05:13.210 --> 05:15.400
Massey football and cricket.

05:15.460 --> 05:16.930
Now we have a good amount of data.

05:17.500 --> 05:20.020
So obviously, first one we automatically because.

05:21.110 --> 05:27.720
This particular actor is quite close to Massey, but the second one you ought to consider like a sidewalk.

05:27.830 --> 05:30.250
So Sela Tendulkar, Dravid.

05:30.730 --> 05:38.540
They needed the like a king or I would say the gall of this cricket tournament or a cricket sport sameway

05:38.600 --> 05:41.920
like a Massee in case of football kind of goes haywire.

05:42.020 --> 05:43.070
In case of cricket.

05:43.340 --> 05:49.010
So those of you don't know about cricket, I'm just getting idea that these two guys are the God of

05:49.010 --> 05:49.820
the cricket.

05:51.220 --> 05:57.640
Spots like that, then the conclusion is that this preprint model looks very much based, even though.

05:57.700 --> 06:02.830
And one more thing I want to highlight that you just reported, one lacks reactor only.

06:02.950 --> 06:07.720
But there are many, many more vectors is embedded inside this pretty model.

06:08.290 --> 06:13.270
So that is all about how to use this retrain model for the embedding purpose.

06:13.630 --> 06:14.860
See you in the next video.
