WEBVTT

00:01.510 --> 00:02.080
Hey, everyone.

00:02.110 --> 00:06.790
So let's continue our discussion on training your own and meeting Martin.

00:07.240 --> 00:11.410
So in the last video, video created this model and we put product training purpose.

00:11.770 --> 00:13.780
Now that training is completed.

00:14.320 --> 00:17.100
Let me display this model logic.

00:21.480 --> 00:21.900
All right.

00:21.930 --> 00:25.180
So we got our model object in terms of work to work class.

00:25.530 --> 00:31.650
So our model is really we can do the production and we can see some of the applications of this over

00:31.650 --> 00:32.320
to Emory.

00:33.090 --> 00:36.720
So the first one is we should execute this one first.

00:37.080 --> 00:44.330
So model dot v w v function we are going to use and that function will return witness that particular

00:44.330 --> 00:47.850
of how it is represented or what is its word vector.

00:48.390 --> 00:49.740
So let me have Randy.

00:52.360 --> 01:00.570
All right, so sort of a token hammy and men has been represented by this 32 numbers simply.

01:01.060 --> 01:08.200
It has on their applications like a most similar so similar to men, which are the other 10 words,

01:08.260 --> 01:09.790
which is very much Close's.

01:10.030 --> 01:16.780
Now, not in terms of the actual characters contained inside those token, but in terms of the semantic

01:16.780 --> 01:22.800
representation that man is related to, which are the 10 closest one.

01:23.020 --> 01:24.520
So let me executing.

01:27.260 --> 01:33.190
And you can see Roumain is quite close to man, couple, girl, boy.

01:33.500 --> 01:34.730
So we got all those.

01:35.870 --> 01:41.210
Which is quite similar in nature compared to what they were worthy of supply man.

01:41.540 --> 01:47.660
So now you can understand the power of this word to work or avoid inbreeding, kind of modern, which

01:47.660 --> 01:49.500
is a prediction based model.

01:49.970 --> 01:56.330
Now, there is no similarity between, let's say bye and a man, but genetically try to indicate that

01:57.140 --> 01:59.090
these are the similar voice.

01:59.300 --> 02:03.560
So this thing just cannot be possible with a simple string matching.

02:04.040 --> 02:06.730
But this is possible with Delp of this war, Emily.

02:07.130 --> 02:10.550
So what inbreeding has given us power to represent?

02:11.750 --> 02:14.300
Every single token in a hide Amundsen's space.

02:14.390 --> 02:17.270
And the results of fixed type of Hyrum, in essence, space.

02:17.830 --> 02:21.080
Now let's see one more application and you will be amazed to see about.

02:23.720 --> 02:24.110
Let's see.

02:24.560 --> 02:30.350
We'll try to do some simple vector arithmetic like I designed in subsection, so I'm just going to stop

02:31.040 --> 02:35.960
the word vector of this king minus man, and I'm just adding to the women.

02:36.720 --> 02:38.920
And let's see what results we get.

02:41.560 --> 02:44.790
So Volman is close to movement itself.

02:46.380 --> 02:52.770
Because king and man will be canceled each other, then we are union, education, religious.

02:53.070 --> 03:00.600
So in this case, we didn't get a very good result because we are expecting no king minus man plus woman

03:00.630 --> 03:07.620
should be like a queen because King will cancel out man the same way women will cancel out the queen.

03:07.890 --> 03:11.310
But now why this thing happened because we're limited by the data.

03:11.670 --> 03:13.740
Of course we are limited by the data.

03:14.280 --> 03:19.440
Let's see one more like a Berlin is the capital of Germany as a country.

03:19.920 --> 03:21.690
Then Paris is the capital of France.

03:21.720 --> 03:23.830
So let's see whether we'll get to France or not.

03:26.990 --> 03:27.350
All right.

03:27.380 --> 03:29.550
So in this case, we got a very good result.

03:30.080 --> 03:38.090
So if we subtract the word vector of Germany minus Berlin, plus Paris will be equal to France.

03:38.330 --> 03:41.030
So we got excellent result in this particular case.

03:41.330 --> 03:47.450
Now, you can imagine that why in this case, it works fine, because we're taking the data that is

03:47.450 --> 03:48.740
avoid news data.

03:49.220 --> 03:51.230
You can see why news data from the rally.

03:51.520 --> 03:58.000
So was probably in those dataset, we found the words like Germany, Berlin together.

03:58.010 --> 04:00.140
Same with Paris and France together.

04:00.440 --> 04:06.740
That's why those kind of semantic really and see, however, vote to work model embedding technique,

04:06.950 --> 04:09.440
it try to cattery very much easy.

04:10.040 --> 04:12.570
Let's see about the Massee minus football.

04:12.620 --> 04:16.160
So same Leiker Massie's the guard of the football.

04:16.490 --> 04:20.000
Same way we are trying to find voids the guard of the cricket.

04:20.810 --> 04:21.260
So.

04:22.200 --> 04:28.080
Mostly district tournament is being played in a is in Pacific countries.

04:29.460 --> 04:30.900
And let's see what will come.

04:32.230 --> 04:34.860
So we got stalled and find a undergoes.

04:35.140 --> 04:40.030
So we don't have enough data corresponding to this cricket might be.

04:40.210 --> 04:41.950
And that's why this thing happened.

04:42.790 --> 04:45.010
We can try with, let's say, hockey weather.

04:46.210 --> 04:51.130
I'm not sure where the hockey is a token available in this.

04:52.200 --> 04:54.900
Hello, regional leaders hope so hockey doesn't exist.

04:54.930 --> 04:55.920
Hockey doesn't exist.

04:56.330 --> 04:58.380
So let's just keep it as it is.

04:58.950 --> 05:02.350
So now you can understand the power of this word to whack Mardle.

05:02.370 --> 05:03.300
And we try to.

05:04.840 --> 05:10.600
Playing our own Martin over the weekend, we got some very, very good result, but we are still limited

05:10.600 --> 05:11.330
by the data set.

05:11.830 --> 05:13.750
And of course, that will be OK.

05:13.810 --> 05:16.720
My idea and intention is to teach you that we do.

05:16.730 --> 05:18.020
But this didn't seem like library.

05:18.030 --> 05:21.470
Feel your own custom data set how you can train the model.

05:22.080 --> 05:26.990
Now, next, we are going to deal with that same word, two VEC model.

05:27.040 --> 05:32.170
But that will be a pre train work work model and we will see the same result.

05:32.630 --> 05:40.420
What I would say will apply those very standard updated model, which is being trained by some big companies

05:40.840 --> 05:45.270
and those particular model will give it excellent results.

05:45.350 --> 05:46.950
So see you in the next video.
