1
00:00:00,000 --> 00:00:00,000
Hello guys.

2
00:00:00,000 --> 00:00:03,000
So we are going to continue a discussion with respect to natural language processing.

3
00:00:03,000 --> 00:00:06,000
In this video we are going to cover word two vec.

4
00:00:06,000 --> 00:00:09,000
We have already seen what exactly word two vec is.

5
00:00:09,000 --> 00:00:11,000
It is a deep learning train model.

6
00:00:11,000 --> 00:00:17,000
And again it is a kind of a word embedding techniques, wherein the focus is to convert word into vectors,

7
00:00:17,000 --> 00:00:22,000
making sure that the meaning of different different words are actually maintained.

8
00:00:22,000 --> 00:00:26,000
Like if there is a similar words, we will be getting vectors that are very near to each other if it

9
00:00:26,000 --> 00:00:31,000
probably find out the difference, and will also be able to see that which all words are completely

10
00:00:31,000 --> 00:00:33,000
opposite based on these vectors.

11
00:00:33,000 --> 00:00:40,000
Okay, so let's discuss about word two vec and I will give you an idea about like what exactly is word

12
00:00:40,000 --> 00:00:44,000
two vec and how the words is getting converted into a vectors.

13
00:00:45,000 --> 00:00:50,000
In the upcoming videos, I will try to show you that how word two vec models are basically prepared

14
00:00:50,000 --> 00:00:52,000
with respect to architecture.

15
00:00:52,000 --> 00:00:54,000
Uh, in, in the case of deep learning models.

16
00:00:54,000 --> 00:00:58,000
And for that you really need to have knowledge about an models, right?

17
00:00:58,000 --> 00:01:01,000
If you really need to understand that, how we can train word two vec from scratch.

18
00:01:01,000 --> 00:01:02,000
Okay.

19
00:01:02,000 --> 00:01:07,000
And uh, yeah, let's go ahead and with the uh definition and let's see that what all problems that

20
00:01:07,000 --> 00:01:08,000
actually fixes.

21
00:01:08,000 --> 00:01:15,000
So over here word two vec is a technique for natural language processing published in 2013.

22
00:01:15,000 --> 00:01:18,000
And it was published by Google.

23
00:01:18,000 --> 00:01:23,000
An amazing company already, you know that and they have done some tons of work with respect to NLP.

24
00:01:23,000 --> 00:01:26,000
You know they're doing lot of research.

25
00:01:26,000 --> 00:01:29,000
The word two vec algorithm uses a neural network model.

26
00:01:29,000 --> 00:01:35,000
We'll be discussing about this, how we'll be using it, uh, how it uses a neural network to learn

27
00:01:35,000 --> 00:01:36,000
word association.

28
00:01:36,000 --> 00:01:42,000
Please make make sure that you understand these words to learn word association from a large corpus

29
00:01:42,000 --> 00:01:43,000
of text.

30
00:01:44,000 --> 00:01:51,000
Once trained, such a model can detect synonym words or suggest additional words for partial sentence,

31
00:01:51,000 --> 00:01:55,000
so it will be able to detect synonyms.

32
00:01:55,000 --> 00:01:59,000
It will be able to detect, uh, opposite words and many more things.

33
00:01:59,000 --> 00:02:05,000
As the name implies, word two vec represents each distinct word with a particular list of number called

34
00:02:05,000 --> 00:02:05,000
as vectors.

35
00:02:05,000 --> 00:02:09,000
So at the end of the day, we are converting a word into vectors.

36
00:02:09,000 --> 00:02:11,000
But this vector will have many things.

37
00:02:11,000 --> 00:02:17,000
It will be able to detect synonym words, or it will also be able to suggest additional words for partial

38
00:02:17,000 --> 00:02:18,000
sentence.

39
00:02:18,000 --> 00:02:21,000
Okay, now let's understand what exactly this is.

40
00:02:21,000 --> 00:02:24,000
Nasty guys in bag of words TF-IDF we have already seen right?

41
00:02:24,000 --> 00:02:30,000
Based on the vocabulary size will either get one, zeros, ones or zeros and in short we are getting

42
00:02:30,000 --> 00:02:32,000
a sparse matrix in TF-IDF.

43
00:02:32,000 --> 00:02:34,000
Also, we may get decimals like 0.2 5.6.

44
00:02:34,000 --> 00:02:37,000
Then again zero zeros are there in word two vec.

45
00:02:37,000 --> 00:02:39,000
It will be little bit different.

46
00:02:39,000 --> 00:02:41,000
Now let me talk about how it will be different.

47
00:02:41,000 --> 00:02:44,000
Let's consider that I have a vocabulary okay.

48
00:02:44,000 --> 00:02:46,000
And I have a vocabulary.

49
00:02:46,000 --> 00:02:48,000
Let's say I have my vocabulary.

50
00:02:48,000 --> 00:02:55,000
So these are basically my this many number of unique words I have in my corpus Okay, unique words I

51
00:02:55,000 --> 00:02:56,000
have in my corpus.

52
00:02:56,000 --> 00:02:58,000
Corpus basically means paragraph.

53
00:02:58,000 --> 00:02:59,000
Okay.

54
00:02:59,000 --> 00:03:06,000
Now let's say the vocabulary words that I specifically have like something like boy, girl okay.

55
00:03:06,000 --> 00:03:10,000
And then we have something like King, Queen.

56
00:03:13,000 --> 00:03:20,000
And if I probably talk about some more words like apple and mango, let's say in my vocabulary, I have

57
00:03:20,000 --> 00:03:21,000
this many words.

58
00:03:21,000 --> 00:03:21,000
Okay.

59
00:03:21,000 --> 00:03:29,000
Now, one very important word I'm going to put up over here, which is called as feature representation.

60
00:03:30,000 --> 00:03:33,000
Feature representation.

61
00:03:33,000 --> 00:03:35,000
Please listen to me very, very carefully.

62
00:03:35,000 --> 00:03:36,000
A very important topic.

63
00:03:36,000 --> 00:03:45,000
Now each and every word that are present in the vocabulary will be converted into a feature representation.

64
00:03:45,000 --> 00:03:47,000
Now what this exactly means.

65
00:03:47,000 --> 00:03:56,000
This basically means that we are going to convert this all words into a vectors based on some features.

66
00:03:56,000 --> 00:03:58,000
Now what all features can it be?

67
00:03:58,000 --> 00:03:58,000
Right?

68
00:03:58,000 --> 00:04:00,000
Let me give you a very good example about it.

69
00:04:00,000 --> 00:04:06,000
But understand that when we are training a very big word two vec model at that time, you will not be

70
00:04:06,000 --> 00:04:08,000
getting a clear idea about all the features.

71
00:04:08,000 --> 00:04:14,000
But here, just to make this intuition, to make you understand the intuition, lets discuss about this

72
00:04:14,000 --> 00:04:15,000
feature representation.

73
00:04:15,000 --> 00:04:20,000
So in the left hand side what will happen is that let's say that I will be having lot of features like

74
00:04:20,000 --> 00:04:20,000
this.

75
00:04:20,000 --> 00:04:23,000
So I may be having a feature called as gender.

76
00:04:23,000 --> 00:04:26,000
I may be having a feature called as Royal.

77
00:04:26,000 --> 00:04:29,000
I may be having a feature called as age.

78
00:04:29,000 --> 00:04:31,000
I may be having a feature called As Food.

79
00:04:31,000 --> 00:04:32,000
Like this.

80
00:04:32,000 --> 00:04:37,000
I will be having a lot of features, and let's say that the total number of features, you know, are

81
00:04:37,000 --> 00:04:39,000
basically 300 dimension.

82
00:04:39,000 --> 00:04:44,000
That basically means I will be having one more nth feature over here and that size of this entire features,

83
00:04:44,000 --> 00:04:49,000
like if I probably count all of this, this will be 300 dimensions.

84
00:04:49,000 --> 00:04:54,000
Let's consider 300 dimension basically means I will be having this 300 features.

85
00:04:54,000 --> 00:05:01,000
Now with respect to all this vocabulary, we are going to represent this word in the form of vector

86
00:05:01,000 --> 00:05:03,000
considering this feature representation.

87
00:05:03,000 --> 00:05:05,000
So this is my entire feature okay.

88
00:05:05,000 --> 00:05:09,000
What I'm actually going to do I'm going to take up all these words that are present in the vocabulary.

89
00:05:09,000 --> 00:05:15,000
Based on this particular features we are going to assign one numerical value okay.

90
00:05:15,000 --> 00:05:16,000
And understand one thing.

91
00:05:16,000 --> 00:05:20,000
This numerical value will be assigned based on the relation of this two words.

92
00:05:20,000 --> 00:05:26,000
That is the vocabulary and this feature representation that we have here I have given you as an example.

93
00:05:26,000 --> 00:05:31,000
But when we are training a very large word two vec model, we will not be able to see this features

94
00:05:31,000 --> 00:05:32,000
entirely.

95
00:05:32,000 --> 00:05:32,000
Right?

96
00:05:33,000 --> 00:05:38,000
If I take an example of Google, you know they have come up with this amazing word two vec model, which

97
00:05:38,000 --> 00:05:44,000
is basically trained in 3 billion words, I guess 3 billion words, which is coming from the news feed.

98
00:05:44,000 --> 00:05:45,000
Right.

99
00:05:45,000 --> 00:05:50,000
And that practical example also try to show you and at the end of the day, you'll be able to see that

100
00:05:50,000 --> 00:05:55,000
each and every word is basically represented by 300 dimension of feature representation.

101
00:05:55,000 --> 00:05:59,000
That basically is every word will be having a 300 dimension.

102
00:05:59,000 --> 00:06:00,000
Uh, vectors.

103
00:06:00,000 --> 00:06:01,000
Okay.

104
00:06:01,000 --> 00:06:03,000
Now this is super important.

105
00:06:03,000 --> 00:06:05,000
What kind of values that I can have over here.

106
00:06:06,000 --> 00:06:06,000
Right.

107
00:06:06,000 --> 00:06:08,000
What kind of values I can have over here.

108
00:06:08,000 --> 00:06:12,000
Now, let's say with respect to Y which is present in the vocabulary, okay.

109
00:06:12,000 --> 00:06:15,000
And it's and its relationship with respect to gender.

110
00:06:15,000 --> 00:06:17,000
Let's say that I'm having one minus one over here.

111
00:06:17,000 --> 00:06:19,000
Let's, let's let's just consider okay.

112
00:06:19,000 --> 00:06:20,000
Minus one over here.

113
00:06:20,000 --> 00:06:27,000
Now with respect to girl and gender the value can be plus one because it is the opposite of boy right

114
00:06:27,000 --> 00:06:28,000
opposite with respect to gender.

115
00:06:28,000 --> 00:06:31,000
If I say boy, is there opposite of boy is nothing but girl.

116
00:06:31,000 --> 00:06:34,000
So minus one and one, this kind of vectors can come.

117
00:06:34,000 --> 00:06:36,000
Now here you can see with respect to the next word.

118
00:06:36,000 --> 00:06:38,000
Over here we have boy, we have royal.

119
00:06:38,000 --> 00:06:40,000
We obviously know we cannot say right.

120
00:06:40,000 --> 00:06:43,000
We don't have a sentence that oh he's a royal boy.

121
00:06:43,000 --> 00:06:46,000
He can be a royal prince or he can be a royal king.

122
00:06:46,000 --> 00:06:46,000
Right.

123
00:06:46,000 --> 00:06:49,000
So there is no no proper relationship.

124
00:06:49,000 --> 00:06:53,000
So in this particular case, you know, there will be a value like 0.01.

125
00:06:53,000 --> 00:06:54,000
I'm just giving you as an example.

126
00:06:55,000 --> 00:06:55,000
Right.

127
00:06:55,000 --> 00:07:00,000
Similarly, we will also be having with respect to boy and age, let's say that there is not much relation.

128
00:07:00,000 --> 00:07:03,000
So I'm just going to put it as 0.03 okay.

129
00:07:03,000 --> 00:07:05,000
Very near to zero.

130
00:07:05,000 --> 00:07:07,000
Now similarly I can have all these values over here.

131
00:07:07,000 --> 00:07:14,000
And these all values comes through proper trained models like word two vec word two vec which is trained

132
00:07:14,000 --> 00:07:16,000
by deep learning techniques like an okay.

133
00:07:16,000 --> 00:07:19,000
And I'll show you in the next video how those models are basically trained.

134
00:07:19,000 --> 00:07:25,000
But just understand over here, each and every vocabulary that we are seeing is represented based on

135
00:07:25,000 --> 00:07:26,000
this feature representation.

136
00:07:26,000 --> 00:07:30,000
So that basically means for boy I will be having this vector.

137
00:07:30,000 --> 00:07:37,000
So for boy for boy you can see over here I will be having this specific vector.

138
00:07:39,000 --> 00:07:39,000
Okay.

139
00:07:39,000 --> 00:07:42,000
So this all vectors will be here okay.

140
00:07:42,000 --> 00:07:44,000
Now similarly with respect to girl.

141
00:07:44,000 --> 00:07:45,000
Now you can see with respect to gender.

142
00:07:45,000 --> 00:07:48,000
If boy is having minus one then this will be plus one right.

143
00:07:48,000 --> 00:07:51,000
Because it is completely opposite with respect to royal.

144
00:07:51,000 --> 00:07:52,000
Again no relationship.

145
00:07:52,000 --> 00:07:53,000
So it will be .02.

146
00:07:53,000 --> 00:08:00,000
Let's say here I'm going to put .02 because age with respect to this also no specific relationship right.

147
00:08:00,000 --> 00:08:03,000
So similarly I will be having other values like this.

148
00:08:03,000 --> 00:08:08,000
Now see with respect to King, you know I may have gender like uh, there is a relationship with respect

149
00:08:08,000 --> 00:08:09,000
to gender and King.

150
00:08:09,000 --> 00:08:12,000
So -0.92 will be there with respect to Queen.

151
00:08:12,000 --> 00:08:15,000
It can be plus 0.93 you know.

152
00:08:15,000 --> 00:08:19,000
So here you can see opposite right opposite opposite over here.

153
00:08:19,000 --> 00:08:19,000
Right.

154
00:08:19,000 --> 00:08:23,000
And similarly I can have other vectors like over here Royal and king are related.

155
00:08:23,000 --> 00:08:23,000
Right.

156
00:08:23,000 --> 00:08:26,000
So I can have 0.95 okay.

157
00:08:26,000 --> 00:08:29,000
And over here you can see also royal and Queen can also be a royal right.

158
00:08:29,000 --> 00:08:32,000
So over here probably the value can be 0.96 .97.

159
00:08:32,000 --> 00:08:38,000
Very much nearby each other right now understand because of this vectors you know similar words will

160
00:08:38,000 --> 00:08:41,000
be will be very very close to each other.

161
00:08:41,000 --> 00:08:43,000
Because if I try to subtract it, I'll just give you an idea about it.

162
00:08:43,000 --> 00:08:48,000
So right now with respect to age also obviously there will be some relationship between age and King

163
00:08:48,000 --> 00:08:51,000
because we say that old king right with respect to age.

164
00:08:51,000 --> 00:08:53,000
So suppose let's say here I'm having 0.75.

165
00:08:53,000 --> 00:08:55,000
Here I'm having .68.

166
00:08:55,000 --> 00:08:57,000
And like this I will be having multiple vectors like this.

167
00:08:57,000 --> 00:08:58,000
Right.

168
00:08:58,000 --> 00:09:00,000
And we can clearly see right over here.

169
00:09:00,000 --> 00:09:00,000
It is.

170
00:09:00,000 --> 00:09:05,000
Now see with respect to Apple obviously it will have no relationship with respect to gender uh gender.

171
00:09:05,000 --> 00:09:10,000
So I can probably write point uh point five or .01 something.

172
00:09:10,000 --> 00:09:10,000
Right.

173
00:09:10,000 --> 00:09:12,000
With respect to mango also it cannot be.

174
00:09:12,000 --> 00:09:13,000
So I'll write just .23.

175
00:09:13,000 --> 00:09:14,000
I'm just saying that okay.

176
00:09:14,000 --> 00:09:17,000
We basically, uh, can have some values over here.

177
00:09:17,000 --> 00:09:21,000
If it is not having a relation, it will be very much near to zero, like 0.05.

178
00:09:22,000 --> 00:09:22,000
Okay.

179
00:09:22,000 --> 00:09:24,000
I'm just putting some values.

180
00:09:24,000 --> 00:09:28,000
But once we train all these models right, we will be getting this value here.

181
00:09:28,000 --> 00:09:33,000
I'm giving you a crux idea about like how each and every vectors may look like okay, now over here

182
00:09:33,000 --> 00:09:34,000
with respect to Apple.

183
00:09:34,000 --> 00:09:37,000
And also I will not be having much relationship.

184
00:09:37,000 --> 00:09:39,000
So let's say I'm putting minus 0.20 to.

185
00:09:39,000 --> 00:09:41,000
This will be point plus .02.

186
00:09:41,000 --> 00:09:44,000
With respect to age and Apple, this may have a very good relationship, right.

187
00:09:44,000 --> 00:09:49,000
Because, uh, let's say if the apple is kept for ten days outside, you know, it may, it may, it

188
00:09:49,000 --> 00:09:50,000
may rotten up.

189
00:09:50,000 --> 00:09:50,000
Right?

190
00:09:50,000 --> 00:09:52,000
It may not have that nutritional value.

191
00:09:52,000 --> 00:09:55,000
So this, uh, with respect to age, it may have a direct relationship.

192
00:09:55,000 --> 00:09:57,000
Mango also may it may have a direct relationship.

193
00:09:57,000 --> 00:10:01,000
So we are going to have this vectors pretty much similar with respect to food.

194
00:10:01,000 --> 00:10:03,000
Yes apple belongs to a food item.

195
00:10:03,000 --> 00:10:04,000
Right.

196
00:10:04,000 --> 00:10:05,000
So this will probably have a good value.

197
00:10:05,000 --> 00:10:07,000
It may be 0.91.

198
00:10:07,000 --> 00:10:09,000
This is also 0.92 right.

199
00:10:09,000 --> 00:10:11,000
And similarly I'll be having different different vectors.

200
00:10:11,000 --> 00:10:11,000
Right.

201
00:10:11,000 --> 00:10:17,000
So here what we have done is that each and every vocabulary word is represented based on this feature

202
00:10:17,000 --> 00:10:18,000
representation.

203
00:10:19,000 --> 00:10:19,000
Right.

204
00:10:19,000 --> 00:10:25,000
And here the feature representation is not only be 300 dimensions, it can be 100 dimension.

205
00:10:25,000 --> 00:10:27,000
It can be different different dimensions.

206
00:10:27,000 --> 00:10:28,000
Right.

207
00:10:28,000 --> 00:10:33,000
But here I'm just showing you an example with respect to Google and what will be this feature that will

208
00:10:33,000 --> 00:10:38,000
also not be exactly known, but just consider that I'm just giving you an intuitive example that yes,

209
00:10:38,000 --> 00:10:42,000
based on some relationship with respect to the word, you are able to get this specific vectors.

210
00:10:43,000 --> 00:10:47,000
Now, there are a lot of advantages with respect to this.

211
00:10:47,000 --> 00:10:49,000
You know why I'm saying you?

212
00:10:49,000 --> 00:10:56,000
Because suppose let's say if I do a calculation which is like king minus man plus queen.

213
00:10:56,000 --> 00:11:01,000
If I do this calculation and this is a famous calculation which is also written in Google research paper,

214
00:11:01,000 --> 00:11:08,000
if I probably do this calculation, the output that I am going to get is something called as women.

215
00:11:09,000 --> 00:11:09,000
Okay.

216
00:11:11,000 --> 00:11:12,000
I will definitely get it as woman.

217
00:11:12,000 --> 00:11:15,000
Why or why woman is not there.

218
00:11:15,000 --> 00:11:16,000
It's okay.

219
00:11:16,000 --> 00:11:17,000
So I'll let me just remove it.

220
00:11:18,000 --> 00:11:31,000
Suppose if I say king minus boy plus queen, then the output that I'm going to get C king is this vector

221
00:11:31,000 --> 00:11:31,000
right.

222
00:11:31,000 --> 00:11:34,000
I'm subtracting with boy and then I'm adding it up.

223
00:11:34,000 --> 00:11:35,000
Queen.

224
00:11:35,000 --> 00:11:40,000
At the end of the day you will be seeing that the girl will be much more related to boy.

225
00:11:40,000 --> 00:11:42,000
So I'm going to get the output as girl here.

226
00:11:42,000 --> 00:11:44,000
I'm just doing the vector calculation.

227
00:11:44,000 --> 00:11:47,000
And this is what kind of relations will be able to get it.

228
00:11:47,000 --> 00:11:50,000
You know this is what this is an amazing thing.

229
00:11:50,000 --> 00:11:55,000
We are getting this kind of relation just by seeing these vectors, which has been provided by word

230
00:11:55,000 --> 00:11:55,000
two vec.

231
00:11:55,000 --> 00:11:57,000
Again, I'm not going to do the calculation here.

232
00:11:57,000 --> 00:12:01,000
I've just randomly stuffed some values, but in the real word two vec use case.

233
00:12:01,000 --> 00:12:04,000
You'll be seeing that if you do this kind of calculation you're going to get girl.

234
00:12:04,000 --> 00:12:11,000
And this I will show you practically also as we go ahead once we use this Google or Google Word two

235
00:12:11,000 --> 00:12:15,000
vec which is basically trained on 3 billion words, which is quite amazing.

236
00:12:15,000 --> 00:12:15,000
Right?

237
00:12:15,000 --> 00:12:17,000
So let me give you another example.

238
00:12:17,000 --> 00:12:22,000
Let's say here what I'm doing instead of using 300 dimension, I'll represent every word by two dimension.

239
00:12:22,000 --> 00:12:30,000
Let's say I have 0.950.96 man is represented by something like uh, .95.98.

240
00:12:30,000 --> 00:12:33,000
Let's let's represent like this, okay.

241
00:12:33,000 --> 00:12:35,000
I'm just giving some values, okay?

242
00:12:35,000 --> 00:12:38,000
Some meaningful values so that you'll get an idea about it.

243
00:12:38,000 --> 00:12:42,000
Let's say Queen is given as 0.96, let's say because this is the opposite of this.

244
00:12:42,000 --> 00:12:43,000
Right.

245
00:12:43,000 --> 00:12:47,000
And again this can be a similar keyword, uh, similar vocabulary.

246
00:12:47,000 --> 00:12:55,000
And with respect to human, let's say that I'm having something like 0.94 or 0.96, let's say then what

247
00:12:55,000 --> 00:13:03,000
I do is that if we do the calculation of king minus man plus queen, I'm going to get the human right

248
00:13:03,000 --> 00:13:04,000
as a output.

249
00:13:05,000 --> 00:13:10,000
Now what this vectors represent that also you really need to understand it is super, super important.

250
00:13:10,000 --> 00:13:14,000
And that is where I'm going to discuss about something called as cosine similarity.

251
00:13:14,000 --> 00:13:14,000
Okay.

252
00:13:15,000 --> 00:13:16,000
Cosine similarity.

253
00:13:16,000 --> 00:13:20,000
Super important topic with respect to understanding these things.

254
00:13:20,000 --> 00:13:21,000
Because if you understand these things and all.

255
00:13:21,000 --> 00:13:27,000
Now see over here King is given by two, two, two vectors right .95.96.

256
00:13:27,000 --> 00:13:31,000
So obviously I can I can basically construct this into a two dimension.

257
00:13:31,000 --> 00:13:33,000
Let's say that I'm getting king over here.

258
00:13:34,000 --> 00:13:34,000
Okay.

259
00:13:34,000 --> 00:13:39,000
And in order to make you understand, let's say Queen is over here, okay.

260
00:13:39,000 --> 00:13:41,000
Or Queen can come over anywhere.

261
00:13:41,000 --> 00:13:43,000
Let's say man is over here, right?

262
00:13:43,000 --> 00:13:44,000
Man is over here.

263
00:13:44,000 --> 00:13:48,000
If I probably say King with respect to gender, right.

264
00:13:48,000 --> 00:13:51,000
This two is going to be the most nearest word when compared to queen.

265
00:13:51,000 --> 00:13:52,000
Right?

266
00:13:52,000 --> 00:13:57,000
So if how do I calculate the distance between this vector and this vector which is provided in this

267
00:13:57,000 --> 00:13:57,000
form?

268
00:13:57,000 --> 00:14:01,000
All I do is that I try to probably calculate the distance like this.

269
00:14:02,000 --> 00:14:07,000
Okay, I try to find out the angle, and for this to find out the distance between two vectors, we

270
00:14:07,000 --> 00:14:14,000
use a distance formula which says it is nothing but one minus cosine similarity.

271
00:14:14,000 --> 00:14:16,000
Now what is cosine similarity?

272
00:14:16,000 --> 00:14:17,000
This is super important.

273
00:14:17,000 --> 00:14:23,000
To understand cosine similarity, I can basically say that cosine similarity is nothing, but cosine

274
00:14:23,000 --> 00:14:26,000
similarity is nothing but the angle between these two vectors.

275
00:14:26,000 --> 00:14:29,000
Let's say the angle between these two vectors.

276
00:14:29,000 --> 00:14:32,000
Let's say I'm just taking as an example is 45 degree.

277
00:14:32,000 --> 00:14:37,000
Let's, let's consider then this is nothing but cos 45 cos 45 is nothing but one by root two.

278
00:14:38,000 --> 00:14:41,000
And probably I think this is approximately equal to .7071.

279
00:14:41,000 --> 00:14:43,000
I've done the calculation.

280
00:14:43,000 --> 00:14:44,000
If it is wrong just let me know.

281
00:14:44,000 --> 00:14:44,000
Okay.

282
00:14:44,000 --> 00:14:51,000
But over here then the distance between these two vectors will be nothing but one -0.7071.

283
00:14:51,000 --> 00:14:55,000
Now if I'm calculating the distance it is 0.29.

284
00:14:55,000 --> 00:15:01,000
Let's say so if I'm getting this 0.29 distance right I will say that okay, almost this particular word

285
00:15:01,000 --> 00:15:02,000
are similar.

286
00:15:02,000 --> 00:15:08,000
Let's say if I have two more vectors, if I have two different vectors, one of the vector is basically

287
00:15:08,000 --> 00:15:11,000
over here, one of the vector is over here.

288
00:15:11,000 --> 00:15:13,000
Then the angle between these two is nothing but 90 degree.

289
00:15:13,000 --> 00:15:20,000
In the case of 90 degree my distance will be nothing but one minus cos theta cos theta is nothing but

290
00:15:20,000 --> 00:15:24,000
cos 90 cos 90 is nothing but zero one -0 is one.

291
00:15:24,000 --> 00:15:29,000
So I can definitely say this vector and this vector are completely different because the distance between

292
00:15:29,000 --> 00:15:30,000
them is one.

293
00:15:30,000 --> 00:15:34,000
If the distance is nearer to zero, I will say that they are almost similar vectors.

294
00:15:34,000 --> 00:15:36,000
Now in this case it is 0.29.

295
00:15:36,000 --> 00:15:38,000
I can say that okay, somewhat similar, right?

296
00:15:38,000 --> 00:15:42,000
Let's say if I have one more vector which is in this point only then I can say that this two vectors

297
00:15:42,000 --> 00:15:49,000
are almost same because the angle between them is uh cos zero, cos zero is nothing but one one minus

298
00:15:49,000 --> 00:15:50,000
one, which is nothing but zero.

299
00:15:50,000 --> 00:15:51,000
Right.

300
00:15:51,000 --> 00:15:55,000
Because over here the angle between these two point is nothing but zero, right.

301
00:15:55,000 --> 00:15:56,000
There is no angle at all.

302
00:15:56,000 --> 00:15:56,000
Right.

303
00:15:56,000 --> 00:16:00,000
So if there is no angle then I will be able to find out the distance.

304
00:16:00,000 --> 00:16:06,000
In this case my distance will be nothing but one minus cos zero one minus cos zero is nothing but one

305
00:16:06,000 --> 00:16:08,000
minus one, which is nothing but zero.

306
00:16:08,000 --> 00:16:15,000
Now if the distance is coming as zero, that basically means these two are same word right now.

307
00:16:15,000 --> 00:16:19,000
So this is super important because recommendation also happens in this way itself.

308
00:16:19,000 --> 00:16:23,000
Now in recommendation let's say I have a movie which is like Avengers.

309
00:16:23,000 --> 00:16:25,000
Let's say Avengers is over here.

310
00:16:26,000 --> 00:16:28,000
Where do you think Iron Man will come?

311
00:16:28,000 --> 00:16:30,000
Iron man will come again, near this or near this?

312
00:16:30,000 --> 00:16:33,000
At this particular point, only Right.

313
00:16:33,000 --> 00:16:38,000
Iron man will be coming and it will be based on different, different feature representation.

314
00:16:38,000 --> 00:16:38,000
Right?

315
00:16:39,000 --> 00:16:42,000
Whether it is a comic movie, whether it is an action movie.

316
00:16:42,000 --> 00:16:48,000
So action, comic, right, or comedy, these all are feature representation.

317
00:16:48,000 --> 00:16:54,000
Try to understand this and movie name is basically my vector, my vocabulary.

318
00:16:54,000 --> 00:16:57,000
It can be Avengers, it can be this one.

319
00:16:57,000 --> 00:16:58,000
Right?

320
00:16:58,000 --> 00:17:03,000
So I hope you are getting an idea about how word two vec is basically working.

321
00:17:03,000 --> 00:17:08,000
At the end of the day, we are basically creating a feature representation of every word, okay?

322
00:17:08,000 --> 00:17:09,000
And we are able to find it out.

323
00:17:09,000 --> 00:17:12,000
So yes, this was about word two vec.

324
00:17:12,000 --> 00:17:18,000
Now what we are going to do is that you need to understand that how this feature representation is created

325
00:17:18,000 --> 00:17:22,000
and how this vector is basically created.

326
00:17:22,000 --> 00:17:23,000
How these vectors.

327
00:17:23,000 --> 00:17:28,000
Because here I have randomly written boy to gender is minus one, boy to girl, girl to gender is plus

328
00:17:28,000 --> 00:17:28,000
one.

329
00:17:28,000 --> 00:17:31,000
Because I said that okay, this may be the opposite one.

330
00:17:31,000 --> 00:17:38,000
Now you'll get an idea that how in an deep neural deep learning neural network, basically a simple

331
00:17:38,000 --> 00:17:40,000
neural network, how this entire word two vec is trained.

332
00:17:41,000 --> 00:17:43,000
And that is what I'm going to discuss in my next video.

333
00:17:43,000 --> 00:17:47,000
Then I'm going to also show you the practical implementation.

334
00:17:47,000 --> 00:17:53,000
So I hope you are able to understand this with respect to this, uh, a very good, amazing model developed

335
00:17:53,000 --> 00:17:57,000
by Google, a very good architecture for with respect to this.

336
00:17:57,000 --> 00:18:01,000
And we'll try to solve that in the upcoming video and I'll, I'll give you a clear idea about it.

337
00:18:01,000 --> 00:18:02,000
Okay.

338
00:18:02,000 --> 00:18:07,000
Just the prerequisite is that you really need to know about a and loss function, optimizers and all.

339
00:18:07,000 --> 00:18:08,000
So yes, this was it.

340
00:18:08,000 --> 00:18:09,000
I will see you all in the next video.

341
00:18:09,000 --> 00:18:10,000
Thank you.

