1
00:00:00,000 --> 00:00:01,000
Hello guys!

2
00:00:01,000 --> 00:00:05,000
So I'm quite excited to start the NLP series that is natural language processing for Machine learning.

3
00:00:05,000 --> 00:00:11,000
In this video we are going to see the entire roadmap like how you should go ahead and prepare for natural

4
00:00:11,000 --> 00:00:16,000
language processing and NLP is an amazing domain, amazing tech part.

5
00:00:16,000 --> 00:00:18,000
With respect to machine learning or deep learning.

6
00:00:18,000 --> 00:00:23,000
Lot of research is basically happening in natural language processing itself.

7
00:00:23,000 --> 00:00:29,000
So let me go ahead and let me just share my screen and uh, let me go ahead and let me explain about

8
00:00:29,000 --> 00:00:32,000
the roadmap, like how you should basically go ahead with the preparation.

9
00:00:33,000 --> 00:00:37,000
So, uh, to begin with, uh, till now, probably if, you know, machine learning.

10
00:00:37,000 --> 00:00:40,000
Okay, so in machine learning you have actually seen.

11
00:00:40,000 --> 00:00:43,000
Right, we solve two different kind of problems.

12
00:00:43,000 --> 00:00:46,000
One is supervised that is supervised.

13
00:00:46,000 --> 00:00:51,000
And the other one that we basically solve is something called as unsupervised machine learning.

14
00:00:52,000 --> 00:00:58,000
Now in most of the supervised machine learning use cases where we specifically solve two different kind

15
00:00:58,000 --> 00:01:03,000
of problem statements like classification and regression.

16
00:01:03,000 --> 00:01:03,000
Right.

17
00:01:04,000 --> 00:01:06,000
In all these problem statement.

18
00:01:06,000 --> 00:01:07,000
Right.

19
00:01:07,000 --> 00:01:15,000
What we have seen is that let's say if we have some specific set of features like F1, F2, F3, f4,

20
00:01:15,000 --> 00:01:17,000
like this, it can be any number of features.

21
00:01:17,000 --> 00:01:22,000
This features are usually called as independent features.

22
00:01:22,000 --> 00:01:22,000
Right.

23
00:01:22,000 --> 00:01:26,000
So if I probably talk about this, these are my input features.

24
00:01:26,000 --> 00:01:29,000
Or I can also say this as my independent features.

25
00:01:30,000 --> 00:01:32,000
Independent features.

26
00:01:32,000 --> 00:01:33,000
Right.

27
00:01:33,000 --> 00:01:38,000
And similarly I also have my output feature which is my dependent feature.

28
00:01:39,000 --> 00:01:39,000
Right.

29
00:01:40,000 --> 00:01:42,000
Which is my dependent feature.

30
00:01:43,000 --> 00:01:50,000
So what is our aim in a supervised machine learning model is that with respect to this features, we

31
00:01:50,000 --> 00:01:54,000
obviously will be having lot of data points in the output.

32
00:01:54,000 --> 00:01:57,000
It can be a classification problem or a regression problem.

33
00:01:57,000 --> 00:02:04,000
And over here I may have continuous values or I may have uh classified points like ones and zeros.

34
00:02:04,000 --> 00:02:04,000
It can be binary.

35
00:02:04,000 --> 00:02:05,000
It can be multi-class.

36
00:02:05,000 --> 00:02:06,000
Okay.

37
00:02:06,000 --> 00:02:10,000
So let's say if I have this kind of data points over here, what is our main aim.

38
00:02:10,000 --> 00:02:13,000
We usually create a model okay.

39
00:02:13,000 --> 00:02:15,000
We usually create a model.

40
00:02:15,000 --> 00:02:17,000
And we train the model with this data.

41
00:02:17,000 --> 00:02:18,000
Right.

42
00:02:18,000 --> 00:02:20,000
So we basically create a model.

43
00:02:20,000 --> 00:02:24,000
And we train this particular model with this specific data set.

44
00:02:24,000 --> 00:02:24,000
Right.

45
00:02:25,000 --> 00:02:31,000
With this specific data set then our model will be able to make or it will be capable to make some predictions

46
00:02:31,000 --> 00:02:33,000
whenever I give this kind of input data.

47
00:02:34,000 --> 00:02:40,000
Now in machine learning specifically, if I talk about this features right now, F1 can be a continuous

48
00:02:40,000 --> 00:02:46,000
features, F2 can basically be a categorical features, and it can also be different types of features

49
00:02:46,000 --> 00:02:47,000
over here.

50
00:02:47,000 --> 00:02:54,000
And during this particular scenario, let's say that some of our features are completely made up of

51
00:02:54,000 --> 00:02:55,000
text.

52
00:02:55,000 --> 00:03:00,000
Let's say one basic example that I really want to give is about spam classification.

53
00:03:00,000 --> 00:03:02,000
Let's say this is my example over here.

54
00:03:02,000 --> 00:03:04,000
Spam classification.

55
00:03:05,000 --> 00:03:11,000
Now in spam classification, what all features we may have, we may have let's say, uh, my main aim

56
00:03:11,000 --> 00:03:20,000
is to basically detect whether a email that comes to me is a spam or not spam.

57
00:03:20,000 --> 00:03:23,000
Okay, so let's say that this is a classification problem that I really want to solve.

58
00:03:24,000 --> 00:03:30,000
Now in this scenario, some of the features that I may have, one feature is that I may have something

59
00:03:30,000 --> 00:03:32,000
called as email subject.

60
00:03:32,000 --> 00:03:33,000
Okay.

61
00:03:34,000 --> 00:03:37,000
I may have the next feature as email body.

62
00:03:38,000 --> 00:03:43,000
And my output feature is basically whether this mail is spam or ham.

63
00:03:43,000 --> 00:03:45,000
Ham basically means not spam.

64
00:03:45,000 --> 00:03:48,000
So let's say over here, I'll give you one example.

65
00:03:48,000 --> 00:03:50,000
Let's say the email subject is like billionaire.

66
00:03:50,000 --> 00:03:55,000
So here you can see that it is a completely a text right email body.

67
00:03:55,000 --> 00:04:01,000
It can be like you won a lottery of okay, I'm just giving you an example.

68
00:04:01,000 --> 00:04:04,000
You won a lottery of billion dollar.

69
00:04:04,000 --> 00:04:12,000
And I think you get this kind of emails right now, obviously, when you get this email, uh, in real

70
00:04:12,000 --> 00:04:17,000
world scenario, we will be classifying this particular points and this will basically be a spam.

71
00:04:17,000 --> 00:04:19,000
So I'll put a category as one.

72
00:04:19,000 --> 00:04:21,000
Or I can also put it as spam itself.

73
00:04:22,000 --> 00:04:29,000
Now here one thing that you can notice over here is that whenever in our input features we have, we

74
00:04:29,000 --> 00:04:33,000
have a continuous variable and we have a categorical variables.

75
00:04:33,000 --> 00:04:38,000
And obviously we have different different techniques to convert this categorical variables into continuous

76
00:04:38,000 --> 00:04:38,000
values.

77
00:04:38,000 --> 00:04:39,000
Right.

78
00:04:39,000 --> 00:04:41,000
Uh, they are techniques like one hot encoding.

79
00:04:41,000 --> 00:04:44,000
They are techniques like uh target encoding, ordinal encoding.

80
00:04:44,000 --> 00:04:47,000
All those techniques are there which we basically do it in feature engineering.

81
00:04:47,000 --> 00:04:55,000
But let's say that if my entire data is a text or a sentence like this, right?

82
00:04:55,000 --> 00:05:02,000
In this particular scenario, I will definitely it will not be that easy for a model to understand,

83
00:05:02,000 --> 00:05:02,000
right?

84
00:05:03,000 --> 00:05:06,000
Obviously, because the model cannot understand human language, right?

85
00:05:06,000 --> 00:05:07,000
Right.

86
00:05:07,000 --> 00:05:11,000
Now I have written in English tomorrow it can be in Chinese, the day after tomorrow it can be in some

87
00:05:11,000 --> 00:05:12,000
other languages.

88
00:05:12,000 --> 00:05:17,000
So model is not directly capable of understanding this particular text.

89
00:05:17,000 --> 00:05:20,000
So what should we do in this particular scenario.

90
00:05:20,000 --> 00:05:22,000
So we have techniques.

91
00:05:22,000 --> 00:05:28,000
We have techniques where we can convert this all text into some meaningful vectors.

92
00:05:29,000 --> 00:05:30,000
Meaningful vectors.

93
00:05:30,000 --> 00:05:31,000
Now what are vectors.

94
00:05:31,000 --> 00:05:33,000
Vectors are just like numbers only.

95
00:05:34,000 --> 00:05:40,000
But understand those vectors represent some meaningful information with respect to this particular text,

96
00:05:40,000 --> 00:05:40,000
okay.

97
00:05:40,000 --> 00:05:47,000
And whenever your input data is in form of text or sentences, we basically use something called as

98
00:05:47,000 --> 00:05:55,000
natural language processing so that we'll be able to process this particular data, and we'll be able

99
00:05:55,000 --> 00:05:59,000
to make the model understand to solve use cases like spam classification.

100
00:05:59,000 --> 00:06:00,000
Right.

101
00:06:00,000 --> 00:06:04,000
So this is the entire context behind NLP and why it is so much popular.

102
00:06:04,000 --> 00:06:07,000
Because nowadays you see lot of examples.

103
00:06:07,000 --> 00:06:07,000
Alexa.

104
00:06:07,000 --> 00:06:08,000
Right.

105
00:06:08,000 --> 00:06:09,000
I think many people use Alexa.

106
00:06:09,000 --> 00:06:12,000
Many people use Google Home, right?

107
00:06:13,000 --> 00:06:18,000
Many people use some automated device like let's say the AC is running right?

108
00:06:18,000 --> 00:06:21,000
And you say, hey, switch off the AC, switch on the AC.

109
00:06:21,000 --> 00:06:27,000
How that particular machine is able to understand that it is all because of natural language processing.

110
00:06:27,000 --> 00:06:33,000
Google is extensively doing some amazing research with respect to NLP, and they're coming up with some

111
00:06:33,000 --> 00:06:34,000
amazing things.

112
00:06:34,000 --> 00:06:40,000
Yes, we will be learning, uh, both with respect to machine learning and as we go will also try to

113
00:06:40,000 --> 00:06:41,000
learn with respect to deep learning.

114
00:06:41,000 --> 00:06:43,000
But just try to understand.

115
00:06:43,000 --> 00:06:47,000
In this video we are going to understand the roadmap, like how we should go ahead and prepare with

116
00:06:47,000 --> 00:06:48,000
respect to the NLP.

117
00:06:48,000 --> 00:06:53,000
So again I'm going to write the roadmap of NLP okay.

118
00:06:54,000 --> 00:06:57,000
And again we are also going to use different different libraries.

119
00:06:58,000 --> 00:07:06,000
I will just going to draw a pyramid kind of structure and we will go in the bottom to top approach okay.

120
00:07:06,000 --> 00:07:11,000
So let's go ahead and let's try to understand what should be the roadmap of NLP initially to begin with.

121
00:07:12,000 --> 00:07:16,000
Initially to begin with I'm just going to create a small block.

122
00:07:18,000 --> 00:07:19,000
Okay.

123
00:07:19,000 --> 00:07:26,000
To begin with, this first block is basically initially you need to know one programming language.

124
00:07:26,000 --> 00:07:30,000
So let's say that I am going to probably go ahead with Python programming language.

125
00:07:30,000 --> 00:07:32,000
Super important right.

126
00:07:32,000 --> 00:07:37,000
With the help of Python programming language will obviously be able to solve lot of use cases of NLP.

127
00:07:37,000 --> 00:07:39,000
But when I go to step one.

128
00:07:39,000 --> 00:07:41,000
So this is basically the step one.

129
00:07:41,000 --> 00:07:46,000
The step one is nothing, but it is basically called as text pre-processing.

130
00:07:48,000 --> 00:07:54,000
And this text pre-processing initially will start with some basic things like whenever we have this

131
00:07:54,000 --> 00:08:00,000
kind of text data, what are the text pre-processing things we need to do, how we can basically clean

132
00:08:00,000 --> 00:08:01,000
that particular text data.

133
00:08:01,000 --> 00:08:04,000
All those things will basically come in text pre-processing.

134
00:08:04,000 --> 00:08:05,000
Okay.

135
00:08:05,000 --> 00:08:10,000
The techniques that we are probably going to apply in this is like tokenization.

136
00:08:10,000 --> 00:08:16,000
Tokenization is a concept wherein you convert a paragraph into a sentence, a sentence into a word's

137
00:08:16,000 --> 00:08:17,000
different, different things.

138
00:08:18,000 --> 00:08:21,000
We are also going to learn techniques like lemmatization.

139
00:08:22,000 --> 00:08:24,000
We are going to learn techniques like stemming.

140
00:08:25,000 --> 00:08:28,000
We are also going to introduce to words like something called as stopwords.

141
00:08:29,000 --> 00:08:33,000
All these things will basically become covering in the text pre-processing part one.

142
00:08:33,000 --> 00:08:35,000
So this is basically the step one.

143
00:08:35,000 --> 00:08:36,000
Okay.

144
00:08:36,000 --> 00:08:38,000
So I'm just going to write this as step one.

145
00:08:39,000 --> 00:08:40,000
Super important.

146
00:08:40,000 --> 00:08:43,000
And initially we'll be starting with this okay.

147
00:08:43,000 --> 00:08:46,000
And again uh the entire detailed syllabus.

148
00:08:46,000 --> 00:08:50,000
Obviously I'm going to make video by video to make you understand each and everything.

149
00:08:50,000 --> 00:08:54,000
But on the thousand feet overview, I'm just giving you this.

150
00:08:54,000 --> 00:08:57,000
There are a lot of topics inside this which we really need to be familiar with.

151
00:08:57,000 --> 00:08:59,000
Now coming to the second one.

152
00:08:59,000 --> 00:08:59,000
Okay.

153
00:08:59,000 --> 00:09:04,000
So the second one, we basically say it as text pre-processing.

154
00:09:04,000 --> 00:09:05,000
Step two okay.

155
00:09:05,000 --> 00:09:09,000
So again this is also a text pre-processing technique.

156
00:09:09,000 --> 00:09:15,000
But we little bit we we try to increase the complexity of it and it tries to solve more problems.

157
00:09:15,000 --> 00:09:23,000
So here in the text pre-processing tool here we focus on converting the text data into vectors.

158
00:09:23,000 --> 00:09:25,000
So here I'm just going to write it down.

159
00:09:25,000 --> 00:09:26,000
This is my step two.

160
00:09:27,000 --> 00:09:29,000
Here I'll write.

161
00:09:29,000 --> 00:09:32,000
Or I can basically text pre-processing part two.

162
00:09:32,000 --> 00:09:38,000
Here we are going to learn topics like bag of words okay TF-IDF.

163
00:09:40,000 --> 00:09:44,000
We're also going to learn things like unigrams bigrams.

164
00:09:44,000 --> 00:09:47,000
There are a lot of concepts like this which we are going to cover it again.

165
00:09:47,000 --> 00:09:50,000
This is again a text pre-processing technique.

166
00:09:50,000 --> 00:09:52,000
But again understand what is the main aim.

167
00:09:52,000 --> 00:09:57,000
The main aim is basically to convert or let me just write like this.

168
00:09:57,000 --> 00:09:58,000
Step one and step two.

169
00:09:58,000 --> 00:10:01,000
Instead of writing like this, I will just write it in a simpler way.

170
00:10:01,000 --> 00:10:03,000
Okay, so here I am.

171
00:10:03,000 --> 00:10:08,000
We are basically focusing on cleaning the text, right?

172
00:10:09,000 --> 00:10:11,000
Cleaning the inputs.

173
00:10:12,000 --> 00:10:13,000
Cleaning the input.

174
00:10:13,000 --> 00:10:22,000
In this particular step, we are trying to focus on converting our input text to vectors.

175
00:10:22,000 --> 00:10:30,000
And this is a super important step, because this vector should be able to make sure that the context

176
00:10:30,000 --> 00:10:32,000
of the statement should be able to get captured, right.

177
00:10:32,000 --> 00:10:39,000
So at the end of the day, in NLP, with whatever techniques right now it is there like a Transformers,

178
00:10:39,000 --> 00:10:42,000
births are there, which is quite advanced techniques.

179
00:10:42,000 --> 00:10:47,000
If you are able to convert this input text into some meaningful vectors, you will be able to solve

180
00:10:47,000 --> 00:10:49,000
those particular use case in a better manner, right?

181
00:10:49,000 --> 00:10:55,000
So this is basically the second step where we focus on uh, converting the input text into vectors.

182
00:10:55,000 --> 00:11:00,000
Still, there are more advanced techniques of text preprocessing, which I will go with the third step.

183
00:11:00,000 --> 00:11:02,000
And here we focus on.

184
00:11:05,000 --> 00:11:09,000
Here I'm just going to write it as text pre-processing.

185
00:11:11,000 --> 00:11:13,000
With respect to part three.

186
00:11:13,000 --> 00:11:14,000
Okay.

187
00:11:14,000 --> 00:11:19,000
And here what all things we basically focus on, we use more advanced techniques which is like word

188
00:11:19,000 --> 00:11:22,000
two vec word two vec.

189
00:11:23,000 --> 00:11:25,000
Average word two vec.

190
00:11:27,000 --> 00:11:35,000
And this is also a technique to convert the input text into vectors.

191
00:11:37,000 --> 00:11:39,000
Now you may be thinking Krish in the second step.

192
00:11:39,000 --> 00:11:40,000
Also you have written the same thing.

193
00:11:40,000 --> 00:11:42,000
In the third step also you have written same thing.

194
00:11:42,000 --> 00:11:49,000
Yes, guys, understand as we go from the second step to the third step, this conversion of the input

195
00:11:49,000 --> 00:11:55,000
text to vectors, it is better than the approach that we basically use in B or w that is bag of words

196
00:11:55,000 --> 00:11:57,000
TF-IDF unigram bigrams.

197
00:11:57,000 --> 00:11:58,000
Right?

198
00:11:58,000 --> 00:12:01,000
But as a learner we really need to know all this particular steps.

199
00:12:01,000 --> 00:12:03,000
Right now.

200
00:12:03,000 --> 00:12:07,000
Over here we focus more on techniques like word two vec and average word two vec.

201
00:12:07,000 --> 00:12:12,000
Here the word two vec and average word two vec is again a kind of a deep learning technique.

202
00:12:12,000 --> 00:12:16,000
But we will try to learn this and will try to understand how it basically happens.

203
00:12:16,000 --> 00:12:17,000
Okay.

204
00:12:17,000 --> 00:12:21,000
Now coming to the next step that we focus on right.

205
00:12:21,000 --> 00:12:30,000
If we continue understanding the entire roadmap here, we also focus on understanding RNN, LSTM, RNN,

206
00:12:32,000 --> 00:12:35,000
GRU okay?

207
00:12:35,000 --> 00:12:38,000
G are you on now again?

208
00:12:38,000 --> 00:12:44,000
Guys, this is a deep learning technique which is basically used for handling or solving text related

209
00:12:44,000 --> 00:12:49,000
use cases like spam classification, text summarization, and many more things.

210
00:12:49,000 --> 00:12:55,000
So again over here, these are some neural networks you should be familiar with in the roadmap before,

211
00:12:55,000 --> 00:13:00,000
uh, basically when we enter into the deep learning part, this is super important to understand.

212
00:13:00,000 --> 00:13:04,000
And again, uh, as I said, this is a part of deep learning technique.

213
00:13:04,000 --> 00:13:05,000
Okay.

214
00:13:05,000 --> 00:13:09,000
But since I'm writing about the roadmap, I really need to mention about all those things.

215
00:13:09,000 --> 00:13:14,000
Now coming to the next one, uh, there is also a technique which is called as word embedding.

216
00:13:14,000 --> 00:13:15,000
Okay.

217
00:13:15,000 --> 00:13:21,000
So this is an also an amazing way to convert input text into vectors internally if I talk about.

218
00:13:21,000 --> 00:13:27,000
So this particular text pre-processing it also uses techniques like word embedding okay.

219
00:13:27,000 --> 00:13:29,000
So this is also a technique.

220
00:13:29,000 --> 00:13:32,000
But this technique is basically called as word embeddings.

221
00:13:32,000 --> 00:13:36,000
And word embeddings internally uses some amount of word two vec.

222
00:13:36,000 --> 00:13:39,000
But we can train our own word embedding techniques.

223
00:13:39,000 --> 00:13:40,000
Okay.

224
00:13:40,000 --> 00:13:44,000
So again uh this is a technique of converting input text into vectors.

225
00:13:44,000 --> 00:13:48,000
Now similarly we have techniques like transformers and birds also.

226
00:13:48,000 --> 00:13:52,000
So coming to the next one this is basically transformer.

227
00:13:53,000 --> 00:13:55,000
Again this is an advanced technique.

228
00:13:55,000 --> 00:14:00,000
I just really wanted to mention all these things to you so that you will be able to understand.

229
00:14:00,000 --> 00:14:07,000
But so if you really want to become a pro, you really need to go with this particular pattern and try

230
00:14:07,000 --> 00:14:10,000
to complete till birth and try to see the application of that.

231
00:14:10,000 --> 00:14:16,000
But as we go up from bottom to top, the accuracy of the model keeps on increasing.

232
00:14:16,000 --> 00:14:17,000
Because.

233
00:14:17,000 --> 00:14:20,000
And remember one more thing the size of the model also increases, right?

234
00:14:20,000 --> 00:14:24,000
Which whichever machine learning models or deep learning models that you are probably trying to create

235
00:14:24,000 --> 00:14:30,000
to solve the NLP use cases, it will keep on increasing as you go towards Transformer and Bert.

236
00:14:30,000 --> 00:14:36,000
Now initially, as I said, we are going to learn with respect to NLP, with respect to Machine learning

237
00:14:36,000 --> 00:14:38,000
and NLP for machine learning.

238
00:14:38,000 --> 00:14:43,000
So what all things we are going to focus on, we are going to focus on these three things first and

239
00:14:43,000 --> 00:14:45,000
then we'll start deep learning.

240
00:14:45,000 --> 00:14:47,000
We will probably focus on this three.

241
00:14:47,000 --> 00:14:51,000
Okay so here I'm just going to write this important thing.

242
00:14:51,000 --> 00:14:56,000
Is that all the this three steps will be the part of machine learning.

243
00:14:56,000 --> 00:15:01,000
And now when I say machine learning we will be using libraries like NLTK.

244
00:15:02,000 --> 00:15:08,000
And there are also libraries like Spacy, which will actually help us to do perform all the tasks which

245
00:15:08,000 --> 00:15:10,000
is available over here in the bottom three.

246
00:15:10,000 --> 00:15:11,000
Right.

247
00:15:11,000 --> 00:15:14,000
But we'll focus on one library called that is called as NLTK.

248
00:15:14,000 --> 00:15:18,000
So suppose if you know NLTK, I think learning space is also very much easy.

249
00:15:19,000 --> 00:15:26,000
In the case of deep learning we will usually go with libraries like TensorFlow or PyTorch.

250
00:15:26,000 --> 00:15:28,000
So both libraries are quite amazing with respect to this.

251
00:15:28,000 --> 00:15:30,000
As you know, TensorFlow is now an open source.

252
00:15:30,000 --> 00:15:32,000
PyTorch is an open source.

253
00:15:32,000 --> 00:15:33,000
TensorFlow has been created by Google.

254
00:15:33,000 --> 00:15:36,000
PyTorch has been created by Facebook.

255
00:15:36,000 --> 00:15:36,000
Right?

256
00:15:36,000 --> 00:15:39,000
But at the end of the day, what is the main aim?

257
00:15:39,000 --> 00:15:46,000
Your input data is in the form of text data, and you basically have to perform some amazing kind of

258
00:15:46,000 --> 00:15:52,000
text preprocessing where you convert the input text data into vectors and you are able to solve amazing

259
00:15:52,000 --> 00:15:55,000
use cases of NLP using both machine learning and deep learning.

260
00:15:55,000 --> 00:16:00,000
And this is a brief idea about the entire roadmap, uh, like how we are going to prepare.

261
00:16:01,000 --> 00:16:05,000
In the next video, I'm going to discuss about some of the amazing use cases, what all things we can

262
00:16:05,000 --> 00:16:07,000
basically do with the help of NLP.

263
00:16:07,000 --> 00:16:08,000
Right?

264
00:16:08,000 --> 00:16:11,000
So all those things will be covering in the upcoming video.

265
00:16:11,000 --> 00:16:13,000
So yes, I will see you all in the next video.

266
00:16:13,000 --> 00:16:13,000
Thank you.