WEBVTT

00:00.820 --> 00:01.840
All right, everyone.

00:02.530 --> 00:09.190
So the first day was sex and where we are going to start learning about the real natural language processing

00:09.190 --> 00:09.700
stuff.

00:10.480 --> 00:14.630
So let's get started with what we are going to learn in this section.

00:15.730 --> 00:19.870
So this section is all about all those basics and l.p technique.

00:20.260 --> 00:21.220
We will be learning.

00:23.230 --> 00:26.410
So while analyzing data, the first day will matter.

00:26.530 --> 00:32.680
You could to take it and apply those matter on your tax data, which is nothing but a tokenization.

00:34.650 --> 00:38.130
Then we'll move towards finding a route from your data.

00:38.730 --> 00:40.410
Stemming the limitation.

00:42.950 --> 00:48.980
How you can remove all those only important words like stop words from your tax.

00:51.120 --> 00:55.440
How to apply some kind of vocabulary and a rule-based matching.

00:57.070 --> 01:04.450
How you can pack some particular part of your tax, a speech that is called as a part of speech tagging,

01:05.110 --> 01:13.060
once you assign some kind of noun or a vote to some particular word in your sentence, it makes much

01:13.060 --> 01:16.550
more sense for further interpretation of your tax.

01:16.990 --> 01:23.590
And if you're just flat on all those words, maybe those tax will be completely different.

01:24.020 --> 01:27.570
That in which context we are using those back.

01:28.180 --> 01:30.790
So that is the kind of analysis we will do.

01:30.910 --> 01:38.220
And that is called as a part of speech, tagging that whether your individual vote is known pronoun.

01:38.480 --> 01:39.080
All right.

01:39.150 --> 01:39.460
Well.

01:41.440 --> 01:47.270
Name into Dietrick of mizin, one of the very important and useful thing, while analyzing your tax,

01:47.720 --> 01:49.520
whether some particular work.

01:50.030 --> 01:51.500
Let's say it's Animala.

01:52.010 --> 01:54.470
So it will be like on a place or a mountain.

01:54.860 --> 01:56.050
Let's see some.

01:56.210 --> 01:58.640
But we will name like temps.

01:58.850 --> 02:02.780
So it will be a reworked name or not.

02:03.620 --> 02:06.670
So it's entity will be some posehn.

02:07.460 --> 02:13.700
So this we identifying the entity from every single token we got earlier.

02:14.540 --> 02:17.120
That is called EZA named Entity Recognition.

02:17.910 --> 02:20.870
And then we will see a sentence segmentation.

02:22.010 --> 02:25.760
So these are some of the basics related to natural language processing.

02:25.940 --> 02:27.920
We will be learning in this section.

02:28.760 --> 02:30.050
So see you in the next video.

02:30.260 --> 02:37.190
We'll get started with our first very important tokenized sense step for tax analysis.