1
00:00:00,000 --> 00:00:00,000
Hello guys.

2
00:00:00,000 --> 00:00:04,000
So we are going to continue the discussion with respect to Lang Chin.

3
00:00:04,000 --> 00:00:09,000
Uh, in this video, we are going to discuss about some of the very important components of Lang Chin,

4
00:00:09,000 --> 00:00:14,000
uh, that we use while generating a gen AI application.

5
00:00:15,000 --> 00:00:20,000
Already in our previous video, we have seen the entire ecosystem of Lang Chin, and we had also done

6
00:00:20,000 --> 00:00:22,000
the setup of Lang Smith.

7
00:00:22,000 --> 00:00:28,000
Uh, and, uh, before I go ahead and do the coding, you know, and start using this library, uh,

8
00:00:28,000 --> 00:00:33,000
we will just try to understand what are the most common components that we will be specifically using.

9
00:00:33,000 --> 00:00:40,000
And in this video, I'll be taking an example of a Rag application that is nothing but retrieval augmented

10
00:00:40,000 --> 00:00:41,000
generation.

11
00:00:41,000 --> 00:00:43,000
Now what does this Rag actually means?

12
00:00:43,000 --> 00:00:46,000
Is that and we are just going to create a lot of projects as we go ahead.

13
00:00:46,000 --> 00:00:48,000
But let me just give you a brief idea about Rag.

14
00:00:48,000 --> 00:00:51,000
What exactly is a Rag application.

15
00:00:51,000 --> 00:00:57,000
Let's say uh, and by just seeing this particular diagram, you'll be able to understand okay.

16
00:00:57,000 --> 00:01:03,000
Now drag augmented generation basically means let's say you have a specific data source okay.

17
00:01:03,000 --> 00:01:07,000
Now this data source may be some kind of PDF file.

18
00:01:08,000 --> 00:01:09,000
Right.

19
00:01:09,000 --> 00:01:12,000
Let's say you have thousands and thousands of PDF files.

20
00:01:12,000 --> 00:01:18,000
Now if you want to make this PDF file as a document Q&A application?

21
00:01:18,000 --> 00:01:23,000
That basically means whenever I ask a question automatically, my gen AI application.

22
00:01:23,000 --> 00:01:28,000
Okay, my let's say if I am probably creating a gen AI app over here.

23
00:01:28,000 --> 00:01:31,000
Okay, so this is my gen app.

24
00:01:31,000 --> 00:01:39,000
What I will do is that if a user asks any query, it should be able to probably send this query to this

25
00:01:39,000 --> 00:01:40,000
entire data source.

26
00:01:40,000 --> 00:01:43,000
And from this PDF we should be able to get the response.

27
00:01:44,000 --> 00:01:48,000
Okay, so this in short is the entire process of Rag.

28
00:01:48,000 --> 00:01:56,000
But again I've just given you a 10,000 height overview right in 10,000ft height overview.

29
00:01:56,000 --> 00:02:01,000
So that is the reason I just told that, hey, we have some kind of dataset where we have thousands

30
00:02:01,000 --> 00:02:02,000
and thousands of PDF.

31
00:02:02,000 --> 00:02:07,000
And now what I'm doing is that I'm creating a generative AI application, uh, this generative AI application,

32
00:02:07,000 --> 00:02:14,000
what task it performs is that whenever a input is given from a user or regarding any question from this

33
00:02:14,000 --> 00:02:19,000
particular PDF, I should be able to get it by having a conversation with this entire data source,

34
00:02:19,000 --> 00:02:19,000
right?

35
00:02:19,000 --> 00:02:21,000
That is nothing but tons of PDF that you have.

36
00:02:21,000 --> 00:02:28,000
Okay, this is a just a basic example, but we'll discuss more about it now if I just consider this

37
00:02:28,000 --> 00:02:29,000
application okay.

38
00:02:29,000 --> 00:02:34,000
And most of the common components that we use in engine to develop this kind of application and similar

39
00:02:34,000 --> 00:02:36,000
kind of applications itself.

40
00:02:36,000 --> 00:02:40,000
So we'll try to understand step by step how we will probably be developing.

41
00:02:40,000 --> 00:02:44,000
And then along with that, what all components are used in each and every step.

42
00:02:44,000 --> 00:02:45,000
We'll try to understand it okay.

43
00:02:45,000 --> 00:02:51,000
So initially what we do is that whenever I have this kind of use case, first of all we need to load

44
00:02:51,000 --> 00:02:53,000
the data set right now.

45
00:02:53,000 --> 00:02:57,000
Loading the data set will be from various data sources.

46
00:02:57,000 --> 00:02:57,000
Okay.

47
00:02:57,000 --> 00:03:00,000
Now this data source can be anything.

48
00:03:00,000 --> 00:03:02,000
So and that is the reason we have used this diagram.

49
00:03:02,000 --> 00:03:04,000
It can be a PDF.

50
00:03:04,000 --> 00:03:06,000
It can be a Excel file.

51
00:03:06,000 --> 00:03:08,000
It can be a JSON file.

52
00:03:08,000 --> 00:03:09,000
It can be an image.

53
00:03:09,000 --> 00:03:10,000
It can be videos.

54
00:03:10,000 --> 00:03:11,000
It can be URLs.

55
00:03:11,000 --> 00:03:12,000
Right.

56
00:03:12,000 --> 00:03:18,000
So initially whenever we develop this kind of application the first step that we specifically use is

57
00:03:18,000 --> 00:03:19,000
data ingestion.

58
00:03:20,000 --> 00:03:24,000
So this is the most important component in launching okay.

59
00:03:24,000 --> 00:03:26,000
And in the upcoming video.

60
00:03:26,000 --> 00:03:33,000
In the next video I will try to show you, with the help of Lang Chain, how you will be able to perform

61
00:03:33,000 --> 00:03:36,000
data ingestion with respect to various data sources.

62
00:03:36,000 --> 00:03:40,000
Okay, so that is the first component that we are going to discuss.

63
00:03:40,000 --> 00:03:46,000
Then what we do is that after taking after after basically using this entire data, like let's say we

64
00:03:46,000 --> 00:03:52,000
have ingested this data, we have loaded this particular data, then the next step will be that we will

65
00:03:52,000 --> 00:03:56,000
divide this data will split this data into smaller chunks.

66
00:03:56,000 --> 00:03:56,000
Okay.

67
00:03:56,000 --> 00:04:05,000
So here what we are doing, we will take this entire data and we will split into text chunks.

68
00:04:05,000 --> 00:04:06,000
Okay.

69
00:04:06,000 --> 00:04:07,000
It can be text chunks.

70
00:04:07,000 --> 00:04:08,000
It can be document chunks.

71
00:04:08,000 --> 00:04:11,000
The reason is very much simple why we are basically doing this.

72
00:04:11,000 --> 00:04:17,000
Because see, in the later stages we will be specifically using LM models.

73
00:04:17,000 --> 00:04:18,000
Or it can be multi model.

74
00:04:18,000 --> 00:04:19,000
Right.

75
00:04:19,000 --> 00:04:27,000
And one of the very important uh property of this particular LM model is that it has some limitation

76
00:04:27,000 --> 00:04:31,000
limitation with respect to context.

77
00:04:31,000 --> 00:04:37,000
So for different different LM models that we will be seeing like OpenAI or Google Gemini Pro or any

78
00:04:37,000 --> 00:04:41,000
different kind of models that are even available in hugging face open source models like llama three.

79
00:04:41,000 --> 00:04:45,000
They have some limitation with respect to the context size okay.

80
00:04:46,000 --> 00:04:51,000
So based on this particular context size, what we will be doing is that there is a restriction the

81
00:04:51,000 --> 00:04:57,000
size like uh, if my LM model I can only give this much maximum context size of data so that it will

82
00:04:57,000 --> 00:05:02,000
be able to understand from where it needs to pick the query or response, and it needs to give it as

83
00:05:02,000 --> 00:05:03,000
an output.

84
00:05:03,000 --> 00:05:03,000
Right.

85
00:05:03,000 --> 00:05:06,000
So that is the reason what we do is that we divide this particular data.

86
00:05:06,000 --> 00:05:06,000
Right.

87
00:05:06,000 --> 00:05:08,000
And this data can be having 1000 PDFs.

88
00:05:08,000 --> 00:05:13,000
If you read all the all the data from a thousand PDFs, it can become a huge data itself.

89
00:05:13,000 --> 00:05:17,000
So we take that data and we'll split that into text chunks okay.

90
00:05:17,000 --> 00:05:20,000
So that is the second step that you will be seeing.

91
00:05:20,000 --> 00:05:24,000
And splitting this chunks is done by various other techniques.

92
00:05:24,000 --> 00:05:28,000
So here I will just consider this as a data transformation technique.

93
00:05:28,000 --> 00:05:32,000
So this is nothing, but this is my data transformation here.

94
00:05:32,000 --> 00:05:36,000
My main aim is to transform the data or divide this data into chunks.

95
00:05:36,000 --> 00:05:36,000
Okay.

96
00:05:36,000 --> 00:05:41,000
So this is the second important, uh, component that we will be doing.

97
00:05:41,000 --> 00:05:45,000
And again, based on our upcoming videos we will discuss each and every components.

98
00:05:45,000 --> 00:05:45,000
Okay.

99
00:05:45,000 --> 00:05:50,000
Now coming to the third and in the important step which is called as embed.

100
00:05:50,000 --> 00:05:51,000
Okay.

101
00:05:52,000 --> 00:05:54,000
Embed is nothing, but it is just embedding okay.

102
00:05:54,000 --> 00:05:57,000
So here we are basically going to do text embedding.

103
00:05:57,000 --> 00:05:59,000
Now what is text embedding basically mean.

104
00:05:59,000 --> 00:06:05,000
We will take this entire text and we will convert into vectors okay.

105
00:06:05,000 --> 00:06:07,000
Now why do we require vectors.

106
00:06:07,000 --> 00:06:13,000
Because see whenever we query this kind of data source you know unless and until we don't convert this

107
00:06:13,000 --> 00:06:20,000
text into vectors over there, algorithms such as similarity search because see, one of the most important,

108
00:06:21,000 --> 00:06:26,000
uh, algorithm that gets applied when we are working with this kind of text data is nothing but similarity

109
00:06:26,000 --> 00:06:27,000
search okay.

110
00:06:28,000 --> 00:06:30,000
Similarity search.

111
00:06:30,000 --> 00:06:33,000
And it will basically be using a cosine search over here.

112
00:06:33,000 --> 00:06:38,000
It will try to understand based on the vectors which are the similar context, which are the similar

113
00:06:38,000 --> 00:06:39,000
sentence that I'm actually searching for.

114
00:06:39,000 --> 00:06:42,000
And based on that, it will retrieve those particular context.

115
00:06:42,000 --> 00:06:42,000
Okay.

116
00:06:42,000 --> 00:06:45,000
So that is the reason we basically use embedding techniques.

117
00:06:45,000 --> 00:06:49,000
Now in this embedding techniques you have different different kinds of embedding.

118
00:06:49,000 --> 00:06:51,000
OpenAI has OpenAI embeddings.

119
00:06:51,000 --> 00:06:55,000
You also have open source embeddings for different different model.

120
00:06:55,000 --> 00:06:57,000
Even in hugging face you'll be seeing a lot of open source model.

121
00:06:57,000 --> 00:06:57,000
It.

122
00:06:57,000 --> 00:07:01,000
All of them specifically has its own embedding techniques.

123
00:07:01,000 --> 00:07:04,000
Okay, even Google Gemini Pro has its own embedding techniques.

124
00:07:04,000 --> 00:07:08,000
So we will again be seeing that whenever we are developing that particular application.

125
00:07:08,000 --> 00:07:12,000
So this was the the third step now coming to the fourth step.

126
00:07:12,000 --> 00:07:15,000
So here you can see I will take this text I'll convert this into vectors.

127
00:07:15,000 --> 00:07:17,000
The vectors will look something like this okay.

128
00:07:18,000 --> 00:07:20,000
Now this vectors.

129
00:07:20,000 --> 00:07:23,000
Once we convert it we really need to store it somewhere.

130
00:07:23,000 --> 00:07:24,000
Right.

131
00:07:24,000 --> 00:07:26,000
Because we can't just keep it in our local file.

132
00:07:26,000 --> 00:07:26,000
Right.

133
00:07:26,000 --> 00:07:29,000
So for this we will be using a store.

134
00:07:29,000 --> 00:07:30,000
And this store is nothing.

135
00:07:30,000 --> 00:07:35,000
But this is basically called as vector store database.

136
00:07:36,000 --> 00:07:36,000
Okay.

137
00:07:36,000 --> 00:07:40,000
Vector store DB we will be seeing different different vector store db.

138
00:07:41,000 --> 00:07:44,000
First example is nothing but we will go ahead and see fires.

139
00:07:44,000 --> 00:07:46,000
Then there is something called as chroma DB.

140
00:07:48,000 --> 00:07:51,000
In our course we'll also be seeing Astra DB.

141
00:07:52,000 --> 00:07:55,000
Okay so different different vector databases.

142
00:07:55,000 --> 00:08:01,000
We'll be seeing how we can basically save this entire data into this kind of vector database that we

143
00:08:01,000 --> 00:08:02,000
are going to see.

144
00:08:02,000 --> 00:08:02,000
Okay.

145
00:08:02,000 --> 00:08:05,000
And that is what this is exactly a vector database.

146
00:08:05,000 --> 00:08:10,000
And what you can also do is that from this vector database you can also directly query anything that

147
00:08:10,000 --> 00:08:11,000
you want.

148
00:08:11,000 --> 00:08:16,000
But the result or the response that you are basically going to get will be the context information.

149
00:08:16,000 --> 00:08:17,000
Okay.

150
00:08:17,000 --> 00:08:21,000
It will be the context information, and it usually uses the similarity search.

151
00:08:21,000 --> 00:08:22,000
Okay.

152
00:08:22,000 --> 00:08:24,000
So this is my fourth step.

153
00:08:24,000 --> 00:08:31,000
That usually happens now once I save this entirely into the vector db vector store db okay.

154
00:08:31,000 --> 00:08:34,000
Now we go to the second most important step okay.

155
00:08:34,000 --> 00:08:39,000
And this is my second um second you can just consider that this is my second module.

156
00:08:39,000 --> 00:08:40,000
This is my first module.

157
00:08:40,000 --> 00:08:43,000
And the first module you have this particular four steps okay.

158
00:08:44,000 --> 00:08:47,000
And in the second module you have some more steps okay.

159
00:08:47,000 --> 00:08:51,000
Now while we are working with vector store DB okay.

160
00:08:51,000 --> 00:08:59,000
In the second module you will be seeing that a user will be able to ask any questions okay.

161
00:08:59,000 --> 00:09:04,000
And what we do we also design our separate prompt.

162
00:09:04,000 --> 00:09:06,000
So here is your user question.

163
00:09:06,000 --> 00:09:09,000
Along with this we combine with this particular prompt.

164
00:09:09,000 --> 00:09:12,000
This prompt usually says that hey you are an LM model.

165
00:09:12,000 --> 00:09:14,000
You need to behave something like this.

166
00:09:14,000 --> 00:09:21,000
I'll say, hey, uh, act as a AI, AI researcher and try to answer all the questions that is given

167
00:09:21,000 --> 00:09:21,000
by the user.

168
00:09:22,000 --> 00:09:27,000
Okay, so if I give this kind of prompt, any LM model that we are going to use, it will act like an

169
00:09:27,000 --> 00:09:28,000
AI researcher.

170
00:09:28,000 --> 00:09:34,000
Okay, so here prompt is just like the message that I am giving to my AI assistant.

171
00:09:35,000 --> 00:09:36,000
Assistant okay.

172
00:09:37,000 --> 00:09:40,000
AI assistant like what it needs to behave like.

173
00:09:40,000 --> 00:09:40,000
Okay.

174
00:09:40,000 --> 00:09:47,000
So once I give that particular prompt template uh prompt, then any question that is given over here,

175
00:09:47,000 --> 00:09:51,000
we will try to combine with this particular prompt okay.

176
00:09:51,000 --> 00:09:57,000
So as soon as we give this particular question, okay, this question initially will be searched from

177
00:09:57,000 --> 00:09:58,000
this vector store.

178
00:09:58,000 --> 00:09:59,000
Okay.

179
00:09:59,000 --> 00:10:00,000
From this vector store.

180
00:10:01,000 --> 00:10:05,000
Then from that particular vector store we will get some context information.

181
00:10:05,000 --> 00:10:11,000
So for doing that purpose to basically query from the vector store we really need to understand two

182
00:10:11,000 --> 00:10:12,000
important things.

183
00:10:12,000 --> 00:10:21,000
One is there is something called as document store flow chain.

184
00:10:22,000 --> 00:10:22,000
Okay.

185
00:10:23,000 --> 00:10:26,000
This is one thing which we will be learning about okay.

186
00:10:26,000 --> 00:10:28,000
This is also one very important component.

187
00:10:28,000 --> 00:10:32,000
And the second thing is something called as the retrieval chain.

188
00:10:33,000 --> 00:10:40,000
Or I'll not just say two, I can just consider it as one because document stuff load chain will be part

189
00:10:40,000 --> 00:10:40,000
of that.

190
00:10:40,000 --> 00:10:40,000
Okay.

191
00:10:40,000 --> 00:10:45,000
So here we need to understand about something called as the retrieval chain.

192
00:10:46,000 --> 00:10:49,000
So this retrieval chain how it is basically created.

193
00:10:49,000 --> 00:10:51,000
And what is this retrieval chain.

194
00:10:51,000 --> 00:11:02,000
This retrieval chain is nothing but it is an interface which is responsible in querying the vector store

195
00:11:02,000 --> 00:11:06,000
DB vector store db.

196
00:11:06,000 --> 00:11:07,000
Okay.

197
00:11:07,000 --> 00:11:10,000
And based on this particular query, I will be able to get the response.

198
00:11:10,000 --> 00:11:17,000
And once I get the response through this retrieval chain, I will basically be having my context info.

199
00:11:17,000 --> 00:11:22,000
Context info basically means based on the query, where is the similar data in that particular PDF,

200
00:11:22,000 --> 00:11:24,000
that entire context will be picked up, right?

201
00:11:24,000 --> 00:11:31,000
And along with this context info, we will be combining this prompt template and finally giving to the

202
00:11:31,000 --> 00:11:35,000
LM model to get my output response.

203
00:11:37,000 --> 00:11:38,000
Output response.

204
00:11:39,000 --> 00:11:43,000
So this in short is the entire cycle over here.

205
00:11:43,000 --> 00:11:44,000
Okay.

206
00:11:44,000 --> 00:11:45,000
Now what will happen in the next video?

207
00:11:45,000 --> 00:11:49,000
First of all we'll see what are the different different data ingestion technique in action.

208
00:11:49,000 --> 00:11:55,000
You'll be amazed to see like what all different different data ingestion techniques are there.

209
00:11:55,000 --> 00:11:55,000
Right.

210
00:11:55,000 --> 00:12:00,000
The second thing will that I'm actually going to show you if I take up any data, how will I be able

211
00:12:00,000 --> 00:12:02,000
to divide that into chunks.

212
00:12:02,000 --> 00:12:03,000
This is also amazing.

213
00:12:03,000 --> 00:12:05,000
Like multiple properties are specifically there.

214
00:12:06,000 --> 00:12:12,000
Then the third thing that I'm actually going to show you, which all embedding techniques we can specifically

215
00:12:12,000 --> 00:12:12,000
use, right.

216
00:12:12,000 --> 00:12:18,000
The embedding techniques like Firefox, Chrome, adb, Astra DB will be seeing multiple embedding techniques

217
00:12:18,000 --> 00:12:18,000
okay.

218
00:12:18,000 --> 00:12:21,000
Okay, sorry, not face chroma db.

219
00:12:21,000 --> 00:12:22,000
This is vector store db.

220
00:12:22,000 --> 00:12:23,000
But we will try to see.

221
00:12:23,000 --> 00:12:28,000
With respect to OpenAI, I have a different embedding technique with respect to open source models.

222
00:12:28,000 --> 00:12:29,000
I have a different embedding technique.

223
00:12:29,000 --> 00:12:34,000
So we will try to see that embedding techniques which whose responsibility will be for converting your

224
00:12:34,000 --> 00:12:36,000
text data into vectors.

225
00:12:36,000 --> 00:12:40,000
And finally, you'll also be seeing that we'll also discuss about vector store DB.

226
00:12:40,000 --> 00:12:41,000
Right.

227
00:12:41,000 --> 00:12:45,000
And then we'll combine all these things and we will be creating some amazing projects.

228
00:12:45,000 --> 00:12:48,000
We'll also be learning about retrieval chain.

229
00:12:48,000 --> 00:12:51,000
So step by step we will break down each of these things.

230
00:12:51,000 --> 00:12:56,000
Because these are most of the common components that are used in every projects of JNI.

231
00:12:56,000 --> 00:12:57,000
Right.

232
00:12:57,000 --> 00:12:59,000
So I hope you like this particular video.

233
00:12:59,000 --> 00:13:03,000
These are some of the very important components of language in which you should never miss it out.

234
00:13:03,000 --> 00:13:05,000
And definitely you should know about these things.

235
00:13:05,000 --> 00:13:07,000
So yes, this was it for my side.

236
00:13:07,000 --> 00:13:09,000
I will see you all in the next video.

237
00:13:09,000 --> 00:13:09,000
Thank you.