1
00:00:00,000 --> 00:00:05,000
So guys, in this particular video we are going to discuss an end to end LM project with the help of

2
00:00:05,000 --> 00:00:07,000
AWS, bedrock and Lang Chain.

3
00:00:07,000 --> 00:00:11,000
This was also one of the most requested video from many people out there.

4
00:00:11,000 --> 00:00:15,000
What we are going to do in this particular project is that we are going to develop a document, Q and

5
00:00:15,000 --> 00:00:21,000
a application where specifically will be harnessing multiple models that are provided by AWS bedrock

6
00:00:21,000 --> 00:00:23,000
like Cloudy Llama two.

7
00:00:23,000 --> 00:00:29,000
And you can also use Amazon Titan, whatever models you specifically want that actually it provides.

8
00:00:29,000 --> 00:00:33,000
We can actually use them and we can implement this entire application.

9
00:00:33,000 --> 00:00:38,000
So to go ahead with, I will just go ahead and show you the quick demo over here.

10
00:00:38,000 --> 00:00:40,000
So what is this specific entire application.

11
00:00:40,000 --> 00:00:41,000
This is a rack right?

12
00:00:41,000 --> 00:00:46,000
We are trying to create a rack system wherein we have multiple PDFs.

13
00:00:46,000 --> 00:00:50,000
All those PDFs is stored in the form of vector embeddings inside vector store.

14
00:00:50,000 --> 00:00:57,000
And whenever we ask any query we will be able to harness the power of AWS bedrock, which provides different

15
00:00:57,000 --> 00:00:59,000
different models like cloudy, Llama two.

16
00:00:59,000 --> 00:00:59,000
And uh.

17
00:00:59,000 --> 00:01:03,000
Based on this particular question, we can retrieve that question from the PDF files.

18
00:01:03,000 --> 00:01:06,000
So let's say I'm asking what are Transformers?

19
00:01:06,000 --> 00:01:12,000
And if I go ahead and click on cloudy Output that basically is going to hit the cloudy API cloudy model.

20
00:01:12,000 --> 00:01:14,000
And I'm going to get the entire response okay.

21
00:01:14,000 --> 00:01:20,000
So currently this is running If you probably want llama two output, I can also go ahead and click on

22
00:01:20,000 --> 00:01:21,000
llama two.

23
00:01:21,000 --> 00:01:24,000
And I can actually get that specific response right.

24
00:01:24,000 --> 00:01:30,000
So whatever questions you you have with respect to this particular PDF documents what the set of PDF

25
00:01:30,000 --> 00:01:34,000
documents also will show you, and you will be able to get the entire response over here.

26
00:01:34,000 --> 00:01:34,000
Right?

27
00:01:34,000 --> 00:01:37,000
So, uh, why it is taking some time?

28
00:01:37,000 --> 00:01:42,000
Because I still need to write that optimized version of the code, but once it runs right for the first

29
00:01:42,000 --> 00:01:45,000
time, I think then after that it works absolutely fine.

30
00:01:45,000 --> 00:01:50,000
Now let me just go ahead and write what is YOLO, right?

31
00:01:50,000 --> 00:01:57,000
So if I go ahead and click on llama two output before see I showed you also llama two, uh, in my local

32
00:01:57,000 --> 00:01:58,000
system itself over there.

33
00:01:58,000 --> 00:02:01,000
It was taking a lot of time, but this is directly coming from the API model itself.

34
00:02:01,000 --> 00:02:05,000
So here is what is the response I am specifically getting.

35
00:02:05,000 --> 00:02:06,000
And I'll be able to get the answers.

36
00:02:06,000 --> 00:02:10,000
Now this is what I'm going to develop completely from scratch already.

37
00:02:10,000 --> 00:02:15,000
If you remember guys in our previous session in our yesterday, I had uploaded a video where I showed

38
00:02:15,000 --> 00:02:18,000
you the power of cloudy dot pi, how to invoke the model and all.

39
00:02:19,000 --> 00:02:21,000
Uh, then we also saw with respect to llama two and all.

40
00:02:21,000 --> 00:02:25,000
Now let me just go ahead and create a new file over here.

41
00:02:25,000 --> 00:02:28,000
So let me just go ahead and write this as app dot Pi.

42
00:02:28,000 --> 00:02:32,000
Now inside this app dot pi I'm going to write my entire code.

43
00:02:32,000 --> 00:02:38,000
Now before I go ahead, I really need to do some installation with respect to some of the requirements.

44
00:02:38,000 --> 00:02:45,000
As said, uh, here we are going to use not only Boto3 and aws CLI, we are also going to use some more

45
00:02:45,000 --> 00:02:47,000
libraries like pi PDF.

46
00:02:47,000 --> 00:02:50,000
So let me just go ahead and write this over here.

47
00:02:50,000 --> 00:02:55,000
So we require pi PDF along with pi PDF we We also need lang chain.

48
00:02:56,000 --> 00:03:01,000
And then we will also be requiring Streamlit write uh.

49
00:03:01,000 --> 00:03:06,000
Along with this we also require fierce fire CPU.

50
00:03:06,000 --> 00:03:07,000
Right.

51
00:03:07,000 --> 00:03:11,000
So this is what we are going to use because from our local environment we are going to do the vector

52
00:03:11,000 --> 00:03:12,000
embeddings.

53
00:03:12,000 --> 00:03:15,000
So for this I'm going to use the fierce DB right.

54
00:03:15,000 --> 00:03:19,000
So through this only we'll be able to create the vector embedding into the vector store.

55
00:03:19,000 --> 00:03:22,000
So these are some of the basic requirements that I actually require.

56
00:03:22,000 --> 00:03:27,000
Now uh aws cli is also there because we really need to configure it.

57
00:03:27,000 --> 00:03:30,000
Now let me quickly go ahead and open the terminal.

58
00:03:30,000 --> 00:03:33,000
Let me clear the screen because yesterday I was creating the images over here.

59
00:03:33,000 --> 00:03:37,000
The first step again you have to create a virtual environment.

60
00:03:37,000 --> 00:03:38,000
Do all this pip install.

61
00:03:38,000 --> 00:03:42,000
And all right I hope I don't have to tell you all those things now because I have repeated many number

62
00:03:42,000 --> 00:03:43,000
of times.

63
00:03:43,000 --> 00:03:47,000
So let me go ahead and write pip install minus our requirement dot txt.

64
00:03:47,000 --> 00:03:51,000
If you are not able to understand it guys, please make sure that I'll be providing you this entire

65
00:03:51,000 --> 00:03:53,000
playlist in the description of this particular video.

66
00:03:53,000 --> 00:03:55,000
You can go ahead and watch it.

67
00:03:55,000 --> 00:03:58,000
Okay, but uh, you have to follow the playlist, right?

68
00:03:58,000 --> 00:04:02,000
Again, I'm not going to repeat everything from scratch, otherwise it is just a waste of time.

69
00:04:02,000 --> 00:04:03,000
Right?

70
00:04:03,000 --> 00:04:07,000
So till this installation is basically taking place, let me just go ahead and talk about what all things

71
00:04:07,000 --> 00:04:09,000
we are going to basically develop.

72
00:04:09,000 --> 00:04:09,000
Right.

73
00:04:09,000 --> 00:04:11,000
So this is the document Q&A search.

74
00:04:11,000 --> 00:04:16,000
So in short my PDF will be stored in a stored in the form of my entire PDFs.

75
00:04:16,000 --> 00:04:18,000
We probably consider this.

76
00:04:18,000 --> 00:04:21,000
My entire PDF will be stored in vector store.

77
00:04:21,000 --> 00:04:26,000
And from this vector store we can specifically query any information that we want.

78
00:04:26,000 --> 00:04:30,000
Along with this, we are going to harness the power of Lang Chen.

79
00:04:30,000 --> 00:04:36,000
Chin Lang Chin along with the LM models from AWS bedrock.

80
00:04:36,000 --> 00:04:39,000
Okay, so this is what we are going to do in this project.

81
00:04:39,000 --> 00:04:44,000
So please make sure that you prepare well, uh, understand the architecture, what we are trying to

82
00:04:44,000 --> 00:04:44,000
do.

83
00:04:44,000 --> 00:04:49,000
So our entire project deals with two important steps one, two.

84
00:04:49,000 --> 00:04:55,000
And before this, we are also going to do a very important step, which is called as data ingestion

85
00:04:55,000 --> 00:05:01,000
Right now in data ingestion, what we are going to do is that we are going to read from the entire folder

86
00:05:01,000 --> 00:05:03,000
how many PDFs are there, all the PDFs.

87
00:05:03,000 --> 00:05:04,000
We are going to read it.

88
00:05:04,000 --> 00:05:05,000
Okay.

89
00:05:05,000 --> 00:05:10,000
So once I probably get this entire PDF, then my next step actually starts over here.

90
00:05:11,000 --> 00:05:16,000
I'm going to take all those documents, the PDFs over there, split this into chunks, create the embeddings.

91
00:05:16,000 --> 00:05:19,000
And here we are going to use this for this database.

92
00:05:19,000 --> 00:05:23,000
So here we are going to use this fast right.

93
00:05:23,000 --> 00:05:25,000
The vector database.

94
00:05:25,000 --> 00:05:28,000
And through that only we will try to create this vector store.

95
00:05:28,000 --> 00:05:28,000
Right.

96
00:05:28,000 --> 00:05:31,000
So step by step I'll show you how to do this.

97
00:05:31,000 --> 00:05:32,000
What embeddings.

98
00:05:32,000 --> 00:05:34,000
Also we are going to specifically use here.

99
00:05:34,000 --> 00:05:36,000
See understand we are going to create the embeddings.

100
00:05:36,000 --> 00:05:43,000
So for embeddings we are going to use today I will show you a new model which is basically called as

101
00:05:43,000 --> 00:05:44,000
Amazon Titan.

102
00:05:45,000 --> 00:05:45,000
Okay.

103
00:05:45,000 --> 00:05:48,000
So that you understand you have multiple options.

104
00:05:48,000 --> 00:05:51,000
You have you may have implemented multiple options with respect to that.

105
00:05:51,000 --> 00:05:56,000
Because tomorrow in companies if you go definitely they are going to use this AWS bedrock because it

106
00:05:56,000 --> 00:05:58,000
provides a lot of features, right.

107
00:05:59,000 --> 00:06:02,000
So for creating this embeddings I will specifically use Amazon Titan.

108
00:06:03,000 --> 00:06:06,000
Other than this, if you don't want to use this, you can also use OpenAI embeddings.

109
00:06:06,000 --> 00:06:06,000
Right?

110
00:06:06,000 --> 00:06:07,000
It is up to you.

111
00:06:07,000 --> 00:06:12,000
I've also shown with Google generative AI embeddings, multiple embeddings techniques in my playlist

112
00:06:12,000 --> 00:06:14,000
with respect to Google, Gemini and all.

113
00:06:14,000 --> 00:06:19,000
So after we create this vector store, now how we are going to use the LM.

114
00:06:19,000 --> 00:06:24,000
So in the second step, whenever we ask any question, first of all the similarity search will happen

115
00:06:24,000 --> 00:06:29,000
from the vector store whatever relevant documents and chunks we get.

116
00:06:29,000 --> 00:06:30,000
We have to take this chunks.

117
00:06:30,000 --> 00:06:33,000
Give it to my LM model along with the prompt.

118
00:06:34,000 --> 00:06:35,000
Like let's say that I say that, okay.

119
00:06:35,000 --> 00:06:40,000
Please summarize this entire information based on the query that I've asked in 250 words.

120
00:06:40,000 --> 00:06:45,000
So this LLM model along with this prompt is going to take this particular data and is going to give

121
00:06:45,000 --> 00:06:46,000
the answer itself.

122
00:06:46,000 --> 00:06:46,000
Right.

123
00:06:46,000 --> 00:06:51,000
So this both I'm going to develop completely from scratch step by step.

124
00:06:51,000 --> 00:06:52,000
We'll try to see it okay.

125
00:06:52,000 --> 00:06:57,000
So uh let's quickly go ahead and let's start our coding without wasting any time.

126
00:06:57,000 --> 00:06:58,000
Okay.

127
00:06:58,000 --> 00:07:00,000
So here is my app dot Pi.

128
00:07:00,000 --> 00:07:02,000
I said that I'm going to do it completely from scratch.

129
00:07:02,000 --> 00:07:05,000
So um, there will be some installation.

130
00:07:05,000 --> 00:07:08,000
There will be some, uh, errors that may be probably coming up.

131
00:07:08,000 --> 00:07:14,000
So we will try to also import all the libraries based on the pipeline that we have created, based on

132
00:07:14,000 --> 00:07:17,000
the steps or architectures that we have already discussed.

133
00:07:17,000 --> 00:07:22,000
So first of all, quickly let's go ahead and import JSON.

134
00:07:23,000 --> 00:07:25,000
So I'm going to import JSON.

135
00:07:25,000 --> 00:07:27,000
I'm going to import OS.

136
00:07:27,000 --> 00:07:34,000
Uh I'm going to import uh sis I think I'll not be using sis but let it be.

137
00:07:34,000 --> 00:07:37,000
These are some of the common libraries that we will specifically use.

138
00:07:37,000 --> 00:07:39,000
Along with this we are going to use Boto3.

139
00:07:39,000 --> 00:07:44,000
And again guys, in my previous video I have shown you how to configure the aws cli.

140
00:07:44,000 --> 00:07:48,000
Please make sure that you watch the playlist, otherwise you will not be able to understand.

141
00:07:48,000 --> 00:07:53,000
Okay, if you directly go into and jump in this particular video, no, you'll not be able to understand.

142
00:07:53,000 --> 00:07:55,000
So in the description I will give you the playlist.

143
00:07:55,000 --> 00:07:56,000
The first video only.

144
00:07:56,000 --> 00:07:59,000
I've shown you how you can actually configure the aws cli.

145
00:08:00,000 --> 00:08:00,000
Okay.

146
00:08:00,000 --> 00:08:01,000
So please make sure this okay.

147
00:08:01,000 --> 00:08:04,000
The next thing is that as I said right.

148
00:08:04,000 --> 00:08:08,000
We will be we will be using.

149
00:08:10,000 --> 00:08:19,000
using Titan embedding small model right for creating vectors or to generate embeddings.

150
00:08:19,000 --> 00:08:19,000
Okay.

151
00:08:19,000 --> 00:08:21,000
So we will create this model.

152
00:08:21,000 --> 00:08:24,000
The reason why I'm showing you because I've never used in any of my videos.

153
00:08:24,000 --> 00:08:30,000
Now this Titan embedding what I will do I will call from the long chain library a long chain framework.

154
00:08:30,000 --> 00:08:37,000
Long chain provides you multiple functionalities, multiple multiple options to properly interact with

155
00:08:37,000 --> 00:08:39,000
bedrock, right AWS bedrock.

156
00:08:39,000 --> 00:08:42,000
So as I said, this framework is compulsory for you all to know.

157
00:08:42,000 --> 00:08:43,000
One is launching and Lama index.

158
00:08:43,000 --> 00:08:44,000
Right.

159
00:08:44,000 --> 00:08:53,000
So what I'm going to do I'm going to right from lang chain dot embeddings I'm going to import bedrock.

160
00:08:54,000 --> 00:08:56,000
Let's see bed.

161
00:08:59,000 --> 00:08:59,000
Bed.

162
00:08:59,000 --> 00:09:01,000
Rock embeddings.

163
00:09:01,000 --> 00:09:01,000
Right.

164
00:09:01,000 --> 00:09:03,000
So I'm going to specifically use this.

165
00:09:03,000 --> 00:09:05,000
This will be small letter okay.

166
00:09:05,000 --> 00:09:08,000
So guys I'm also verifying it from the documentation okay.

167
00:09:08,000 --> 00:09:13,000
So there is a documentation that is given not very much clearly but I've implemented this entire project

168
00:09:13,000 --> 00:09:13,000
also.

169
00:09:13,000 --> 00:09:14,000
Right.

170
00:09:14,000 --> 00:09:17,000
So from lang chain dot embeddings import bedrock embeddings.

171
00:09:17,000 --> 00:09:22,000
The next thing that I'm going to use from lang chain dot LMS.

172
00:09:22,000 --> 00:09:26,000
Like we can also use Lange chain to call this LM models that are present inside bedrock.

173
00:09:26,000 --> 00:09:35,000
So that is also integrated right from LMS dot bedrock import bedrock.

174
00:09:35,000 --> 00:09:38,000
Okay I'm going to specifically use this tool.

175
00:09:38,000 --> 00:09:43,000
One is from Lange dot embeddings bedrock embeddings and from Lange chain dot lms dot bedrock.

176
00:09:44,000 --> 00:09:46,000
Uh I'm going to import the bedrock.

177
00:09:46,000 --> 00:09:46,000
Okay.

178
00:09:46,000 --> 00:09:49,000
So this is specifically for the embedding part okay.

179
00:09:49,000 --> 00:09:55,000
Now they will also be some libraries that I need to import for data ingestion because I really need

180
00:09:55,000 --> 00:09:57,000
to load the data set right.

181
00:09:57,000 --> 00:10:04,000
So over here you'll be able to see I will write import numpy as np.

182
00:10:04,000 --> 00:10:05,000
Okay.

183
00:10:05,000 --> 00:10:08,000
Uh this is numpy I'm going to specifically use.

184
00:10:08,000 --> 00:10:11,000
I think by default numpy will be available.

185
00:10:11,000 --> 00:10:16,000
Import numpy as np perfect, so numpy as np.

186
00:10:16,000 --> 00:10:26,000
Then from long chain dot text splitter, I'm going to use text splitter so that as soon as I load the

187
00:10:26,000 --> 00:10:31,000
document, I need to probably use this recursive character text splitter.

188
00:10:31,000 --> 00:10:34,000
Okay, so this is what we specifically use.

189
00:10:34,000 --> 00:10:37,000
And in my lecture series I've discussed everything about this.

190
00:10:37,000 --> 00:10:48,000
Then again I will go ahead and write from Lange chain dot document on document loaders import.

191
00:10:49,000 --> 00:10:52,000
Um we will specifically use this pi.

192
00:10:53,000 --> 00:10:56,000
So let me just copy this pi pdf directly.

193
00:10:56,000 --> 00:10:57,000
Loaders.

194
00:10:57,000 --> 00:11:01,000
So this is what we are specifically going to use py pdf directly loaders.

195
00:11:01,000 --> 00:11:05,000
And it is actually present inside long chained or document loaders.

196
00:11:05,000 --> 00:11:05,000
Okay.

197
00:11:05,000 --> 00:11:09,000
So this is specifically required for the data ingestion.

198
00:11:09,000 --> 00:11:14,000
With this what I will do I have already created one folder over here which is called as data which has

199
00:11:14,000 --> 00:11:15,000
this two PDFs.

200
00:11:15,000 --> 00:11:16,000
Right.

201
00:11:16,000 --> 00:11:21,000
I need to load all this PDFs from this particular folder and then perform all the vector embeddings

202
00:11:21,000 --> 00:11:22,000
that I require.

203
00:11:22,000 --> 00:11:24,000
So in data ingestion, first of all we will load it.

204
00:11:24,000 --> 00:11:29,000
We will split the entire documents by using recursive character text splitter.

205
00:11:29,000 --> 00:11:33,000
And then after this we will convert this into vector embeddings.

206
00:11:33,000 --> 00:11:40,000
So let me just go ahead and write vector embeddings and vector store here I'm going to specifically

207
00:11:40,000 --> 00:11:43,000
use for this index okay fish db or chroma db.

208
00:11:43,000 --> 00:11:44,000
Also you can use it is up to you.

209
00:11:44,000 --> 00:11:47,000
Again I have shown both the ways in my playlist.

210
00:11:47,000 --> 00:11:51,000
So from Long Chain let me just see this.

211
00:11:51,000 --> 00:11:54,000
Okay, so bedrock embeddings.

212
00:11:54,000 --> 00:12:00,000
Is there something called as from long chain underscore community.

213
00:12:00,000 --> 00:12:00,000
It was right.

214
00:12:00,000 --> 00:12:02,000
Let's see okay.

215
00:12:02,000 --> 00:12:06,000
From from long chain underscore community.

216
00:12:06,000 --> 00:12:08,000
Let's copy this.

217
00:12:08,000 --> 00:12:10,000
Is this correct or not.

218
00:12:10,000 --> 00:12:11,000
We'll try to see Yeah.

219
00:12:11,000 --> 00:12:13,000
Bedrock embeddings was there.

220
00:12:13,000 --> 00:12:16,000
Uh, let's copy this here also.

221
00:12:16,000 --> 00:12:18,000
So py PDF directory loader is also there.

222
00:12:18,000 --> 00:12:20,000
I think there will be some warnings that will be coming.

223
00:12:20,000 --> 00:12:24,000
Uh, I think most of the, some of the libraries and community.

224
00:12:24,000 --> 00:12:25,000
Okay.

225
00:12:25,000 --> 00:12:29,000
So here what I'm actually going to do from Lang Chain.

226
00:12:30,000 --> 00:12:33,000
Let's see if we get an error again I'll revert it okay.

227
00:12:34,000 --> 00:12:36,000
From Lang chain dot text splitter.

228
00:12:36,000 --> 00:12:37,000
Sorry, not text splitter.

229
00:12:37,000 --> 00:12:37,000
Why?

230
00:12:37,000 --> 00:12:39,000
I'm using text splitter already.

231
00:12:39,000 --> 00:12:40,000
I've actually done it.

232
00:12:40,000 --> 00:12:47,000
So I'm also going to use for this vector embedding specifically uh, I have to use vector uh files.

233
00:12:47,000 --> 00:12:47,000
Right.

234
00:12:47,000 --> 00:12:53,000
So I will write vector stores import files.

235
00:12:54,000 --> 00:12:54,000
Okay.

236
00:12:54,000 --> 00:12:57,000
So fierce is one library I'll be using.

237
00:12:57,000 --> 00:12:57,000
Again.

238
00:12:57,000 --> 00:13:00,000
This files will be present inside community I guess.

239
00:13:00,000 --> 00:13:04,000
Okay, so Vector Langton underscore community.

240
00:13:04,000 --> 00:13:07,000
Uh, dot vector stores import files.

241
00:13:07,000 --> 00:13:09,000
Let's see if this is not accessible.

242
00:13:09,000 --> 00:13:12,000
So I think it is inside Langton only okay.

243
00:13:12,000 --> 00:13:14,000
Otherwise you can see the documentation okay.

244
00:13:14,000 --> 00:13:15,000
At the end of the day no worries.

245
00:13:15,000 --> 00:13:22,000
Then after doing this I will also write from lang che dot indexes.

246
00:13:22,000 --> 00:13:27,000
Okay I will not use this Also, let's just use fires because we have already installed fire CPU.

247
00:13:27,000 --> 00:13:28,000
Okay, then.

248
00:13:28,000 --> 00:13:35,000
Um, after doing this, uh, what we are specifically going to do is that we are now this is for the

249
00:13:35,000 --> 00:13:35,000
vector embedding.

250
00:13:35,000 --> 00:13:38,000
Now we need to go ahead with the LM models.

251
00:13:38,000 --> 00:13:38,000
Right.

252
00:13:38,000 --> 00:13:46,000
So for LM models there Lang chain already provides ways to load models from the AWS bedrock okay.

253
00:13:46,000 --> 00:13:49,000
So for here I'll write from long chain.

254
00:13:51,000 --> 00:14:00,000
From long chain underscore community dot one I will specifically use prompts oops from long chain dot

255
00:14:00,000 --> 00:14:00,000
prompts.

256
00:14:01,000 --> 00:14:05,000
I'm going to import prompt template okay.

257
00:14:05,000 --> 00:14:11,000
And then the other one is from long chain dot chain.

258
00:14:11,000 --> 00:14:17,000
So I'm going to specifically use one chain where I'll be using import retrieval QA.

259
00:14:17,000 --> 00:14:18,000
Right.

260
00:14:18,000 --> 00:14:23,000
So since I'm going to create a Q&A chat bot with the documents, uh, Q&A in short.

261
00:14:23,000 --> 00:14:25,000
So that is the reason I'm using this.

262
00:14:25,000 --> 00:14:27,000
And this is basically to create my own prompt template.

263
00:14:28,000 --> 00:14:34,000
Now let's call the bedrock client so that we get the access of all the models okay.

264
00:14:34,000 --> 00:14:38,000
So I will go ahead and set up the bedrock clients over here.

265
00:14:38,000 --> 00:14:39,000
So here let me go ahead and write.

266
00:14:39,000 --> 00:14:43,000
Bedrock is equal to Boto3 dot client.

267
00:14:43,000 --> 00:14:53,000
And here I'm going to give my service name is equal to bed the bed bedrock dash runtime okay.

268
00:14:53,000 --> 00:14:59,000
So I've already shown you yesterday also uh in in my playlist of this bedrock, I've also shown you

269
00:14:59,000 --> 00:15:02,000
how you can actually call the client itself and access the models.

270
00:15:02,000 --> 00:15:08,000
Okay, now as said, as I as I already told that the embedding that we are going to specifically use

271
00:15:08,000 --> 00:15:11,000
is called as bedrock embeddings, right.

272
00:15:11,000 --> 00:15:14,000
So let me quickly go ahead and call this.

273
00:15:14,000 --> 00:15:16,000
I will go ahead and write Bedrock Embedding.

274
00:15:16,000 --> 00:15:21,000
And let me show you how you can actually call this embedding from the bedrock.

275
00:15:21,000 --> 00:15:23,000
So I will copy paste this.

276
00:15:23,000 --> 00:15:25,000
Let me give the model ID.

277
00:15:25,000 --> 00:15:28,000
Now this model ID you'll be able to see it over here right.

278
00:15:28,000 --> 00:15:36,000
So if you probably go ahead in AWS itself let's say I will go ahead and write embedding okay.

279
00:15:37,000 --> 00:15:41,000
So here only in Titan I think you will be able to find it out.

280
00:15:41,000 --> 00:15:45,000
So foundation model base models.

281
00:15:45,000 --> 00:15:51,000
So if you go ahead and search inside this any of the embedding models you can specifically take.

282
00:15:51,000 --> 00:15:54,000
But I am going to use this Titan embedding model okay.

283
00:15:54,000 --> 00:16:00,000
So if you probably click over here and see down you'll be getting the entire details like what is the

284
00:16:00,000 --> 00:16:02,000
embedding, what is the model ID and all.

285
00:16:02,000 --> 00:16:04,000
Okay, so I am going to specifically use this.

286
00:16:04,000 --> 00:16:07,000
So let me quickly go ahead and call this model ID.

287
00:16:07,000 --> 00:16:13,000
So for this model ID I've already copied it from that from this particular website.

288
00:16:13,000 --> 00:16:19,000
So here you can see I will have copied this entire thing and I will paste it over here on my model ID.

289
00:16:19,000 --> 00:16:21,000
So you also have to do the same step okay.

290
00:16:21,000 --> 00:16:26,000
So once we do this the next thing in this that we really need to give is the client.

291
00:16:26,000 --> 00:16:27,000
And client.

292
00:16:27,000 --> 00:16:31,000
You know that we have not yet called any client or what.

293
00:16:31,000 --> 00:16:32,000
So we have called this bedrock client.

294
00:16:32,000 --> 00:16:35,000
So this will basically be my client over here.

295
00:16:35,000 --> 00:16:35,000
Right.

296
00:16:35,000 --> 00:16:41,000
So once we give this bedrock client, that basically means it knows we are going to call this particular

297
00:16:41,000 --> 00:16:43,000
embedding model from this particular client.

298
00:16:43,000 --> 00:16:44,000
That is bedrock itself.

299
00:16:44,000 --> 00:16:48,000
In short, we are going to use the AWS, lm, uh, AWS bedrock.

300
00:16:48,000 --> 00:16:51,000
Now let me quickly go ahead.

301
00:16:51,000 --> 00:16:53,000
And this is what is all step by step will be going.

302
00:16:53,000 --> 00:16:57,000
First we will create this data ingestion model.

303
00:16:57,000 --> 00:16:59,000
See if this step is done.

304
00:16:59,000 --> 00:17:00,000
We have created our client.

305
00:17:00,000 --> 00:17:04,000
Now we will go ahead and implement this data ingestion so quickly.

306
00:17:04,000 --> 00:17:07,000
Let's go ahead and implement this data ingestion.

307
00:17:07,000 --> 00:17:10,000
And please make sure that you follow this.

308
00:17:10,000 --> 00:17:14,000
So here I'm going to basically write data ingestion okay.

309
00:17:14,000 --> 00:17:20,000
And uh I will create a step which is called as data ingestion a function.

310
00:17:20,000 --> 00:17:25,000
Now inside this data ingestion what things we are specifically going to do.

311
00:17:25,000 --> 00:17:26,000
Right I have a folder.

312
00:17:26,000 --> 00:17:27,000
So let me go ahead and write.

313
00:17:27,000 --> 00:17:31,000
Loader is equal to pi PDF directory loader.

314
00:17:31,000 --> 00:17:34,000
And here I'm going to give my folder which is called as data.

315
00:17:34,000 --> 00:17:38,000
Right from that data folder only I need to pick up all the PDF files right.

316
00:17:38,000 --> 00:17:39,000
So this is the first step.

317
00:17:39,000 --> 00:17:45,000
Then I will go ahead and write documents is equal to loader dot load.

318
00:17:45,000 --> 00:17:46,000
Right.

319
00:17:46,000 --> 00:17:54,000
So from that this loader we are going to load this entire documents Now in our testing we will specifically

320
00:17:54,000 --> 00:17:57,000
use something called as characters text split.

321
00:17:57,000 --> 00:17:57,000
Right.

322
00:17:57,000 --> 00:17:59,000
So that is what we are.

323
00:17:59,000 --> 00:18:01,000
We have already implemented right.

324
00:18:01,000 --> 00:18:02,000
Recursive character text split.

325
00:18:02,000 --> 00:18:04,000
Now how do you implement this?

326
00:18:04,000 --> 00:18:05,000
It is very much simple.

327
00:18:05,000 --> 00:18:07,000
I will go ahead and write text splitter.

328
00:18:07,000 --> 00:18:08,000
Right.

329
00:18:08,000 --> 00:18:11,000
And this with the action I've already shown you.

330
00:18:11,000 --> 00:18:13,000
So it will be recursive text splitter.

331
00:18:13,000 --> 00:18:18,000
And here first of parameter will be my chunk size.

332
00:18:20,000 --> 00:18:22,000
Chunk underscore size.

333
00:18:22,000 --> 00:18:23,000
Chunk size.

334
00:18:23,000 --> 00:18:26,000
I will give it to 1000 or 10,000.

335
00:18:27,000 --> 00:18:27,000
Right.

336
00:18:27,000 --> 00:18:33,000
And along with this chunk size the second parameter I would like to give as chunk overlap.

337
00:18:33,000 --> 00:18:39,000
Please give this number a little bit more value so that you will be able to understand it Okay.

338
00:18:39,000 --> 00:18:43,000
So this two are their text splitter recursive character text splitter.

339
00:18:43,000 --> 00:18:45,000
And this right.

340
00:18:45,000 --> 00:18:52,000
Once we get this text splitter I will go ahead and write Dox is equal to text splitter dot from oh sorry

341
00:18:52,000 --> 00:18:55,000
dot split underscore documents okay.

342
00:18:55,000 --> 00:18:58,000
And then here I'm going to give my entire document.

343
00:18:58,000 --> 00:19:02,000
So in short we are going to split it right based on this recursive character text splitter.

344
00:19:02,000 --> 00:19:06,000
Once we do this, then we will return the docs.

345
00:19:06,000 --> 00:19:08,000
All the docs we're going to specifically return the docs.

346
00:19:08,000 --> 00:19:13,000
So data ingestion is done right with respect to this particular data ingestion.

347
00:19:13,000 --> 00:19:16,000
Here what we are doing we are reading all the PDFs from the data folder.

348
00:19:16,000 --> 00:19:20,000
We are doing recursive character text splitter with this chunk overlapping.

349
00:19:20,000 --> 00:19:23,000
And then we split all those documents right.

350
00:19:23,000 --> 00:19:26,000
So this becomes a this completes our data ingestion.

351
00:19:26,000 --> 00:19:32,000
Now the next thing is that we will go ahead with vector embeddings.

352
00:19:33,000 --> 00:19:35,000
Vector embedding and vector store.

353
00:19:35,000 --> 00:19:36,000
Right.

354
00:19:36,000 --> 00:19:40,000
And this is where we are going to specifically use that title embedding that we have imported.

355
00:19:40,000 --> 00:19:44,000
Along with that, what we are specifically going to do is that we are going to use this files.

356
00:19:44,000 --> 00:19:45,000
Right.

357
00:19:45,000 --> 00:19:48,000
So let me quickly go ahead and write it down.

358
00:19:48,000 --> 00:19:49,000
So here I'm going to write my definition.

359
00:19:50,000 --> 00:19:53,000
And quickly let's go ahead and write Get Vector Store.

360
00:19:53,000 --> 00:19:54,000
So I'll create one more function.

361
00:19:55,000 --> 00:19:59,000
Now inside this vector store I'm going to first of all give the documents whatever documents is coming

362
00:19:59,000 --> 00:19:59,000
from here.

363
00:19:59,000 --> 00:20:04,000
Because we take this documents and then we do the embedding techniques and we perform all the embedding

364
00:20:04,000 --> 00:20:04,000
techniques.

365
00:20:04,000 --> 00:20:12,000
So in my next step what I will go ahead and write I'll write a vector store underscore files okay.

366
00:20:12,000 --> 00:20:14,000
So this will basically be my variable.

367
00:20:14,000 --> 00:20:18,000
And then I'm going to use fire start.

368
00:20:18,000 --> 00:20:18,000
Okay.

369
00:20:19,000 --> 00:20:23,000
From underscore documents right.

370
00:20:23,000 --> 00:20:26,000
And we are going to implement this right.

371
00:20:26,000 --> 00:20:30,000
So in the first parameter inside fires we give docs.

372
00:20:30,000 --> 00:20:34,000
And the second parameter that we specifically give is my bedrock embeddings.

373
00:20:34,000 --> 00:20:34,000
Right.

374
00:20:34,000 --> 00:20:36,000
So this is the embedding.

375
00:20:36,000 --> 00:20:42,000
Specifically we are going to use okay one One is the bedrock embedding which we have initialized over

376
00:20:42,000 --> 00:20:42,000
here.

377
00:20:42,000 --> 00:20:45,000
See over here in the bedrock embedding we have initialized.

378
00:20:45,000 --> 00:20:46,000
And that is what we are going to use it over here.

379
00:20:46,000 --> 00:20:53,000
Now after we get that vector surface this is basically my vector store I will go ahead and save it in

380
00:20:53,000 --> 00:20:54,000
my local disk.

381
00:20:54,000 --> 00:20:59,000
So let me quickly write dot save underscore local.

382
00:21:00,000 --> 00:21:05,000
And inside this we are going to basically write phi s underscore index.

383
00:21:05,000 --> 00:21:05,000
Right.

384
00:21:05,000 --> 00:21:11,000
So this will basically be saved in my uh folder over here itself right in my hard disk.

385
00:21:11,000 --> 00:21:14,000
You can also save this in any database as such if you want.

386
00:21:14,000 --> 00:21:17,000
So this is my vector embedding I've done it.

387
00:21:17,000 --> 00:21:19,000
And this step is also completed.

388
00:21:19,000 --> 00:21:22,000
Now let's talk about the next step.

389
00:21:22,000 --> 00:21:26,000
In the next step we based on the import we have to work with LM models.

390
00:21:26,000 --> 00:21:30,000
Now what we are going to do over here is that I'm going to create some LM models.

391
00:21:30,000 --> 00:21:35,000
Let's say I will go ahead and write first LM model that I'm going to work with is cloudy.

392
00:21:35,000 --> 00:21:35,000
Right.

393
00:21:35,000 --> 00:21:45,000
So I will say cloudy LM and inside this cloudy we will create create the anthropic model.

394
00:21:45,000 --> 00:21:52,000
Because I should say already AWS bedrock is giving you the power to harness multiple models to use the

395
00:21:52,000 --> 00:21:53,000
multiple models.

396
00:21:53,000 --> 00:21:56,000
So we can specifically develop in this application itself.

397
00:21:56,000 --> 00:21:59,000
It's just like having different different models in different, different ways.

398
00:21:59,000 --> 00:22:01,000
But here I'll try to make it much more generic.

399
00:22:01,000 --> 00:22:05,000
So here we are going to write quite the anthropic model and I'm going to create my LLM.

400
00:22:06,000 --> 00:22:08,000
Along with this we are going to use this bedrock.

401
00:22:08,000 --> 00:22:15,000
So bedrock um, before if I probably show you in my previous tutorial, the way of invoking a model

402
00:22:15,000 --> 00:22:18,000
was something like bedrock dot invoke model.

403
00:22:19,000 --> 00:22:23,000
But now, since we are already using Lang chain, we don't have to basically use invoke model.

404
00:22:23,000 --> 00:22:27,000
So what we specifically do, we just call this bedrock.

405
00:22:27,000 --> 00:22:32,000
And you can probably see how where this bedrock is present, it is present in a Lang chain.

406
00:22:32,000 --> 00:22:34,000
So they have actually created a wrapper.

407
00:22:34,000 --> 00:22:37,000
And internally they will invoke that specific model.

408
00:22:37,000 --> 00:22:42,000
But this is a way that how you can use this frameworks like Lang in a generic way.

409
00:22:42,000 --> 00:22:45,000
So now I'm going to use this model as bedrock.

410
00:22:45,000 --> 00:22:49,000
Now with respect to this I will go ahead and write my model ID.

411
00:22:50,000 --> 00:22:55,000
Already I have searched my model ID so this is what is my model ID over here.

412
00:22:55,000 --> 00:23:01,000
How you do you get this information again from this particular examples, go and click on any model

413
00:23:01,000 --> 00:23:02,000
you will be able to.

414
00:23:02,000 --> 00:23:05,000
Let's say I want to generate cloud I generate code.

415
00:23:05,000 --> 00:23:08,000
So with respect to this this will be my model ID right.

416
00:23:08,000 --> 00:23:12,000
So it is a generic way of finding the model ID so you don't have to even worry about it.

417
00:23:12,000 --> 00:23:13,000
Right.

418
00:23:13,000 --> 00:23:17,000
So I have my model ID which I did it in my previous tutorial also.

419
00:23:17,000 --> 00:23:20,000
And then I will go ahead and write my client.

420
00:23:21,000 --> 00:23:24,000
My client will be nothing but bedrock.

421
00:23:25,000 --> 00:23:27,000
Bedrock.

422
00:23:27,000 --> 00:23:30,000
And the next thing that I'm going to specifically give after this, right?

423
00:23:30,000 --> 00:23:32,000
One is model arguments.

424
00:23:32,000 --> 00:23:32,000
Right.

425
00:23:32,000 --> 00:23:35,000
So model arguments with respect to this.

426
00:23:35,000 --> 00:23:38,000
Now this arguments, this arguments that you'll be seeing.

427
00:23:38,000 --> 00:23:38,000
Right.

428
00:23:38,000 --> 00:23:42,000
Where do you find it again if you probably go here inside the body.

429
00:23:42,000 --> 00:23:48,000
And if you go at the last right, there will be some arguments that will be added like this.

430
00:23:48,000 --> 00:23:50,000
Max tokens to sample temperature.

431
00:23:50,000 --> 00:23:50,000
This this.

432
00:23:50,000 --> 00:23:52,000
You can also add it over here.

433
00:23:52,000 --> 00:23:57,000
Now with respect to this particular model that I am actually using, this was the argument that was

434
00:23:57,000 --> 00:23:59,000
present inside that particular body.

435
00:23:59,000 --> 00:24:01,000
So that is the reason I had created this.

436
00:24:01,000 --> 00:24:01,000
Right.

437
00:24:01,000 --> 00:24:04,000
All this JSON file separately so that I can refer it.

438
00:24:04,000 --> 00:24:05,000
Okay.

439
00:24:05,000 --> 00:24:07,000
So I hope you are able to understand it.

440
00:24:07,000 --> 00:24:09,000
If you are able to understand, please make sure that you hit like till now.

441
00:24:09,000 --> 00:24:11,000
And now let's go ahead.

442
00:24:11,000 --> 00:24:15,000
And here what we are specifically doing whenever I call this function, that basically means my cloud

443
00:24:15,000 --> 00:24:17,000
model has got loaded.

444
00:24:17,000 --> 00:24:20,000
So I'm going to return the LM model.

445
00:24:20,000 --> 00:24:25,000
Now suppose similarly you want to probably go ahead and call the llama two model.

446
00:24:25,000 --> 00:24:28,000
So I will go ahead and paste it over here.

447
00:24:28,000 --> 00:24:30,000
I'll say get llama two LM like this.

448
00:24:30,000 --> 00:24:34,000
You can actually create any number of functions as you want okay.

449
00:24:34,000 --> 00:24:36,000
So this will basically be my llama two.

450
00:24:36,000 --> 00:24:40,000
Now inside my llama two let me just copy this again I'm going to see my.

451
00:24:41,000 --> 00:24:44,000
So this is the llama two model over here okay.

452
00:24:44,000 --> 00:24:48,000
Only the model ID change and whatever is the argument that will change.

453
00:24:48,000 --> 00:24:54,000
So in the case of arguments uh, with respect to this, this is nothing but max gen length, right?

454
00:24:54,000 --> 00:24:55,000
How I'm getting it.

455
00:24:55,000 --> 00:24:57,000
Let me show you again okay.

456
00:24:57,000 --> 00:24:59,000
So let's say this is a chain of thoughts.

457
00:24:59,000 --> 00:25:01,000
Let's say that I'm going to use this okay.

458
00:25:01,000 --> 00:25:05,000
So here you can probably see this is my model ID okay.

459
00:25:06,000 --> 00:25:11,000
And then inside my arguments this is from where we'll be able to get it right.

460
00:25:11,000 --> 00:25:15,000
So this API request is pretty much important with respect to this okay.

461
00:25:15,000 --> 00:25:17,000
So return a model.

462
00:25:17,000 --> 00:25:21,000
And this is where is my model that I'm getting with respect to llama two.

463
00:25:21,000 --> 00:25:26,000
So now you know how to create how to call any uh models itself right.

464
00:25:26,000 --> 00:25:27,000
Data ingestion is done.

465
00:25:27,000 --> 00:25:29,000
Vector store is done.

466
00:25:29,000 --> 00:25:29,000
Everything is done.

467
00:25:29,000 --> 00:25:32,000
Now let's go ahead and create my prompt template.

468
00:25:32,000 --> 00:25:37,000
Now this prompt template that I'm actually going to create I'm going to use something called as Lang

469
00:25:37,000 --> 00:25:37,000
chain.

470
00:25:37,000 --> 00:25:41,000
So in my prompt template I have written see this okay.

471
00:25:44,000 --> 00:25:48,000
So I've given a simple prompt template which you can also use it.

472
00:25:52,000 --> 00:25:54,000
So I have written human in this format.

473
00:25:54,000 --> 00:26:00,000
Use the following piece of context to provide a concise answer to the and at least summarize with 250

474
00:26:00,000 --> 00:26:01,000
words with detail explanation.

475
00:26:01,000 --> 00:26:03,000
If you don't know the answer, just say that you don't know.

476
00:26:03,000 --> 00:26:05,000
Just don't try to make up the answer.

477
00:26:05,000 --> 00:26:06,000
So context and question is there.

478
00:26:06,000 --> 00:26:10,000
And based on this assistant, whatever assistant output is there, it will get appended over here.

479
00:26:11,000 --> 00:26:14,000
Now with respect to this particular thing we use the login prompt template.

480
00:26:14,000 --> 00:26:17,000
We write context and question and we use it okay.

481
00:26:17,000 --> 00:26:18,000
Now this is fine.

482
00:26:18,000 --> 00:26:19,000
Everything is good.

483
00:26:19,000 --> 00:26:21,000
Now let's focus on the response part.

484
00:26:21,000 --> 00:26:26,000
So here I will go ahead and write get underscore response underscore lm.

485
00:26:26,000 --> 00:26:28,000
And here the first parameter is my lm.

486
00:26:28,000 --> 00:26:31,000
The second parameter is my vector store.

487
00:26:31,000 --> 00:26:32,000
Vector store fires.

488
00:26:33,000 --> 00:26:35,000
The third parameter that I'm going to give is my query.

489
00:26:35,000 --> 00:26:35,000
Right.

490
00:26:35,000 --> 00:26:38,000
So this three parameters we are going to play with.

491
00:26:38,000 --> 00:26:44,000
One is obviously if I want to get the response from a specific LM then I have to give this three information.

492
00:26:44,000 --> 00:26:50,000
Okay, get response LM the LM model, which I'm calling this is my first index and this is my query

493
00:26:50,000 --> 00:26:51,000
okay.

494
00:26:51,000 --> 00:26:57,000
Now once we get this, we as you know that we have imported something called as a retrieval cue.

495
00:26:57,000 --> 00:26:58,000
So retrieval cue.

496
00:26:58,000 --> 00:27:00,000
And here also we are going to use it.

497
00:27:00,000 --> 00:27:02,000
And this is how you basically call it.

498
00:27:02,000 --> 00:27:03,000
Right.

499
00:27:03,000 --> 00:27:05,000
So I'm just going to copy it paste it over here.

500
00:27:05,000 --> 00:27:07,000
Again I'm seeing the lantern documentation.

501
00:27:07,000 --> 00:27:11,000
So retrieval QA from chain type I'm giving my LM model.

502
00:27:11,000 --> 00:27:13,000
Chain type will be stuff I've already shown you.

503
00:27:13,000 --> 00:27:15,000
What is text summarization.

504
00:27:15,000 --> 00:27:17,000
Different different text summarization techniques.

505
00:27:17,000 --> 00:27:18,000
So stuff is there over here.

506
00:27:18,000 --> 00:27:20,000
This is the most important thing.

507
00:27:20,000 --> 00:27:25,000
Retriever how from where the similarity search will basically happen.

508
00:27:25,000 --> 00:27:26,000
So this is where we are doing right.

509
00:27:26,000 --> 00:27:31,000
Office index basically has the entire index itself right.

510
00:27:31,000 --> 00:27:31,000
Right.

511
00:27:31,000 --> 00:27:36,000
And then we are trying to do the similarity search along with the argument top k three or top three

512
00:27:36,000 --> 00:27:37,000
prompts.

513
00:27:37,000 --> 00:27:41,000
And then we are also saying return source document with respect to this one.

514
00:27:41,000 --> 00:27:43,000
And then I have my prompt over here.

515
00:27:43,000 --> 00:27:44,000
Right.

516
00:27:44,000 --> 00:27:45,000
So all these things are basically done.

517
00:27:45,000 --> 00:27:48,000
And then after this we will be able to get the response.

518
00:27:48,000 --> 00:27:52,000
So now once we specifically write this chain type uh arguments.

519
00:27:52,000 --> 00:27:54,000
And here we are specifically giving the prompt.

520
00:27:54,000 --> 00:27:57,000
The prompt is getting created by the prompt template over here.

521
00:27:57,000 --> 00:28:04,000
And then in the next step we will go ahead and right answer I'll create a variable called as answer.

522
00:28:04,000 --> 00:28:06,000
And we can use this q a variable.

523
00:28:07,000 --> 00:28:13,000
And inside this we will give in the form of query colon.

524
00:28:13,000 --> 00:28:16,000
And here we are going to give the query itself.

525
00:28:16,000 --> 00:28:21,000
So this query is the query that is coming from the as As an input okay.

526
00:28:21,000 --> 00:28:27,000
And once we get the answer see when we when this QA is basically using right it is nothing but retrieval

527
00:28:27,000 --> 00:28:27,000
QA.

528
00:28:28,000 --> 00:28:33,000
So it will retrieve the response and it will store it in this particular variable.

529
00:28:33,000 --> 00:28:40,000
Now inside this variable there will be something called as result as a key which will give you the output.

530
00:28:41,000 --> 00:28:42,000
So if you print it right.

531
00:28:42,000 --> 00:28:44,000
Initially I printed this answer.

532
00:28:44,000 --> 00:28:46,000
I've already implemented this.

533
00:28:46,000 --> 00:28:49,000
So there there is a variable called as result which has the entire answer.

534
00:28:50,000 --> 00:28:55,000
So this is my response LM where I'm getting the entire result.

535
00:28:55,000 --> 00:28:58,000
Now let's go ahead and create our Streamlit app.

536
00:28:59,000 --> 00:29:02,000
Now quickly there are two important things with respect to this Streamlit app.

537
00:29:03,000 --> 00:29:09,000
One place I have to make sure that whenever a document is updated, you know that it should get converted

538
00:29:09,000 --> 00:29:11,000
into a vector embeddings.

539
00:29:11,000 --> 00:29:17,000
So what I'm actually going to do quickly over here, I'll create a main function and let me show you

540
00:29:17,000 --> 00:29:19,000
what I will be doing okay.

541
00:29:20,000 --> 00:29:23,000
So quickly first of all I will import Streamlit as st.

542
00:29:26,000 --> 00:29:27,000
Streamlit as st.

543
00:29:30,000 --> 00:29:31,000
Fine.

544
00:29:31,000 --> 00:29:32,000
We are going to use Streamlit.

545
00:29:32,000 --> 00:29:33,000
Now see this?

546
00:29:33,000 --> 00:29:35,000
The thing that I've copied and pasted.

547
00:29:35,000 --> 00:29:39,000
So I've written chat, PDF chat with PDF using AWS bedrock.

548
00:29:39,000 --> 00:29:40,000
This is my user question.

549
00:29:40,000 --> 00:29:42,000
Like what kind of question I want from the pdf file.

550
00:29:42,000 --> 00:29:53,000
Then a sidebar is basically created and I will say update or create vector store, right?

551
00:29:53,000 --> 00:29:59,000
That basically means inside this I will create a button saying that vectors update.

552
00:29:59,000 --> 00:30:05,000
That basically means once I click click right then it should go ahead and call this data ingestion.

553
00:30:05,000 --> 00:30:11,000
Now data ingestion, what it is going to do it is this data ingestion is going to read all the files

554
00:30:11,000 --> 00:30:12,000
from the data folder.

555
00:30:12,000 --> 00:30:15,000
Then it is going to take this loader documents.

556
00:30:15,000 --> 00:30:18,000
It is going to perform this recursive character text splitter.

557
00:30:18,000 --> 00:30:22,000
And then after providing after doing this it is going to return this entire documents.

558
00:30:22,000 --> 00:30:25,000
Right after this we are going to call this vector store.

559
00:30:25,000 --> 00:30:27,000
So one by one we will be calling it.

560
00:30:27,000 --> 00:30:30,000
So here you will be able to see data injection and vector store.

561
00:30:30,000 --> 00:30:30,000
We are calling.

562
00:30:30,000 --> 00:30:31,000
We get this docs.

563
00:30:32,000 --> 00:30:33,000
Now inside that vector store.

564
00:30:33,000 --> 00:30:39,000
What is happening is that we will save this in our local, uh, folder itself in the hard disk in the

565
00:30:39,000 --> 00:30:40,000
form of files index.

566
00:30:40,000 --> 00:30:47,000
So as soon as I click this button that is vector update Update, which will be available in the sideways

567
00:30:47,000 --> 00:30:48,000
inside the sidebar sidebar.

568
00:30:49,000 --> 00:30:52,000
Means in Streamlit it is in the left hand side.

569
00:30:52,000 --> 00:30:57,000
So the vectors will get created, the vector stores will get created, and it will also saved in a local

570
00:30:57,000 --> 00:30:58,000
folder itself.

571
00:30:58,000 --> 00:30:58,000
Right.

572
00:30:58,000 --> 00:31:03,000
So over here a folder will come with this particular name that is called as files index.

573
00:31:03,000 --> 00:31:04,000
This is one step.

574
00:31:04,000 --> 00:31:10,000
Now the second step is that I can go ahead and create a cloudy output.

575
00:31:10,000 --> 00:31:14,000
Okay, I will create a cloud, a button which is called as cloudy output.

576
00:31:14,000 --> 00:31:14,000
Now see this.

577
00:31:14,000 --> 00:31:18,000
This is important to understand okay so this is my button.

578
00:31:18,000 --> 00:31:20,000
I've created another button.

579
00:31:20,000 --> 00:31:25,000
When I say cloudy output that basically means I have to use the cloudy model API.

580
00:31:25,000 --> 00:31:32,000
So the first thing, as soon as I click on the cloudy output button, my face should get loaded from

581
00:31:32,000 --> 00:31:33,000
the local.

582
00:31:33,000 --> 00:31:38,000
So so that is the reason I'm writing dot load underscore local with this file index and the bedrock

583
00:31:38,000 --> 00:31:40,000
embedding, the same embedding which I've actually used.

584
00:31:41,000 --> 00:31:45,000
Then I will go ahead and call this cloudy LLM right.

585
00:31:45,000 --> 00:31:48,000
I'm going to call this cloudy LLM because I will get the LLM itself.

586
00:31:48,000 --> 00:31:54,000
Now if I go inside this function, if I go ahead and write f 12 okay.

587
00:31:54,000 --> 00:31:58,000
So here you will be able to see where it is.

588
00:31:58,000 --> 00:31:59,000
Uh, cloudy.

589
00:31:59,000 --> 00:32:00,000
LM so cloudy.

590
00:32:00,000 --> 00:32:05,000
LM it is going to call this particular model and return our LM over here.

591
00:32:05,000 --> 00:32:05,000
Right.

592
00:32:05,000 --> 00:32:08,000
So here I'm going to get my LM with respect to this.

593
00:32:08,000 --> 00:32:12,000
And this LM will specifically be used inside this function.

594
00:32:12,000 --> 00:32:17,000
That is get response underscore LM where I have given my LM vector store files and query right.

595
00:32:17,000 --> 00:32:20,000
When I get this three information over here, I'll get my response.

596
00:32:20,000 --> 00:32:23,000
And I am writing this particular response over here.

597
00:32:24,000 --> 00:32:25,000
Very much simple.

598
00:32:25,000 --> 00:32:25,000
Very much easy.

599
00:32:25,000 --> 00:32:26,000
Right.

600
00:32:26,000 --> 00:32:30,000
So let's execute this and let's see whether this is working fine or not okay.

601
00:32:30,000 --> 00:32:32,000
So this is my main function.

602
00:32:32,000 --> 00:32:37,000
So what I will do I will create this if underscore underscore name with respect to this mean.

603
00:32:37,000 --> 00:32:40,000
Now let's see whether we will get any error okay.

604
00:32:41,000 --> 00:32:42,000
So I will clear my screen.

605
00:32:43,000 --> 00:32:46,000
But in short we have executed each and every thing right.

606
00:32:46,000 --> 00:32:50,000
So I will go ahead and write Streamlit run app dot Pi.

607
00:32:50,000 --> 00:32:51,000
Let's see whether we will get any errors.

608
00:32:51,000 --> 00:32:55,000
First of all, uh, okay.

609
00:32:55,000 --> 00:32:56,000
It's working fine.

610
00:32:56,000 --> 00:32:57,000
I will close this.

611
00:32:57,000 --> 00:32:58,000
Okay.

612
00:32:58,000 --> 00:33:03,000
Now this on the left hand side, I will get this vector update update or create vector store.

613
00:33:03,000 --> 00:33:06,000
And here I have chat PDF using AWS bedrock.

614
00:33:06,000 --> 00:33:06,000
Okay.

615
00:33:06,000 --> 00:33:11,000
Now the first step as I said what I'm going to do here is my data folder.

616
00:33:11,000 --> 00:33:15,000
It has this PDF file attention dot pdf, yolo dot pdf.

617
00:33:15,000 --> 00:33:18,000
So what I'm going to do I'm going to probably click on Vector Update.

618
00:33:18,000 --> 00:33:20,000
As soon as I click on Vector Update.

619
00:33:20,000 --> 00:33:21,000
Now what is going to happen.

620
00:33:21,000 --> 00:33:22,000
See over here.

621
00:33:22,000 --> 00:33:26,000
As soon as I click on Vector update, what is going to happen.

622
00:33:26,000 --> 00:33:29,000
So if this is clicked that basically means the data ingestion folder.

623
00:33:29,000 --> 00:33:30,000
It will go.

624
00:33:30,000 --> 00:33:36,000
It will read both this PDF and then it will convert that into vectors And we will save that vector over

625
00:33:36,000 --> 00:33:37,000
here.

626
00:33:37,000 --> 00:33:40,000
Right now you cannot see any folder in the name of files index.

627
00:33:40,000 --> 00:33:43,000
But now if I go ahead and click on vector update.

628
00:33:43,000 --> 00:33:44,000
Now what is going to happen.

629
00:33:44,000 --> 00:33:47,000
See as soon as this vector update will finish.

630
00:33:47,000 --> 00:33:52,000
Now processing is basically happening and this will probably come up with like done okay done status.

631
00:33:52,000 --> 00:33:54,000
Unless until we don't get any errors.

632
00:33:54,000 --> 00:33:56,000
Let's see with respect to this right what will happen.

633
00:33:57,000 --> 00:34:00,000
So it is now loading all the PDF it is performing.

634
00:34:00,000 --> 00:34:02,000
It is converting those into vectors.

635
00:34:02,000 --> 00:34:03,000
Right.

636
00:34:03,000 --> 00:34:06,000
And then it has now stored that as a fixed index.

637
00:34:06,000 --> 00:34:12,000
Now if you probably go over here and see this folder my files index has created, it has index dot files

638
00:34:12,000 --> 00:34:13,000
and index dot pickle.

639
00:34:13,000 --> 00:34:15,000
Right now this is the first step.

640
00:34:15,000 --> 00:34:19,000
Now the second step is another button which is basically cloudy output.

641
00:34:19,000 --> 00:34:21,000
Now cloudy output is over here.

642
00:34:21,000 --> 00:34:27,000
Now if I give any prompt what is attention is all you need.

643
00:34:27,000 --> 00:34:29,000
What is attention is all you need.

644
00:34:30,000 --> 00:34:33,000
Okay, if I go ahead and click on Cloudy Output.

645
00:34:33,000 --> 00:34:36,000
Now it is going to call the cloudy model right.

646
00:34:36,000 --> 00:34:40,000
So if you probably go ahead and see the code again what is basically happening.

647
00:34:40,000 --> 00:34:43,000
First we will load this file index from the local.

648
00:34:43,000 --> 00:34:46,000
Then we will call the cloudy LLM model.

649
00:34:46,000 --> 00:34:52,000
Then we will call this Get response LM where we are giving all these three information the user question.

650
00:34:52,000 --> 00:34:56,000
And now finally you'll be able to see I'm getting the output right now.

651
00:34:56,000 --> 00:34:58,000
This is with respect to cloudy output right.

652
00:34:58,000 --> 00:35:03,000
So similarly if I go ahead and write what is YOLO I will go ahead and click on Cloudy Output.

653
00:35:03,000 --> 00:35:06,000
Now again see follow this pattern inside it will go again.

654
00:35:06,000 --> 00:35:08,000
It will load this files index.

655
00:35:08,000 --> 00:35:09,000
Right.

656
00:35:09,000 --> 00:35:12,000
And then we will call this particular cloud model.

657
00:35:12,000 --> 00:35:14,000
Now how the performance can be improved.

658
00:35:14,000 --> 00:35:16,000
This Phi C index I will not call each and every time.

659
00:35:16,000 --> 00:35:18,000
So let me do one thing.

660
00:35:18,000 --> 00:35:22,000
Let me call this once okay.

661
00:35:23,000 --> 00:35:25,000
Let me do one thing.

662
00:35:28,000 --> 00:35:34,000
For the first time I will just keep it outside this okay.

663
00:35:34,000 --> 00:35:36,000
But again it will give us an error.

664
00:35:36,000 --> 00:35:38,000
But I will give you that as an assignment.

665
00:35:38,000 --> 00:35:40,000
First of all, let's work on this okay.

666
00:35:40,000 --> 00:35:42,000
So here you can see my YOLO is also coming.

667
00:35:42,000 --> 00:35:44,000
Now see if I want to add llama two and all.

668
00:35:44,000 --> 00:35:46,000
What I need to do I will go over here.

669
00:35:46,000 --> 00:35:49,000
My llama two function is already created if you remember.

670
00:35:49,000 --> 00:35:54,000
See over here my llama two function is where get llama two right?

671
00:35:54,000 --> 00:35:56,000
So I will create another button.

672
00:35:56,000 --> 00:35:58,000
I will copy it like this.

673
00:35:58,000 --> 00:36:03,000
If my start button is llama two, right?

674
00:36:03,000 --> 00:36:07,000
So this will be my llama two output.

675
00:36:08,000 --> 00:36:12,000
Now instead of calling get cloud lm, I will say get llama two LM.

676
00:36:12,000 --> 00:36:16,000
So now we will go ahead and check whether for llama two it is working fine or not.

677
00:36:16,000 --> 00:36:19,000
So I will save this quickly.

678
00:36:19,000 --> 00:36:22,000
I will see now I'll click on llama two.

679
00:36:22,000 --> 00:36:24,000
Now let's see whether we'll get the output or not.

680
00:36:24,000 --> 00:36:26,000
Again, the same thing is going to happen.

681
00:36:26,000 --> 00:36:29,000
First of all, it is going to load the files index.

682
00:36:29,000 --> 00:36:32,000
After that it is going to take the LM model.

683
00:36:32,000 --> 00:36:33,000
It is going to give all the information.

684
00:36:33,000 --> 00:36:35,000
And then finally you'll be getting the answer.

685
00:36:35,000 --> 00:36:42,000
Similarly I can go ahead and write what is attention is all you need okay.

686
00:36:42,000 --> 00:36:46,000
So if I go ahead and click on llama two, then again I will be able to get the response.

687
00:36:46,000 --> 00:36:50,000
So this entirely everything is basically happening with the AWS bedrock.

688
00:36:50,000 --> 00:36:55,000
So step by step I have shown you almost everything the data ingestion step, what our libraries are

689
00:36:55,000 --> 00:36:58,000
specifically getting used, all this code, I have written it in front of you.

690
00:36:58,000 --> 00:37:03,000
The Streamlit file we have actually created, I would suggest go ahead and try it from your side, and

691
00:37:03,000 --> 00:37:05,000
this is how you are able to get the response.

692
00:37:05,000 --> 00:37:11,000
But the best thing is that all these models are available in AWS bedrock and it is already scalable.

693
00:37:11,000 --> 00:37:13,000
It you can actually use it according to your wish.