1
00:00:00,000 --> 00:00:00,000
Hello guys!

2
00:00:00,000 --> 00:00:06,000
In this video, I am going to show you some of the amazing, powerful features of, uh, Nvidia Nim.

3
00:00:06,000 --> 00:00:09,000
Uh, and again, which was recently announced by Nvidia.

4
00:00:09,000 --> 00:00:14,000
Uh, and if I talk about Nvidia Nim, it is a latest breakthrough in the generative AI development.

5
00:00:14,000 --> 00:00:15,000
Okay.

6
00:00:15,000 --> 00:00:20,000
Nvidia Nim is a set of inference microservices for deploying AI models.

7
00:00:20,000 --> 00:00:24,000
And it revolutionizes the way how we deploy generative AI enterprises.

8
00:00:24,000 --> 00:00:29,000
Along with this, Nvidia Nim offers multiple AI models, right?

9
00:00:29,000 --> 00:00:31,000
It can be a model, a model, multi model.

10
00:00:31,000 --> 00:00:35,000
Not only that, it also provides you an Nvidia AI foundation model.

11
00:00:35,000 --> 00:00:38,000
Just with the help of APIs, you'll be able to integrate in your application.

12
00:00:38,000 --> 00:00:42,000
You'll be able to seamlessly run it and it is quite highly scalable.

13
00:00:42,000 --> 00:00:44,000
So in this video I'm going to talk about this.

14
00:00:44,000 --> 00:00:47,000
I'm also going to show you multiple examples with the help of coding.

15
00:00:47,000 --> 00:00:51,000
So please make sure that you watch this video till the end because this is an amazing feature.

16
00:00:51,000 --> 00:00:55,000
And as I always say, there will be many, many LM models that will be coming up.

17
00:00:55,000 --> 00:01:01,000
But the clear winner will be the company that provides the best inferencing thing for us.

18
00:01:01,000 --> 00:01:01,000
Right.

19
00:01:01,000 --> 00:01:03,000
So let us go ahead.

20
00:01:03,000 --> 00:01:04,000
And this is the page over here.

21
00:01:04,000 --> 00:01:06,000
You can see instantly run and deploy generative AI.

22
00:01:06,000 --> 00:01:13,000
Explore the latest community built AI models or AI with API optimized and accelerated by Nvidia, and

23
00:01:13,000 --> 00:01:15,000
then deploy anywhere with Nvidia name.

24
00:01:15,000 --> 00:01:20,000
Uh, you'll also be able to experience leading open source models and I will be showing more about this.

25
00:01:20,000 --> 00:01:25,000
You know, uh, you can also do integration, uh, just by using an API call away.

26
00:01:25,000 --> 00:01:30,000
Uh, along with this, you will be able to run anyway, accelerate your AI deployment with Nvidia name,

27
00:01:30,000 --> 00:01:32,000
uh, how to buy.

28
00:01:32,000 --> 00:01:32,000
So here it is.

29
00:01:32,000 --> 00:01:38,000
And the best thing is that just to for you to try it out, uh, once you probably create a page you'll

30
00:01:38,000 --> 00:01:39,000
be getting 1000 credits.

31
00:01:39,000 --> 00:01:44,000
So which will be more than sufficient to explore and probably call multiple models to you.

32
00:01:44,000 --> 00:01:44,000
Right?

33
00:01:44,000 --> 00:01:46,000
So let us go ahead and let us see.

34
00:01:46,000 --> 00:01:48,000
So first of all go to this particular page.

35
00:01:48,000 --> 00:01:52,000
Anyhow, I will be giving you this link in the description of this particular video.

36
00:01:52,000 --> 00:01:53,000
Just click on try it now.

37
00:01:53,000 --> 00:01:56,000
And once you probably go ahead and click on try It now.

38
00:01:56,000 --> 00:01:58,000
Here you will be able to see all the models right.

39
00:01:58,000 --> 00:02:01,000
So with respect to models you can see llama 370 B.

40
00:02:01,000 --> 00:02:06,000
It has almost all the open source model foundation models along with that Nvidia Foundation models.

41
00:02:06,000 --> 00:02:08,000
Also it has right even open source model.

42
00:02:08,000 --> 00:02:09,000
Also it has.

43
00:02:09,000 --> 00:02:10,000
So here you'll be able to see gamma.

44
00:02:10,000 --> 00:02:12,000
You'll be able to see 85 images.

45
00:02:12,000 --> 00:02:14,000
So multi-model model LM model.

46
00:02:14,000 --> 00:02:18,000
So here with respect to models you can see reasoning is there for reasoning.

47
00:02:18,000 --> 00:02:19,000
You can use all these models.

48
00:02:19,000 --> 00:02:20,000
Visual design.

49
00:02:20,000 --> 00:02:23,000
Uh all these models are actually there retrieval right.

50
00:02:23,000 --> 00:02:26,000
If you want to probably implement retrieval it is also there.

51
00:02:26,000 --> 00:02:29,000
Then here you have speech biology gaming.

52
00:02:29,000 --> 00:02:31,000
Multiple things are specifically over here.

53
00:02:31,000 --> 00:02:33,000
So let me show you one example over here.

54
00:02:33,000 --> 00:02:38,000
And then we'll also try to create an end to end application that will be a Rag application using this

55
00:02:38,000 --> 00:02:39,000
Nvidia name.

56
00:02:39,000 --> 00:02:42,000
Uh so let's go ahead and let's start our project.

57
00:02:42,000 --> 00:02:46,000
But before I go ahead, uh, let me just go ahead and show you one of the things like, uh, so before

58
00:02:46,000 --> 00:02:48,000
I start, uh, any project.

59
00:02:48,000 --> 00:02:51,000
Right, it is very much important that you also need to have an API key.

60
00:02:51,000 --> 00:02:53,000
Now, how to probably get an API key.

61
00:02:53,000 --> 00:02:58,000
Let's say that in my project, in my Rag application, I want to probably use llama 370 billion instruct,

62
00:02:58,000 --> 00:02:59,000
right?

63
00:02:59,000 --> 00:03:02,000
And right now, since the inferencing is happening in the Nvidia name itself.

64
00:03:02,000 --> 00:03:07,000
So if you go ahead and click this particular model here, you'll be able to see something like this.

65
00:03:07,000 --> 00:03:08,000
Right here you will be able to chat.

66
00:03:08,000 --> 00:03:14,000
So let's say if I say hi and if I send a message you'll be able to see that I'll be able to get a response.

67
00:03:14,000 --> 00:03:16,000
Uh, how are you?

68
00:03:16,000 --> 00:03:16,000
Okay.

69
00:03:16,000 --> 00:03:19,000
Any question that you want over here, how are you?

70
00:03:19,000 --> 00:03:21,000
So here you can see this is the code.

71
00:03:21,000 --> 00:03:21,000
Right.

72
00:03:21,000 --> 00:03:22,000
And when I'm writing how are you.

73
00:03:22,000 --> 00:03:25,000
So this is the code that is basically getting set with respect to the content.

74
00:03:25,000 --> 00:03:27,000
So you can use this particular code.

75
00:03:27,000 --> 00:03:29,000
And you can also call it right.

76
00:03:29,000 --> 00:03:30,000
And I'll also be showing you.

77
00:03:30,000 --> 00:03:32,000
So once I probably send how are you.

78
00:03:32,000 --> 00:03:34,000
You'll be able to get the response over here.

79
00:03:34,000 --> 00:03:38,000
Now when you go to this uh bill.nvidia.com.

80
00:03:38,000 --> 00:03:39,000
First of all you need to log in.

81
00:03:39,000 --> 00:03:40,000
Right.

82
00:03:40,000 --> 00:03:42,000
So here you can see I have logged in over here.

83
00:03:42,000 --> 00:03:46,000
If you have not uh if you don't have an account I would suggest please go ahead and create an account.

84
00:03:46,000 --> 00:03:49,000
And here you can see I have 954 credits left.

85
00:03:49,000 --> 00:03:50,000
Initially.

86
00:03:50,000 --> 00:03:52,000
You will be getting 1000 when you are probably creating a new account.

87
00:03:52,000 --> 00:03:57,000
Okay, now uh, this is the thing here you can actually see, right?

88
00:03:57,000 --> 00:04:01,000
Uh, the entire, uh, code is also visible, right?

89
00:04:01,000 --> 00:04:03,000
So I will show you how you can run this particular code.

90
00:04:03,000 --> 00:04:08,000
And again over here you can see that there is an API key that is required in order to generate the API

91
00:04:08,000 --> 00:04:08,000
key.

92
00:04:08,000 --> 00:04:11,000
All you have to do is that click on get API key over here right.

93
00:04:11,000 --> 00:04:15,000
So here you'll be able to see uh get API key in this green color.

94
00:04:15,000 --> 00:04:16,000
And just click on this.

95
00:04:16,000 --> 00:04:20,000
This key authenticates your Nvidia AI foundation endpoint for test and evaluation.

96
00:04:20,000 --> 00:04:22,000
And just go ahead and generate the key.

97
00:04:22,000 --> 00:04:26,000
Now I will be making sure that I copy this particular key, because I'm going to use this particular

98
00:04:26,000 --> 00:04:28,000
key for my coding purpose okay.

99
00:04:28,000 --> 00:04:30,000
Now let's go ahead.

100
00:04:30,000 --> 00:04:32,000
And first of all let's go ahead and open my VS code.

101
00:04:32,000 --> 00:04:38,000
And I will show you how we can go ahead and create this entire project, that code that you'll be able

102
00:04:38,000 --> 00:04:38,000
to see over there.

103
00:04:38,000 --> 00:04:41,000
Write this code also will try to run it okay.

104
00:04:41,000 --> 00:04:42,000
So let's go ahead step by step.

105
00:04:42,000 --> 00:04:43,000
Let's go ahead and do this.

106
00:04:43,000 --> 00:04:48,000
So first of all uh let me go ahead and create my environment.

107
00:04:48,000 --> 00:04:49,000
So here I'm going to use conda.

108
00:04:49,000 --> 00:04:58,000
So conda create minus p v and v python double equal to 3.10 I'll just use 3.10 and let us go ahead and

109
00:04:58,000 --> 00:04:59,000
create this environment.

110
00:04:59,000 --> 00:05:00,000
Okay.

111
00:05:00,000 --> 00:05:06,000
Now over here you definitely require multiple, uh, multiple, uh, requirements, um, in multiple

112
00:05:06,000 --> 00:05:08,000
packages in the requirement dot txt.

113
00:05:08,000 --> 00:05:10,000
So we will also go ahead and update that okay.

114
00:05:10,000 --> 00:05:15,000
So first of all, uh, let me just go ahead and yes, one more thing that I'm going to show you, the

115
00:05:15,000 --> 00:05:18,000
Rag application that we are going to create will do it with the help of Lang Lang.

116
00:05:19,000 --> 00:05:21,000
And also so Lang Syne also has an integration for that.

117
00:05:21,000 --> 00:05:26,000
So first of all, let me just quickly go ahead and write requirements dot txt.

118
00:05:27,000 --> 00:05:27,000
Okay.

119
00:05:27,000 --> 00:05:32,000
So let me go ahead and create a file requirements dot txt.

120
00:05:32,000 --> 00:05:38,000
So first of all what all packages I specifically require I'll be using OpenAI because in this particular

121
00:05:38,000 --> 00:05:41,000
code that I see right there was uh OpenAI over here.

122
00:05:41,000 --> 00:05:43,000
So let me just first of all copy this.

123
00:05:43,000 --> 00:05:43,000
Okay.

124
00:05:43,000 --> 00:05:45,000
I will be requiring this key.

125
00:05:45,000 --> 00:05:47,000
So I've copied it in my node back notebook.

126
00:05:47,000 --> 00:05:49,000
So this is my entire code over here.

127
00:05:49,000 --> 00:05:51,000
So let me just copy this entire code okay.

128
00:05:51,000 --> 00:05:54,000
Copy the code and paste it over here.

129
00:05:54,000 --> 00:05:59,000
So let me just go ahead and create my app.py file and let me paste it over here okay.

130
00:05:59,000 --> 00:06:00,000
And we'll go step by step.

131
00:06:00,000 --> 00:06:02,000
We'll understand what exactly this is okay.

132
00:06:02,000 --> 00:06:07,000
So first of all in the requirement dot txt I am going to use OpenAI along with this.

133
00:06:07,000 --> 00:06:12,000
Uh you'll also be seeing that uh I will be using Python dot env.

134
00:06:12,000 --> 00:06:17,000
The reason why I'm importing this because in the in my env file I will be creating that API key.

135
00:06:17,000 --> 00:06:18,000
So I will be requiring that okay.

136
00:06:19,000 --> 00:06:20,000
Uh this is the next thing.

137
00:06:20,000 --> 00:06:25,000
And along with this uh, right now I'll just keep this two packages over here just to run that code.

138
00:06:25,000 --> 00:06:32,000
Now, uh, let me go ahead and write pip install minus our requirement dot txt.

139
00:06:32,000 --> 00:06:35,000
Before that, let me activate my VNC environment.

140
00:06:35,000 --> 00:06:38,000
So conda activate v and v okay.

141
00:06:38,000 --> 00:06:44,000
So this is the activate environment that I will specifically be using now or what I'm actually going

142
00:06:44,000 --> 00:06:44,000
to do.

143
00:06:44,000 --> 00:06:50,000
Just go ahead and write pip install pip install minus r requirement dot txt okay.

144
00:06:50,000 --> 00:06:53,000
So the installation will specifically happen over here.

145
00:06:53,000 --> 00:06:55,000
Uh all the installation will happen.

146
00:06:55,000 --> 00:06:59,000
And here you'll be able to see that once the installation is taken place I will go ahead and run this

147
00:06:59,000 --> 00:07:02,000
particular code, because here I just require OpenAI.

148
00:07:02,000 --> 00:07:06,000
Now let's understand this particular code and uh, what all things are there in this code.

149
00:07:06,000 --> 00:07:11,000
So first of all I am importing from OpenAI, I import OpenAI and then we are creating a client with

150
00:07:11,000 --> 00:07:12,000
respect to OpenAI.

151
00:07:12,000 --> 00:07:19,000
The base URL over here will be this integrate.api.nvidia.com/v one which is given from the code.

152
00:07:19,000 --> 00:07:21,000
This is my API key okay.

153
00:07:21,000 --> 00:07:24,000
Please make sure that you don't have this publicly visible API key.

154
00:07:24,000 --> 00:07:26,000
Instead, what you can do you can.

155
00:07:26,000 --> 00:07:32,000
As you know that I have already imported Python dot env so we can directly create an environment variable

156
00:07:32,000 --> 00:07:33,000
and use this.

157
00:07:33,000 --> 00:07:33,000
Okay.

158
00:07:33,000 --> 00:07:34,000
So I will do that.

159
00:07:34,000 --> 00:07:39,000
Uh, first of all I'll create an environment variable over here and let me do that first step uh, so

160
00:07:39,000 --> 00:07:43,000
that I can actually use it when I'm creating my end to end project.

161
00:07:43,000 --> 00:07:44,000
So this is my dot env.

162
00:07:44,000 --> 00:07:47,000
Uh, this is my environment variable that I'm going to create.

163
00:07:47,000 --> 00:07:49,000
right Nvidia underscore API underscore key.

164
00:07:50,000 --> 00:07:51,000
And this is my API key okay.

165
00:07:51,000 --> 00:07:54,000
And I can call this API key wherever I want.

166
00:07:54,000 --> 00:07:58,000
Um by using uh this Python dot env file okay.

167
00:07:59,000 --> 00:08:01,000
And that I'll show you once I probably create a rack project.

168
00:08:01,000 --> 00:08:06,000
Now once I probably create this particular client, then, uh, we have to use client dot chat completion

169
00:08:06,000 --> 00:08:07,000
dot create.

170
00:08:07,000 --> 00:08:09,000
Here we are giving the model name.

171
00:08:09,000 --> 00:08:09,000
Again.

172
00:08:09,000 --> 00:08:10,000
It is been given.

173
00:08:10,000 --> 00:08:11,000
The entire code is given over here.

174
00:08:11,000 --> 00:08:15,000
See, I did not write it any like Nvidia name is already providing this.

175
00:08:15,000 --> 00:08:18,000
And just imagine just using this and directly executing it.

176
00:08:18,000 --> 00:08:19,000
It's quite amazing.

177
00:08:19,000 --> 00:08:22,000
Then messages role is equal to user content.

178
00:08:22,000 --> 00:08:22,000
How are you?

179
00:08:22,000 --> 00:08:24,000
Okay so I have written this particular message.

180
00:08:24,000 --> 00:08:31,000
Or let me just go ahead and write uh provide me a paragraph, provide me, uh, provide me an essay.

181
00:08:31,000 --> 00:08:32,000
Okay.

182
00:08:32,000 --> 00:08:34,000
On machine learning.

183
00:08:34,000 --> 00:08:34,000
Okay.

184
00:08:37,000 --> 00:08:37,000
Okay.

185
00:08:37,000 --> 00:08:40,000
Provide me an article on machine learning.

186
00:08:40,000 --> 00:08:40,000
Okay.

187
00:08:40,000 --> 00:08:41,000
Something like this.

188
00:08:42,000 --> 00:08:42,000
Okay.

189
00:08:45,000 --> 00:08:50,000
So then we are setting up the temperature value top underscore P will be one max tokens 1024.

190
00:08:50,000 --> 00:08:51,000
And stream is equal to true.

191
00:08:51,000 --> 00:08:53,000
When we are keeping stream is equal to true.

192
00:08:53,000 --> 00:08:55,000
That basically means the entire completion will.

193
00:08:55,000 --> 00:08:59,000
We will be able to, uh, see in the form of uh, streams.

194
00:08:59,000 --> 00:09:03,000
And this is basically there in OpenAI right then from chunk in completion.

195
00:09:03,000 --> 00:09:05,000
We are executing this particular code.

196
00:09:05,000 --> 00:09:09,000
Now, let me just go ahead and run this and let's see whether we will be able to get the output or not.

197
00:09:09,000 --> 00:09:12,000
So here I'm going to just write Python app dot Pi.

198
00:09:12,000 --> 00:09:18,000
And uh here uh see machine learning all the answers is coming over here.

199
00:09:18,000 --> 00:09:19,000
It is streaming.

200
00:09:19,000 --> 00:09:21,000
It is giving you the entire output.

201
00:09:21,000 --> 00:09:22,000
This is perfect.

202
00:09:22,000 --> 00:09:23,000
You are able to get the output.

203
00:09:23,000 --> 00:09:26,000
So that basically means you're just able to execute it and how quick it is.

204
00:09:26,000 --> 00:09:28,000
Nvidia Nim.

205
00:09:28,000 --> 00:09:33,000
Trust me, the inferencing is very, very fast and that is what it is going to basically bring up a

206
00:09:33,000 --> 00:09:35,000
breakthrough and an amazing revolution, right?

207
00:09:36,000 --> 00:09:39,000
Uh, in the entire process of generative AI development, and at the end of the day, company really

208
00:09:39,000 --> 00:09:41,000
needs to think about inferencing.

209
00:09:41,000 --> 00:09:44,000
And if I talk about Nvidia, it is the king of GPUs.

210
00:09:44,000 --> 00:09:44,000
Right?

211
00:09:44,000 --> 00:09:48,000
So the inferencing needs to be obviously very, very good okay.

212
00:09:48,000 --> 00:09:48,000
Okay.

213
00:09:48,000 --> 00:09:50,000
So this was it.

214
00:09:50,000 --> 00:09:53,000
Uh, now here, uh, this, uh, simple thing we have actually done.

215
00:09:53,000 --> 00:09:56,000
Now, let us go ahead and do some amazing end to end project.

216
00:09:56,000 --> 00:10:01,000
And this time I'm actually going to show you a project which is basically a kind of RAC project and

217
00:10:01,000 --> 00:10:04,000
how you can use along with Lang Chain.

218
00:10:04,000 --> 00:10:06,000
Uh, that also I'll probably show you.

219
00:10:06,000 --> 00:10:10,000
So first of all, I will go ahead and uh update my requirement dot txt.

220
00:10:10,000 --> 00:10:14,000
So here I'm going to use lang chain Nvidia underscore I underscore endpoints.

221
00:10:14,000 --> 00:10:20,000
This will be the lang chain integration which will actually help you to call all the Nvidia um models

222
00:10:20,000 --> 00:10:21,000
that it has in him.

223
00:10:21,000 --> 00:10:26,000
Then you will be also importing lang chain underscore community uh five CPU.

224
00:10:26,000 --> 00:10:29,000
Along with this, since I'm going to create a Streamlit app, I'm going to use this.

225
00:10:29,000 --> 00:10:32,000
And one more thing that I'm going to use is pi PDF.

226
00:10:32,000 --> 00:10:32,000
Okay.

227
00:10:32,000 --> 00:10:36,000
So all this uh, requirements I'm actually going to use it.

228
00:10:36,000 --> 00:10:41,000
Now let me just open the terminal again with respect to this, I'm going to go ahead and write pip install

229
00:10:42,000 --> 00:10:45,000
minus our requirement dot txt okay.

230
00:10:46,000 --> 00:10:48,000
So the installation will take place.

231
00:10:48,000 --> 00:10:50,000
Uh it is going to take again some time with respect to this.

232
00:10:50,000 --> 00:10:55,000
Let me close this uh folders are visible over here okay.

233
00:10:55,000 --> 00:10:59,000
So once this is done, I will go ahead and start my development.

234
00:10:59,000 --> 00:10:59,000
Okay.

235
00:10:59,000 --> 00:11:04,000
So here, uh, let me go ahead and write final app dot pi.

236
00:11:06,000 --> 00:11:06,000
Okay.

237
00:11:06,000 --> 00:11:08,000
Now final app dot pi.

238
00:11:08,000 --> 00:11:11,000
Uh, this is where I am going to specifically write all my code.

239
00:11:11,000 --> 00:11:14,000
Uh, and, uh, we'll be seeing what all things we are going to use.

240
00:11:14,000 --> 00:11:15,000
Okay.

241
00:11:15,000 --> 00:11:21,000
Now, as you know, uh, whenever we are creating a Rag application, we will be using some kind of

242
00:11:21,000 --> 00:11:21,000
things, right?

243
00:11:21,000 --> 00:11:23,000
Let's say PDF or something as such.

244
00:11:23,000 --> 00:11:26,000
So, uh, I have some of the PDFs over here.

245
00:11:26,000 --> 00:11:27,000
Okay.

246
00:11:27,000 --> 00:11:32,000
So this for PDFs I have over here I will copy this okay.

247
00:11:32,000 --> 00:11:35,000
And let me open this particular folder.

248
00:11:35,000 --> 00:11:40,000
So we are going to read all the PDFs from here from this US Kansas census.

249
00:11:40,000 --> 00:11:42,000
So all the PDFs is over here.

250
00:11:42,000 --> 00:11:46,000
We'll read this and we'll try to create a Rag application wherein we will be asking any questions related

251
00:11:46,000 --> 00:11:47,000
to this.

252
00:11:47,000 --> 00:11:50,000
Here we are also going to perform embedding okay.

253
00:11:50,000 --> 00:11:52,000
Now when I say we are going to perform embedding.

254
00:11:52,000 --> 00:11:54,000
So we will be using Nvidia embeddings okay.

255
00:11:54,000 --> 00:12:00,000
And that is what we are I'm going to show you also so quickly uh let's go ahead and import uh Streamlit

256
00:12:00,000 --> 00:12:00,000
okay.

257
00:12:00,000 --> 00:12:03,000
So first of all I'm going to import Streamlit as st.

258
00:12:03,000 --> 00:12:14,000
Then I'm going to import OS okay okay then from Nvidia I'm going to import uh embeddings and chat Nvidia.

259
00:12:14,000 --> 00:12:17,000
So this is a library that I'm going to import over here.

260
00:12:17,000 --> 00:12:23,000
So this Nvidia embedding will be again used with the help of APIs and chat Nvidia to call any models

261
00:12:23,000 --> 00:12:25,000
that are available in Nvidia name okay.

262
00:12:25,000 --> 00:12:29,000
And this is the library that is used in integration with long chain.

263
00:12:29,000 --> 00:12:32,000
Then uh you have this web based loader.

264
00:12:32,000 --> 00:12:37,000
So since uh uh I okay, I need to probably read from my peer directory.

265
00:12:37,000 --> 00:12:42,000
So for that I will be using from lantern underscore community dot document loaders.

266
00:12:42,000 --> 00:12:48,000
Since I'm also going to use output parser create document chain and recursive character text splitter.

267
00:12:48,000 --> 00:12:49,000
So I am also going to use this.

268
00:12:49,000 --> 00:12:55,000
So these are some of the libraries that I'm actually going to use for my entire course, uh project

269
00:12:55,000 --> 00:12:57,000
right where we'll be using this.

270
00:12:57,000 --> 00:13:01,000
So here you can see the entire uh installation has also been done.

271
00:13:01,000 --> 00:13:07,000
Now let me quickly go ahead and write from dot E and EMV import load underscore dot EMV.

272
00:13:07,000 --> 00:13:12,000
So then we are going to initialize load underscore dot e and v so that we will be able to call all my

273
00:13:12,000 --> 00:13:13,000
environment variables.

274
00:13:13,000 --> 00:13:21,000
Now uh, load uh, the API or Nvidia API key because we need to load it.

275
00:13:21,000 --> 00:13:27,000
So for that I will go ahead and write OS dot environment environ okay.

276
00:13:27,000 --> 00:13:33,000
And this will be nothing but an Nvidia underscore API key.

277
00:13:33,000 --> 00:13:35,000
And this is what I have actually created in my env file.

278
00:13:35,000 --> 00:13:36,000
Right.

279
00:13:36,000 --> 00:13:37,000
So here you go.

280
00:13:37,000 --> 00:13:40,000
And then just go ahead and write OS dot get env.

281
00:13:40,000 --> 00:13:46,000
And here we are going to use Nvidia API key okay.

282
00:13:46,000 --> 00:13:48,000
So my environment variable is ready.

283
00:13:48,000 --> 00:13:53,000
Uh now the next step or what we are basically going to do is that we are going to call our LM model.

284
00:13:53,000 --> 00:13:55,000
So let me just go ahead and write my LM model.

285
00:13:55,000 --> 00:13:58,000
And this time I'm going to use Chat Nvidia.

286
00:13:58,000 --> 00:13:59,000
Right.

287
00:13:59,000 --> 00:14:00,000
So chat Nvidia.

288
00:14:00,000 --> 00:14:03,000
And as you all know I'm going to use which model.

289
00:14:04,000 --> 00:14:06,000
So chat Nvidia is there I guess.

290
00:14:06,000 --> 00:14:09,000
So chat Nvidia okay chat Nvidia.

291
00:14:09,000 --> 00:14:11,000
And I'm going to use my model name.

292
00:14:11,000 --> 00:14:13,000
Uh the same model name that I've actually called.

293
00:14:13,000 --> 00:14:16,000
That is nothing but meta llama 370 billion parameter okay.

294
00:14:17,000 --> 00:14:19,000
So this basically becomes my LM model.

295
00:14:19,000 --> 00:14:27,000
Now once this LM model is loaded, since I have to read from this particular folder all the PDFs.

296
00:14:27,000 --> 00:14:32,000
So let me just go ahead and create my one function, which is called as vector embedding because I have

297
00:14:32,000 --> 00:14:37,000
to create vectors for all this PDF file.

298
00:14:37,000 --> 00:14:37,000
Right.

299
00:14:38,000 --> 00:14:41,000
Uh, here I'm going to specifically use sessions okay.

300
00:14:41,000 --> 00:14:43,000
So that I'll be able to access it here and there.

301
00:14:43,000 --> 00:14:46,000
So here, uh, let me go ahead and write form if vectors.

302
00:14:48,000 --> 00:14:58,000
If vectors I will create a session which is called as vectors not in s t dot session underscore state.

303
00:14:58,000 --> 00:14:58,000
Okay.

304
00:14:58,000 --> 00:15:02,000
And this uh I will create another line.

305
00:15:02,000 --> 00:15:05,000
Now I will go ahead and write s t dot session.

306
00:15:08,000 --> 00:15:12,000
Session underscore state okay dot embedding.

307
00:15:12,000 --> 00:15:14,000
So first of all I'm going to create my embeddings.

308
00:15:14,000 --> 00:15:16,000
And I'm going to initialize to Nvidia embedding okay.

309
00:15:16,000 --> 00:15:21,000
So this is the embedding that we are going to use in order to convert the document into vectors or the

310
00:15:21,000 --> 00:15:22,000
text into vectors okay.

311
00:15:22,000 --> 00:15:24,000
So this is the first thing.

312
00:15:24,000 --> 00:15:26,000
So here you have to also everything looks fine.

313
00:15:26,000 --> 00:15:34,000
Then uh I'm going to probably also create s t dot session underscore state okay.

314
00:15:34,000 --> 00:15:36,000
Dot loader okay.

315
00:15:36,000 --> 00:15:39,000
And this time I'm going to use my pi pdf.

316
00:15:39,000 --> 00:15:42,000
Let's see pi PDF directory loader.

317
00:15:42,000 --> 00:15:44,000
And I will be giving my folder location.

318
00:15:44,000 --> 00:15:46,000
That is US census okay.

319
00:15:46,000 --> 00:15:49,000
So let me quickly copy this over here.

320
00:15:50,000 --> 00:15:57,000
So this is my uh another library that I have to use it Okay, so let's see, uh, till here, if everything

321
00:15:57,000 --> 00:16:02,000
is working fine or not, I will just go ahead and open my terminal and let's see if I'm getting any

322
00:16:02,000 --> 00:16:02,000
error.

323
00:16:02,000 --> 00:16:03,000
Okay.

324
00:16:03,000 --> 00:16:10,000
Streamlit, run final app dot pi.

325
00:16:10,000 --> 00:16:13,000
Okay, so once this is running, I think it should be running.

326
00:16:13,000 --> 00:16:14,000
And I don't think so.

327
00:16:14,000 --> 00:16:15,000
Any problem should be there.

328
00:16:16,000 --> 00:16:17,000
Okay, so this looks fine.

329
00:16:17,000 --> 00:16:19,000
Yeah it is giving me a blank page.

330
00:16:19,000 --> 00:16:21,000
Uh, now it looks fine.

331
00:16:21,000 --> 00:16:22,000
Everything is fine.

332
00:16:22,000 --> 00:16:24,000
Let me quickly go ahead and close this.

333
00:16:25,000 --> 00:16:25,000
Okay.

334
00:16:25,000 --> 00:16:26,000
Now.

335
00:16:26,000 --> 00:16:27,000
Perfect.

336
00:16:27,000 --> 00:16:29,000
So US census I'm actually keeping the up in this.

337
00:16:29,000 --> 00:16:36,000
So basically this pie PDF directory loader is going to read this entire PDF file inside this US census

338
00:16:36,000 --> 00:16:36,000
then.

339
00:16:36,000 --> 00:16:43,000
Now what I'm actually going to do is that uh, after reading from this particular loader write variable,

340
00:16:43,000 --> 00:16:49,000
I'm going to basically write st dot session underscore state.

341
00:16:53,000 --> 00:16:58,000
Session underscore state dot docs okay.

342
00:16:58,000 --> 00:17:03,000
And here uh we are going to specifically use st underscore session underscore state loader dot loader.

343
00:17:03,000 --> 00:17:07,000
So what loader dot loader will specifically give me all the documents over here.

344
00:17:07,000 --> 00:17:12,000
And along with this I, I'm going to probably do the character text splitting.

345
00:17:12,000 --> 00:17:15,000
Uh, because I need to probably do the splitting itself after this.

346
00:17:15,000 --> 00:17:17,000
So once I get my documents.

347
00:17:17,000 --> 00:17:22,000
So then what we are going to basically do is that we are going to write SD dot session, underscore

348
00:17:22,000 --> 00:17:28,000
state dot text, underscore splitter, recursive character text splitter, chunk size.

349
00:17:28,000 --> 00:17:31,000
I'm going to take it as 700 chunk overlap I'm going to take it as 50.

350
00:17:31,000 --> 00:17:38,000
And here you'll be able to see final document is there uh, with split documents, SD, dot docs.

351
00:17:38,000 --> 00:17:39,000
And I've taken the top 30 docs.

352
00:17:39,000 --> 00:17:40,000
Okay.

353
00:17:40,000 --> 00:17:43,000
So we are splitting this particular documents with respect to this.

354
00:17:43,000 --> 00:17:45,000
So guys step by step you have seen this.

355
00:17:45,000 --> 00:17:46,000
First of all we created our embedding.

356
00:17:46,000 --> 00:17:49,000
Then we read the entire directory all the PDFs.

357
00:17:49,000 --> 00:17:51,000
We had it in the loader.

358
00:17:51,000 --> 00:17:55,000
This loader dot load load will basically give you the entire documents.

359
00:17:55,000 --> 00:17:59,000
And then we are taking this particular documents and applying recursive character text splitter where

360
00:17:59,000 --> 00:18:01,000
the entire documents will be divided into chunks.

361
00:18:01,000 --> 00:18:02,000
Okay.

362
00:18:02,000 --> 00:18:06,000
Uh, here the chunk size is basically taken as 700 with overlap of 50.

363
00:18:06,000 --> 00:18:10,000
And finally we get our final document by using the split documents.

364
00:18:10,000 --> 00:18:12,000
Uh, and I'm going to take the top 30 records.

365
00:18:12,000 --> 00:18:13,000
Okay.

366
00:18:13,000 --> 00:18:18,000
So once I probably get the final documents, uh, at the end of the day, I need to convert this into

367
00:18:18,000 --> 00:18:18,000
vectors.

368
00:18:18,000 --> 00:18:22,000
So I'm going to basically write s t underscore session state okay.

369
00:18:23,000 --> 00:18:25,000
Dot vectors.

370
00:18:26,000 --> 00:18:38,000
Uh and here I'm going to use fis again weight dot vectors is equal to here uh we are going to use fis

371
00:18:38,000 --> 00:18:43,000
from documents SD dot session final underscore document and SD dot session state embeddings.

372
00:18:43,000 --> 00:18:45,000
So here we have also created our embeddings.

373
00:18:45,000 --> 00:18:47,000
And the same embeddings will be used over here.

374
00:18:47,000 --> 00:18:53,000
And finally this vector will be nothing but it will be our uh vector database.

375
00:18:53,000 --> 00:18:54,000
In short the vector database okay.

376
00:18:54,000 --> 00:19:00,000
So this entirely this function, what it is going to do is that it is going to read all the PDFs from

377
00:19:00,000 --> 00:19:01,000
that folder.

378
00:19:01,000 --> 00:19:03,000
It will divide all the documents into chunks.

379
00:19:03,000 --> 00:19:06,000
And then finally it will convert into a vector and store it in a vector database.

380
00:19:06,000 --> 00:19:09,000
So that is what this entire function is basically going to do.

381
00:19:10,000 --> 00:19:15,000
Now, uh, it's time uh, we go ahead and probably write s t dot title.

382
00:19:16,000 --> 00:19:21,000
And here I'm going to basically use Nvidia name demo.

383
00:19:21,000 --> 00:19:22,000
Okay.

384
00:19:22,000 --> 00:19:27,000
Uh, once I probably use this uh, as you know that I'm going to probably create my chat prompt template.

385
00:19:27,000 --> 00:19:28,000
So let me go ahead and define it.

386
00:19:29,000 --> 00:19:32,000
Uh chat prompt template I have already imported it over here okay.

387
00:19:32,000 --> 00:19:36,000
So this chat prompt template, as I said it is a kind of a Rag application.

388
00:19:36,000 --> 00:19:38,000
So I'll say answer the question based on the provided context only.

389
00:19:38,000 --> 00:19:41,000
Please provide the most accurate response based on the question.

390
00:19:41,000 --> 00:19:42,000
So here is my context.

391
00:19:42,000 --> 00:19:44,000
Here is my question okay.

392
00:19:44,000 --> 00:19:52,000
Once I probably have this uh, now I'll go ahead and create my entire a single text prompt, uh, text

393
00:19:52,000 --> 00:19:52,000
input.

394
00:19:52,000 --> 00:19:54,000
I'll say enter your question from the documents.

395
00:19:55,000 --> 00:19:58,000
Whatever question you have from the documents, you are going to probably ask it over here.

396
00:19:58,000 --> 00:20:00,000
Now I will also create a button.

397
00:20:00,000 --> 00:20:03,000
So I'm going to right from start button okay.

398
00:20:04,000 --> 00:20:08,000
If s t dot button.

399
00:20:10,000 --> 00:20:14,000
And here let's say I'm saying document embedding.

400
00:20:14,000 --> 00:20:16,000
If this is clicked right.

401
00:20:16,000 --> 00:20:22,000
If this is clicked I will specifically call my vector embedding function okay.

402
00:20:22,000 --> 00:20:28,000
And once I have my vector embedding function, I will go ahead and update this and I'll say, hey,

403
00:20:29,000 --> 00:20:35,000
my vector store DB is ready, okay.

404
00:20:35,000 --> 00:20:39,000
And this time the vector store that we are using is a farce, right?

405
00:20:39,000 --> 00:20:41,000
Fast vector store DB is ready.

406
00:20:41,000 --> 00:20:44,000
And by using which embedding technique.

407
00:20:46,000 --> 00:20:51,000
Is ready using Nvidia embeddings.

408
00:20:51,000 --> 00:20:51,000
Right.

409
00:20:51,000 --> 00:20:54,000
So this embeddings we are specifically used over here.

410
00:20:54,000 --> 00:20:54,000
Right.

411
00:20:54,000 --> 00:20:59,000
So what happens is that as soon as I probably click this button it is going to call this entire vector

412
00:20:59,000 --> 00:21:00,000
embedding.

413
00:21:00,000 --> 00:21:03,000
And it is going to make sure that you have this vector store DB ready.

414
00:21:03,000 --> 00:21:03,000
Okay.

415
00:21:03,000 --> 00:21:09,000
And then finally I will go ahead and uh so once this is done I will go ahead and write if prompt one.

416
00:21:09,000 --> 00:21:15,000
If the prompt one is there, if I am probably searching for anything, and if I press enter, okay,

417
00:21:15,000 --> 00:21:20,000
then the next thing that I'm actually going to do is that create my document chain.

418
00:21:20,000 --> 00:21:26,000
And for this I'm going to use my create stuff document chain here I have to give my LM comma prompt.

419
00:21:26,000 --> 00:21:27,000
Right.

420
00:21:27,000 --> 00:21:30,000
So this we have already seen in our Lang chain playlist itself.

421
00:21:30,000 --> 00:21:30,000
Right.

422
00:21:30,000 --> 00:21:32,000
So here you have this.

423
00:21:32,000 --> 00:21:39,000
And then finally you will be able to see retriever is equal to est dot session underscore state dot

424
00:21:39,000 --> 00:21:41,000
vectors as retriever.

425
00:21:41,000 --> 00:21:42,000
So this vectors db.

426
00:21:42,000 --> 00:21:47,000
When we use with as retrievers this basically becomes an interface to retrieve all the data from here.

427
00:21:47,000 --> 00:21:48,000
Right.

428
00:21:48,000 --> 00:21:50,000
So this basically becomes my retriever.

429
00:21:50,000 --> 00:21:54,000
And since this vector is basically stored in the session state, we are going to basically use this

430
00:21:54,000 --> 00:21:55,000
session state itself.

431
00:21:55,000 --> 00:22:00,000
Then after taking this entire retriever, what we are basically going to do, we are going to create

432
00:22:00,000 --> 00:22:02,000
our retrieval chain.

433
00:22:02,000 --> 00:22:02,000
Okay.

434
00:22:02,000 --> 00:22:05,000
And here we are going to use uh okay.

435
00:22:05,000 --> 00:22:07,000
Create retrieval chain.

436
00:22:07,000 --> 00:22:13,000
And here we are going to use my retriever comma document chain okay.

437
00:22:13,000 --> 00:22:16,000
So this two things I am going to basically use it and close it over here.

438
00:22:16,000 --> 00:22:18,000
Since already my LLM is basically called.

439
00:22:18,000 --> 00:22:26,000
Now this create retrieval chain will basically create my retrieval underscore chain which I'm going

440
00:22:26,000 --> 00:22:27,000
to use it over here.

441
00:22:27,000 --> 00:22:32,000
But at the end of the day this model that is basically getting called it is from name.

442
00:22:32,000 --> 00:22:32,000
Right.

443
00:22:32,000 --> 00:22:38,000
So Nvidia name Nvidia name inferencing.

444
00:22:39,000 --> 00:22:39,000
Right.

445
00:22:39,000 --> 00:22:40,000
Inferencing.

446
00:22:41,000 --> 00:22:41,000
Perfect.

447
00:22:43,000 --> 00:22:44,000
So here it is.

448
00:22:44,000 --> 00:22:51,000
Uh and uh after you get this retrieval chain, all I have to probably do is that I will start the time.

449
00:22:52,000 --> 00:22:53,000
Okay.

450
00:22:53,000 --> 00:22:55,000
So import time.

451
00:22:55,000 --> 00:23:02,000
Let's go ahead and import time over here because I'm going to use the time functionality so that I will

452
00:23:02,000 --> 00:23:04,000
be also able to measure and see how fast this is.

453
00:23:04,000 --> 00:23:05,000
Start time.

454
00:23:05,000 --> 00:23:11,000
Uh, then we are invoking based on the prompt that we have, and we finally get the response and we

455
00:23:11,000 --> 00:23:12,000
are writing the entire response.

456
00:23:12,000 --> 00:23:13,000
Okay.

457
00:23:13,000 --> 00:23:19,000
Now, um, along with this, uh, this meta, uh, open source model, like llama three, also provides

458
00:23:19,000 --> 00:23:20,000
you some context information.

459
00:23:20,000 --> 00:23:27,000
So below, by using again Streamlit, we will try to display the entire we'll try to display the entire,

460
00:23:27,000 --> 00:23:28,000
uh, context also.

461
00:23:28,000 --> 00:23:33,000
So here I'm going to write it from with x t dot x standard document similarity search.

462
00:23:33,000 --> 00:23:36,000
And here we are going to basically display the context.

463
00:23:36,000 --> 00:23:37,000
Now let's go ahead and run this.

464
00:23:37,000 --> 00:23:39,000
And I think it should probably work.

465
00:23:39,000 --> 00:23:42,000
And we are going to basically use this chart in Nvidia itself.

466
00:23:42,000 --> 00:23:44,000
Now let's quickly open the terminal.

467
00:23:44,000 --> 00:23:49,000
So guys now let's go ahead and run this code and see that whether everything is working fine.

468
00:23:49,000 --> 00:23:53,000
So here I'm going to basically write Python final app.py.

469
00:23:54,000 --> 00:23:56,000
Oh sorry I have to basically run with Streamlit.

470
00:23:56,000 --> 00:24:02,000
So Streamlit run our final app Dot Pi.

471
00:24:02,000 --> 00:24:02,000
Okay.

472
00:24:02,000 --> 00:24:06,000
Anyhow, I'll be giving you the entire code in the description of this particular video.

473
00:24:06,000 --> 00:24:09,000
Go ahead and check it out and definitely go ahead and use Nvidia Nim.

474
00:24:09,000 --> 00:24:13,000
Now the first thing over here is that I'll go ahead and click on Document Embedding.

475
00:24:13,000 --> 00:24:16,000
So as soon as I probably click this, what it is going to do from that entire folder, it is going to

476
00:24:16,000 --> 00:24:20,000
take out all the documents, convert that into vectors, store it in the vector database.

477
00:24:20,000 --> 00:24:23,000
There we have used Nvidia embeddings.

478
00:24:23,000 --> 00:24:26,000
So we will just wait till the vector db is ready.

479
00:24:26,000 --> 00:24:30,000
And then we will go ahead and ask any question that we want like a Rag application.

480
00:24:30,000 --> 00:24:32,000
So still it is taking time.

481
00:24:32,000 --> 00:24:33,000
There are many files 4 to 5 files.

482
00:24:33,000 --> 00:24:35,000
So obviously there are a lot of documents.

483
00:24:35,000 --> 00:24:39,000
And again it is basically happening at completely with the Nvidia name inferencing.

484
00:24:39,000 --> 00:24:41,000
So let's wait for some time.

485
00:24:41,000 --> 00:24:44,000
So here you can see the vector store DB is ready.

486
00:24:44,000 --> 00:24:50,000
Let me quickly go over here and see whether I'll be able to see my vector DB Okay.

487
00:24:51,000 --> 00:24:57,000
So now the next thing what I'm actually going to do over here is that go ahead and ask any question

488
00:24:57,000 --> 00:24:57,000
that I want.

489
00:24:57,000 --> 00:24:58,000
Right.

490
00:24:58,000 --> 00:25:05,000
So now over here I'm going to basically say differences in let's let's go ahead and ask this question.

491
00:25:05,000 --> 00:25:10,000
What is the differences in the uninsured rate by state in 2022.

492
00:25:10,000 --> 00:25:19,000
So here I'm going to put this question What is the differences in the UN uninsured rate by state in

493
00:25:19,000 --> 00:25:20,000
2022.

494
00:25:20,000 --> 00:25:25,000
So it obviously needs to pick up all this information and probably give you us an answer.

495
00:25:25,000 --> 00:25:29,000
So here according to the context the difference in the uninsured state are.

496
00:25:29,000 --> 00:25:36,000
So here you can see a low of 2.4% and 16.6% compared to the national rate, uh, 8.0%.

497
00:25:36,000 --> 00:25:41,000
And from the help of document similarity search, you can see what all context it is basically taken

498
00:25:41,000 --> 00:25:42,000
and is giving you the results.

499
00:25:42,000 --> 00:25:46,000
And this entire thing is basically happening with Nvidia Nim, right?

500
00:25:46,000 --> 00:25:49,000
All the models that are probably there from Nvidia embeddings to open source model.

501
00:25:49,000 --> 00:25:52,000
And here we have specifically using llama three itself.

502
00:25:52,000 --> 00:25:54,000
So go ahead and check it out.

503
00:25:54,000 --> 00:25:58,000
And definitely uh, just go ahead and explore more models that you want uh based on different different

504
00:25:58,000 --> 00:26:00,000
use cases like reasoning.

505
00:26:00,000 --> 00:26:02,000
You have something different visual design.

506
00:26:02,000 --> 00:26:04,000
You have something different Retrieval speech.

507
00:26:04,000 --> 00:26:07,000
If you have if you want to probably check it out, you can go ahead and check it out.

508
00:26:07,000 --> 00:26:10,000
And here you also have lot of open source models, uh, foundation models.

509
00:26:10,000 --> 00:26:13,000
So it is up to you go ahead and explore this.

510
00:26:13,000 --> 00:26:19,000
So I feel uh, Nvidia name Nvidia has done an amazing work and definitely this solves a lot of problems.

511
00:26:19,000 --> 00:26:20,000
So yes, this was it for my side.

512
00:26:20,000 --> 00:26:21,000
I will see you all in the next video.

513
00:26:21,000 --> 00:26:22,000
Have a great day ahead.

514
00:26:22,000 --> 00:26:26,000
And all the information regarding this will be given in the description of this particular video.

515
00:26:26,000 --> 00:26:27,000
So thank you.

516
00:26:27,000 --> 00:26:28,000
I'll see you in the next video.

517
00:26:28,000 --> 00:26:28,000
Bye bye.