1
00:00:00,000 --> 00:00:04,000
So guys, we are going to continue our generative AI series on AWS cloud.

2
00:00:04,000 --> 00:00:10,000
And in this video I'm going to probably show you how you can actually deploy your hugging face model.

3
00:00:10,000 --> 00:00:12,000
Let it be an open source model or any LM model.

4
00:00:12,000 --> 00:00:16,000
Also, specifically in AWS SageMaker that is Amazon SageMaker.

5
00:00:17,000 --> 00:00:23,000
Um, Amazon SageMaker is one of the important services in AWS where you can probably complete the entire

6
00:00:23,000 --> 00:00:29,000
life cycle of a data science project or an AI project completely along with MLOps till the deployment

7
00:00:29,000 --> 00:00:30,000
each and every thing.

8
00:00:30,000 --> 00:00:35,000
So in this particular video, I will be showing you how we can take up any hugging face model in short

9
00:00:35,000 --> 00:00:38,000
and probably deploy it in the SageMaker itself.

10
00:00:38,000 --> 00:00:41,000
So please make sure that you watch this video till the end.

11
00:00:41,000 --> 00:00:43,000
And one more very important thing guys.

12
00:00:43,000 --> 00:00:46,000
Uh uh, just check out this particular video.

13
00:00:46,000 --> 00:00:50,000
They will they may some they may be some amount of charges that may incur.

14
00:00:50,000 --> 00:00:54,000
So please be careful with respect to respect to this whenever you start creating.

15
00:00:54,000 --> 00:00:59,000
And finally when you create that particular endpoint, make sure that you delete all the endpoints itself.

16
00:00:59,000 --> 00:01:02,000
And I will be providing you the entire guidance as we go ahead.

17
00:01:02,000 --> 00:01:05,000
So first of all, we are just going to go ahead and search for AWS SageMaker.

18
00:01:05,000 --> 00:01:11,000
And to start with, uh, how to probably work with AWS SageMaker studio and all.

19
00:01:11,000 --> 00:01:12,000
I will also be talking about that.

20
00:01:12,000 --> 00:01:17,000
So first of all, you can see, uh, you can just go ahead and click on Getting Started.

21
00:01:17,000 --> 00:01:18,000
Okay.

22
00:01:18,000 --> 00:01:24,000
Uh, and with respect to this AWS SageMaker, the documentation is pretty much amazing, right?

23
00:01:24,000 --> 00:01:28,000
So if you don't know about AWS or Amazon SageMaker, it provides machine learning capabilities that

24
00:01:28,000 --> 00:01:33,000
are purpose built for data scientists and developers to prepare, build, train and deploy high quality

25
00:01:33,000 --> 00:01:34,000
ML models efficiently.

26
00:01:34,000 --> 00:01:39,000
So, uh, what we are going to basically do is that step by step, I'm going to actually show you.

27
00:01:39,000 --> 00:01:43,000
So I'll go to the domain part and I will go ahead and create the domain.

28
00:01:43,000 --> 00:01:46,000
So once I probably go ahead and create the domain.

29
00:01:46,000 --> 00:01:48,000
So here uh there are two options.

30
00:01:48,000 --> 00:01:52,000
One is set up for organization and the other one is set up for single user.

31
00:01:52,000 --> 00:01:52,000
Okay.

32
00:01:53,000 --> 00:01:57,000
So as soon as uh, and right now we are just going to test it for single user.

33
00:01:57,000 --> 00:02:01,000
So what I'm going to do is that I'm going to select this here.

34
00:02:01,000 --> 00:02:03,000
You will be able to get an IAM role automatically.

35
00:02:03,000 --> 00:02:07,000
It will be creating it with this Amazon SageMaker full Access policy.

36
00:02:08,000 --> 00:02:10,000
Uh, public internet access standard encryption.

37
00:02:11,000 --> 00:02:11,000
SageMaker studio.

38
00:02:11,000 --> 00:02:13,000
You'll get the access to this.

39
00:02:13,000 --> 00:02:18,000
You'll be getting shareable SageMaker studio notebooks, SageMaker canvas and IAM authentication.

40
00:02:18,000 --> 00:02:23,000
So each and every thing uh, by inbuilt, it will be providing you this entire environment where you

41
00:02:23,000 --> 00:02:25,000
can actually do the coding.

42
00:02:25,000 --> 00:02:27,000
So just go ahead and click this.

43
00:02:27,000 --> 00:02:32,000
So once you probably go ahead and click this you will be able to see uh something like uh okay, the

44
00:02:32,000 --> 00:02:35,000
setup is going to take place and it is going to take some amount of time.

45
00:02:35,000 --> 00:02:40,000
You know, uh, again, it depends on how much time it is probably going to take over here.

46
00:02:40,000 --> 00:02:44,000
It will automatically create the users and everything will be able to get it right.

47
00:02:44,000 --> 00:02:50,000
So here, if I probably go ahead and just reload this, you will be able to see that my third domain

48
00:02:50,000 --> 00:02:54,000
is probably over here getting set up pending is this.

49
00:02:54,000 --> 00:02:57,000
And it will probably take some amount of time based on the usage.

50
00:02:57,000 --> 00:03:01,000
But other than that that you can probably see that I've already created two.

51
00:03:01,000 --> 00:03:06,000
So uh, since this is already getting created, I will just go ahead and show you some more examples

52
00:03:06,000 --> 00:03:07,000
with respect to this.

53
00:03:07,000 --> 00:03:12,000
So let me just go ahead and quickly click on this quick set up domain.

54
00:03:12,000 --> 00:03:16,000
You'll be able to see that by default it will be creating a user okay.

55
00:03:16,000 --> 00:03:21,000
You can add any number of users based on the organization that you're specifically working in.

56
00:03:21,000 --> 00:03:25,000
Let's say you want to provide access to multiple users in the same, uh, domain itself.

57
00:03:25,000 --> 00:03:26,000
You can also provide it.

58
00:03:26,000 --> 00:03:26,000
Okay.

59
00:03:26,000 --> 00:03:31,000
So by default, uh, as as soon as you provide the user access.

60
00:03:31,000 --> 00:03:35,000
Also, there is one very important thing that, uh, you will be able to see.

61
00:03:35,000 --> 00:03:36,000
Right?

62
00:03:36,000 --> 00:03:40,000
And, uh, that is nothing but the user information over here along with this.

63
00:03:40,000 --> 00:03:42,000
Let me just hide my face here.

64
00:03:42,000 --> 00:03:44,000
You'll be able to see that you'll be getting a launch button.

65
00:03:44,000 --> 00:03:47,000
Okay, so through this, you'll be able to access canvas.

66
00:03:47,000 --> 00:03:51,000
I will be talking about canvas as we go ahead in the further videos.

67
00:03:51,000 --> 00:03:51,000
TensorBoard.

68
00:03:51,000 --> 00:03:57,000
I hope everybody knows about TensorBoard Profiler, but we are just interested in working on this particular

69
00:03:57,000 --> 00:03:58,000
studio, right?

70
00:03:58,000 --> 00:04:04,000
So once I probably click on the studio and uh, you will be able to see that I will be able to get this

71
00:04:04,000 --> 00:04:06,000
entire Amazon SageMaker.

72
00:04:06,000 --> 00:04:11,000
It will be a studio completely, which will actually provide you the entire ecosystem with respect to

73
00:04:11,000 --> 00:04:14,000
any kind of development that you really want to do.

74
00:04:14,000 --> 00:04:17,000
Okay, so here you can probably see Jupyter Lab is there.

75
00:04:17,000 --> 00:04:23,000
If you want to quickly start deploying, fine tune and evaluate pre-trained models specifically for

76
00:04:23,000 --> 00:04:23,000
LM.

77
00:04:23,000 --> 00:04:25,000
You can also go ahead with this.

78
00:04:25,000 --> 00:04:30,000
You know, if you want to probably, uh, go ahead and do some kind of AutoML.

79
00:04:30,000 --> 00:04:33,000
You can also see over here there is an option model evaluation.

80
00:04:33,000 --> 00:04:35,000
So you will find this entire ecosystem okay.

81
00:04:35,000 --> 00:04:39,000
Now what I'm actually going to do is that I'm just going to click on this Jupyter lab.

82
00:04:39,000 --> 00:04:43,000
And right now you will be able to see that nothing is running over here okay.

83
00:04:43,000 --> 00:04:45,000
Nothing is running specifically.

84
00:04:45,000 --> 00:04:50,000
And uh, what I'm actually going to do, I'm going to create a create a Jupyter lab.

85
00:04:50,000 --> 00:04:50,000
Okay.

86
00:04:50,000 --> 00:04:53,000
Over here, Jupyter lab space so that I will be able to work it.

87
00:04:53,000 --> 00:04:53,000
Okay.

88
00:04:53,000 --> 00:04:59,000
So here I'm just going to write test demo SageMaker okay.

89
00:04:59,000 --> 00:05:02,000
And this I'm going to show you with hugging space okay.

90
00:05:02,000 --> 00:05:03,000
Uh sorry.

91
00:05:03,000 --> 00:05:04,000
Hugging face models.

92
00:05:04,000 --> 00:05:04,000
Right.

93
00:05:04,000 --> 00:05:09,000
So here uh, it will first of all ask you to probably select the instance, right.

94
00:05:10,000 --> 00:05:11,000
Uh, multiple instances.

95
00:05:11,000 --> 00:05:13,000
You can probably select uh for my use cases again.

96
00:05:13,000 --> 00:05:20,000
Uh, this instances if I go ahead and check over here SageMaker instances based on different different

97
00:05:20,000 --> 00:05:23,000
instances there is a different, different price.

98
00:05:23,000 --> 00:05:28,000
So you really need to be, uh, taking care of this because see, if you're working in a company based

99
00:05:28,000 --> 00:05:30,000
on the requirement, you can probably select the instances.

100
00:05:30,000 --> 00:05:33,000
So here you can probably see on demand pricing.

101
00:05:33,000 --> 00:05:40,000
So uh, let's say if I select Mlt3 three medium $0.05 here it provides you CPU with two core four GB

102
00:05:40,000 --> 00:05:41,000
memory.

103
00:05:41,000 --> 00:05:45,000
But understand if you are specifically working with generative AI models, right.

104
00:05:45,000 --> 00:05:48,000
You definitely require a huge amount of data, right?

105
00:05:48,000 --> 00:05:49,000
So a huge amount of space.

106
00:05:49,000 --> 00:05:53,000
So if you probably go further, there will be good, good amount.

107
00:05:53,000 --> 00:05:59,000
Amazing system accelerating computing which also provides you not only CPU cores but also GPUs.

108
00:05:59,000 --> 00:05:59,000
Right.

109
00:05:59,000 --> 00:06:04,000
So as you keep on using more and more, you'll be able to see the charges will be going higher and higher.

110
00:06:04,000 --> 00:06:08,000
But just, uh, in this specific video, I really want to show you some demo.

111
00:06:08,000 --> 00:06:13,000
Uh, but again, if you just get this idea in any companies, if you go, you will be able to work it

112
00:06:13,000 --> 00:06:13,000
out.

113
00:06:13,000 --> 00:06:13,000
Right.

114
00:06:13,000 --> 00:06:18,000
So I'm going to take a small system and probably work with this right now, which small system I'm going

115
00:06:18,000 --> 00:06:19,000
to specifically take.

116
00:06:19,000 --> 00:06:23,000
So if I go up here is something called as ML dot m5 two large.

117
00:06:23,000 --> 00:06:28,000
So if you go ahead and see over here uh ML offer two large.

118
00:06:28,000 --> 00:06:32,000
So this is basically providing you eight core GPUs and 32 GB memory.

119
00:06:32,000 --> 00:06:38,000
The charges will be 0.461, uh, price per hour for using this particular instance.

120
00:06:38,000 --> 00:06:43,000
The reason why I'm doing this is that because I will just give you an example of one of the model,

121
00:06:43,000 --> 00:06:44,000
how it is deployed and all.

122
00:06:44,000 --> 00:06:45,000
Okay.

123
00:06:45,000 --> 00:06:46,000
And uh, we'll see to that.

124
00:06:46,000 --> 00:06:47,000
Okay.

125
00:06:47,000 --> 00:06:49,000
So here I'm just going to keep the storage to ten GB.

126
00:06:49,000 --> 00:06:53,000
Uh, and now I will go ahead and click on Run Space.

127
00:06:53,000 --> 00:06:54,000
Okay.

128
00:06:54,000 --> 00:07:00,000
So once the run space is clicked on you will be able to see that my entire environment will be ready

129
00:07:00,000 --> 00:07:02,000
and I should be able to work it out.

130
00:07:02,000 --> 00:07:02,000
Okay.

131
00:07:02,000 --> 00:07:08,000
And with respect to this, uh, any kind of code that you run right in the my recent video, you have

132
00:07:08,000 --> 00:07:15,000
also seen that I have actually, uh, created entire one Lambda function with the API gateway.

133
00:07:15,000 --> 00:07:17,000
I've shown you how to probably create the endpoint and all.

134
00:07:17,000 --> 00:07:19,000
So all those things you can probably do with this also.

135
00:07:19,000 --> 00:07:22,000
But here more amazing endpoints.

136
00:07:22,000 --> 00:07:25,000
You'll be, uh, more amazing ecosystem and features you will be able to get.

137
00:07:25,000 --> 00:07:28,000
Okay, so this is going to take some amount of time.

138
00:07:28,000 --> 00:07:29,000
So okay now it is created.

139
00:07:29,000 --> 00:07:32,000
Now I can just go ahead and click on Open Jupyter Lab.

140
00:07:33,000 --> 00:07:39,000
Now understand with respect to hugging face there are multiple ways to load a specific model okay.

141
00:07:39,000 --> 00:07:41,000
There are multiple ways.

142
00:07:41,000 --> 00:07:45,000
Uh, all those ways I'm going to talk about and this will be important for you because you really need

143
00:07:45,000 --> 00:07:50,000
to know have an idea like how we can actually work in a system, right?

144
00:07:50,000 --> 00:07:53,000
So once this is loaded, uh, the first step.

145
00:07:53,000 --> 00:07:59,000
Okay, what you can do, I will just go ahead and change my theme, so I'll make it Jupiter dock.

146
00:07:59,000 --> 00:07:59,000
Okay.

147
00:07:59,000 --> 00:08:01,000
So here you have options.

148
00:08:01,000 --> 00:08:03,000
Uh, you have options for notebook.

149
00:08:03,000 --> 00:08:08,000
You have options for console, terminal, Python, anything you can specifically work with I will just

150
00:08:08,000 --> 00:08:10,000
go ahead and open terminal.

151
00:08:10,000 --> 00:08:10,000
Uh, sorry.

152
00:08:10,000 --> 00:08:11,000
Notebook.

153
00:08:11,000 --> 00:08:17,000
And the first thing that I will do is that just go ahead and write pip install SageMaker.

154
00:08:17,000 --> 00:08:17,000
Okay.

155
00:08:17,000 --> 00:08:19,000
Which is the upgraded version.

156
00:08:19,000 --> 00:08:21,000
Just go ahead and install that automatically.

157
00:08:21,000 --> 00:08:26,000
The installation will be taking place and all the recent updates that is basically there with respect

158
00:08:26,000 --> 00:08:30,000
to the SageMaker that will get updated and understand this is my entire environment, the same Jupyter

159
00:08:30,000 --> 00:08:33,000
notebook that we usually specifically work on.

160
00:08:33,000 --> 00:08:34,000
And all right.

161
00:08:34,000 --> 00:08:39,000
So here you'll be able to see this now, uh, to go ahead with what we really need to do initially.

162
00:08:39,000 --> 00:08:41,000
So I will go ahead and import SageMaker.

163
00:08:41,000 --> 00:08:41,000
Okay.

164
00:08:41,000 --> 00:08:45,000
And there are some steps that you definitely need to follow as we start.

165
00:08:45,000 --> 00:08:45,000
Right.

166
00:08:45,000 --> 00:08:47,000
So I'm going to import Boto3.

167
00:08:47,000 --> 00:08:50,000
And I'll be providing you all the code in the description of this particular video.

168
00:08:50,000 --> 00:08:53,000
So first of all we'll go ahead and create our session.

169
00:08:53,000 --> 00:08:58,000
The session will be nothing but SageMaker dot session okay SageMaker dot session.

170
00:08:58,000 --> 00:09:01,000
So by default whatever is the session we will be able to get it.

171
00:09:01,000 --> 00:09:07,000
Uh, initially, uh, we have not set up any, uh, session buckets or anything as such.

172
00:09:07,000 --> 00:09:09,000
You can also create that particular session.

173
00:09:09,000 --> 00:09:14,000
So I'm just going to write over here even with some comments so that it will be beneficial for you.

174
00:09:14,000 --> 00:09:19,000
So here you can see session uh SageMaker session bucket used for uploading data model logs.

175
00:09:19,000 --> 00:09:25,000
If you want to probably upload data, upload a custom data or download the Huggingface model in this

176
00:09:25,000 --> 00:09:28,000
particular S3 bucket and probably retrieve from that, or use from that.

177
00:09:28,000 --> 00:09:29,000
You can actually do that.

178
00:09:29,000 --> 00:09:32,000
SageMaker will automatically create this bucket if it does not exist.

179
00:09:32,000 --> 00:09:35,000
Okay, so you can probably go ahead and write this particular condition.

180
00:09:35,000 --> 00:09:41,000
And here you'll be able to see that if there is nothing as such, um, if it is none, I'm just going

181
00:09:41,000 --> 00:09:45,000
to probably, uh, make if the session is not there, I'm going to probably go ahead and create the

182
00:09:45,000 --> 00:09:46,000
default bucket.

183
00:09:46,000 --> 00:09:46,000
Okay.

184
00:09:46,000 --> 00:09:49,000
So this is the code that we do initially.

185
00:09:49,000 --> 00:09:53,000
Uh, and this bucket will specifically be used to upload any custom data or training data.

186
00:09:53,000 --> 00:09:56,000
It can be models after the inferencing and multiple things.

187
00:09:57,000 --> 00:10:04,000
Now other than this, we, uh, really need to focus on the role management over here because since

188
00:10:04,000 --> 00:10:10,000
we are using AWS SageMaker to execute any kind of code, uh, specifically by using the services of

189
00:10:10,000 --> 00:10:13,000
AWS SageMaker, we really need to provide the role access and all.

190
00:10:13,000 --> 00:10:17,000
So for this, uh, we can just go ahead and write some try catch block.

191
00:10:17,000 --> 00:10:23,000
So here I'm just going to go ahead and write try um try and let me just go ahead and do pass over here

192
00:10:23,000 --> 00:10:26,000
and let me go ahead and write accept block.

193
00:10:26,000 --> 00:10:31,000
Um with respect to this particular accept block, I will just go ahead and raise some value error.

194
00:10:31,000 --> 00:10:34,000
And the best thing over here is that you'll get all the suggestion right.

195
00:10:34,000 --> 00:10:36,000
And that is pretty much amazing, right?

196
00:10:36,000 --> 00:10:41,000
Once you are specifically using it let's say my IAM user is not yet set.

197
00:10:41,000 --> 00:10:43,000
So it will obviously give an exception.

198
00:10:43,000 --> 00:10:45,000
So it will go to this particular value error.

199
00:10:45,000 --> 00:10:49,000
And here I will first of all go ahead and set up my uh user.

200
00:10:49,000 --> 00:10:51,000
So I'm IAM user in short.

201
00:10:51,000 --> 00:10:54,000
So I'll go ahead and write Boto3 dot client.

202
00:10:54,000 --> 00:10:57,000
And uh I'm also checking out the documentation page.

203
00:10:57,000 --> 00:11:01,000
It is always good to probably check out the documentation page and perform in that particular way.

204
00:11:01,000 --> 00:11:04,000
And then I'm going to basically set up my role.

205
00:11:04,000 --> 00:11:08,000
Um, I'm going to use IAM dot get underscore role.

206
00:11:08,000 --> 00:11:13,000
Now it is going to probably pick up those role that is configured with this particular Jupyter notebook.

207
00:11:13,000 --> 00:11:13,000
Right.

208
00:11:13,000 --> 00:11:23,000
So I'm just going to write role name okay is equal to SageMaker SageMaker underscore execution underscore

209
00:11:23,000 --> 00:11:25,000
role okay.

210
00:11:25,000 --> 00:11:30,000
And this is basically going to be set up with our roll.

211
00:11:30,000 --> 00:11:33,000
And from that we are going to probably take a number.

212
00:11:33,000 --> 00:11:33,000
Right.

213
00:11:33,000 --> 00:11:39,000
That is how we specifically identify the unique roll that we are probably trying to give.

214
00:11:39,000 --> 00:11:39,000
Okay.

215
00:11:39,000 --> 00:11:41,000
So all this thing is done over here.

216
00:11:41,000 --> 00:11:47,000
Let me just make some spaces that is actually required whenever we work with this okay.

217
00:11:47,000 --> 00:11:49,000
So here you can probably see my roll is also set up.

218
00:11:49,000 --> 00:11:51,000
Uh this is my exception.

219
00:11:51,000 --> 00:11:56,000
Now let me quickly go ahead and write in a try once the, uh, x, uh, once the role is set up.

220
00:11:56,000 --> 00:12:00,000
So I can basically go ahead and write SageMaker dot get execution role.

221
00:12:00,000 --> 00:12:01,000
Okay.

222
00:12:01,000 --> 00:12:04,000
This will be responsible in getting the execution role.

223
00:12:04,000 --> 00:12:07,000
If by default nothing is set up, then you can probably call it out.

224
00:12:07,000 --> 00:12:09,000
Uh, based on the role that it has.

225
00:12:09,000 --> 00:12:10,000
It has.

226
00:12:10,000 --> 00:12:10,000
Okay.

227
00:12:11,000 --> 00:12:15,000
Again, one common thing that I've seen that we have to provide spaces over here.

228
00:12:15,000 --> 00:12:17,000
So that is one of the thing over here.

229
00:12:17,000 --> 00:12:21,000
So please make sure that you keep on working with respect to that whenever you're writing any code.

230
00:12:22,000 --> 00:12:24,000
Uh, after that I will just go ahead and write.

231
00:12:24,000 --> 00:12:28,000
Session is equal to SageMaker dot session.

232
00:12:28,000 --> 00:12:29,000
I'm just going to call my session.

233
00:12:29,000 --> 00:12:32,000
And here I'm just going to use my default bucket.

234
00:12:32,000 --> 00:12:38,000
Uh, so here one of the parameter that you can probably see is nothing but your default bucket, which

235
00:12:38,000 --> 00:12:43,000
will be assigned to my default bucket or sorry, not default bucket SageMaker session bucket.

236
00:12:43,000 --> 00:12:46,000
That is my default parameter that is specifically required.

237
00:12:46,000 --> 00:12:50,000
Okay, so this is the first step that you really need to do it okay.

238
00:12:50,000 --> 00:12:58,000
And then finally I can go ahead and print my you can see how it is going to pick up the r n roll okay

239
00:12:58,000 --> 00:13:00,000
SageMaker role Arn.

240
00:13:00,000 --> 00:13:05,000
And here I'm going to basically set it up to roll okay.

241
00:13:05,000 --> 00:13:08,000
So this is the roll that is going to get print up.

242
00:13:08,000 --> 00:13:10,000
And next print is equal to.

243
00:13:10,000 --> 00:13:14,000
And here you can probably also go ahead and write your SageMaker session.

244
00:13:15,000 --> 00:13:26,000
So SageMaker session region if I want to display it okay it will be nothing but session dot dot boto

245
00:13:26,000 --> 00:13:27,000
region name.

246
00:13:27,000 --> 00:13:27,000
Right.

247
00:13:27,000 --> 00:13:29,000
So this is also there.

248
00:13:29,000 --> 00:13:32,000
So once I execute it, uh, let's see whether it will be working.

249
00:13:32,000 --> 00:13:33,000
Fine.

250
00:13:33,000 --> 00:13:36,000
So here you can see uh not applying SDK defaults.

251
00:13:36,000 --> 00:13:38,000
This is they're not applying info okay.

252
00:13:38,000 --> 00:13:40,000
Some error I can probably see over.

253
00:13:40,000 --> 00:13:41,000
Okay.

254
00:13:41,000 --> 00:13:44,000
Let's go ahead and do that again okay.

255
00:13:44,000 --> 00:13:45,000
Now it's working fine.

256
00:13:45,000 --> 00:13:48,000
You can probably see SageMaker role is basically this.

257
00:13:48,000 --> 00:13:54,000
And this is the default role that uh the domain has basically created with respect to the user.

258
00:13:54,000 --> 00:13:56,000
And then us East one I'm actually working on.

259
00:13:56,000 --> 00:13:56,000
Okay.

260
00:13:56,000 --> 00:13:58,000
So this is the first step.

261
00:13:58,000 --> 00:14:03,000
Now if I have the roles and all, uh, now I can probably go ahead and call any kind of models that

262
00:14:03,000 --> 00:14:04,000
I specifically want.

263
00:14:04,000 --> 00:14:09,000
So uh, first, uh, we will try to call one type of model.

264
00:14:09,000 --> 00:14:12,000
And, uh, I will just go ahead and show you the code.

265
00:14:12,000 --> 00:14:13,000
Okay.

266
00:14:13,000 --> 00:14:21,000
So here I'm calling one model which is called as Distilbert uncased distilled squared, and this is

267
00:14:21,000 --> 00:14:23,000
specifically used for question answering.

268
00:14:23,000 --> 00:14:26,000
So I am basically going to give the hub configuration.

269
00:14:26,000 --> 00:14:27,000
See Huggingface hub.

270
00:14:27,000 --> 00:14:32,000
If you don't know about Huggingface hub then uh, there you definitely have a lot of models which can

271
00:14:32,000 --> 00:14:34,000
probably use for multiple use cases.

272
00:14:34,000 --> 00:14:37,000
So here you can see Huggingface uh, model.

273
00:14:37,000 --> 00:14:41,000
I'd have to probably give whatever is the my model ID, the kind of task that I'm looking for, I'll

274
00:14:41,000 --> 00:14:45,000
be giving in the form of, uh, key value pairs in this particular variable called as hub.

275
00:14:45,000 --> 00:14:46,000
Okay.

276
00:14:46,000 --> 00:14:48,000
Then we go ahead and create the hugging face model.

277
00:14:48,000 --> 00:14:49,000
Right.

278
00:14:49,000 --> 00:14:52,000
And for that we have already deployed or imported from SageMaker.

279
00:14:52,000 --> 00:14:54,000
Dot hugging face model import hugging face.

280
00:14:54,000 --> 00:14:54,000
Right.

281
00:14:54,000 --> 00:15:00,000
And then finally you'll be able to see that I'm using this environment hub okay role I've already used

282
00:15:00,000 --> 00:15:04,000
transformer version which has provided PyTorch version and Pi version 3.9.

283
00:15:04,000 --> 00:15:08,000
Okay, so this is how the parameters you really need to give with respect to this.

284
00:15:08,000 --> 00:15:08,000
Okay.

285
00:15:08,000 --> 00:15:16,000
So, uh, I've already executed this, uh, and I don't want to again deploy this entire model in my

286
00:15:16,000 --> 00:15:19,000
another Jupyter notebook because it is going to definitely take time.

287
00:15:19,000 --> 00:15:22,000
So just to give you an idea, I have already done that.

288
00:15:22,000 --> 00:15:24,000
So let me just show you over here.

289
00:15:24,000 --> 00:15:27,000
This is my another instance which I was actually working on.

290
00:15:28,000 --> 00:15:32,000
So here, uh, is my second Huggingface model, which I have actually given.

291
00:15:32,000 --> 00:15:35,000
Third, uh, how to deploy it to the SageMaker inferences.

292
00:15:35,000 --> 00:15:38,000
As you know, I have selected ML dot m5 x large.

293
00:15:38,000 --> 00:15:40,000
Okay, so I am here.

294
00:15:40,000 --> 00:15:43,000
I'm writing predictor by using the same hugging face model dot deploy.

295
00:15:43,000 --> 00:15:46,000
I'll be saying the instance count as one okay.

296
00:15:46,000 --> 00:15:48,000
Because I just required one instance.

297
00:15:48,000 --> 00:15:49,000
And what is the instance type.

298
00:15:49,000 --> 00:15:52,000
Instance type is ml dot m5 dot xlarge okay.

299
00:15:52,000 --> 00:15:53,000
Where I'm specifically deploying.

300
00:15:53,000 --> 00:15:57,000
Other than that, uh, I need to make sure that how should I give my input?

301
00:15:57,000 --> 00:16:02,000
So I will be creating in this particular format, because this model that is distilled, Bert based

302
00:16:02,000 --> 00:16:07,000
Uncased distilled squared will be requiring the model, uh, the input data in this particular format.

303
00:16:07,000 --> 00:16:10,000
That is question what is the use for inferences then?

304
00:16:10,000 --> 00:16:10,000
Context.

305
00:16:10,000 --> 00:16:16,000
My name is so and so I've just given some examples so that my more charges should not incur.

306
00:16:16,000 --> 00:16:18,000
And then we'll go ahead and predict the data.

307
00:16:18,000 --> 00:16:24,000
So here you'll be able to see if I go ahead and write predictor dot predict okay with respect to this

308
00:16:24,000 --> 00:16:27,000
particular data because this data is required in that particular use case.

309
00:16:27,000 --> 00:16:32,000
So here you can see score is this one start is this one end is this one answer is SageMaker right.

310
00:16:32,000 --> 00:16:34,000
So what is used for inferences.

311
00:16:34,000 --> 00:16:38,000
So here you can see from this particular question uh you have that SageMaker.

312
00:16:38,000 --> 00:16:39,000
It is going to pick up.

313
00:16:39,000 --> 00:16:43,000
Understand when I execute this code it is going to take 5 to 10 minutes to deploy into this.

314
00:16:43,000 --> 00:16:45,000
And you will be able to get your endpoint Okay.

315
00:16:45,000 --> 00:16:48,000
Just to give you an idea, how does an endpoint look like?

316
00:16:48,000 --> 00:16:51,000
Um, I'll just go over here just a second.

317
00:16:52,000 --> 00:16:56,000
Uh, how does the endpoint look like if I, if you really want to see.

318
00:16:56,000 --> 00:17:00,000
So here I, if I go back to my SageMaker studio I've created multiple endpoints.

319
00:17:00,000 --> 00:17:04,000
So if you go to deployments you'll be able to see endpoints over here.

320
00:17:04,000 --> 00:17:04,000
Right.

321
00:17:04,000 --> 00:17:05,000
These two endpoints are there.

322
00:17:05,000 --> 00:17:10,000
Let's say I want to go ahead and see this particular endpoint okay I can click on this particular endpoint.

323
00:17:10,000 --> 00:17:13,000
And once we deploy it that endpoint will be created.

324
00:17:13,000 --> 00:17:18,000
If I want to test the inferences you'll be able to see that, uh, I'll be getting this application

325
00:17:18,000 --> 00:17:22,000
JSON and whatever body I'm specifically giving like how I'm giving over here.

326
00:17:22,000 --> 00:17:23,000
This is my entire body, right.

327
00:17:23,000 --> 00:17:28,000
And with respect to that particular body, if I give over here also and just test it, I should be able

328
00:17:28,000 --> 00:17:32,000
to get if I just click on send request, I should be able to get my output right.

329
00:17:32,000 --> 00:17:35,000
And there are also multiple options like auto scaling.

330
00:17:35,000 --> 00:17:38,000
You can scale to whatever things you specifically want based on the charges.

331
00:17:38,000 --> 00:17:43,000
Again, uh, but here you will be able to see that I am I have deployed this model in this particular

332
00:17:43,000 --> 00:17:44,000
instance.

333
00:17:44,000 --> 00:17:49,000
And, uh, you'll be able to just use that particular endpoint wherever you want in your code, anywhere

334
00:17:49,000 --> 00:17:51,000
as such, based on your requirement.

335
00:17:51,000 --> 00:17:52,000
Right.

336
00:17:52,000 --> 00:17:54,000
So this was the entire thing, right.

337
00:17:54,000 --> 00:17:55,000
Let's let's try some more thing.

338
00:17:55,000 --> 00:17:55,000
Okay.

339
00:17:55,000 --> 00:17:58,000
I will again create some more data over here.

340
00:17:59,000 --> 00:18:07,000
And, uh, um, my name is Chris, and.

341
00:18:09,000 --> 00:18:10,000
I teach data science.

342
00:18:10,000 --> 00:18:15,000
I'll just give this a and I teach data science.

343
00:18:15,000 --> 00:18:16,000
Okay.

344
00:18:16,000 --> 00:18:17,000
So.

345
00:18:20,000 --> 00:18:22,000
What does Chris like?

346
00:18:24,000 --> 00:18:25,000
Okay, I'm just going to execute this.

347
00:18:25,000 --> 00:18:27,000
Let's see what I'm going to get.

348
00:18:27,000 --> 00:18:30,000
The answer I'm predictor dot predict.

349
00:18:30,000 --> 00:18:33,000
So you have a predict method here.

350
00:18:33,000 --> 00:18:35,000
You can see the answer is data science right.

351
00:18:35,000 --> 00:18:39,000
What does Chris like uh what does Chris teach.

352
00:18:39,000 --> 00:18:41,000
So you can basically do okay.

353
00:18:41,000 --> 00:18:43,000
You'll be able to get the answer data science okay.

354
00:18:43,000 --> 00:18:46,000
So in short this is just a simple model, right?

355
00:18:46,000 --> 00:18:48,000
I've probably deployed this particular model.

356
00:18:48,000 --> 00:18:51,000
You'll be able to see that this is my entire SageMaker.

357
00:18:51,000 --> 00:18:56,000
And here uh, you know it took some amount of time I think this dot dots are like every one minute,

358
00:18:56,000 --> 00:19:01,000
around 6 to 7 minutes to deploy this entire model in this particular instance, which I've actually

359
00:19:01,000 --> 00:19:01,000
created.

360
00:19:01,000 --> 00:19:08,000
Now, just to give you an idea like how does, uh, machine learning, uh, LM model be like.

361
00:19:08,000 --> 00:19:10,000
So I'll be giving you multiple examples.

362
00:19:10,000 --> 00:19:14,000
So first of all, you have to probably go and update SageMaker the same thing by keeping your role and

363
00:19:15,000 --> 00:19:16,000
uh, region name.

364
00:19:16,000 --> 00:19:21,000
Then you are basically going to call the SageMaker image URI.

365
00:19:22,000 --> 00:19:22,000
Right?

366
00:19:22,000 --> 00:19:24,000
Since we are going to run it as a container.

367
00:19:24,000 --> 00:19:28,000
So here you can see hugging face deep learning container in AWS SageMaker.

368
00:19:28,000 --> 00:19:32,000
So the entire LM model will basically be put in a container.

369
00:19:32,000 --> 00:19:34,000
Then you can actually call that particular container over here.

370
00:19:34,000 --> 00:19:38,000
So in order to call the container you will be using from SageMaker or hugging face.

371
00:19:38,000 --> 00:19:41,000
Get hugging face LM image URI.

372
00:19:41,000 --> 00:19:47,000
And then here is your version that you are specifically using and what kind of image you are actually

373
00:19:47,000 --> 00:19:47,000
coming up with.

374
00:19:47,000 --> 00:19:52,000
Okay, just to show you, if I probably give you an example, I will go ahead and install all this.

375
00:19:52,000 --> 00:19:53,000
Let's see.

376
00:19:53,000 --> 00:19:55,000
You'll be able to see the URI okay.

377
00:19:55,000 --> 00:19:59,000
And then it will probably download it and upload it.

378
00:19:59,000 --> 00:20:01,000
So there will be a lot of task in that.

379
00:20:01,000 --> 00:20:07,000
And a lot of money will be required because again, at the end of the day you are see over here, you

380
00:20:07,000 --> 00:20:08,000
are probably got this.

381
00:20:08,000 --> 00:20:10,000
Now let me just execute this again.

382
00:20:10,000 --> 00:20:12,000
So here is your SageMaker role.

383
00:20:12,000 --> 00:20:16,000
And then you can you can see that which you are I am actually calling right.

384
00:20:16,000 --> 00:20:20,000
So if I probably go and see this is the URI that is basically required right.

385
00:20:20,000 --> 00:20:21,000
Image URL.

386
00:20:21,000 --> 00:20:23,000
So if you know about Docker and containers.

387
00:20:23,000 --> 00:20:27,000
So once you get the URI then you should be able to load the hugging face model.

388
00:20:27,000 --> 00:20:35,000
Now here also you can probably see here we are calling Falcon B for t b instruct number of GPUs how

389
00:20:35,000 --> 00:20:35,000
much you want.

390
00:20:35,000 --> 00:20:38,000
And just imagine that instance type that we are going to use.

391
00:20:38,000 --> 00:20:40,000
Is MLG 512 into large.

392
00:20:40,000 --> 00:20:44,000
I'll just show you the cost and you will be pretty much shocked to see the cost.

393
00:20:45,000 --> 00:20:51,000
So here you can see the internal storage GB is one into 308,800.

394
00:20:51,000 --> 00:20:54,000
Uh total GPU memory is 96.

395
00:20:54,000 --> 00:20:57,000
You know, GPU 24 core bandwidth.

396
00:20:57,000 --> 00:20:58,000
Internet bandwidth is 40.

397
00:20:59,000 --> 00:21:03,000
And the GPU model is Nvidia A10H, right?

398
00:21:03,000 --> 00:21:07,000
Memory is nothing, but 192 CPU cores is nothing but 48.

399
00:21:07,000 --> 00:21:09,000
Just imagine how much the charges will be.

400
00:21:09,000 --> 00:21:14,000
So that is the reason why I have actually taken a small model and probably shown it to you later on.

401
00:21:14,000 --> 00:21:18,000
You can probably do with respect to anything once you probably go through the company, right?

402
00:21:18,000 --> 00:21:21,000
So that is the reason we have selected all these parameters over here.

403
00:21:21,000 --> 00:21:23,000
Then we initialize the hugging face model.

404
00:21:23,000 --> 00:21:24,000
Then we deploy it.

405
00:21:24,000 --> 00:21:25,000
That's it.

406
00:21:25,000 --> 00:21:25,000
Right.

407
00:21:25,000 --> 00:21:29,000
And then after that you can probably use the same payload.

408
00:21:29,000 --> 00:21:32,000
You can structure the payload over here uh with the prompt.

409
00:21:32,000 --> 00:21:34,000
And then you will be able to get the text okay.

410
00:21:34,000 --> 00:21:39,000
So anyhow I will be giving you this entire examples in the description of this particular video.

411
00:21:39,000 --> 00:21:44,000
But again, yes, I've told that I've been uploading videos on AWS SageMaker and all, but definitely

412
00:21:44,000 --> 00:21:49,000
there requires a lot of charges that is involved, but it is always good to have a knowledge about it.

413
00:21:49,000 --> 00:21:51,000
That is the reason that is the main aim for this particular video.

414
00:21:51,000 --> 00:21:53,000
So I hope you like this particular video.

415
00:21:53,000 --> 00:21:54,000
This was it for my side.

416
00:21:54,000 --> 00:21:57,000
Uh, again, uh, you can refer to multiple examples.

417
00:21:57,000 --> 00:22:03,000
Uh, here I will also give you this GitHub link from the documentation of, uh, hugging Face.

418
00:22:03,000 --> 00:22:07,000
So there are multiple labs which has been created, which you can also go ahead and refer to.

419
00:22:07,000 --> 00:22:08,000
Okay.

420
00:22:08,000 --> 00:22:13,000
So this uh, some from our set copy this.

421
00:22:13,000 --> 00:22:18,000
So this is an amazing lab details that is probably provided I will be providing you this entire code

422
00:22:18,000 --> 00:22:19,000
itself.

423
00:22:19,000 --> 00:22:23,000
Other than that, you can also easily see the documentation of hugging face SageMaker.

424
00:22:23,000 --> 00:22:24,000
So yes, this was it for my side.

425
00:22:24,000 --> 00:22:25,000
I'll see you in the next video.

426
00:22:25,000 --> 00:22:26,000
Have a great day ahead.

427
00:22:26,000 --> 00:22:27,000
Thank you all.

428
00:22:27,000 --> 00:22:27,000
Take care.

429
00:22:27,000 --> 00:22:28,000
Bye bye.

