1
00:00:00,000 --> 00:00:00,000
Hello guys.

2
00:00:00,000 --> 00:00:04,000
So we are going to continue the discussion with respect to creating the chatbot.

3
00:00:04,000 --> 00:00:08,000
And uh, already in our previous video we have discussed about Prompt template with, uh, you know,

4
00:00:08,000 --> 00:00:10,000
the conversation history.

5
00:00:10,000 --> 00:00:16,000
Now we are going to focus on understanding how do we manage the conversation history.

6
00:00:16,000 --> 00:00:21,000
See, one important concept to understand when building chatbot is how to manage conversation history.

7
00:00:21,000 --> 00:00:27,000
If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window

8
00:00:27,000 --> 00:00:27,000
of the LM.

9
00:00:27,000 --> 00:00:32,000
Therefore, it is important to add a step that limits the size of the messages that you are passing

10
00:00:32,000 --> 00:00:32,000
it.

11
00:00:32,000 --> 00:00:36,000
Okay, so let me just go ahead and show you how it can basically be done.

12
00:00:36,000 --> 00:00:42,000
So here, uh, I will probably go ahead and write or import one important library.

13
00:00:42,000 --> 00:00:49,000
Let me say from lang chain underscore core, I'm going to write dot messages.

14
00:00:49,000 --> 00:00:52,000
And here we are going to import system message.

15
00:00:52,000 --> 00:00:53,000
So system message.

16
00:00:53,000 --> 00:00:59,000
Along with that we are going to also use one more method which is called as Trim messages.

17
00:00:59,000 --> 00:01:02,000
Now what exactly is this trim message.

18
00:01:02,000 --> 00:01:06,000
Okay so here I will just go ahead and write over here.

19
00:01:06,000 --> 00:01:06,000
Okay.

20
00:01:06,000 --> 00:01:12,000
So trim message trim underscore messages.

21
00:01:12,000 --> 00:01:14,000
This will help.

22
00:01:14,000 --> 00:01:20,000
This is a helper which will help reduce how many messages we are, uh, sending it to the model.

23
00:01:20,000 --> 00:01:23,000
Okay, so let me write some brief information over here.

24
00:01:23,000 --> 00:01:29,000
So this helper read, uh, helper to reduce how many messages we are sending to the model.

25
00:01:29,000 --> 00:01:34,000
The trimmer allows us to specify how many tokens we want to keep along with the other parameters.

26
00:01:34,000 --> 00:01:38,000
Like if you want to always keep on the system, keep the system message and whether to allow partial

27
00:01:38,000 --> 00:01:39,000
message or not.

28
00:01:39,000 --> 00:01:43,000
Okay, so this is what specifically we do with the help of trim underscore message.

29
00:01:43,000 --> 00:01:48,000
And this trim underscore messages we have probably imported over here okay.

30
00:01:48,000 --> 00:01:51,000
Now quickly I will go ahead and initialize this trimmer.

31
00:01:51,000 --> 00:01:54,000
And I'll go ahead and write trim underscore message.

32
00:01:54,000 --> 00:01:56,000
Inside this there are some tokens.

33
00:01:56,000 --> 00:02:01,000
We will some parameters that will be uh will be playing with you know, so the first token over here

34
00:02:01,000 --> 00:02:03,000
is nothing but max token.

35
00:02:03,000 --> 00:02:05,000
Let's say I want to keep the max tokens to 70.

36
00:02:05,000 --> 00:02:06,000
I want to limit it to 70.

37
00:02:06,000 --> 00:02:07,000
Okay.

38
00:02:07,000 --> 00:02:10,000
Then I will ask for a strategy.

39
00:02:10,000 --> 00:02:14,000
We will go ahead and say, hey, what strategy do you need to follow for trimming it?

40
00:02:14,000 --> 00:02:15,000
Right.

41
00:02:15,000 --> 00:02:16,000
And there are multiple strategies.

42
00:02:16,000 --> 00:02:19,000
One of the strategy over here that we are going to use.

43
00:02:19,000 --> 00:02:24,000
Last, if I say last, it is just going to focus on the last conversation that it had.

44
00:02:24,000 --> 00:02:27,000
You know, from there it is going to count the entire tokens.

45
00:02:27,000 --> 00:02:27,000
Okay.

46
00:02:28,000 --> 00:02:34,000
Then uh, the token there is also one more parameter which is called as token counter.

47
00:02:34,000 --> 00:02:37,000
See see max token is max token count of trim message.

48
00:02:37,000 --> 00:02:40,000
Token counter basically means function or lm for counting tokens.

49
00:02:40,000 --> 00:02:41,000
Right.

50
00:02:41,000 --> 00:02:46,000
So here, uh, if I go ahead and write token counter, I'm going to use my model which will be used

51
00:02:46,000 --> 00:02:48,000
for uh, counting the tokens.

52
00:02:48,000 --> 00:02:49,000
Right.

53
00:02:49,000 --> 00:02:52,000
Then I will say include system.

54
00:02:52,000 --> 00:02:55,000
So there are more parameters over here.

55
00:02:55,000 --> 00:03:01,000
If you see over here there is also maybe one more parameter not able to see it over here.

56
00:03:01,000 --> 00:03:02,000
And this is one example.

57
00:03:02,000 --> 00:03:02,000
It is given.

58
00:03:02,000 --> 00:03:03,000
Right.

59
00:03:03,000 --> 00:03:07,000
I will discuss about it okay I will I'll give you one very good example over here.

60
00:03:07,000 --> 00:03:12,000
So here what we'll do here also we will go ahead and set our include system is equal to true.

61
00:03:12,000 --> 00:03:16,000
I'm saying hey include the system message because system message is important.

62
00:03:16,000 --> 00:03:17,000
We cannot remember.

63
00:03:17,000 --> 00:03:21,000
We cannot because that is where we say what exactly the LM model should do.

64
00:03:21,000 --> 00:03:26,000
Okay, then I'll say allow partial is equal to false.

65
00:03:26,000 --> 00:03:29,000
So I don't want partial information, I just want to trim.

66
00:03:29,000 --> 00:03:30,000
That's it.

67
00:03:30,000 --> 00:03:35,000
And here I will go ahead and say start on from where it needs to start.

68
00:03:35,000 --> 00:03:38,000
It should not start from the human conversation okay.

69
00:03:38,000 --> 00:03:39,000
Perfect.

70
00:03:39,000 --> 00:03:42,000
So this is the trimmer that I'm actually going to use okay.

71
00:03:42,000 --> 00:03:44,000
Now let me do let me do one thing.

72
00:03:44,000 --> 00:03:49,000
Let me keep a set of messages Okay, so here are my set of messages.

73
00:03:49,000 --> 00:03:51,000
Um, so here you can see system messages, content.

74
00:03:51,000 --> 00:03:52,000
Your good assistant.

75
00:03:52,000 --> 00:03:55,000
Then human message says, hey, I am Bob.

76
00:03:55,000 --> 00:03:56,000
Then a message says hi.

77
00:03:56,000 --> 00:03:59,000
Human message says I like vanilla ice cream.

78
00:03:59,000 --> 00:04:00,000
I message says nice.

79
00:04:00,000 --> 00:04:02,000
This is the response from the LM model.

80
00:04:02,000 --> 00:04:06,000
Then human message is like what is two, two plus two I message says four.

81
00:04:06,000 --> 00:04:08,000
Human message says thanks.

82
00:04:08,000 --> 00:04:12,000
So this is a set of conversation that has happened between human and AI.

83
00:04:12,000 --> 00:04:12,000
Okay.

84
00:04:13,000 --> 00:04:19,000
So now if I just use this set of, uh, image, uh, conversation or messages, and if I use streamer

85
00:04:19,000 --> 00:04:23,000
dot invoke and I pass it over here, now you see the magic?

86
00:04:23,000 --> 00:04:26,000
What it will happen if I execute it.

87
00:04:26,000 --> 00:04:27,000
Okay.

88
00:04:27,000 --> 00:04:29,000
Uh, trim underscore message allow partials.

89
00:04:29,000 --> 00:04:30,000
Okay.

90
00:04:30,000 --> 00:04:32,000
Allow partial should not be there or what?

91
00:04:32,000 --> 00:04:36,000
So here, uh, you will be able to see instead of allow partials, it should be partial.

92
00:04:36,000 --> 00:04:36,000
Okay.

93
00:04:36,000 --> 00:04:42,000
Now, if I go ahead and execute it here, you'll be able to see that it will apply all the trimming

94
00:04:42,000 --> 00:04:44,000
on this with the help of that same model.

95
00:04:44,000 --> 00:04:49,000
Now you can see system message has been considered and here you can see human message.

96
00:04:49,000 --> 00:04:52,000
Hi I am Bob so hi I am Bob.

97
00:04:52,000 --> 00:04:54,000
Over here you can see hi I like vanilla ice cream.

98
00:04:54,000 --> 00:04:55,000
So everything is basically taken.

99
00:04:55,000 --> 00:04:58,000
The reason is I've taken all the 65 tokens.

100
00:04:58,000 --> 00:05:03,000
Okay let me sorry this if I probably go ahead and count it, the total number of tokens may be 65 or

101
00:05:03,000 --> 00:05:04,000
70 over there.

102
00:05:04,000 --> 00:05:07,000
So for this I will just go ahead and use 45 tokens.

103
00:05:07,000 --> 00:05:09,000
Now let's see whether it will change.

104
00:05:09,000 --> 00:05:11,000
So here you can see you are a good assistant.

105
00:05:11,000 --> 00:05:14,000
And it has started now from I like vanilla ice cream.

106
00:05:14,000 --> 00:05:17,000
See I like vanilla ice cream.

107
00:05:17,000 --> 00:05:17,000
Right.

108
00:05:17,000 --> 00:05:21,000
So this information you can see from here the top two has been trimmed off.

109
00:05:21,000 --> 00:05:22,000
Right.

110
00:05:22,000 --> 00:05:22,000
Not top two.

111
00:05:22,000 --> 00:05:25,000
The top two human and I message conversation has been trimmed off.

112
00:05:25,000 --> 00:05:26,000
Okay.

113
00:05:26,000 --> 00:05:29,000
And remaining all you can see that it is you are able to see it.

114
00:05:29,000 --> 00:05:33,000
And this is how you specifically manage the entire information.

115
00:05:33,000 --> 00:05:37,000
We be we don't need to every time pass the entire, uh, context.

116
00:05:37,000 --> 00:05:42,000
The reason is very simple because there is a limitation with respect to every context window in our

117
00:05:42,000 --> 00:05:43,000
LM models.

118
00:05:43,000 --> 00:05:43,000
Okay.

119
00:05:43,000 --> 00:05:46,000
Now let's do one thing.

120
00:05:46,000 --> 00:05:49,000
Uh, we can also go ahead and use chains.

121
00:05:49,000 --> 00:05:50,000
Right.

122
00:05:50,000 --> 00:05:56,000
And, uh, see, uh, right now we are just passing all the list of messages and at all the time we

123
00:05:56,000 --> 00:05:58,000
need to probably go ahead and pass this, uh, trimmer.

124
00:05:58,000 --> 00:05:59,000
Right.

125
00:05:59,000 --> 00:06:01,000
So in a chain, how do we pass it?

126
00:06:01,000 --> 00:06:06,000
Okay, so here I will go ahead and, uh, like, if I'm creating a chain, how do I pass this particular

127
00:06:06,000 --> 00:06:08,000
trimmer function that we are going to see?

128
00:06:08,000 --> 00:06:12,000
So here I will go ahead and write from operator import item getter okay.

129
00:06:12,000 --> 00:06:16,000
And I'll tell you what exactly we do with this item later.

130
00:06:16,000 --> 00:06:20,000
First of all, I will go ahead and create my I'll import my runnable library.

131
00:06:20,000 --> 00:06:26,000
So here I will go ahead and say from long chain underscore core dot runnables okay.

132
00:06:27,000 --> 00:06:31,000
And from this runnables I'm going to import the runnable pass through okay.

133
00:06:31,000 --> 00:06:35,000
So it will be runnable pass through okay.

134
00:06:35,000 --> 00:06:40,000
Now once I have this runnable pass through I think it should be imported.

135
00:06:40,000 --> 00:06:41,000
Lang chain.

136
00:06:41,000 --> 00:06:42,000
Lang chain.

137
00:06:42,000 --> 00:06:43,000
Okay.

138
00:06:43,000 --> 00:06:45,000
Now see this?

139
00:06:45,000 --> 00:06:46,000
Whenever I create my chain.

140
00:06:46,000 --> 00:06:50,000
Okay, I know that first when I'm creating this chain.

141
00:06:50,000 --> 00:06:57,000
So now if I really want to apply the trimmer function, first of all, the trimmer function in my list

142
00:06:57,000 --> 00:07:00,000
of messages, I can actually use this runnable pass through.

143
00:07:00,000 --> 00:07:03,000
And I'll say and this I will be providing in my chain itself.

144
00:07:03,000 --> 00:07:04,000
I'll say dot assign.

145
00:07:05,000 --> 00:07:11,000
And inside this I will go ahead and write my messages is equal to item getter okay.

146
00:07:11,000 --> 00:07:14,000
And here the item getter that we are going to basically say is messages.

147
00:07:14,000 --> 00:07:19,000
By using this item getter of messages, we will be able to retrieve the entire messages that is available

148
00:07:19,000 --> 00:07:20,000
in the prompt template.

149
00:07:20,000 --> 00:07:23,000
And along with this we will go ahead and apply primer.

150
00:07:23,000 --> 00:07:24,000
Okay.

151
00:07:24,000 --> 00:07:30,000
After applying this, uh, one step by step, I have applied over here, right?

152
00:07:30,000 --> 00:07:38,000
The next important thing will be that along with this, I will go ahead and concatenate the chain with

153
00:07:38,000 --> 00:07:38,000
prompt.

154
00:07:38,000 --> 00:07:38,000
Okay.

155
00:07:38,000 --> 00:07:41,000
Then I will go ahead and concatenate.

156
00:07:41,000 --> 00:07:43,000
And this is basically using LCL right.

157
00:07:43,000 --> 00:07:45,000
I will go ahead with model okay.

158
00:07:45,000 --> 00:07:46,000
So this basically becomes my chain.

159
00:07:46,000 --> 00:07:49,000
Now if I go ahead and say chain dot response.

160
00:07:50,000 --> 00:07:53,000
And here uh sorry chain dot invoke.

161
00:07:53,000 --> 00:07:56,000
We have to do invoke okay.

162
00:07:56,000 --> 00:07:58,000
And here we will be passing the parameters.

163
00:07:58,000 --> 00:08:01,000
The first parameter that we are going to pass is nothing but messages.

164
00:08:01,000 --> 00:08:04,000
And this will basically go with all my messages that I have.

165
00:08:04,000 --> 00:08:05,000
Okay.

166
00:08:06,000 --> 00:08:09,000
So all the messages over here you can see right.

167
00:08:09,000 --> 00:08:12,000
This is the messages that I have actually created okay.

168
00:08:12,000 --> 00:08:14,000
And along with this messages I will pass it over here.

169
00:08:14,000 --> 00:08:18,000
I will say hey let's go ahead and add my human conversation with this.

170
00:08:18,000 --> 00:08:21,000
So it will be my human message.

171
00:08:21,000 --> 00:08:27,000
I'll say, hey, content, what is my or what ice cream do I like?

172
00:08:27,000 --> 00:08:32,000
Okay, I'll just go ahead and ask what ice cream do I like?

173
00:08:32,000 --> 00:08:34,000
So this will basically be my content.

174
00:08:34,000 --> 00:08:37,000
I should not give this in this one.

175
00:08:37,000 --> 00:08:38,000
So this will basically be my content okay.

176
00:08:38,000 --> 00:08:42,000
Content over here and this will be my first information.

177
00:08:42,000 --> 00:08:46,000
And again we need to give this entirely, uh, this human message.

178
00:08:46,000 --> 00:08:47,000
I need to give it in the form of list.

179
00:08:47,000 --> 00:08:50,000
So let me just go ahead and close this.

180
00:08:50,000 --> 00:08:54,000
Now, the second parameter here, what I will be doing is that I will just go ahead and write language

181
00:08:54,000 --> 00:08:58,000
is equal to English because in the prompt template we have used two parameters.

182
00:08:59,000 --> 00:08:59,000
Right.

183
00:08:59,000 --> 00:09:03,000
So finally I will store this in my response.

184
00:09:03,000 --> 00:09:09,000
And here you will be able to see that I will go ahead and display response dot text or sorry response

185
00:09:09,000 --> 00:09:10,000
dot content.

186
00:09:10,000 --> 00:09:15,000
So once I execute it, uh, I'm getting an error saying that messages over here.

187
00:09:15,000 --> 00:09:16,000
Okay.

188
00:09:16,000 --> 00:09:19,000
This should be in the form of key value pairs.

189
00:09:19,000 --> 00:09:20,000
Okay.

190
00:09:20,000 --> 00:09:22,000
So so guys now let's go ahead and execute this.

191
00:09:22,000 --> 00:09:26,000
So once I go ahead and execute you can see as an I don't have access to a personal preference.

192
00:09:26,000 --> 00:09:31,000
The reason is very simple why we are not able to find out what ice cream do I like.

193
00:09:31,000 --> 00:09:32,000
right?

194
00:09:32,000 --> 00:09:36,000
The reason is very simple because the trimmer, which is specifically applying this 45 tokens, right

195
00:09:36,000 --> 00:09:42,000
Max tokens when it is applying the trimming on this particular messages, uh, that context is not there.

196
00:09:42,000 --> 00:09:42,000
Right?

197
00:09:42,000 --> 00:09:46,000
So that is the reason you can see over here that I'm not getting that particular information.

198
00:09:46,000 --> 00:09:49,000
So it is saying I don't like your favorite ice cream, uh, over here.

199
00:09:49,000 --> 00:09:54,000
And I think, uh, uh, I think this one, I like vanilla ice cream has also gone okay.

200
00:09:54,000 --> 00:09:57,000
Uh, since we have changed the context.

201
00:09:57,000 --> 00:09:58,000
Maximum number of tokens to 45.

202
00:09:58,000 --> 00:09:59,000
Okay.

203
00:09:59,000 --> 00:10:03,000
But let's see for this whether this will be able to remember this.

204
00:10:03,000 --> 00:10:04,000
What is two plus two.

205
00:10:04,000 --> 00:10:10,000
So here what I am actually going to quickly do is that I will go ahead and invoke one more chain.

206
00:10:10,000 --> 00:10:15,000
And this time I will just go ahead and ask what math problem did I ask for.

207
00:10:15,000 --> 00:10:19,000
So if I go ahead and execute it here, you can see you asked for what is two plus two?

208
00:10:19,000 --> 00:10:19,000
Okay.

209
00:10:20,000 --> 00:10:24,000
So this way you are able to get all the informations and you are able to see it.

210
00:10:24,000 --> 00:10:24,000
Okay.

211
00:10:25,000 --> 00:10:29,000
Um, again uh we can go ahead and again still work with.

212
00:10:29,000 --> 00:10:37,000
Now let's wrap this thing this in the message history okay.

213
00:10:37,000 --> 00:10:41,000
We will wrap it message history.

214
00:10:41,000 --> 00:10:47,000
Now in order to wrap it again, what I really need to do is that I will go ahead and create my config

215
00:10:47,000 --> 00:10:48,000
with messages tree.

216
00:10:48,000 --> 00:10:54,000
So I will go ahead and write with messages tree runnable, uh, message tree chain get sessions, uh,

217
00:10:54,000 --> 00:10:54,000
history.

218
00:10:54,000 --> 00:10:57,000
And here you'll be able to see all this information.

219
00:10:57,000 --> 00:10:57,000
Okay.

220
00:10:57,000 --> 00:11:01,000
Um, now I will just go ahead and set up my config.

221
00:11:01,000 --> 00:11:05,000
So my config will be nothing, but it will be configurable.

222
00:11:05,000 --> 00:11:07,000
This is one of the key that is used.

223
00:11:07,000 --> 00:11:16,000
And I will go ahead and write my session underscore ID, which will be nothing but chat five okay,

224
00:11:16,000 --> 00:11:18,000
so let's say this will be my chat file.

225
00:11:18,000 --> 00:11:21,000
So here I'll just go ahead and execute it okay.

226
00:11:21,000 --> 00:11:27,000
Now based on this particular context, uh, let me go ahead and have some of the conversation okay.

227
00:11:28,000 --> 00:11:32,000
And here I will go ahead and execute what's my name?

228
00:11:32,000 --> 00:11:34,000
Uh, obviously I'm not going to get anything right.

229
00:11:34,000 --> 00:11:36,000
So human message is this.

230
00:11:36,000 --> 00:11:38,000
And I'm calling this particular configuration.

231
00:11:38,000 --> 00:11:42,000
And if I even go ahead and ask the math problem right.

232
00:11:42,000 --> 00:11:47,000
What math problem did I ask again it'll be showing me that as a large model, I have no memory of the

233
00:11:47,000 --> 00:11:49,000
past conversation.

234
00:11:49,000 --> 00:11:51,000
So, uh, this was it.

235
00:11:51,000 --> 00:11:54,000
You know, I hope you were able to understand this entire thing.

236
00:11:55,000 --> 00:11:56,000
And this is important.

237
00:11:56,000 --> 00:11:59,000
Whenever you really want to work with the chat conversation history.

238
00:11:59,000 --> 00:11:59,000
Right.

239
00:11:59,000 --> 00:12:03,000
So what are the things we did in all this, uh, three, four videos, right.

240
00:12:03,000 --> 00:12:04,000
We played with messages.

241
00:12:04,000 --> 00:12:09,000
We played with tremor, we played with managing the conversation history with prompt template how you

242
00:12:09,000 --> 00:12:13,000
can actually work with the entire, uh, conversation history.

243
00:12:13,000 --> 00:12:16,000
We have also discussed that and all these things has been discussed.

244
00:12:16,000 --> 00:12:19,000
And these are some of the very important components that will be required whenever you're building your

245
00:12:19,000 --> 00:12:20,000
chat bot.

246
00:12:20,000 --> 00:12:22,000
So yes, this was it from my side.

247
00:12:22,000 --> 00:12:23,000
I hope you like this particular video.

248
00:12:23,000 --> 00:12:24,000
I'll see you all in the next video.

249
00:12:24,000 --> 00:12:25,000
Thank you.