1
00:00:00,000 --> 00:00:01,000
Hello guys.

2
00:00:01,000 --> 00:00:04,000
So we are going to continue our discussion with respect to text summarization.

3
00:00:04,000 --> 00:00:08,000
We have already seen about stuff documents uh stuff document chain summarization.

4
00:00:08,000 --> 00:00:11,000
Then we saw map reduce chain summarization.

5
00:00:11,000 --> 00:00:16,000
And finally we are going to go ahead and explore about refine chain summarization.

6
00:00:16,000 --> 00:00:22,000
Now this is uh uh, you know, just a slight modification, uh, when compared to the map reduce.

7
00:00:22,000 --> 00:00:25,000
Let's say you have this entire documents.

8
00:00:25,000 --> 00:00:25,000
Okay.

9
00:00:26,000 --> 00:00:32,000
Now when we have this entire documents, let me just go ahead and rub this again and write it in a better

10
00:00:32,000 --> 00:00:32,000
way.

11
00:00:32,000 --> 00:00:35,000
So let's say if I have a document over here.

12
00:00:37,000 --> 00:00:39,000
So these are my documents.

13
00:00:39,000 --> 00:00:43,000
Let's say I have divided this documents into chunks, right.

14
00:00:43,000 --> 00:00:44,000
Smaller chunks.

15
00:00:44,000 --> 00:00:46,000
So this is my chunk one.

16
00:00:46,000 --> 00:00:46,000
Chunk two.

17
00:00:46,000 --> 00:00:48,000
Like this I have lot of chunks.

18
00:00:49,000 --> 00:00:57,000
Now what we will do whenever we use this refine chain summarization we will take this first chunk.

19
00:00:57,000 --> 00:00:59,000
We will give it to a prompt template.

20
00:01:01,000 --> 00:01:07,000
Along with this prompt, we will go ahead and pass to the LLM to get our summarization.

21
00:01:07,000 --> 00:01:10,000
So let's say once I get this summarization.

22
00:01:12,000 --> 00:01:17,000
Then this is one specific output that we are going to get.

23
00:01:17,000 --> 00:01:17,000
Okay.

24
00:01:17,000 --> 00:01:24,000
Now for the second chunk what it will happen before sending this to the prompt.

25
00:01:27,000 --> 00:01:30,000
This to the prompt and the LM.

26
00:01:31,000 --> 00:01:37,000
We are also going to take a reference of this summarization result along with this.

27
00:01:37,000 --> 00:01:38,000
Right.

28
00:01:38,000 --> 00:01:44,000
So if I just really want to explain this, uh, over here, it is basically given by refine basically

29
00:01:44,000 --> 00:01:48,000
means updating a rolling summary by iterating over the document in sequence.

30
00:01:48,000 --> 00:01:48,000
Okay.

31
00:01:49,000 --> 00:01:56,000
If I really want to just talk about refine, this basically means for every chunk, whenever we are

32
00:01:56,000 --> 00:02:01,000
passing it to the prompt, and once we get the summarization of that, it is going to combine with the

33
00:02:01,000 --> 00:02:02,000
next chunk.

34
00:02:02,000 --> 00:02:04,000
And then again it is going to send to the prompt and LLM.

35
00:02:04,000 --> 00:02:08,000
We will get our next summarization over here.

36
00:02:08,000 --> 00:02:08,000
Okay.

37
00:02:08,000 --> 00:02:12,000
So let's say that I have got my next summarization here Right?

38
00:02:13,000 --> 00:02:14,000
And then what will happen?

39
00:02:14,000 --> 00:02:17,000
This summary will be later combined with this.

40
00:02:17,000 --> 00:02:23,000
And here we will be able to get our another like we'll be able to pass to the another prompt and another

41
00:02:23,000 --> 00:02:24,000
LLM.

42
00:02:24,000 --> 00:02:27,000
So this is basically getting rolled up right.

43
00:02:27,000 --> 00:02:32,000
So that is the reason when we saw in the documentation it is saying that hey updating a rolling summary,

44
00:02:32,000 --> 00:02:38,000
we are taking the summary and we are rolling it over one by one as we go ahead with respect to the next

45
00:02:38,000 --> 00:02:38,000
chunk.

46
00:02:38,000 --> 00:02:38,000
Right?

47
00:02:38,000 --> 00:02:40,000
So this is my chunk one, chunk two like that.

48
00:02:40,000 --> 00:02:45,000
And finally after covering all the chunks we get the final summary right.

49
00:02:46,000 --> 00:02:51,000
So this is one technique uh, with respect to refine.

50
00:02:51,000 --> 00:02:56,000
And here you could see that how it is a little bit different from MapReduce and the stuff one okay.

51
00:02:56,000 --> 00:03:02,000
So guys now let's go ahead and see the practical implementation of refine chain for summarization.

52
00:03:02,000 --> 00:03:04,000
So for this we don't have to do much.

53
00:03:04,000 --> 00:03:06,000
Just use the same load summarize chain.

54
00:03:06,000 --> 00:03:10,000
And here we are going to use the first parameter as my lm model.

55
00:03:10,000 --> 00:03:13,000
The second parameter will basically be my chain type.

56
00:03:13,000 --> 00:03:18,000
Whenever we use a chain type over here by refine, then we will be able to use it.

57
00:03:18,000 --> 00:03:22,000
The third thing that we are going to use is something called as verbose is equal to true to just see

58
00:03:22,000 --> 00:03:23,000
the all the information.

59
00:03:23,000 --> 00:03:30,000
And finally when I just go ahead and run chain dot run with respect to our final documents, I think

60
00:03:30,000 --> 00:03:32,000
we should be able to get our output summary.

61
00:03:32,000 --> 00:03:37,000
So this will basically be my output summary is equal to this.

62
00:03:37,000 --> 00:03:41,000
And let's print this output summary over here.

63
00:03:41,000 --> 00:03:42,000
So here you can see.

64
00:03:42,000 --> 00:03:50,000
Now see over here also sending the food I could do for him along this this refining is basically happening

65
00:03:50,000 --> 00:03:51,000
right.

66
00:03:51,000 --> 00:03:56,000
See at the end of the day refine summarization basically means it's more about refining the summary

67
00:03:56,000 --> 00:03:57,000
itself.

68
00:03:57,000 --> 00:04:02,000
The first chunk of summary, when we get it, we are sending along with the second chunk and again passing

69
00:04:02,000 --> 00:04:04,000
it to the prompt and LLM model.

70
00:04:04,000 --> 00:04:10,000
And finally, at each and every time, we are specifically doing that particular process to get a much

71
00:04:10,000 --> 00:04:12,000
better form of summary itself.

72
00:04:12,000 --> 00:04:15,000
And that is what we can actually do with the help of load summarization.

73
00:04:17,000 --> 00:04:22,000
Um, so in this all series of video we have seen about all the three types.

74
00:04:22,000 --> 00:04:24,000
One is stuff document chain.

75
00:04:24,000 --> 00:04:26,000
Then we have MapReduce.

76
00:04:26,000 --> 00:04:28,000
Then finally we have about define chain.

77
00:04:28,000 --> 00:04:30,000
We have also seen the theoretical intuition.

78
00:04:30,000 --> 00:04:32,000
Yes this was it from my side.

79
00:04:32,000 --> 00:04:39,000
Uh, in the next video we are going to develop an end to end project implementation for both structured

80
00:04:39,000 --> 00:04:40,000
and unstructured type of content.

81
00:04:40,000 --> 00:04:41,000
Okay.

82
00:04:41,000 --> 00:04:43,000
And that is what we are going to do in the next video.

83
00:04:43,000 --> 00:04:45,000
So yes, I will see you all in the next video.

84
00:04:45,000 --> 00:04:45,000
Thank you.