1
00:00:00,000 --> 00:00:00,000
Hello guys.

2
00:00:00,000 --> 00:00:02,000
So we are going to continue our lecture series.

3
00:00:02,000 --> 00:00:07,000
Uh, already in our previous video we have seen that how with the help of OpenAI API, we can probably

4
00:00:07,000 --> 00:00:09,000
create a generative AI app.

5
00:00:09,000 --> 00:00:14,000
One of the example that I had actually taken, I took this entire website as my data source.

6
00:00:14,000 --> 00:00:19,000
I read all the content and uh, uh, converted this into documents.

7
00:00:19,000 --> 00:00:22,000
Later on, I divided that documents into chunk of documents.

8
00:00:22,000 --> 00:00:27,000
And after that we use some kind of embedding techniques, specifically OpenAI embedding technique,

9
00:00:27,000 --> 00:00:30,000
to convert this text into vectors.

10
00:00:30,000 --> 00:00:34,000
And finally, we also used for this, uh, which was a vector database.

11
00:00:34,000 --> 00:00:41,000
And then with the help of document chain and retrieval, right, we were able to create an amazing LLM

12
00:00:41,000 --> 00:00:47,000
gen AI application where we were also able to use gen, uh, LLM models along with our prompt engineering

13
00:00:47,000 --> 00:00:48,000
that is prompt template.

14
00:00:48,000 --> 00:00:51,000
And we were able to get the response from this particular text.

15
00:00:51,000 --> 00:00:56,000
So all those things we specifically did and now the scenario will be that many people will not be having

16
00:00:56,000 --> 00:01:01,000
open AI API, and obviously they they may not be even having credit card or they may not be uploading

17
00:01:01,000 --> 00:01:02,000
$5 credits.

18
00:01:02,000 --> 00:01:03,000
Right.

19
00:01:03,000 --> 00:01:08,000
So for that, uh, this video will be super important because here I am going to use Olama.

20
00:01:08,000 --> 00:01:14,000
And along with this I will be using your open source LLM models which you can run completely locally.

21
00:01:14,000 --> 00:01:15,000
Right.

22
00:01:15,000 --> 00:01:20,000
So if you don't know about Olama you will be able to run large language models, specifically open source

23
00:01:20,000 --> 00:01:22,000
large language models in your local machine.

24
00:01:22,000 --> 00:01:27,000
And then I will try to probably show you how you can create this generative AI application with the

25
00:01:27,000 --> 00:01:28,000
help of this.

26
00:01:28,000 --> 00:01:28,000
Okay.

27
00:01:28,000 --> 00:01:30,000
So quickly, uh, let's do one thing.

28
00:01:30,000 --> 00:01:33,000
First of all, you need to go ahead and download this Olama.

29
00:01:33,000 --> 00:01:33,000
Right.

30
00:01:33,000 --> 00:01:38,000
So this Olama, uh, it'll it is available for Mac OS, Linux, windows.

31
00:01:38,000 --> 00:01:39,000
Right.

32
00:01:39,000 --> 00:01:42,000
So all uh, operating system, it is obviously available.

33
00:01:43,000 --> 00:01:45,000
Uh, but you need to have Windows 10 or later.

34
00:01:45,000 --> 00:01:47,000
So right now I have Windows 11.

35
00:01:47,000 --> 00:01:50,000
So if I go ahead and download this I will go ahead and click it.

36
00:01:50,000 --> 00:01:53,000
So here you'll be able to see that uh exe file will get downloaded.

37
00:01:53,000 --> 00:01:55,000
So this is the exe file.

38
00:01:55,000 --> 00:01:59,000
You just need to double click it and just keep on pressing next next next.

39
00:01:59,000 --> 00:02:02,000
And then uh the your Allama will start running.

40
00:02:02,000 --> 00:02:03,000
Okay.

41
00:02:03,000 --> 00:02:08,000
So just to give you an idea, once Allama will be running in your background, right in the background

42
00:02:08,000 --> 00:02:10,000
services, you'll be able to see this kind of icon.

43
00:02:10,000 --> 00:02:11,000
Right.

44
00:02:11,000 --> 00:02:12,000
And this icon is nothing.

45
00:02:12,000 --> 00:02:16,000
But, uh, this is the olama, uh, icon which will be running in the background process.

46
00:02:16,000 --> 00:02:17,000
Okay.

47
00:02:17,000 --> 00:02:23,000
So once you probably do the installation, uh, then, uh, I will just go ahead and tell you, like

48
00:02:23,000 --> 00:02:27,000
what all things you can specifically do with Olama now, Olama, if you probably go ahead and visit

49
00:02:27,000 --> 00:02:28,000
this GitHub.

50
00:02:28,000 --> 00:02:28,000
Right.

51
00:02:28,000 --> 00:02:29,000
It is completely open source.

52
00:02:29,000 --> 00:02:31,000
Anybody can use it here.

53
00:02:31,000 --> 00:02:38,000
It supports a lot of open source LM models like llama three B, llama three, 7B53, gamma gamma two,

54
00:02:39,000 --> 00:02:40,000
uh gamma mistral.

55
00:02:40,000 --> 00:02:43,000
Then you also have llama two uncensored lava solar.

56
00:02:43,000 --> 00:02:46,000
So different different open source models like gamma.

57
00:02:46,000 --> 00:02:47,000
Everything is there.

58
00:02:47,000 --> 00:02:48,000
Okay.

59
00:02:48,000 --> 00:02:54,000
Now, uh, once you download this llama, how to probably download this entire model.

60
00:02:54,000 --> 00:02:56,000
So I will just go ahead and open my command prompt.

61
00:02:56,000 --> 00:03:00,000
You can open your terminal if you are in Mac you know or Linux.

62
00:03:00,000 --> 00:03:07,000
Now inside this I will just go ahead and right to run any models that I really want to work with, right?

63
00:03:07,000 --> 00:03:10,000
Let's say that I want to go ahead and work with Lumetri.

64
00:03:10,000 --> 00:03:13,000
So I will just go ahead and say, hey, uh, first of all, we need to download Luma three.

65
00:03:13,000 --> 00:03:15,000
So here I will just go ahead and click this.

66
00:03:15,000 --> 00:03:18,000
I'll remove this over here and paste it over here.

67
00:03:18,000 --> 00:03:18,000
Right.

68
00:03:18,000 --> 00:03:21,000
So as soon as I go ahead and write oh llama run llama three.

69
00:03:21,000 --> 00:03:25,000
So initially let's say if in my machine this llama three is not downloaded.

70
00:03:25,000 --> 00:03:28,000
So it will go ahead and download this entire LM model.

71
00:03:28,000 --> 00:03:34,000
And uh, then uh, you'll be able to even execute it in this command prompt by just giving some input

72
00:03:34,000 --> 00:03:35,000
and getting the response.

73
00:03:35,000 --> 00:03:37,000
So let me just go ahead and press enter.

74
00:03:37,000 --> 00:03:41,000
So here you'll be able to see that llama three is already downloaded in my machine.

75
00:03:41,000 --> 00:03:46,000
So you'll not be able to get any configuration over here because, uh, usually when we are running

76
00:03:46,000 --> 00:03:49,000
llama three, you know, it is going to it is nothing.

77
00:03:49,000 --> 00:03:53,000
But, uh, let's say if I'm going to use this 8 billion parameters, it is nothing.

78
00:03:53,000 --> 00:03:55,000
But it is for 4.7 GB file.

79
00:03:55,000 --> 00:03:56,000
Right.

80
00:03:56,000 --> 00:03:58,000
So here you can see as soon as I write llama run llama three.

81
00:03:59,000 --> 00:04:03,000
Uh, since I have already downloaded it in my local machine for you, if you are doing it for the first

82
00:04:03,000 --> 00:04:08,000
time, you'll be able to see that, uh, your llama three will get downloaded over here.

83
00:04:08,000 --> 00:04:12,000
Okay, so in my case, since I do not want to make this particular video.

84
00:04:12,000 --> 00:04:14,000
So I did this particular download beforehand.

85
00:04:14,000 --> 00:04:18,000
Right now here let me just go ahead and write some message.

86
00:04:18,000 --> 00:04:18,000
Hi.

87
00:04:18,000 --> 00:04:21,000
You can see that how quick we are able to get the response.

88
00:04:21,000 --> 00:04:23,000
This response is specifically coming from llama three.

89
00:04:23,000 --> 00:04:26,000
So let me just go ahead and write what is generated by.

90
00:04:26,000 --> 00:04:28,000
I will also be able to get the response quickly.

91
00:04:28,000 --> 00:04:31,000
And this model is basically there in my local machine.

92
00:04:31,000 --> 00:04:34,000
And it is basically interacting from there and how fast it is.

93
00:04:34,000 --> 00:04:35,000
Right.

94
00:04:35,000 --> 00:04:38,000
And obviously you need to have a good configuration of your system.

95
00:04:38,000 --> 00:04:40,000
If you do not have it, it may take some time to get the response.

96
00:04:40,000 --> 00:04:43,000
Okay, so obviously you can do this.

97
00:04:43,000 --> 00:04:45,000
Uh, then I'll just go ahead and write exit.

98
00:04:45,000 --> 00:04:46,000
Okay.

99
00:04:46,000 --> 00:04:51,000
Uh, so here you'll be able to see that any type of conversation I will be able to use it over here

100
00:04:51,000 --> 00:04:51,000
okay.

101
00:04:51,000 --> 00:04:55,000
Similarly, any model that you really want to work with, right.

102
00:04:55,000 --> 00:05:00,000
Let's say if I go ahead and write Llama Run, there is also one more model which is called as gamma.

103
00:05:00,000 --> 00:05:00,000
Right.

104
00:05:00,000 --> 00:05:04,000
So let's go ahead and download this gamma seven B okay.

105
00:05:04,000 --> 00:05:09,000
So here you can see I'm just writing llama run gamma two B right.

106
00:05:09,000 --> 00:05:10,000
2 billion parameters.

107
00:05:10,000 --> 00:05:12,000
So here you can see.

108
00:05:14,000 --> 00:05:18,000
Uh quickly we will go ahead and see this.

109
00:05:18,000 --> 00:05:22,000
So if I go ahead and write this gamma run.

110
00:05:22,000 --> 00:05:25,000
Uh, so first of all, let me do one thing quickly.

111
00:05:26,000 --> 00:05:26,000
Okay.

112
00:05:26,000 --> 00:05:31,000
So let me just execute it over here or let me just go ahead and open my command prompt again.

113
00:05:31,000 --> 00:05:32,000
Okay.

114
00:05:32,000 --> 00:05:33,000
And you can run it from anywhere.

115
00:05:33,000 --> 00:05:36,000
So I will just go ahead and copy this entire thing.

116
00:05:36,000 --> 00:05:37,000
Okay.

117
00:05:37,000 --> 00:05:38,000
Paste it over here.

118
00:05:38,000 --> 00:05:42,000
Now see I have also downloaded gamma two so directly I'm getting the prompt.

119
00:05:42,000 --> 00:05:42,000
Hi.

120
00:05:43,000 --> 00:05:44,000
Who are you?

121
00:05:45,000 --> 00:05:45,000
Okay.

122
00:05:45,000 --> 00:05:47,000
let me go ahead and write this particular message.

123
00:05:47,000 --> 00:05:48,000
Who are you?

124
00:05:48,000 --> 00:05:50,000
I'm a large language model trained by Google.

125
00:05:50,000 --> 00:05:55,000
I'm a conversational AI that can assist with a wide range of tasks, including language translation,

126
00:05:55,000 --> 00:05:57,000
translation and information retrieval and creative writing.

127
00:05:57,000 --> 00:05:58,000
Right?

128
00:05:58,000 --> 00:05:59,000
How can I help you today?

129
00:05:59,000 --> 00:06:07,000
Please provide me a Python code to play snake game.

130
00:06:07,000 --> 00:06:09,000
Okay, let's just go ahead and write this.

131
00:06:09,000 --> 00:06:11,000
So this is my entire Python code.

132
00:06:11,000 --> 00:06:12,000
You'll be able to see this.

133
00:06:12,000 --> 00:06:15,000
And if you don't know about gamma it is an open source model by Google.

134
00:06:15,000 --> 00:06:16,000
Right.

135
00:06:16,000 --> 00:06:18,000
So we will be using this kind of models.

136
00:06:18,000 --> 00:06:24,000
And along with this there are a lot of open source model like Pi three me mini you have Mistral, you

137
00:06:24,000 --> 00:06:25,000
have neural chat.

138
00:06:25,000 --> 00:06:26,000
You have code llama.

139
00:06:26,000 --> 00:06:31,000
Code llama is specifically for getting response with respect to any kind of codes that you have, right.

140
00:06:31,000 --> 00:06:34,000
So all these specific models, you'll be able to run it okay.

141
00:06:34,000 --> 00:06:34,000
Okay.

142
00:06:34,000 --> 00:06:37,000
So this was about the initial setup of Allama.

143
00:06:37,000 --> 00:06:40,000
Uh and again uh, it is very much simple.

144
00:06:40,000 --> 00:06:45,000
We just go over here, download the exe file or based on your Mac OS or Linux, uh, you get that particular

145
00:06:45,000 --> 00:06:48,000
file extension and just go ahead and install it.

146
00:06:48,000 --> 00:06:50,000
And once you install it it will be running.

147
00:06:50,000 --> 00:06:54,000
And then with the help of command prompt, first of all, to use any model, let's say I want to go

148
00:06:54,000 --> 00:06:57,000
ahead and use llama to or I want to go ahead and use llama three.

149
00:06:57,000 --> 00:06:59,000
I have to first of all download it.

150
00:06:59,000 --> 00:07:00,000
That is compulsory over there.

151
00:07:00,000 --> 00:07:06,000
Let's say that in my next example, I will be showing you how I will be using gamma to be, uh, 2 billion

152
00:07:06,000 --> 00:07:09,000
parameters model specifically to create my generative AI application.

153
00:07:09,000 --> 00:07:12,000
And for that you have to first of all download that in your local machine.

154
00:07:12,000 --> 00:07:13,000
Right.

155
00:07:13,000 --> 00:07:18,000
So in my next video I will be showing you how you can go ahead and create a generative AI application

156
00:07:18,000 --> 00:07:19,000
complete end to end.

157
00:07:19,000 --> 00:07:20,000
Uh, that is what we are going to discuss about.

158
00:07:20,000 --> 00:07:22,000
So I hope you like this particular video.

159
00:07:22,000 --> 00:07:23,000
I will see you all in the next video.

160
00:07:23,000 --> 00:07:24,000
Thank you.