1
00:00:08,190 --> 00:00:10,640
Hey guys this is Caleb with slopes dot com.

2
00:00:10,640 --> 00:00:15,640
And in this video we're going to dive right into adding our core a male model.

3
00:00:15,660 --> 00:00:20,850
We're going to pass an image into it and actually use machine learning to make a prediction of what

4
00:00:20,850 --> 00:00:22,760
it thinks that it's seeing in the image.

5
00:00:22,770 --> 00:00:24,080
Very very cool.

6
00:00:24,120 --> 00:00:30,440
In the last video we left off with this we have our camera view we have our nice labels and our UI views

7
00:00:30,450 --> 00:00:36,380
and when we tap the screen it takes a photo of what it sees and it saves it to the image of you.

8
00:00:36,450 --> 00:00:41,360
We're going to use that in just a moment to pass into our core a male model.

9
00:00:41,380 --> 00:00:47,400
OK so to begin we're actually going to pull open Safari and we're going to go ahead and head to the

10
00:00:47,400 --> 00:00:49,560
core M-L site from Apple.

11
00:00:49,560 --> 00:00:54,860
If you don't remember it's app developers Apple dot com slash machine learning.

12
00:00:54,930 --> 00:01:01,950
Now if you scroll way down to the bottom there is a section called working with models and basically

13
00:01:01,950 --> 00:01:08,580
Apple have provided quite a few different machine learning models that work with core m-L.

14
00:01:08,730 --> 00:01:11,620
The one we're going to use is called squeeze net.

15
00:01:11,820 --> 00:01:18,630
And what it does is it detects the dominant objects present in an image from a set of 1000 categories

16
00:01:19,020 --> 00:01:27,240
like trees animals food vehicles people and more but with an overall footprint of 4.7 megabytes it's

17
00:01:27,240 --> 00:01:28,590
really really small.

18
00:01:28,650 --> 00:01:33,270
And if you actually look at some of these other core and L models you'll see they're pretty big 100

19
00:01:33,270 --> 00:01:35,270
to megabytes 94 megabytes.

20
00:01:35,270 --> 00:01:37,770
Five hundred fifty three megabytes.

21
00:01:37,770 --> 00:01:42,540
So this one is just way more efficient and way smaller.

22
00:01:42,540 --> 00:01:46,260
I mean that's just good for our app it will make the apps size much smaller in the end.

23
00:01:46,260 --> 00:01:51,750
So to get that model go ahead and just click download core Armel model and it'll save to your downloads

24
00:01:51,750 --> 00:01:53,280
which you can see right here.

25
00:01:58,650 --> 00:02:01,930
Which you can see right here and it is called squeeze net.

26
00:02:02,050 --> 00:02:06,650
M-L model that of course stands for machine learning model.

27
00:02:06,660 --> 00:02:11,820
Now what we need to do is open our X called Project and add this file into the project so go ahead and

28
00:02:11,820 --> 00:02:18,030
pull open your X code project and I'm actually going to just drag this from safari and drop it right

29
00:02:18,030 --> 00:02:19,830
into my project.

30
00:02:19,860 --> 00:02:26,910
Now of course copy items if needed create groups and make sure that the target vision app has been selected.

31
00:02:27,050 --> 00:02:33,480
OK click Finish and we're going to go check out what is in this M-L model so click on that and you'll

32
00:02:33,480 --> 00:02:36,410
notice here it says that it is a machine learning model.

33
00:02:36,490 --> 00:02:39,280
It's of type and neural network classifier.

34
00:02:39,300 --> 00:02:46,710
So now you understand how the structure is is working and it gives you a nice description and also references

35
00:02:46,710 --> 00:02:47,700
the inputs.

36
00:02:47,700 --> 00:02:49,160
OK this is important to know.

37
00:02:49,410 --> 00:02:58,210
So we input an image and you see right here that it is BGR 6:58 and you can't see it but that's 6:58.

38
00:02:58,320 --> 00:03:05,720
That means that the input image that it can classify is 227 pixels by 227 pixels.

39
00:03:05,940 --> 00:03:08,530
The outputs are what we get at the end.

40
00:03:08,530 --> 00:03:13,710
OK so we get class label probs which is the probability of each category.

41
00:03:13,890 --> 00:03:18,020
And we also get the class label which is the most likely image category.

42
00:03:18,120 --> 00:03:23,130
So it'll tell us what it thinks it is and the probability that it thinks it's correct.

43
00:03:23,160 --> 00:03:28,380
If you notice here we have squeezed net and it says that it is a swift generated source.

44
00:03:28,530 --> 00:03:35,370
That means that squeeze net is accessible to us as as a model like we can use it just like we can use

45
00:03:35,400 --> 00:03:36,680
any other class.

46
00:03:36,810 --> 00:03:39,550
And let's click this little arrow here so I can show you that.

47
00:03:39,570 --> 00:03:40,200
Check it out.

48
00:03:40,320 --> 00:03:47,470
So it is a Class A swift class it's ready for us to use with squeeze net input right here squeeze net

49
00:03:47,520 --> 00:03:49,230
output can scroll down.

50
00:03:49,230 --> 00:03:55,780
We can actually see the whole squeeze net class right here it's very cool and all available for us.

51
00:03:55,800 --> 00:03:59,880
That way you can take a look at this more and you can kind of dive in deeper if you want to but for

52
00:03:59,880 --> 00:04:05,070
now we're just going to go ahead and we're going to add this into the resources folder and go to camera

53
00:04:05,060 --> 00:04:05,830
AVC.

54
00:04:05,850 --> 00:04:06,840
Let's use it.

55
00:04:06,840 --> 00:04:08,820
Let's see what it can do.

56
00:04:08,840 --> 00:04:12,720
So to begin we're going to go ahead and we're going to actually import two frameworks.

57
00:04:12,720 --> 00:04:17,440
First we're going to import core M-L and last we're going to import vision.

58
00:04:17,560 --> 00:04:24,870
OK these are both part of the core M-L package Corum El handles just kind of your basic overarching

59
00:04:24,930 --> 00:04:32,730
all of the machine learning stuff and vision specifically handles things like face recognition and object

60
00:04:32,730 --> 00:04:33,420
recognition.

61
00:04:33,450 --> 00:04:37,730
We already talked about that and one of the first videos but just just a quick reminder.

62
00:04:37,740 --> 00:04:43,710
All right so now we get to actually use this machine learning model we've dragged in Squeeze net and

63
00:04:43,710 --> 00:04:49,200
to do that go ahead and scroll down and where we created the photo data we're actually going to use

64
00:04:49,200 --> 00:04:54,690
that same photo data and pass it into our core email model squeeze net.

65
00:04:54,780 --> 00:05:01,280
So to do that go ahead and go beneath the photo data and we're going to use a do try catch block k.

66
00:05:01,500 --> 00:05:08,340
We need to be able to catch any errors that are thrown and using vision can throw errors and so we're

67
00:05:08,340 --> 00:05:12,540
going to set it up so that it can handle that and not cause a crash in our application.

68
00:05:12,540 --> 00:05:13,960
So go ahead and type do.

69
00:05:14,460 --> 00:05:17,660
And we're going to just set up the catch block as well while we're at it.

70
00:05:18,060 --> 00:05:18,630
Perfect.

71
00:05:18,630 --> 00:05:22,310
And this is where we're going to handle errors in just a moment.

72
00:05:22,440 --> 00:05:25,590
But inside the do block we're going to type let.

73
00:05:25,590 --> 00:05:31,180
Model equals try v and core Corum L model k.

74
00:05:31,200 --> 00:05:35,890
Now we're going to do a parentheses and we're going to pass in an L model.

75
00:05:35,910 --> 00:05:42,570
Now if you remember squeeze net is an M-L model so what we can do is we can just pass in Squeeze net

76
00:05:42,990 --> 00:05:48,180
we can instantiate it and if we put a period we can pull out the male model.

77
00:05:48,180 --> 00:05:55,560
If you go to look at this this class you can see that there is a variable here that is for an N O model

78
00:05:55,570 --> 00:06:00,030
so we can pull that out and we can use that as a vision core I.M. model.

79
00:06:00,030 --> 00:06:00,840
Really really cool.

80
00:06:00,840 --> 00:06:06,840
So basically we're using squeezed now but we're passing it through vision because vision has a lot of

81
00:06:06,840 --> 00:06:09,400
really cool stuff for identifying things and images.

82
00:06:09,650 --> 00:06:10,890
Okay.

83
00:06:10,980 --> 00:06:16,200
And of course we're using tri because Wii and Corum male model can throw errors and you'll notice in

84
00:06:16,200 --> 00:06:20,000
a while little say that we're having some errors because we're not properly handling them in our catch

85
00:06:20,000 --> 00:06:20,300
block.

86
00:06:20,310 --> 00:06:21,420
But we will.

87
00:06:21,450 --> 00:06:23,370
So we now have a model.

88
00:06:23,370 --> 00:06:24,290
This is good.

89
00:06:24,300 --> 00:06:30,480
The models like our brain and now we need to make a request which is kind of like a thought case where

90
00:06:30,810 --> 00:06:32,820
we're allowing our brain to have a thought.

91
00:06:32,850 --> 00:06:34,990
And then it's going to do something with that thought later.

92
00:06:34,980 --> 00:06:39,640
So after we've created our model go ahead and type let request.

93
00:06:39,840 --> 00:06:43,460
And this is where we're going to use a Vienne core M-L request.

94
00:06:43,460 --> 00:06:45,230
Gates right there ready for you.

95
00:06:45,330 --> 00:06:48,150
Now if you put up parentheses you can see that there are two options.

96
00:06:48,150 --> 00:06:52,740
There's one with a model and one with the completion handler and model.

97
00:06:52,740 --> 00:06:59,390
We're going to use that one because our completion handler will be where we will handle what happens

98
00:06:59,390 --> 00:07:02,110
when we get a response back from this request.

99
00:07:02,120 --> 00:07:02,580
OK.

100
00:07:02,690 --> 00:07:06,170
Now our model of course is going to be our model.

101
00:07:06,170 --> 00:07:12,090
And now this completion handler This is where we're going to call a function we will right in a moment.

102
00:07:12,260 --> 00:07:18,240
So for now I'm going to go ahead and write that function but I'm not going to use it yet.

103
00:07:18,410 --> 00:07:25,400
So up here in our camera AVC class we need to write that function and we're going to put it beneath

104
00:07:25,400 --> 00:07:26,910
did tap camera view.

105
00:07:27,050 --> 00:07:32,700
So I'm going to go ahead and type phunk results method.

106
00:07:32,800 --> 00:07:33,150
All right.

107
00:07:33,160 --> 00:07:33,410
Oops.

108
00:07:33,410 --> 00:07:37,220
And we're going to need to pass in a request of V.N. request.

109
00:07:37,220 --> 00:07:43,760
That's a vision request and we need to also give it the capability of using the errors that could come

110
00:07:43,760 --> 00:07:49,380
from our from our request so that's going to be of type error.

111
00:07:49,880 --> 00:07:51,480
And there isn't always an error.

112
00:07:51,530 --> 00:07:53,660
That's why we're creating it as an optional.

113
00:07:53,660 --> 00:07:58,270
All right so go ahead and give it some curly braces and we will come back to handle

114
00:08:00,540 --> 00:08:02,880
changing the label text.

115
00:08:02,880 --> 00:08:07,460
OK we'll come back to that in a moment but right now we just need to be able to use result's method.

116
00:08:07,620 --> 00:08:08,280
OK.

117
00:08:08,520 --> 00:08:11,840
So our completion handler is going to be result's method.

118
00:08:12,270 --> 00:08:20,220
And what we're going to do is we're going to basically just call the name of our function here because

119
00:08:20,670 --> 00:08:26,330
when we create a request it automatically has a V.N. request and it automatically has an error.

120
00:08:26,490 --> 00:08:31,320
So that can just be passed in by default to this function so we don't actually need to put in any of

121
00:08:31,320 --> 00:08:32,040
the parameters.

122
00:08:32,040 --> 00:08:34,210
That's kind of the cool thing about using this.

123
00:08:34,280 --> 00:08:36,600
So we have a model that's our brain.

124
00:08:36,600 --> 00:08:42,210
We have a request that's kind of like the our brain is making a thought and now the handler is going

125
00:08:42,210 --> 00:08:46,760
to basically take that thought and it's going to try to turn it into something that we can use.

126
00:08:46,770 --> 00:08:51,770
So let's go ahead and let's create a handler by typing let hndler.

127
00:08:52,260 --> 00:08:57,370
And that's going to be an image request handler.

128
00:08:57,390 --> 00:08:58,400
OK.

129
00:08:58,440 --> 00:09:00,320
Now of course there are lots of options here.

130
00:09:00,330 --> 00:09:08,180
As you can see CIA image C.G. image C-v pixel buffer data and you are l what we're going to do is we're

131
00:09:08,190 --> 00:09:13,900
actually going to go ahead and just click on the one with data and options is necessary.

132
00:09:13,920 --> 00:09:15,350
We don't need to use the options.

133
00:09:15,390 --> 00:09:16,280
So we'll get rid of it.

134
00:09:16,500 --> 00:09:18,590
But we are going to pass in our data.

135
00:09:18,750 --> 00:09:19,960
What data you ask.

136
00:09:20,040 --> 00:09:21,260
Our photo data.

137
00:09:21,300 --> 00:09:21,850
OK.

138
00:09:22,050 --> 00:09:26,640
That's the data we're going to give it so that it can analyze and determine what's in the image so pass

139
00:09:26,650 --> 00:09:31,510
in photo data and go ahead and unwrap it so that we can use it.

140
00:09:31,530 --> 00:09:37,350
Now what we need to do is we basically need to use our handler to perform our request our handler is

141
00:09:37,350 --> 00:09:40,590
kind of like a part of our brain that handles our ability to think.

142
00:09:40,590 --> 00:09:45,120
So our handler is going to take our thought here and it's going to try to process it.

143
00:09:45,240 --> 00:09:45,740
OK.

144
00:09:45,990 --> 00:09:53,370
So let's go ahead and we're going to actually need to use try as well because the handler can also throw

145
00:09:53,370 --> 00:09:59,610
errors and so let's go ahead and type try handler perform and you see right here.

146
00:09:59,620 --> 00:10:02,460
It can take an array of V.N. requests.

147
00:10:02,880 --> 00:10:07,690
And so we're going to pass it an array with our request.

148
00:10:07,720 --> 00:10:08,030
OK.

149
00:10:08,040 --> 00:10:12,170
So if we try to build and run here you'll notice it will work.

150
00:10:12,180 --> 00:10:14,720
But the issue is that we're not handling the errors.

151
00:10:14,730 --> 00:10:16,670
If there's a problem we should know.

152
00:10:16,890 --> 00:10:18,720
So to do that we're just going to use Debug.

153
00:10:18,720 --> 00:10:20,350
Print like we have been doing.

154
00:10:20,700 --> 00:10:23,040
And we're going to just call the error.

155
00:10:23,100 --> 00:10:24,080
There's an error.

156
00:10:24,120 --> 00:10:25,270
It'll handle it here.

157
00:10:25,540 --> 00:10:25,760
OK.

158
00:10:25,800 --> 00:10:27,450
That's really easy.

159
00:10:27,450 --> 00:10:35,280
So now what we're doing is we're basically saying that we have a model OK we are going this is like

160
00:10:35,280 --> 00:10:37,650
our brain can process the image.

161
00:10:37,650 --> 00:10:41,540
We have a request that is using our brain to make a thought.

162
00:10:41,670 --> 00:10:47,710
We have a handler that's going to basically take the data and compare it against our little thought.

163
00:10:47,790 --> 00:10:53,310
And then it's going to perform that request work compares them and produces some data for us that is

164
00:10:53,370 --> 00:10:56,980
actually all we need to do to use machine learning.

165
00:10:57,000 --> 00:11:03,610
For now we're going to actually pull out the information from our request in this function results method.

166
00:11:03,810 --> 00:11:08,250
And that's how we're going to actually display what our M-L model is thinking.

167
00:11:08,430 --> 00:11:15,390
So let's do that now go to results method and what we're going to do is we're going to create a constant

168
00:11:15,630 --> 00:11:18,060
to hold the results that come in from our request.

169
00:11:18,060 --> 00:11:21,910
Remember it just gets passed in straight here through our completion handler.

170
00:11:21,930 --> 00:11:28,980
So get rid of this commented out words it's not code and we're going to use guard Latt to create an

171
00:11:28,980 --> 00:11:32,520
instance of our results so go ahead and type guard.

172
00:11:32,520 --> 00:11:36,610
Let results and that is going to be equal to request.

173
00:11:36,630 --> 00:11:43,670
And if you type the period after that you can see that there is results inside of a request.

174
00:11:43,710 --> 00:11:49,950
So very cool we can get that but it needs to be we need to put it into an array because we're going

175
00:11:49,950 --> 00:11:55,160
to cycle through it and we're going to check to make sure that we get the most confident result.

176
00:11:55,160 --> 00:11:56,440
We don't want.

177
00:11:56,750 --> 00:11:58,760
It's like one tenth of a percent sure.

178
00:11:58,770 --> 00:12:02,160
We don't really care about that we want something that is almost positive.

179
00:12:02,160 --> 00:12:10,950
So go ahead and we're going to cast this as an array of V.N. classification observation and you know

180
00:12:10,950 --> 00:12:17,010
what you guys can dive deeper into these descriptions for what these are but basically it is seen classification

181
00:12:17,010 --> 00:12:20,580
information produced by an image analysis request.

182
00:12:20,580 --> 00:12:21,360
All right.

183
00:12:21,630 --> 00:12:28,130
So it's basically using it to determine what is in the image that it was sent in the request.

184
00:12:28,350 --> 00:12:32,260
So with a guard left at the end we need to type else.

185
00:12:32,520 --> 00:12:34,770
And then inside of here it's pretty common just to return.

186
00:12:34,770 --> 00:12:39,900
So if it doesn't work if we if we don't get any results we're just going to return so that we don't

187
00:12:39,900 --> 00:12:46,260
have a crash if we do then we have a nice constant here called results that has our results.

188
00:12:46,320 --> 00:12:48,630
So that looks great.

189
00:12:48,630 --> 00:12:56,930
And beneath result's we're going to go ahead and type for classification in results.

190
00:12:56,940 --> 00:13:01,990
And the reason for that is because we are using V.N. class an observation and that's what we're doing

191
00:13:01,990 --> 00:13:08,050
is we're we're classifying objects in an image based on criteria that the M-L model understands.

192
00:13:08,050 --> 00:13:12,000
So for each classification let's think what we want to do here.

193
00:13:12,520 --> 00:13:18,700
What we're going to do is basically we're going to set it up so that if it is more than 50 percent confident

194
00:13:19,540 --> 00:13:22,180
then it's going to say what it thinks it is.

195
00:13:22,210 --> 00:13:26,680
If it's less than 50 percent confident it's going to say I'm not really sure what this is.

196
00:13:26,680 --> 00:13:27,840
Please try again.

197
00:13:28,090 --> 00:13:29,260
Simple as that.

198
00:13:29,290 --> 00:13:35,300
So to do that we're going to go ahead and type if classification confidence.

199
00:13:35,470 --> 00:13:37,010
Now you notice that.

200
00:13:37,210 --> 00:13:44,900
Well I might have to actually just a second if classification confidence is less than 0.5 K 50 percent.

201
00:13:45,010 --> 00:13:50,140
And if I select this you can see that V.N. confidence is of type float.

202
00:13:50,220 --> 00:13:50,610
OK.

203
00:13:50,710 --> 00:13:52,270
It's a float it's a number that comes in.

204
00:13:52,270 --> 00:13:58,510
So we have 50 percent here 0.5 50 hundredths or five tenths.

205
00:13:58,780 --> 00:14:05,860
If it is less than 50 percent what we're going to do is we're going to set the text label here in our

206
00:14:06,700 --> 00:14:10,540
main upper bar we're going to set that to say I'm not sure what this is.

207
00:14:10,540 --> 00:14:11,700
Please try again.

208
00:14:12,010 --> 00:14:21,560
So go ahead and let's type self dot identification label text and that's going to be equal to.

209
00:14:21,790 --> 00:14:23,790
I'm not sure what this is.

210
00:14:23,800 --> 00:14:26,650
Please try again.

211
00:14:26,650 --> 00:14:26,960
All right.

212
00:14:26,980 --> 00:14:27,610
And you know what.

213
00:14:27,610 --> 00:14:32,570
If we don't have if we're not really confident we shouldn't even show this confidence label so Gordon

214
00:14:32,590 --> 00:14:37,990
type self-confidence label that text and we're just going to say that to be an empty string because

215
00:14:37,990 --> 00:14:39,060
there's no need to show it.

216
00:14:39,070 --> 00:14:45,700
If there is not much confidence that it actually works OK and if that happens as well there's not really

217
00:14:45,760 --> 00:14:48,930
a need for it to continue so we're just going to call break.

218
00:14:48,940 --> 00:14:51,350
So that leaves the for loop.

219
00:14:51,440 --> 00:14:52,270
OK.

220
00:14:52,810 --> 00:14:58,230
What we're going to do next is what happens if we do get the proper results.

221
00:14:58,240 --> 00:15:02,800
Let's say that we get something it thinks it's 98 percent sure that it just saw an umbrella.

222
00:15:02,880 --> 00:15:04,930
K so else right.

223
00:15:05,040 --> 00:15:10,050
That means it's greater than 50 percent if the confidence is more than 50 percent.

224
00:15:10,060 --> 00:15:17,110
Go ahead and type self identification label dot text and that's going to be equal to the classification

225
00:15:18,010 --> 00:15:19,380
identifier.

226
00:15:19,410 --> 00:15:21,560
The identifier is what it thinks it saw.

227
00:15:21,570 --> 00:15:27,760
So in the example I showed you at the beginning of this course when you tap it takes a photo and it

228
00:15:27,760 --> 00:15:29,430
shows maybe umbrella.

229
00:15:29,830 --> 00:15:33,520
And we will program it later to speak out loud to say what it thinks it is.

230
00:15:33,520 --> 00:15:41,050
But this identifier is a string and it returns a name of an object or an item and that's all we need

231
00:15:41,050 --> 00:15:47,400
to do next we need to go ahead and set our confidence label to show how confident it thinks it is.

232
00:15:47,410 --> 00:15:54,490
So go ahead and type self confidence label that text and that's going to be equal to a string of course

233
00:15:54,490 --> 00:16:01,270
and we need to type confidence just like we have here with the capital letters confidence and now we're

234
00:16:01,270 --> 00:16:06,130
going to use string interpolation to pass in a value that is not a string but we're going to cast it

235
00:16:06,130 --> 00:16:07,820
as a string so that it can be used.

236
00:16:07,840 --> 00:16:10,610
So we're going to use the classification.

237
00:16:10,780 --> 00:16:12,270
Confidence.

238
00:16:12,330 --> 00:16:12,610
OK.

239
00:16:12,610 --> 00:16:13,900
Pretty cool.

240
00:16:13,900 --> 00:16:20,260
And we're going to actually multiply that by 100 because right now if the value came in at 0.5 that

241
00:16:20,260 --> 00:16:22,620
might look like 0.5 percent sure.

242
00:16:22,640 --> 00:16:23,380
We don't want that.

243
00:16:23,380 --> 00:16:29,440
So if we were to multiply this by 100 it would move the decimal place over and it would say 50 percent.

244
00:16:29,440 --> 00:16:29,700
OK.

245
00:16:29,740 --> 00:16:33,720
So that's what we want because 0.5 is 50 percent.

246
00:16:33,820 --> 00:16:38,270
Now the issue is that it would be fifty point zero percent.

247
00:16:38,350 --> 00:16:43,030
And I don't want to have to deal with the additional numbers on the outside.

248
00:16:43,030 --> 00:16:48,520
So I'm actually going to cast this as an integer and then put another parentheses on the outside so

249
00:16:48,520 --> 00:16:50,450
that this number is a whole number.

250
00:16:50,590 --> 00:16:51,550
51 percent.

251
00:16:51,550 --> 00:16:53,220
37 percent.

252
00:16:53,290 --> 00:16:57,400
If you want you can leave it at thirty seven point nine eight seven six five percent sure.

253
00:16:57,400 --> 00:17:02,210
I just got kind of annoyed reading like eight numbers after the decimal place so I cast it as an ant.

254
00:17:02,230 --> 00:17:04,720
That's just my personal preference.

255
00:17:04,840 --> 00:17:07,470
And let's see confidence we have the classification.

256
00:17:07,480 --> 00:17:10,870
And at the very end we need to add a percentage mark.

257
00:17:10,890 --> 00:17:12,150
All right.

258
00:17:12,310 --> 00:17:14,360
That looks pretty good.

259
00:17:15,070 --> 00:17:17,920
You know what do you guys want to go try it let's go see if it worked.

260
00:17:18,160 --> 00:17:23,230
OK we're going to go ahead and build and run it and it will show up here as soon as it builds and runs

261
00:17:23,230 --> 00:17:25,030
it will close.

262
00:17:25,210 --> 00:17:25,690
Maybe.

263
00:17:25,840 --> 00:17:26,950
Yeah there it is.

264
00:17:27,310 --> 00:17:27,960
All right.

265
00:17:28,150 --> 00:17:34,910
Building and installing to my phone it'll pop open.

266
00:17:34,920 --> 00:17:36,370
There it is.

267
00:17:36,390 --> 00:17:36,720
All right.

268
00:17:36,720 --> 00:17:38,730
Should we try it if it worked.

269
00:17:38,730 --> 00:17:40,160
We should tap the screen.

270
00:17:40,290 --> 00:17:45,750
It should analyze the photo and it should predict what it thinks it is and return a level of confidence.

271
00:17:45,750 --> 00:17:46,200
Let's try.

272
00:17:46,200 --> 00:17:46,410
Ready.

273
00:17:46,410 --> 00:17:48,650
Three two one.

274
00:17:50,590 --> 00:17:51,760
I'm not sure what this is.

275
00:17:51,760 --> 00:17:54,250
Please try again.

276
00:17:54,310 --> 00:17:55,450
I'm not sure what this is.

277
00:17:55,450 --> 00:17:56,700
Please try again.

278
00:17:58,700 --> 00:17:59,260
Ha.

279
00:17:59,480 --> 00:18:01,880
Interesting.

280
00:18:01,900 --> 00:18:04,030
So it does not appear to know what this is.

281
00:18:04,030 --> 00:18:10,940
Let's try some other other things maybe it might know what my wallet is my desk Loeb's wallet OK.

282
00:18:11,130 --> 00:18:12,650
Not sure what this is interesting.

283
00:18:12,660 --> 00:18:15,530
So what is the problem here.

284
00:18:16,870 --> 00:18:24,340
Let's see we're setting the label you know what let's go ahead let's actually print out the classification

285
00:18:24,340 --> 00:18:24,910
results

286
00:18:29,090 --> 00:18:30,930
let's print out the identifier.

287
00:18:30,960 --> 00:18:35,670
This way we can see everything that it's thinking through us that we know if it's

288
00:18:38,710 --> 00:18:38,960
yeah.

289
00:18:39,040 --> 00:18:40,450
So we can get an idea of what it's thinking.

290
00:18:40,450 --> 00:18:45,290
So when I tap the screen oh it says remote control and modem.

291
00:18:45,310 --> 00:18:50,320
I wonder why oh ok so I think I know what the problem is.

292
00:18:50,320 --> 00:18:52,530
So basically what we're doing is we're going through.

293
00:18:52,690 --> 00:18:58,000
If the confidence is less than 50 percent basically we go through we say I don't know what this is.

294
00:18:58,000 --> 00:19:00,490
And we break meaning we leave our for loop.

295
00:19:00,730 --> 00:19:06,880
But then if it is more than 50 percent confident it goes here and it identifies this but then it loops

296
00:19:06,880 --> 00:19:09,620
through again and it hits something that it's not confident on.

297
00:19:09,640 --> 00:19:14,740
And it shows I'm not sure what this is so when we hit something that it is confident on more than 50

298
00:19:14,740 --> 00:19:16,340
percent we should break.

299
00:19:16,390 --> 00:19:17,350
All right.

300
00:19:17,770 --> 00:19:20,360
OK so we should go ahead and do that.

301
00:19:20,740 --> 00:19:27,190
Let's break on both so that if it's not confident it breaks if it is not if it is totally confident

302
00:19:27,190 --> 00:19:30,390
on the first one it will just go right to here and that's fine.

303
00:19:30,390 --> 00:19:32,390
So let's let's build and run.

304
00:19:32,470 --> 00:19:37,180
Basically we were just cutting out our most confident result because we were going through some of the

305
00:19:37,210 --> 00:19:38,710
non-confidence ones first.

306
00:19:38,710 --> 00:19:41,780
So now that it's built let's take a look.

307
00:19:41,830 --> 00:19:48,310
When I tap on the screen to look at that remote control it's ninety nine percent confidence.

308
00:19:48,320 --> 00:19:53,670
Let's scan something that maybe it doesn't know maybe it doesn't know a Rubik's cube.

309
00:19:53,670 --> 00:19:54,390
Let's try it.

310
00:19:55,470 --> 00:19:55,750
OK.

311
00:19:55,770 --> 00:19:56,730
I'm not sure what this is.

312
00:19:56,730 --> 00:19:57,300
Please try again.

313
00:19:57,300 --> 00:20:00,540
That's fine because I doubt it was trained to know what a Rubik's cube is.

314
00:20:00,600 --> 00:20:01,700
So that's good to know.

315
00:20:01,890 --> 00:20:06,410
When I scanned the remote again remote control 97 percent confident.

316
00:20:06,420 --> 00:20:07,190
Very cool.

317
00:20:07,200 --> 00:20:08,970
So it looks like it's working.

318
00:20:08,970 --> 00:20:10,250
This is awesome.

319
00:20:10,290 --> 00:20:13,140
So we have just successfully integrated machine learning.

320
00:20:13,140 --> 00:20:19,650
I kid you not it's that easy to pass in images make predictions based on a trained model we just gave

321
00:20:19,650 --> 00:20:26,430
our app the ability to think how incredible is that amazing work in the next video we're actually going

322
00:20:26,430 --> 00:20:31,460
to set it up so that we can toggle the flash to turn on and off.

323
00:20:31,650 --> 00:20:36,870
And then we're going to actually at the very end we're going to set up a an avi speech synthesizer to

324
00:20:36,870 --> 00:20:42,720
speak the results to us thinking about maybe users who might be vision impaired who might not be able

325
00:20:42,720 --> 00:20:48,310
to see very well but want to classify objects they can hear what the app is seeing.

326
00:20:48,390 --> 00:20:49,350
Very very cool.

327
00:20:49,350 --> 00:20:52,100
So we're going to go ahead and move on to the next video.

328
00:20:52,110 --> 00:20:57,600
Amazing work with this one guys we are on our way to have a full fledged machine learning app that can

329
00:20:57,600 --> 00:21:00,230
analyze and identify items in photos.

330
00:21:00,240 --> 00:21:01,470
Very very cool.

331
00:21:01,470 --> 00:21:02,250
Awesome work guys.

332
00:21:02,250 --> 00:21:03,710
We'll see in the next video.