1
00:00:02,980 --> 00:00:06,930
Hey what is up everybody this is Caleb Stoltze with Deb's slow dot com.

2
00:00:06,940 --> 00:00:13,630
And today we're going to be building an awesome app called scribe that basically is going to use iOS

3
00:00:13,630 --> 00:00:21,310
10 new speech framework to analyze an audio file and transcribe it into text just like you can with

4
00:00:21,310 --> 00:00:24,000
Siri voice dictation or with Siri.

5
00:00:24,010 --> 00:00:30,700
Obviously some cool uses for this might be a camera app where when you hold it out to take a selfie

6
00:00:30,700 --> 00:00:36,890
and say cheese it can analyze the word cheese and then know that you want to take a photo and then it'll

7
00:00:36,910 --> 00:00:38,100
take a photo for you.

8
00:00:38,110 --> 00:00:38,770
Very cool.

9
00:00:38,770 --> 00:00:43,780
So like lots of very very voice driven type tasks.

10
00:00:43,780 --> 00:00:46,340
This app is going to work a little bit like this.

11
00:00:46,360 --> 00:00:51,820
So basically you're going to click on that red circle button right there and watch what happens when

12
00:00:51,820 --> 00:00:53,050
it plays my recording.

13
00:01:01,640 --> 00:01:02,150
Nice.

14
00:01:02,150 --> 00:01:03,020
Click it again.

15
00:01:03,840 --> 00:01:04,290
Right.

16
00:01:04,430 --> 00:01:05,630
Just might be.

17
00:01:05,770 --> 00:01:09,260
Now I've never made.

18
00:01:09,380 --> 00:01:16,400
OK so as you can see when we click the button here it loads a an audio file that I have prerecorded

19
00:01:17,510 --> 00:01:25,190
and then it basically will get it will begin the analysis using the speech framework will analyze the

20
00:01:25,190 --> 00:01:31,190
audio and then convert it into text and will basically drop it into a text field that we have right

21
00:01:31,190 --> 00:01:31,910
here.

22
00:01:32,000 --> 00:01:37,190
Now there are a couple of things to know when we are using the speech framework.

23
00:01:37,190 --> 00:01:43,070
You must get permission to use it just like you would with MAP kit or anything else regarding the user's

24
00:01:43,070 --> 00:01:48,320
personal information recording of their voices would be their personal information.

25
00:01:48,320 --> 00:01:53,480
Even though this app is not going to be recording their voice using their microphone.

26
00:01:53,600 --> 00:01:55,340
It's still a requirement to get it to work.

27
00:01:55,340 --> 00:01:56,780
And we'll talk about that later.

28
00:01:56,780 --> 00:02:03,410
So let's go ahead and get rid of that and I want to show you some of the available resources online

29
00:02:03,440 --> 00:02:06,670
regarding the speech recognition API.

30
00:02:06,950 --> 00:02:13,450
There is not very much on it online yet because I think it's still in beta actually.

31
00:02:13,640 --> 00:02:16,470
But what we're going to do today works really well.

32
00:02:16,640 --> 00:02:23,060
So if you want you can go to the WWE DC 2016 videos and you can watch a video.

33
00:02:23,060 --> 00:02:24,520
Five hundred nine.

34
00:02:24,800 --> 00:02:29,330
And this is going to basically give you the nuts and bolts of the speech recognition API.

35
00:02:29,480 --> 00:02:34,780
It's really cool actually and there's a lot of amazing uses for this and it always makes me think of

36
00:02:34,790 --> 00:02:40,580
what is possible and just the potential of opening up an API like this to developers.

37
00:02:40,670 --> 00:02:46,220
So I would recommend watching this as kind of a prerequisite to finishing this video.

38
00:02:46,520 --> 00:02:52,340
And then of course you're going to want to go through the Iowa 10 documentary called What's new in Iowa

39
00:02:52,370 --> 00:02:56,500
s and there's an entire section here called Speech recognition.

40
00:02:56,690 --> 00:03:00,260
And I'm just going to kind of breezed through this really quick and then we'll get right into making

41
00:03:00,260 --> 00:03:01,270
that up.

42
00:03:01,280 --> 00:03:07,200
So basically you can recognize and transcribe speech into text.

43
00:03:07,310 --> 00:03:12,670
It can be from both real time live audio and prerecorded audio.

44
00:03:13,160 --> 00:03:23,270
So each speech recognition that you want to do requires a recognizer a request and a recognition task.

45
00:03:23,270 --> 00:03:32,660
So the recognizer basically recognizes there is speech in this audio the request basically will pull

46
00:03:32,660 --> 00:03:40,460
the file from wherever it's stored whether it's live or on your device like you recorded an audio session

47
00:03:40,460 --> 00:03:43,610
before and then you want to analyze it and transcribe it.

48
00:03:43,850 --> 00:03:47,510
And then the recognition task basically puts those two together.

49
00:03:47,600 --> 00:03:54,530
So it'll run the file through and then basically the result will provide that that transcription that

50
00:03:54,530 --> 00:03:54,940
we want.

51
00:03:54,980 --> 00:03:58,280
So we'll get into this later.

52
00:03:58,610 --> 00:04:04,220
And just like we said before we're going to be using what's called an s speech recognition usage description.

53
00:04:04,220 --> 00:04:10,550
We're going to add that key to our apps info up the list and it'll basically pop up a prompt that will

54
00:04:10,550 --> 00:04:16,690
ask us for permission to use speech recognition and this will probably actually be an app guideline

55
00:04:16,750 --> 00:04:17,570
in the future.

56
00:04:17,650 --> 00:04:21,050
But it says here when you adopt speech recognition in your app.

57
00:04:21,130 --> 00:04:26,110
Be sure to indicate to users that their speech is being recognized and that they should not make sensitive

58
00:04:26,140 --> 00:04:28,100
utterances at the time.

59
00:04:28,150 --> 00:04:32,950
So if they're talking about something secure and they're nervous about it being leaked out you just

60
00:04:32,950 --> 00:04:38,180
basically need to tell them don't make a bad choice of what you're saying with our app.

61
00:04:38,560 --> 00:04:39,150
OK.

62
00:04:39,280 --> 00:04:40,530
So that is that.

63
00:04:40,570 --> 00:04:47,230
Let's go ahead and pull open a shiny new X code project in X code beta.

64
00:04:47,410 --> 00:04:52,890
At the time of this recording I'm using X code beta 8.00 beta 3.

65
00:04:53,230 --> 00:05:00,070
And so just go ahead click Create a new X code project single view application is fine and let's call

66
00:05:00,070 --> 00:05:03,520
this scribe let's give it a cool name.

67
00:05:03,520 --> 00:05:04,900
We don't need anything else here.

68
00:05:04,900 --> 00:05:06,470
No core data no unit tests.

69
00:05:06,520 --> 00:05:08,050
Click next.

70
00:05:08,050 --> 00:05:10,950
And I'm going to save it to my desktop just like that.

71
00:05:12,300 --> 00:05:12,810
OK.

72
00:05:13,040 --> 00:05:14,300
Super cool.

73
00:05:14,300 --> 00:05:16,630
So by the way yes that is the name of my phone.

74
00:05:16,640 --> 00:05:21,380
I use that so that people don't try to connect to my hotspot without my authorization.

75
00:05:21,410 --> 00:05:22,160
Anyway moving on.

76
00:05:22,160 --> 00:05:28,700
So we are going to go ahead and start by building out our UI first and then afterwards we're going to

77
00:05:28,700 --> 00:05:30,900
go ahead and code up some code.

78
00:05:30,910 --> 00:05:37,820
So let's begin by adding in a view to the top to drag it up here.

79
00:05:37,820 --> 00:05:39,490
And this is just going to be a banner bar.

80
00:05:39,500 --> 00:05:40,520
It's not necessary.

81
00:05:40,550 --> 00:05:45,350
But I like making my apps look nice even if they serve a really simple purpose.

82
00:05:45,590 --> 00:05:48,290
So I'm going to make it 55 high.

83
00:05:48,290 --> 00:05:55,560
Give it a color of red and if you want you if you want to use this red the next color is D.B 1.

84
00:05:55,570 --> 00:05:56,780
See one see.

85
00:05:57,050 --> 00:05:58,270
So there you go.

86
00:05:58,280 --> 00:05:58,750
All right.

87
00:05:58,790 --> 00:06:00,630
So let's give this some constraints.

88
00:06:00,650 --> 00:06:06,410
Let's pin it to the top left and right and give it a fixed height of 55.

89
00:06:06,420 --> 00:06:12,920
Next up we're going to give it a label and let's drop that down there right in the center.

90
00:06:13,280 --> 00:06:22,610
Let's call this scribe and let's make it a little bigger here and recentre again Center the text and

91
00:06:22,880 --> 00:06:28,180
the font that I like to use when you select custom I like to use Avenir next.

92
00:06:28,220 --> 00:06:32,390
I just think it looks really cool it's nice and round and clean.

93
00:06:32,510 --> 00:06:37,000
So let's do that and maybe I'll bump it up to 20.

94
00:06:37,490 --> 00:06:41,630
And I want to give it a white color that looks good.

95
00:06:41,830 --> 00:06:42,090
OK.

96
00:06:42,100 --> 00:06:48,620
So next let's go ahead and center this in this red container that it's in.

97
00:06:48,790 --> 00:06:49,440
You know what.

98
00:06:49,480 --> 00:06:50,890
Actually that's not going to do what I want.

99
00:06:50,890 --> 00:06:52,690
That's going to put it in the center of our app.

100
00:06:52,780 --> 00:06:56,610
So let's select both the text and our view.

101
00:06:56,620 --> 00:06:59,350
Sometimes you may have to go over here and select both.

102
00:06:59,800 --> 00:07:06,510
And now on the line constraint menu do horizontal and vertical centers that should be good.

103
00:07:07,330 --> 00:07:07,700
OK.

104
00:07:07,750 --> 00:07:13,330
So let's think we need our record button we need a message that says play and transcribe and we need

105
00:07:13,360 --> 00:07:14,380
a text field.

106
00:07:14,380 --> 00:07:21,660
So let's go ahead and add those UI button drag it in here.

107
00:07:21,700 --> 00:07:22,990
That looks good.

108
00:07:23,290 --> 00:07:31,240
And I'm going to make this a square by going to my attributes my size inspector sorry and let's make

109
00:07:31,240 --> 00:07:41,770
it size 60 by 60 go back to the attributes inspector remove the text by selecting that deleting it and

110
00:07:41,770 --> 00:07:47,680
pressing Enter and then we want the background color down here at the bottom of our button to match

111
00:07:47,770 --> 00:07:49,480
our red color on the top.

112
00:07:49,480 --> 00:07:51,480
Now don't worry that this is not yet round.

113
00:07:51,490 --> 00:07:55,180
I'm going to teach you how to round it out and make it look really nice.

114
00:07:55,550 --> 00:07:58,580
So let's just go ahead position that may be there that looks.

115
00:07:58,750 --> 00:07:59,910
Yeah that looks good.

116
00:08:00,310 --> 00:08:02,280
OK so now I need play and transcribe.

117
00:08:02,380 --> 00:08:06,960
So select this you I label press command d to duplicate it.

118
00:08:06,970 --> 00:08:11,250
And I'm just going to drag it down here to the bottom if it will let me.

119
00:08:11,320 --> 00:08:11,980
Please be nice.

120
00:08:11,980 --> 00:08:12,800
There we go.

121
00:08:12,990 --> 00:08:13,400
OK.

122
00:08:13,480 --> 00:08:15,340
So we're going to center that bad boy.

123
00:08:15,730 --> 00:08:25,340
And we want to set this color of this label too dark gray and we want to stretch it out a bit.

124
00:08:25,390 --> 00:08:30,710
And we want to send it again just doing some cleanup work here play.

125
00:08:31,090 --> 00:08:35,270
And what transcribe.

126
00:08:35,270 --> 00:08:36,650
Awesome.

127
00:08:36,740 --> 00:08:38,090
So there's that.

128
00:08:38,210 --> 00:08:41,990
That is in place the button it's looking good.

129
00:08:42,770 --> 00:08:48,320
And you know what we need to do some constraints here so click on your label and let's go ahead and

130
00:08:49,400 --> 00:08:58,400
let's give it horizontal constraint in the container and pin it to the bottom eight just like that.

131
00:08:58,430 --> 00:09:02,120
This one we're going to want to do kind of a similar thing.

132
00:09:02,120 --> 00:09:07,850
We're going to want to give it a fixed height and a fixed width and let's go ahead and pin it to the

133
00:09:07,850 --> 00:09:10,480
bottom just like that.

134
00:09:10,490 --> 00:09:14,820
But we also need to center it in the container horizontally.

135
00:09:15,260 --> 00:09:22,130
So do that and that alignment constraint and click Add one constraint and next All we need to do is

136
00:09:22,130 --> 00:09:27,030
add an hour you text field and our UI is almost done.

137
00:09:27,290 --> 00:09:28,050
Oh I'm sorry.

138
00:09:28,070 --> 00:09:29,530
You I text view.

139
00:09:29,930 --> 00:09:32,780
So we're going to drop in one just like that.

140
00:09:33,050 --> 00:09:35,810
And you see how it snaps 20 to the left.

141
00:09:35,810 --> 00:09:41,190
We want it to be about the same at the top so just kind of eyeball it and we can perfect it later.

142
00:09:41,380 --> 00:09:47,860
We want it 20 from here as well just for aesthetics purposes give it some constraints.

143
00:09:47,880 --> 00:09:51,380
We were close.

144
00:09:51,860 --> 00:10:00,320
Actually we can go ahead and remove this and we can say 20 here 20 here 20 here and click add four constraints

145
00:10:01,700 --> 00:10:02,230
OK.

146
00:10:02,260 --> 00:10:03,660
There we go.

147
00:10:04,260 --> 00:10:05,820
And it's giving us a little error there.

148
00:10:05,820 --> 00:10:06,820
Let's move it down.

149
00:10:06,840 --> 00:10:07,330
OK.

150
00:10:08,980 --> 00:10:13,630
Now I'm trying to figure out why it is not allowing me to do what I want.

151
00:10:13,640 --> 00:10:16,130
It's giving me this weird number thing.

152
00:10:16,130 --> 00:10:17,780
Let's go ahead and try that again.

153
00:10:17,930 --> 00:10:26,120
Let's delete it and let's actually go ahead and leave constraints margins on and let's just say 0 0

154
00:10:26,130 --> 00:10:31,110
0 0 pretty sure that'll give us what we want.

155
00:10:31,630 --> 00:10:32,560
Oh never mind.

156
00:10:32,570 --> 00:10:33,650
No no no this is good.

157
00:10:33,700 --> 00:10:36,680
We want it to be 20 from this 20 from this.

158
00:10:36,820 --> 00:10:40,120
And then the margin is actually it's OK to be zero.

159
00:10:40,120 --> 00:10:51,390
So let's let's change this to be 20 and let's go back whoops there we go.

160
00:10:51,390 --> 00:10:57,410
And we want this constraint for the bottom space to also be 20 OK.

161
00:10:57,680 --> 00:11:00,790
So now it's still giving us a weird problem.

162
00:11:00,790 --> 00:11:02,450
We can we can check that out later.

163
00:11:02,750 --> 00:11:04,260
But for now this looks pretty good.

164
00:11:04,520 --> 00:11:07,220
So let's go ahead and the attributes inspector.

165
00:11:07,490 --> 00:11:14,000
Let's go ahead and change the font to from system to wups to custom.

166
00:11:14,060 --> 00:11:16,010
Go ahead and give it Avenir next.

167
00:11:16,040 --> 00:11:19,190
And I'm going to bump it up to size.

168
00:11:19,190 --> 00:11:20,430
I think 18 is good.

169
00:11:20,600 --> 00:11:21,120
Yeah.

170
00:11:21,990 --> 00:11:28,120
Let's let's make this look a little trendier make it a light color and let's center it.

171
00:11:28,260 --> 00:11:31,670
Of course let's bump the color down to dark gray.

172
00:11:32,130 --> 00:11:38,250
And you know what I think this is pretty good but in the original app I left a little message here just

173
00:11:38,250 --> 00:11:39,540
so that people know what to do.

174
00:11:39,540 --> 00:11:42,420
So if you double click here I type something like this.

175
00:11:42,420 --> 00:11:50,500
Tap the button below to transcribe the audio embedded in this app.

176
00:11:50,670 --> 00:11:56,370
And I said that because our app is actually going to have a preloaded audio file that will kind of do

177
00:11:56,370 --> 00:12:01,710
all the heavy lifting and then yes I'm just telling them that there is an embedded audio file.
