1
00:00:05,980 --> 00:00:10,630
So let's go through some common database
terminology, that should be useful for

2
00:00:10,630 --> 00:00:13,540
you when we're working with the
databases in this section of the course.

3
00:00:13,540 --> 00:00:18,490
And the idea here is that it'll
make it a lot easier to understand the

4
00:00:18,490 --> 00:00:22,170
app, when you understand what the
fundamentals are, and what all the

5
00:00:22,170 --> 00:00:27,840
actual various terms mean. Now at the very basic level, the database is the

6
00:00:27,840 --> 00:00:33,780
container for all the data that you store,
or that resides inside it. Now when you

7
00:00:33,780 --> 00:00:38,530
use the term database, you're referring
to the entire data, as well as the

8
00:00:38,530 --> 00:00:44,100
structure it's actually stored in,
and in addition, any queries and views on

9
00:00:44,100 --> 00:00:49,660
that data. Now in SQLite, which is
the database used for Android apps,

10
00:00:49,660 --> 00:00:54,960
the entire database contents are stored
in one single file, but that isn't true

11
00:00:54,960 --> 00:00:59,850
of most large database systems. Now the
Database Dictionary provides a

12
00:00:59,850 --> 00:01:04,980
comprehensive list of the structures and
types of data, that are used in recording

13
00:01:04,980 --> 00:01:10,720
the data. So basically, it describes all the
tables and fields within the database. Now

14
00:01:10,720 --> 00:01:15,190
more on the specifics later, but in
SQLite there is a table in each

15
00:01:15,190 --> 00:01:20,890
database called SQL_Lite_master, that has this information

16
00:01:20,890 --> 00:01:26,110
in it. Now you can query that table, but there
are commands that do it for you, so you

17
00:01:26,110 --> 00:01:29,920
don't have to understand the structure
of the master table. But it's there for

18
00:01:29,920 --> 00:01:33,190
you anyway, and as we go through the
app, you'll see how these commands

19
00:01:33,190 --> 00:01:36,190
actually work.

20
00:01:36,760 --> 00:01:42,610
Now, a table is a collection of related
data held in a database. So think of

21
00:01:42,610 --> 00:01:46,180
a contact database, for example, that
stores the name and the address and the

22
00:01:46,180 --> 00:01:50,770
phone number, of perhaps your customers.
Or what about an invoice table that

23
00:01:50,770 --> 00:01:55,180
records the invoice number in details of
the invoice. So in this slide there are

24
00:01:55,180 --> 00:02:00,160
two tables; Contacts and Invoices, that are
used to store information about contacts

25
00:02:00,160 --> 00:02:05,290
and invoice details. Now its SQL
databases such as SQLite, again used

26
00:02:05,290 --> 00:02:09,720
in Android app development, or another
database like Microsoft SQL Server -

27
00:02:09,720 --> 00:02:14,290
they're ideal for storing structured
data that can be organized neatly into

28
00:02:14,290 --> 00:02:18,880
rows and columns, like you can see in
these examples. Now with all the interest

29
00:02:18,880 --> 00:02:24,910
in Big Data, there are now databases such as No
SQL or Hadoop, that can cope with data

30
00:02:24,910 --> 00:02:27,730
that doesn't have such an obvious
structure, but we're going to be

31
00:02:27,730 --> 00:02:30,730
restricting our use of database to
structured data.

32
00:02:32,720 --> 00:02:38,540
So a Field is the basic unit of data in
a table. So a field in a database can be

33
00:02:38,540 --> 00:02:43,610
thought of, probably in a similar way, to
what an instance variable in Java is. And

34
00:02:43,610 --> 00:02:47,570
you've seen those obviously in different
apps, and just like a class variable, a

35
00:02:47,570 --> 00:02:52,940
database field has a name and a type.
Now the type restricts what kind of

36
00:02:52,940 --> 00:02:57,620
data can be stored in a field, for
example, it could be a string or it could

37
00:02:57,620 --> 00:03:02,930
accept numbers. Now many databases also
allowed date fields, large text fields

38
00:03:02,930 --> 00:03:07,220
and also fields where you can store
things like photographs or audio - and

39
00:03:07,220 --> 00:03:11,780
these field types can
often, are often called BLOBs, which is

40
00:03:11,780 --> 00:03:16,490
supposed to stand for Binary Large
OBject. It's a great name, and I can't

41
00:03:16,490 --> 00:03:20,090
help thinking that the original acronym
was something like L.B.O, for Large Binary

42
00:03:20,090 --> 00:03:23,090
Object, and they came up with a cooler sounding name afterwards.

43
00:03:24,230 --> 00:03:29,620
Now Fields are often referred to as
columns in databases. I know this can be

44
00:03:29,620 --> 00:03:33,260
technically confusing sometimes because
if you come from an Excel background,

45
00:03:33,260 --> 00:03:36,200
you'll find that the definition is
probably a bit different to what you

46
00:03:36,200 --> 00:03:41,090
think of as a column. Now in a spreadsheet,
the term column refers to an entire set

47
00:03:41,090 --> 00:03:46,370
of data extending across many rows. but
in a relational database, column

48
00:03:46,370 --> 00:03:48,830
generally refers to a single entry -

49
00:03:48,830 --> 00:03:52,250
although, when talking about the
structure of a table rather than the

50
00:03:52,250 --> 00:03:56,540
actual data, then you could talk about a
column to hold the Invoice Number, which

51
00:03:56,540 --> 00:04:01,250
is probably closer to the spreadsheet
use of the term. Now when referring to

52
00:04:01,250 --> 00:04:05,560
the data in a database, though, we're
talking about an individual item, like a

53
00:04:05,560 --> 00:04:10,400
base unit of data. Now relational
databases existed nearly 10 years before

54
00:04:10,400 --> 00:04:13,940
the first spreadsheet program, and we
just have to accept that column means

55
00:04:13,940 --> 00:04:16,940
slightly different things in each case.
But more on that shortly.

56
00:04:18,630 --> 00:04:23,550
Now a row or record, is a single set of
data for all fields that are in that

57
00:04:23,550 --> 00:04:28,770
table. So if you've got four columns like
the example on the screen, and if your

58
00:04:28,770 --> 00:04:33,300
table has an invoice number or a date,
a description and an amount, then a row

59
00:04:33,300 --> 00:04:37,320
represents those four values for a
single invoice. So it's really a

60
00:04:37,320 --> 00:04:41,820
collection of all the columns that
comprise the details of one entry in

61
00:04:41,820 --> 00:04:46,560
that table. So the highlighted record
holds the details for invoice number two,

62
00:04:46,560 --> 00:04:51,060
which was a laptop costing just over
thousand dollars, sold on the

63
00:04:51,060 --> 00:04:57,390
twenty-fourth of may 2016. And you can use
either of the terms, row or record, to

64
00:04:57,390 --> 00:05:01,620
identify it, but the correct relational
database terminology is actually row.

65
00:05:01,620 --> 00:05:03,630
....

66
00:05:03,630 --> 00:05:08,700
Now a Flat File database stores all
the data in a single file, which can

67
00:05:08,700 --> 00:05:12,510
result in a lot of duplication of data.
Now here,

68
00:05:12,510 --> 00:05:17,160
ISP's credit limit needs to be increased,
so that they can purchase the monitor, as

69
00:05:17,160 --> 00:05:20,070
otherwise, they'd actually go over the
limit.

70
00:05:20,070 --> 00:05:24,060
So in order to increase the limit, the
data in three rows would have to be

71
00:05:24,060 --> 00:05:25,970
modified.

72
00:05:25,970 --> 00:05:31,040
Now as you can see, every row in the
table that contains a record for ISP,

73
00:05:31,040 --> 00:05:36,650
must be changed in order to increase the
credit limit. So Flat File databases are

74
00:05:36,650 --> 00:05:41,210
not used very often anymore, but they
were fairly popular in the early days as

75
00:05:41,210 --> 00:05:46,040
they directly mapped to those card index
records, that companies sometimes used.

76
00:05:46,040 --> 00:05:51,050
Now if we were storing names, addresses
and phone number type information, then a

77
00:05:51,050 --> 00:05:55,490
Flat File database is fine for the job.
And even today, people still use address

78
00:05:55,490 --> 00:06:00,230
books and rolex style contact systems,
with each person's details on a

79
00:06:00,230 --> 00:06:04,850
different card. But there isn't really
much need to relate the individual cards

80
00:06:04,850 --> 00:06:05,840
to each other -

81
00:06:05,840 --> 00:06:11,030
this works perfectly well. Now as you can see
from this invoice example, though, trying

82
00:06:11,030 --> 00:06:16,040
to store all the data in a single table
results in duplicate data. Now using a

83
00:06:16,040 --> 00:06:20,390
relational database, tables can be
related to other tables which is very

84
00:06:20,390 --> 00:06:22,370
useful.

85
00:06:22,370 --> 00:06:27,290
So continuing on with the invoice
example; we can split the data out into

86
00:06:27,290 --> 00:06:32,380
a customer table, which contains standard
company data such as their name, address

87
00:06:32,380 --> 00:06:36,590
and a credit limit, and another table
called invoices, containing all that

88
00:06:36,590 --> 00:06:41,810
customer's purchases. The Name column in
the Customer table is related to the

89
00:06:41,810 --> 00:06:46,720
Name column in the Invoices table. Now in
relational database terms, this is called

90
00:06:46,720 --> 00:06:52,760
a Join. In fact, in this example we have a
one-to-many join because there can, and

91
00:06:52,760 --> 00:06:57,830
probably will, be many invoice rows for
each customer. Using a relational model,

92
00:06:57,830 --> 00:07:01,970
updating a customer's credit limit
involves changing the data in just a

93
00:07:01,970 --> 00:07:06,800
single row. So there is a mechanism to
join these two tables, to link the

94
00:07:06,800 --> 00:07:11,440
individual records in each table, to each
other. And you'll often see designs where a

95
00:07:11,440 --> 00:07:15,080
third table is used to provide the link,
as in this next slide.

96
00:07:17,480 --> 00:07:21,890
And it's also very common to use a
linking table to relate the data to two

97
00:07:21,890 --> 00:07:28,100
other tables. Now here, when an invoice
record is stored, a new record is created

98
00:07:28,100 --> 00:07:33,740
in the Customer_Invoices table
to link, for example, invoice 0004

99
00:07:33,740 --> 00:07:38,840
with customer ISP. Now one advantage of
this is that the invoice table only

100
00:07:38,840 --> 00:07:43,640
contains data relating to invoices. The
rows are not cluttered up with

101
00:07:43,640 --> 00:07:48,560
customer information of any kind, not
even the customer name. Splitting the

102
00:07:48,560 --> 00:07:51,860
data up like this is known as
normalisation. Now database

103
00:07:51,860 --> 00:07:56,480
normalization is basically the process
of removing redundant, duplicated and

104
00:07:56,480 --> 00:08:01,190
irrelevant data from the
tables, and the more that this is done,

105
00:08:01,190 --> 00:08:05,720
the higher the level of normalisation. If
you look into normalisation, you'll

106
00:08:05,720 --> 00:08:11,450
find that you can go up to level 6 - Sixth
Normal Form - but in most practical

107
00:08:11,450 --> 00:08:15,530
applications it's ready to go beyond the
third level. Now it's an interesting

108
00:08:15,530 --> 00:08:19,490
subject but the maths can get a bit
horrible at the higher levels. Our

109
00:08:19,490 --> 00:08:23,060
example on the slide isn't quite as
normal we should be, because we've

110
00:08:23,060 --> 00:08:26,780
used the customer name as the link
between the customer and invoice tables.

111
00:08:26,780 --> 00:08:31,130
Now if one of our customers changes its
name, which is quite a common thing to happen,

112
00:08:31,130 --> 00:08:35,929
then we'll have to update each of the
relevant entries in the Customer

113
00:08:35,929 --> 00:08:40,309
_Invoices table as well. Now the
usual way to deal with this is to use a

114
00:08:40,309 --> 00:08:46,550
unique ID field, that stays the same for
each customer once its allocated. Now we

115
00:08:46,550 --> 00:08:50,090
shouldn't store the customer balances in
the table, as that's best calculated

116
00:08:50,090 --> 00:08:53,260
when needed.

117
00:08:53,260 --> 00:08:58,690
So a View is a way of looking at the
data in a format similar to a table, but

118
00:08:58,690 --> 00:09:04,210
bringing data together from more than one
Joined table. So in this example, the view

119
00:09:04,210 --> 00:09:09,520
contains columns from the Customer table
and the Invoices table. So a view can

120
00:09:09,520 --> 00:09:13,270
just contain just some columns from a
single table - for example, just the

121
00:09:13,270 --> 00:09:16,930
Description column from the Invoices table to produce a list of items that

122
00:09:16,930 --> 00:09:23,230
have been sold. Now in SQLite, the data
in a view can't be updated. So that means

123
00:09:23,230 --> 00:09:27,340
that you can't add a new row to a view and have it place the new data into the

124
00:09:27,340 --> 00:09:32,080
the relevant tables. Now some databases
such as Microsoft SQL Server do allow

125
00:09:32,080 --> 00:09:37,120
this, but in that case, the views do have
restrictions on the columns they can and

126
00:09:37,120 --> 00:09:41,410
must contain, if they're used in this way.
So that's not something we need to worry

127
00:09:41,410 --> 00:09:43,960
about because we can't do that in SQLite.

128
00:09:43,960 --> 00:09:47,290
Alright, so that's enough to get us started,
and we're going to explore these terms

129
00:09:47,290 --> 00:09:52,240
more as we go through SQLite. So
coming up in the next videos, we're going

130
00:09:52,240 --> 00:09:57,010
to discuss a a few more database concepts
and I'll be going into more detail. We'll

131
00:09:57,010 --> 00:10:01,630
be using practical examples and actually
using the SQLite database. Now

132
00:10:01,630 --> 00:10:06,670
SQLite is designed to be embedded in
applications, and is actually a library

133
00:10:06,670 --> 00:10:11,980
that's called from our application code.
It does ship with a shell program that

134
00:10:11,980 --> 00:10:16,210
you can use to create databases and
interrogate them, though, and we'll start off by

135
00:10:16,210 --> 00:10:20,410
using that to explore the commands
available in SQLite. Now before we

136
00:10:20,410 --> 00:10:24,040
can use that shell program, we need to
make sure that it's available on your

137
00:10:24,040 --> 00:10:28,420
system's path. so the next three videos
are going to show you how to do that for

138
00:10:28,420 --> 00:10:32,680
Windows, Mac and Linux. So there's one
video for each operating system, so

139
00:10:32,680 --> 00:10:36,400
follow the instructions in the relevant
video for your operating system and you

140
00:10:36,400 --> 00:10:40,180
can skip over the videos that aren't
relevant. So I'll see you in one of those

141
00:10:40,180 --> 00:10:40,600
videos.

