1

00:00:01,170  -->  00:00:06,240
Know that you understand how a hash table works it's time for us to learn about Hasek.

2

00:00:06,780  -->  00:00:07,850
But before doing that.

3

00:00:07,980  -->  00:00:12,270
Let's briefly look at the set interface that has set implements.

4

00:00:12,750  -->  00:00:19,180
We know the tech interface extends the collection interface and it models a mathematical set which means

5

00:00:19,180  -->  00:00:24,240
up unlike a list a set can not how duplicate is due to this.

6

00:00:24,300  -->  00:00:29,540
It is useful when uniqueness matters are also fast lookups matter.

7

00:00:29,610  -->  00:00:37,230
We only discussed about how a hash table helps that Fosler cups on Hassid is based on Hostin as it internally

8

00:00:37,230  -->  00:00:43,970
uses hash map he called up the list interface element position wasn't was the idea that is insertion

9

00:00:43,980  -->  00:00:48,040
order is important but with set insertion order does not matter.

10

00:00:48,330  -->  00:00:56,190
But what matters is fast lookups and uniqueness sockets it is a subway interface offset and it can additionally

11

00:00:56,190  -->  00:00:59,110
be useful for sorting the elements.

12

00:00:59,130  -->  00:01:06,950
So in addition to uniqueness on fossil lookups sort set will also help with sorting the elements unlike

13

00:01:06,950  -->  00:01:13,170
list set interface does not add any new methods on top of what it inherits from the collection interface

14

00:01:13,170  -->  00:01:14,090
.

15

00:01:14,130  -->  00:01:19,740
However due to the fact that it does not allow duplicate it just places additional requirements on some

16

00:01:19,740  -->  00:01:26,070
of the inherited methods and also the constructors for example for the I don't admit that it explicitly

17

00:01:26,070  -->  00:01:30,110
states that duplicates are not allowed.

18

00:01:30,370  -->  00:01:37,010
Now let's look at Hasek as the name implies it is a hash table based implementation of a second duffus

19

00:01:37,880  -->  00:01:38,900
an alley.

20

00:01:38,910  -->  00:01:44,730
It uses a hash map which stores key value based hash map will be discussed later.

21

00:01:45,090  -->  00:01:52,080
But since Hachette starts on the individual objects those objects will be stored Eskies while an empty

22

00:01:52,080  -->  00:01:56,550
object will be stored as a value not an empty object.

23

00:01:56,550  -->  00:01:59,730
It is actually an instance of the object class.

24

00:02:00,150  -->  00:02:05,700
If you're interested you can decode the source code of the augmented in-house set.

25

00:02:05,850  -->  00:02:11,660
It also allows one non-value on here or its typical use cases.

26

00:02:11,760  -->  00:02:17,380
It is useful when the need is rapid lookup rapid insertion and rapid deletion.

27

00:02:17,400  -->  00:02:22,650
And we already know this from our discussion on hash table incisional that should not matter too.

28

00:02:22,680  -->  00:02:29,640
If we want to use Hachigian finally does better than our IList If remove all or retain all operations

29

00:02:29,700  -->  00:02:31,810
are to be used frequently.

30

00:02:32,070  -->  00:02:37,950
If you recall our discussion from idlest Lawson they said that agreement on what it did it through each

31

00:02:37,950  -->  00:02:44,390
element of the collection would remove those elements that are present in the input collection an index

32

00:02:44,500  -->  00:02:49,590
of idlest each the more operation has a linear time complexity Gudu element.

33

00:02:49,600  -->  00:02:54,370
Yes but with Hasek each animal would be constant.

34

00:02:55,020  -->  00:03:03,450
So it is better than idlest not to look at a demo or has set OK for this demo.

35

00:03:03,520  -->  00:03:08,860
I have created this new class called us said demo and it is added to the collections package.

36

00:03:09,150  -->  00:03:12,230
And this class has this method called us Hassidim.

37

00:03:12,390  -->  00:03:17,340
So we are going to defend this method and this particular Demel on some code has already been created

38

00:03:17,340  -->  00:03:17,660
here.

39

00:03:17,700  -->  00:03:26,670
So we will also extend that so fast let's create a set of strings so it will be set of string.

40

00:03:28,380  -->  00:03:33,390
So let's also import a set on how Shekh let's just do can go shuffle.

41

00:03:33,510  -->  00:03:36,020
So always remember to program to an interface.

42

00:03:36,050  -->  00:03:38,000
Ok we already discussed that.

43

00:03:38,180  -->  00:03:41,130
And one of the effective delite terms also to come unstuck.

44

00:03:41,210  -->  00:03:41,450
OK.

45

00:03:41,460  -->  00:03:45,200
So that way if you program to an interface you can always change the implementation.

46

00:03:45,230  -->  00:03:47,760
Now let's go ahead and add a couple of strings to this

47

00:03:47,760  -->  00:03:55,140
.

48

00:03:55,980  -->  00:03:57,850
So it's going to be a B.

49

00:03:57,940  -->  00:04:02,710
And let's also add another string quite a non that's run this.

50

00:04:02,710  -->  00:04:08,240
So we're simply spending the set one and it is only printing A and B.

51

00:04:08,280  -->  00:04:11,230
So the second is not getting at it because it's a duplicate.

52

00:04:11,280  -->  00:04:14,270
And we know that Hassid does not allow duplicate.

53

00:04:14,670  -->  00:04:21,820
So that's why it has set of strings non It's creator has set off this custom plus called book.

54

00:04:22,400  -->  00:04:27,690
OK so book is defined at the bottom here and it has three feet slightly altered on either on that are

55

00:04:27,690  -->  00:04:29,130
also getters and setters for that.

56

00:04:29,220  -->  00:04:34,310
And there is also a constructor which will take the same fuse as the input parameter.

57

00:04:34,660  -->  00:04:40,050
And here in this class we are creating two instances of the book and we are passing the same arguments

58

00:04:40,170  -->  00:04:44,130
for the for the book name for the author and also the ear.

59

00:04:44,160  -->  00:04:46,440
It is basically the year publication.

60

00:04:46,560  -->  00:04:52,320
So we have two books here and we are creating this new Haslet instance and we're adding the books to

61

00:04:52,320  -->  00:04:52,870
this.

62

00:04:52,870  -->  00:04:54,290
And we're simply printing it.

63

00:04:54,540  -->  00:04:56,980
So if Iran does so here it is.

64

00:04:57,030  -->  00:05:01,130
But this is not readable because it is printing the object references so.

65

00:05:01,150  -->  00:05:04,180
I'd like to string method here in the book.

66

00:05:04,240  -->  00:05:10,070
It's going to source said Jondrette to string on let's use all of the fields here.

67

00:05:11,400  -->  00:05:13,800
So here if you go down.

68

00:05:13,800  -->  00:05:15,650
So here had a two string mattered.

69

00:05:15,800  -->  00:05:22,490
Now let me run it now as you can see it has printing both the books and the corresponding three rallies

70

00:05:22,510  -->  00:05:22,890
.

71

00:05:23,220  -->  00:05:26,050
But here we have a duplicate here.

72

00:05:26,550  -->  00:05:27,100
So here.

73

00:05:27,120  -->  00:05:29,650
But the objects are logically equal.

74

00:05:29,820  -->  00:05:30,300
OK.

75

00:05:30,600  -->  00:05:36,270
So has said Normally we use it only for only if we want uniqueness.

76

00:05:36,270  -->  00:05:41,250
So if we want if we do not want the second element to be added because it is a duplicate because it

77

00:05:41,250  -->  00:05:45,560
is logically equal then in that case we need to do more.

78

00:05:45,720  -->  00:05:51,680
So right now we know that has set is based on hash map and don't only uses map on hash map and then

79

00:05:51,690  -->  00:05:58,320
it is based on hash hash table and promote hash table discussion we know that it uses the hash code

80

00:05:58,400  -->  00:06:05,460
it uses it has a hash function on the hash function uses the hash code mattered in order to in order

81

00:06:05,460  -->  00:06:07,600
to hash into the hash table right.

82

00:06:07,710  -->  00:06:09,690
In order to find the correct bucket number.

83

00:06:09,900  -->  00:06:14,550
Now in this case if the second element is a duplicate then it needs to have the same hash quarter as

84

00:06:14,550  -->  00:06:15,590
a hostile limit.

85

00:06:15,690  -->  00:06:21,720
But in this case both of these objects then have different hash scorched because the hash code method

86

00:06:21,720  -->  00:06:29,070
is defined in the Object class so when Hatchard invokes the hash code method on Book 1 and Book to both

87

00:06:29,070  -->  00:06:33,830
of them will be different because the hash code motard typically the way it is implemented is it and

88

00:06:33,830  -->  00:06:40,590
some memory address or off the corresponding object after converting it into an int representation.

89

00:06:40,700  -->  00:06:44,470
OK so both of them are actually two different objects.

90

00:06:44,490  -->  00:06:47,760
So both of them have two different memory addresses and because of that.

91

00:06:48,060  -->  00:06:53,950
So we get different values but it is written by the hash called method and that has matter is in the

92

00:06:53,980  -->  00:06:54,800
Object class.

93

00:06:54,930  -->  00:07:00,630
So what we need to do is we need to override it so that hardboard these instances we get the same value

94

00:07:01,380  -->  00:07:07,370
same value is written so that we should really not add the second second block.

95

00:07:07,680  -->  00:07:08,730
So let's do that.

96

00:07:08,760  -->  00:07:13,800
Let's go ahead and override the hash code matter.

97

00:07:14,850  -->  00:07:19,200
So here is the public record and so on.

98

00:07:19,210  -->  00:07:21,640
And it's a hash.

99

00:07:21,680  -->  00:07:26,070
Or let's say that I do the same.

100

00:07:26,070  -->  00:07:27,620
Very tender.

101

00:07:27,690  -->  00:07:32,910
I mean let's let's just use data to identify the duplicate.

102

00:07:32,910  -->  00:07:39,210
So in this case a hash could really use that title will invoke the Hoshko matter on Titan.

103

00:07:39,600  -->  00:07:46,790
So far since but since the titles of the books or seen this this particular statement in rockish and

104

00:07:46,800  -->  00:07:51,120
rendered on the same hash good for both the invocations.

105

00:07:51,120  -->  00:07:55,170
OK so now we're on it.

106

00:07:56,400  -->  00:07:58,790
But then the second book is still getting entered.

107

00:07:59,010  -->  00:08:04,780
The reason for that is because we have all read under the hash code mattered.

108

00:08:04,830  -->  00:08:08,860
So both of them have the same hash cord and they get to the same bucket.

109

00:08:09,180  -->  00:08:15,390
But what happens is this particular book the book one has been added to the linked list in the linked

110

00:08:15,390  -->  00:08:16,290
list.

111

00:08:16,290  -->  00:08:23,970
Now the second book also get has to the same bucket then but hash table will try to check if this is

112

00:08:23,970  -->  00:08:28,220
actually equal to that and for that equals method is used.

113

00:08:28,290  -->  00:08:30,320
We have the equal muttered in the Object class.

114

00:08:30,330  -->  00:08:31,560
It is defined there.

115

00:08:31,830  -->  00:08:34,040
So that needs to be overridden also.

116

00:08:34,230  -->  00:08:40,020
So if it is not overridden so has table will actually be using the equals method from the Object class

117

00:08:40,370  -->  00:08:43,290
on Dodik was murdered by the by default.

118

00:08:43,320  -->  00:08:45,510
Well Paul Hoffman equality operation.

119

00:08:45,510  -->  00:08:50,050
So when we when it does an equality operation it compares the object references.

120

00:08:50,170  -->  00:08:50,780
OK.

121

00:08:51,090  -->  00:08:55,510
An object offenses are obviously different because they're both are two different objects.

122

00:08:55,710  -->  00:09:01,610
So we need to also override the equals matter and that is already done here.

123

00:09:01,620  -->  00:09:09,270
So let me just uncommon this but the equals method for the Hoshko we are using title for the equals

124

00:09:09,600  -->  00:09:12,780
just for this example we can use either an author.

125

00:09:13,110  -->  00:09:15,960
So let me run it now.

126

00:09:16,050  -->  00:09:18,670
Now you can see that the second book is not it.

127

00:09:19,000  -->  00:09:19,480
OK.

128

00:09:19,770  -->  00:09:26,010
So what needs to do what needs to be done is the hash code matter needs to be overridden and also the

129

00:09:26,100  -->  00:09:28,960
equals method has to be overridden.

130

00:09:29,430  -->  00:09:34,410
Because a hash board they get mapped to the same hash but the second object will also be mapped at the

131

00:09:34,410  -->  00:09:38,490
same bucket when Hastert performs the equals operation.

132

00:09:38,490  -->  00:09:44,910
It compares it with the book one and both of them have the same ear and the same author.

133

00:09:44,970  -->  00:09:50,820
So it would read on a true and because of got it it's not going to read the second book.

134

00:09:51,240  -->  00:09:58,380
Now even in the first example here when we were using strings so the string also here string is also

135

00:09:58,380  -->  00:10:04,200
an object but string also has you know if you look at the source code you will see that it has already

136

00:10:04,200  -->  00:10:09,490
been bought hash called method as well as the second method which is the equals method.

137

00:10:09,510  -->  00:10:14,690
So because of that it did not add the second the second the duplicate.

138

00:10:14,910  -->  00:10:16,290
So this is what we have.

139

00:10:16,500  -->  00:10:23,840
And that's actually common this odd bit of this because they can be auto generated.

140

00:10:23,970  -->  00:10:31,080
So if you're going to source here and here it is generated hash code unequals and it is asking if you

141

00:10:31,080  -->  00:10:33,010
need to use all entry fees or.

142

00:10:33,030  -->  00:10:35,590
But all three fees for both ultimate's.

143

00:10:36,030  -->  00:10:40,770
So here you can see that this is the hash code metadata that has been auto generated and this is the

144

00:10:40,830  -->  00:10:42,160
equals method.

145

00:10:42,180  -->  00:10:44,980
So these are very different from what we have written earlier.

146

00:10:45,060  -->  00:10:50,790
So the Hoshko method is using some some interior some int variables also.

147

00:10:50,790  -->  00:10:53,540
So this is based on the best practice.

148

00:10:53,550  -->  00:10:59,460
So what we need and what we know it was very very basic Hoshko method but this is like based on some

149

00:10:59,460  -->  00:11:00,540
best practices.

150

00:11:00,810  -->  00:11:03,280
So this is actually a research area itself.

151

00:11:03,760  -->  00:11:05,070
HALPERT behind the hash.

152

00:11:05,070  -->  00:11:06,510
Hash functions.

153

00:11:06,570  -->  00:11:10,520
So so mathematicians on Twitter dical computer scientists work on that.

154

00:11:10,770  -->  00:11:11,880
So this is how it is.

155

00:11:11,880  -->  00:11:15,090
So eclipse is using the best practice.

156

00:11:15,180  -->  00:11:20,460
In fact next we are going to briefly discuss an item from authoritative Java and it also discusses about

157

00:11:20,460  -->  00:11:25,360
this exact same kind of formula and it is.

158

00:11:25,440  -->  00:11:32,530
It is already implemented by eclipse so it is very important to use a good hash code matter.

159

00:11:33,030  -->  00:11:39,650
And in the item that we are going to discuss it seems the String class used a bad hash code method prior

160

00:11:39,680  -->  00:11:43,820
to Jalapa and because of that it gave lot of issues.

161

00:11:43,950  -->  00:11:46,520
When lot of strings were added to hash tables.

162

00:11:46,640  -->  00:11:52,380
Ok so it is very important to use a good hash function and this is a good hash function which is Auto

163

00:11:52,380  -->  00:11:53,820
generator.

164

00:11:53,830  -->  00:12:00,350
The target is now let's head back to the slide and let's just look at the item there.

165

00:12:01,580  -->  00:12:05,580
And here is Item 9 from Effective Java which is all this.

166

00:12:05,580  -->  00:12:06,990
All right Hoshko matter.

167

00:12:07,070  -->  00:12:12,020
When you override equals and this happens to be a common in w question.

168

00:12:12,510  -->  00:12:19,300
So if you want to treat two different objects as logically equal you must go right but equals on Hoshko

169

00:12:19,350  -->  00:12:20,570
methods.

170

00:12:20,720  -->  00:12:26,460
This or that or writing only Haskell method was not helpful in preventing that duplicate from getting

171

00:12:26,460  -->  00:12:27,610
added.

172

00:12:27,630  -->  00:12:32,270
Similarly if only equals was overridden but not caused matter.

173

00:12:32,550  -->  00:12:38,610
Then it is very likely that the duplicate object will be stored in a completely different bucket.

174

00:12:38,610  -->  00:12:45,150
In fact the contract of Hoshko method explicitly states that if two objects are equal according to their

175

00:12:45,180  -->  00:12:51,710
equal Medek then calling the corresponding hash code methods must also produce the same injuries.

176

00:12:52,350  -->  00:12:56,390
Let's actually review this in the Hoshko that's contract.

177

00:12:57,450  -->  00:13:01,400
OK here is a Hoshko method from the object class.

178

00:13:01,820  -->  00:13:03,760
And here is what we discussed earlier.

179

00:13:03,790  -->  00:13:05,280
You can also read it later.

180

00:13:05,370  -->  00:13:07,280
It's as if two objects are equal.

181

00:13:07,470  -->  00:13:09,740
According to the equals method.

182

00:13:09,960  -->  00:13:16,070
Then calling the hash code method on each of the two objects must produce the same integer result.

183

00:13:16,250  -->  00:13:24,270
OK so that's what the effective delite also says and it also says that if two objects are unequal according

184

00:13:24,270  -->  00:13:30,810
to the equals method then the Hoshko method can still produce the same and degenerate business.

185

00:13:30,900  -->  00:13:34,220
OK that's what this particular statement that Turquand says.

186

00:13:34,330  -->  00:13:41,600
OK so if the objects are not equal then the houseboat can still produce the same integer reasons.

187

00:13:41,700  -->  00:13:48,320
However this last statement recommends that for unequal objects it says it is better for the hash court

188

00:13:48,330  -->  00:13:54,840
method to produce distinct integer results so that rate would improve the performance of hash tables

189

00:13:54,840  -->  00:13:55,220
.

190

00:13:55,350  -->  00:14:00,990
So that is what is recommended but there is no requirement that it has to be like that.

191

00:14:01,020  -->  00:14:05,910
The hash quote motard candidate and the same value even for unequal objects.

192

00:14:06,000  -->  00:14:09,140
So there is no requirement from the Java language specification.

193

00:14:09,150  -->  00:14:16,020
It is dependent on the Java language implementation but typically they the hash code method uses the

194

00:14:16,020  -->  00:14:18,640
memory address so the objects are equal.

195

00:14:18,930  -->  00:14:24,270
Obviously it would read under different hash Goertz indistinct Oshkosh on that as how that is the good

196

00:14:24,270  -->  00:14:29,850
way to implement but it is not required from the language of the specification.

197

00:14:30,180  -->  00:14:31,040
So that's about it.

198

00:14:31,080  -->  00:14:36,960
So we looked at Hasek and it is useful when uniqueness on Fosco cups matter but not in social order

199

00:14:36,960  -->  00:14:37,860
.

200

00:14:37,860  -->  00:14:43,890
Now if you want to insert objects of your own class and if you want to maintain uniqueness and logically

201

00:14:43,920  -->  00:14:49,610
equal objects then make sure to override both hash code as well as Eclipse method.

202

00:14:49,800  -->  00:14:50,190
Thank you
