1
00:00:00,630 --> 00:00:01,740
In the last lesson,

2
00:00:01,800 --> 00:00:06,800
we managed to get our API to work and we got back some live data.

3
00:00:07,800 --> 00:00:08,910
But when we ran it,

4
00:00:09,030 --> 00:00:13,500
we saw that some of the texts that we were getting back was formatted really

5
00:00:13,500 --> 00:00:17,970
strangely with these pound signs and ampersands,

6
00:00:18,090 --> 00:00:21,290
and it's not the actual text that we see.

7
00:00:21,920 --> 00:00:24,290
So what's happening here? Well,

8
00:00:24,380 --> 00:00:28,310
what we're actually seeing here are called HTML entities,

9
00:00:28,880 --> 00:00:33,880
and there are a way of replacing certain characters in HTML so that it doesn't

10
00:00:35,060 --> 00:00:38,990
get confused with HTML code. So for example,

11
00:00:39,020 --> 00:00:44,020
the less than symbol could be a part of HTML code.

12
00:00:44,600 --> 00:00:48,590
And instead of using that, we have to use the

13
00:00:48,650 --> 00:00:51,890
< and then semicolon.

14
00:00:52,850 --> 00:00:55,100
So if we look down this table,

15
00:00:55,370 --> 00:01:00,370
we can actually see this " actually stands for a double quotation mark.

16
00:01:01,940 --> 00:01:05,930
And that would make sense cause it's saying "Mario Kart

17
00:01:05,930 --> 00:01:07,400
64"

18
00:01:08,060 --> 00:01:11,570
and this #039,

19
00:01:11,570 --> 00:01:16,570
if we look up in this list, is actually a single quotation mark.

20
00:01:17,570 --> 00:01:20,960
And that would make sense as well, cause it would be Stalin's death.

21
00:01:21,740 --> 00:01:26,270
So how do we get hold of the actual human readable text? Well,

22
00:01:26,420 --> 00:01:30,740
we can use this tool called the free formatter to

23
00:01:31,160 --> 00:01:36,050
unescape the HTML results that we're getting back from our API.

24
00:01:36,890 --> 00:01:40,040
I've copied and pasted this part we've got here.

25
00:01:41,540 --> 00:01:43,790
And if I go ahead and click on unescape,

26
00:01:44,060 --> 00:01:49,060
you can see that it formats it into the original human readable format

27
00:01:49,430 --> 00:01:53,480
and now it says in "Mario Kart 64" Waluigi is a playable character.

28
00:01:54,110 --> 00:01:58,580
And if I paste the cold war ended with Joseph Stalin, blah, blah, blah,

29
00:01:59,030 --> 00:02:02,120
death and I click unescape, then you can see 

30
00:02:02,120 --> 00:02:06,500
it says the Cold War ended with Joseph Stalin's death and it replaces that with

31
00:02:06,560 --> 00:02:08,449
an apostrophe. Now,

32
00:02:08,449 --> 00:02:13,070
essentially we know what to Google and that's kind of the first step towards

33
00:02:13,070 --> 00:02:14,240
solving any problem.

34
00:02:14,720 --> 00:02:18,620
So if you Google for escaping HTML entities in Python,

35
00:02:18,860 --> 00:02:22,940
then the first result we get in Stack Overflow gives us the answer.

36
00:02:23,540 --> 00:02:28,340
We have to import the HTML module and use one of the methods in that module

37
00:02:28,340 --> 00:02:33,110
called unescape in order to unescape the text that we're getting back.

38
00:02:33,950 --> 00:02:37,790
The part where we're interested in this is in our quiz brain,

39
00:02:38,270 --> 00:02:42,260
because that's the part what we format it into our user answer.

40
00:02:43,130 --> 00:02:45,330
Let's change the question

41
00:02:45,380 --> 00:02:50,380
text to be equal to the self.current_question.text,

42
00:02:50,870 --> 00:02:52,640
so this part that we have here

43
00:02:52,970 --> 00:02:57,970
which is being put into our input and we can use this q_text instead.

44
00:02:59,800 --> 00:03:03,640
But instead of using just the text that we get back from the API,

45
00:03:03,970 --> 00:03:06,550
we're going to import the HTML module.

46
00:03:08,550 --> 00:03:08,970
Yeah.

47
00:03:08,970 --> 00:03:13,970
And we're going to use the method inside this HTML module called unescape

48
00:03:14,700 --> 00:03:18,150
to unescape this string that we get from the API.

49
00:03:18,930 --> 00:03:23,700
And now if I run this code again, you can see that this time,

50
00:03:23,730 --> 00:03:28,730
no matter what is inside the string, say an apostrophe in this case or a double

51
00:03:31,860 --> 00:03:35,580
quote in this case, they're all being formatted correctly.

52
00:03:36,780 --> 00:03:37,613
There you have it.

53
00:03:38,070 --> 00:03:42,270
We started off with some strange characters and after a bit of Googling around,

54
00:03:42,270 --> 00:03:47,190
we found the solution to turn them into human readable text. Aas a programmer,

55
00:03:47,220 --> 00:03:49,500
this is a skill that you have to really hone.

56
00:03:49,800 --> 00:03:53,040
This is something that is going to take you to the next level

57
00:03:53,250 --> 00:03:55,680
to this intermediate++ level.

58
00:03:56,010 --> 00:03:59,040
You have to find out solutions to your own problems,

59
00:03:59,340 --> 00:04:01,020
and Google is your best friend.

