Hint 1

Remember to provide your user agent and accepted languages via a header. We've discussed this in the Amazon Price Scraping project in day 47.


Hint 2:

The address data can be quite messy:

There's many ways you can clean this up. One way is to use Python's .replace() and .strip() methods to remove the newlines, whitespace and pipe symbols.


Hint 3

The price for listings with multiple properties have different text from listings with a single property only. A property with a single listing will have a price of $1,234/mo or $1,234+/mo, but a listing with multiple properties will have the number of bedrooms in the price $1,234+ 1bd. Try to clean up this data as well.


Partial Solution

If you got stuck on the data cleaning and BeautifulSoup, you can look at the solution to the first part of the capstone project here: https://gist.github.com/TheMuellenator/7e45f9b977e90419146c4a2ee1713087


Complete Solution

Here's the complete solution that includes both BeautifulSoup and Selenium:

https://gist.github.com/TheMuellenator/1318b1084a74e9b559f9820438b4a931