House hunting can be a daunting experience given there is so much information to consider. rhousehunter
aims to simplify the information collection process for end-users with four simple function syntax in R.
This document will show you how to use the functions of rhousehunter
package to gather rental information on Craiglist with ease.
The first function in our package is the scraper()
. Here you will input a Craigslist housing url for the main housing and apartment rentals page of Craigslist BC and designate the argument online = TRUE
to scrape directly from the internet. When online = FALSE
the scraper function will scrape from a local HTML file, this may be handy if the Craigslist website is down or for internal development and test. Please note that you cannot input the url for an individual listing.
url <- "https://vancouver.craigslist.org/d/apartments-housing-for-rent/search/apa"
scraped_data <- scraper(url, online = FALSE)
head(scraped_data)
#> # A tibble: 6 x 3
#> listing_url price house_type
#> <chr> <chr> <chr>
#> 1 https://vancouver.craigslist.org/bnc/apa/d/burnaby-must-see~ $1,250 1br-600ft~
#> 2 https://vancouver.craigslist.org/rds/apa/d/surrey-bedroom-b~ $1,300 2br-
#> 3 https://vancouver.craigslist.org/van/apa/d/vancouver-furnis~ $1,850 1br-500ft~
#> 4 https://vancouver.craigslist.org/van/apa/d/vancouver-yaleto~ $3,695 2br-900ft~
#> 5 https://vancouver.craigslist.org/van/apa/d/vancouver-bed-ba~ $2,390 2br-748ft~
#> 6 https://vancouver.craigslist.org/van/apa/d/cozy-bedroom-apa~ $1,675 1br-500ft~
Our data_cleaner()
function is straightforward and powerful tool. It turns the tibble with data generated by the scraper()
function into a clean and tidy tibble object. It has a single input, which is the output of the scraper()
function.
cleaned_data <- data_cleaner(scraped_data)
head(cleaned_data)
#> # A tibble: 6 x 5
#> listing_url price num_bedroom area_sqft city
#> <chr> <int> <int> <int> <chr>
#> 1 https://vancouver.craigslist.org/bnc/apa/d~ 1250 1 600 burna~
#> 2 https://vancouver.craigslist.org/rds/apa/d~ 1300 2 NA surrey
#> 3 https://vancouver.craigslist.org/van/apa/d~ 1850 1 500 vanco~
#> 4 https://vancouver.craigslist.org/van/apa/d~ 3695 2 900 vanco~
#> 5 https://vancouver.craigslist.org/van/apa/d~ 2390 2 748 vanco~
#> 6 https://vancouver.craigslist.org/van/apa/d~ 1675 1 500 <NA>
The filter()
function allows you to filter the cleaned data to find the rentals meeting your specifications. The inputs of this function include: the tibble object generated by data_cleaner()
, along with the numeric values for the minimum price, maximum price, minimum square feet, minimum number of bedrooms, and a string of the city name of the desired rentals. It outputs a tibble object with the matching results.
filtered_data <- data_filter(cleaned_data,
min_price = 1000,
max_price = 2000,
sqrt_ft = 500,
num_bedroom_input = 1,
city_input = 'Vancouver')
filtered_data
#> # A tibble: 44 x 5
#> listing_url price num_bedroom area_sqft city
#> <chr> <int> <int> <int> <chr>
#> 1 https://vancouver.craigslist.org/van/apa/~ 1850 1 500 vanco~
#> 2 https://vancouver.craigslist.org/van/apa/~ 1675 1 500 <NA>
#> 3 https://vancouver.craigslist.org/van/apa/~ 1700 3 NA vanco~
#> 4 https://vancouver.craigslist.org/van/apa/~ 1575 1 500 <NA>
#> 5 https://vancouver.craigslist.org/van/apa/~ 1550 1 500 <NA>
#> 6 https://vancouver.craigslist.org/van/apa/~ 2000 2 850 vanco~
#> 7 https://vancouver.craigslist.org/bnc/apa/~ 1450 1 900 <NA>
#> 8 https://vancouver.craigslist.org/rds/apa/~ 1500 2 NA <NA>
#> 9 https://vancouver.craigslist.org/van/apa/~ 1650 1 NA vanco~
#> 10 https://vancouver.craigslist.org/nvn/apa/~ 1750 1 505 vanco~
#> # ... with 34 more rows
At this stage, you can choose to email your filtered results in a .csv
. You will need to input the email address you wish to send the results to and the filtered tibble object. You also have the choice to change the optional email_subject
argument to set your email subject. After the function runs through smoothly without error, you should also receive an email from pyhousehunter@gmail.com in your chosen email’s inbox.
send_email(email_recipient = "elabandari@gmail.com",
filtered_data = filtered_data,
email_subject = 'Results from RHouseHunter')
We do hope rhousehunter
makes your house hunting process easier.