R/nba_scraper.R
nba_scraper.Rd
This function scrapes the tabular data from the ESPN NBA website using RSelenium and returns a tibble object of the scraped data. Users can specify the season year and season type. Users should also specify the port for Selenium driver, and should be the same as the port number used for setting up the Docker container driver (if the user is using Docker as per the "rsketball" package repo instructions). By default, the function will not write to csv until a string input for "csv_path" is given.
nba_scraper( season_year = 2018, season_type = "regular", port = 4445L, csv_path = NULL )
season_year | int from 2001 to 2019 (upper limit based on latest year) |
---|---|
season_type | string. Either "regular" or "postseason". |
port | int with L suffix. Must not be negative. Should be same as port configuration used for Docker container driver setup. |
csv_path | string for csv file. Defaults to NULL. If specified, must end with ".csv". |
A tibble of scraped ESPN NBA data
For detailed use cases, please refer to the vignette: https://ubc-mds.github.io/rsketball/articles/rsketball-vignette.html
# \donttest{ # Scrape regular season 2018/19 without saving to a csv file nba_2018 <- nba_scraper(2018, season_type = "regular", port=4445L)#> [1] "Scraping commencing. Please wait!" #> [1] "Connecting to remote server" #> $acceptInsecureCerts #> [1] FALSE #> #> $browserName #> [1] "chrome" #> #> $browserVersion #> [1] "80.0.3987.106" #> #> $chrome #> $chrome$chromedriverVersion #> [1] "80.0.3987.106 (f68069574609230cf9b635cd784cfb1bf81bb53a-refs/branch-heads/3987@{#882})" #> #> $chrome$userDataDir #> [1] "/tmp/.com.google.Chrome.ZMaT4W" #> #> #> $`goog:chromeOptions` #> $`goog:chromeOptions`$debuggerAddress #> [1] "localhost:36929" #> #> #> $networkConnectionEnabled #> [1] FALSE #> #> $pageLoadStrategy #> [1] "normal" #> #> $platformName #> [1] "linux" #> #> $proxy #> named list() #> #> $setWindowRect #> [1] TRUE #> #> $strictFileInteractability #> [1] FALSE #> #> $timeouts #> $timeouts$implicit #> [1] 0 #> #> $timeouts$pageLoad #> [1] 300000 #> #> $timeouts$script #> [1] 30000 #> #> #> $unhandledPromptBehavior #> [1] "dismiss and notify" #> #> $webdriver.remote.sessionid #> [1] "4463b32b1093d87ef6e59119e7523367" #> #> $id #> [1] "4463b32b1093d87ef6e59119e7523367" #> #> [1] "Data scraping of 2018 regular season completed."# Scrape playoffs season 2017/18 while saving to a local csv file. nba_2017 <- nba_scraper(2017, season_type = "postseason", port=4445L, csv_path = "nba_2017_playoffs.csv")#> [1] "Scraping commencing. Please wait!" #> [1] "Connecting to remote server" #> $acceptInsecureCerts #> [1] FALSE #> #> $browserName #> [1] "chrome" #> #> $browserVersion #> [1] "80.0.3987.106" #> #> $chrome #> $chrome$chromedriverVersion #> [1] "80.0.3987.106 (f68069574609230cf9b635cd784cfb1bf81bb53a-refs/branch-heads/3987@{#882})" #> #> $chrome$userDataDir #> [1] "/tmp/.com.google.Chrome.Lyq2CJ" #> #> #> $`goog:chromeOptions` #> $`goog:chromeOptions`$debuggerAddress #> [1] "localhost:43195" #> #> #> $networkConnectionEnabled #> [1] FALSE #> #> $pageLoadStrategy #> [1] "normal" #> #> $platformName #> [1] "linux" #> #> $proxy #> named list() #> #> $setWindowRect #> [1] TRUE #> #> $strictFileInteractability #> [1] FALSE #> #> $timeouts #> $timeouts$implicit #> [1] 0 #> #> $timeouts$pageLoad #> [1] 300000 #> #> $timeouts$script #> [1] 30000 #> #> #> $unhandledPromptBehavior #> [1] "dismiss and notify" #> #> $webdriver.remote.sessionid #> [1] "4df99c49f160c00ec54c5239bff260ef" #> #> $id #> [1] "4df99c49f160c00ec54c5239bff260ef" #> #> [1] "Data scraping of 2017 postseason season completed."# }