First of all, let’s retrieve a GDP-related data set from Open Government Portal using gdpimporterr
. The first element of the output from gdpimporterr
is the data frame which can be used for downstream data wrangling and analysis, while the second element is a character vector containing the title information from the MetaData.
# Use gdpimporterr to download and import data
raw_data <- gdpimporterr("https://www150.statcan.gc.ca/n1/tbl/csv/36100400-eng.zip")
knitr::kable(head(raw_data[[1]]))
REF_DATE | GEO | DGUID | North American Industry Classification System (NAICS) | UOM | UOM_ID | SCALAR_FACTOR | SCALAR_ID | VECTOR | COORDINATE | VALUE | STATUS | SYMBOL | TERMINATED | DECIMALS |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1997 | Newfoundland and Labrador | 2016A000210 | All industries [T001] | Percentage share | 246 | units | 0 | v54255922 | 1.10 | 100.00 | NA | NA | NA | 2 |
1997 | Newfoundland and Labrador | 2016A000210 | Goods-producing industries [T002] | Percentage share | 246 | units | 0 | v62356022 | 1.20 | 24.66 | NA | NA | NA | 2 |
1997 | Newfoundland and Labrador | 2016A000210 | Service-producing industries [T003] | Percentage share | 246 | units | 0 | v62356023 | 1.21 | 75.34 | NA | NA | NA | 2 |
1997 | Newfoundland and Labrador | 2016A000210 | Industrial production [T010] | Percentage share | 246 | units | 0 | v62356024 | 1.22 | 14.46 | NA | NA | NA | 2 |
1997 | Newfoundland and Labrador | 2016A000210 | Information and communication technology sector [T013] | Percentage share | 246 | units | 0 | v62356025 | 1.23 | 3.50 | NA | NA | NA | 2 |
1997 | Newfoundland and Labrador | 2016A000210 | Energy sector [T016] | Percentage share | 246 | units | 0 | v62356026 | 1.24 | 4.60 | NA | NA | NA | 2 |
raw_data[[2]]
#> [1] "Gross domestic product (GDP) at basic prices, by industry, provinces and territories, percentage share"
Then, gdpcleanerr
helps to rename the column names and clean up the useless columns, preparing the data for summary statistics (gdpdescriberr
) and visualization (gdpploterr
).
# Use gdpcleanerr to clean raw data
clean_data <- gdpcleanerr(raw_data[[1]])
knitr::kable(head(clean_data))
Date | Location | NAICS_Class | Unit | Scale | Value |
---|---|---|---|---|---|
1997 | Newfoundland and Labrador | All industries [T001] | Percentage share | units | 100.00 |
1997 | Newfoundland and Labrador | Goods-producing industries [T002] | Percentage share | units | 24.66 |
1997 | Newfoundland and Labrador | Service-producing industries [T003] | Percentage share | units | 75.34 |
1997 | Newfoundland and Labrador | Industrial production [T010] | Percentage share | units | 14.46 |
1997 | Newfoundland and Labrador | Information and communication technology sector [T013] | Percentage share | units | 3.50 |
1997 | Newfoundland and Labrador | Energy sector [T016] | Percentage share | units | 4.60 |
gdpdesciberr
is used to produce customized statistics summary in a nice and easy format.
# Use gdpdescriberr to produce basic summary statistics
stats <- gdpdescriberr(clean_data, Value, Location, .stats=c("mean", "sd", "max"), dec = 3)
knitr::kable(stats)
V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | V11 | V12 | V13 | V14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Location | Alberta | British Columbia | Manitoba | New Brunswick | Newfoundland and Labrador | Northwest Territories | Nova Scotia | Nunavut | Ontario | Prince Edward Island | Quebec | Saskatchewan | Yukon |
mean | 13.935 | 12.503 | 12.569 | 12.534 | 13.910 | 13.347 | 12.406 | 12.364 | 12.597 | 12.175 | 12.699 | 13.532 | 12.243 |
sd | 22.316 | 22.571 | 22.357 | 22.466 | 22.830 | 22.914 | 22.912 | 23.101 | 22.646 | 22.731 | 22.329 | 22.029 | 23.353 |
max | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
Finally, gdpimporterr
gives a line plot for GDP values across provinces.
# Use gdpplotterr to produce a plot
gdpplotterr(clean_data)