scaler Perform standard scaler on numerical features.
scaler(X_train, X_validation, X_test, colnames)
colnames | vector Vector of column names for numeric features |
---|---|
x_train | data.frame Dataframe of train set containing columns to be scaled. |
x_valid | data.frame Dataframe of validation set containing columns to be scaled. |
x_test | data.frame Dataframe of test set containing columns to be scaled. |
list of data.frame Stores the x_train, x_valid and x_test separately as three dataframes in one list. The first element in the list will contain x_train, second will be x_valid and the third will contain x_test.
x_train <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(34, 35, 56), usage = c(4, 6, 9)) x_validation <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(29, 65, 13), usage = c(5, 27, 10)) x_test <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(20, 35, 18), usage = c(9, 6, 0)) colnames <- c('counts', 'usage') X_train = scaler(x_train, x_validation, x_test, colnames)$X_train#>#> ✓ ggplot2 3.2.1 ✓ purrr 0.3.3 #> ✓ tibble 2.1.3 ✓ dplyr 0.8.3 #> ✓ tidyr 1.0.0 ✓ stringr 1.4.0 #> ✓ readr 1.3.1 ✓ forcats 0.4.0#> Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── #> x dplyr::filter() masks stats::filter() #> x dplyr::lag() masks stats::lag()#> #>#>#> #>#>#> #>#>#> #>X_test = scaler(x_train, x_validation, x_test, colnames)$X_test X_validation = scaler(x_train, x_validation, x_test, colnames)$X_validation