scaler Perform standard scaler on numerical features.

scaler(X_train, X_validation, X_test, colnames)

Arguments

colnames

vector Vector of column names for numeric features

x_train

data.frame Dataframe of train set containing columns to be scaled.

x_valid

data.frame Dataframe of validation set containing columns to be scaled.

x_test

data.frame Dataframe of test set containing columns to be scaled.

Value

list of data.frame Stores the x_train, x_valid and x_test separately as three dataframes in one list. The first element in the list will contain x_train, second will be x_valid and the third will contain x_test.

Examples

x_train <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(34, 35, 56), usage = c(4, 6, 9)) x_validation <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(29, 65, 13), usage = c(5, 27, 10)) x_test <- data.frame(colors = c('Blue', 'Red', 'Green'), counts = c(20, 35, 18), usage = c(9, 6, 0)) colnames <- c('counts', 'usage') X_train = scaler(x_train, x_validation, x_test, colnames)$X_train
#> ── Attaching packages ──────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
#> ggplot2 3.2.1 purrr 0.3.3 #> tibble 2.1.3 dplyr 0.8.3 #> tidyr 1.0.0 stringr 1.4.0 #> readr 1.3.1 forcats 0.4.0
#> ── Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── #> x dplyr::filter() masks stats::filter() #> x dplyr::lag() masks stats::lag()
#> #> Attaching package: ‘testthat’
#> The following object is masked from ‘package:dplyr’: #> #> matches
#> The following object is masked from ‘package:purrr’: #> #> is_null
#> The following object is masked from ‘package:tidyr’: #> #> matches
X_test = scaler(x_train, x_validation, x_test, colnames)$X_test X_validation = scaler(x_train, x_validation, x_test, colnames)$X_validation