Replace missing values in dataframe columns by the specified methods. Separate methods can be applied for categorical column imputation and numerical column imputation.
fill_missing(x_train, x_test, column_list, num_imp, cat_imp)
x_train | training set dataframe to be transformed |
---|---|
x_test | test set dataframe to be transformed |
column_list | named list of columns with two character vectors, must be named numeric' and 'categorical'. |
num_imp | method for numerical imputation, options are "mean and" median |
cat_imp | method for categorical imputation, only option is "mode" |
named list, with two vectors: "x_train", the training set with missing values filled, and "x_test", the test set with missing values filled
x_tr <- data.frame('x' = c(2.5, 3.3, NA), 'y' = c(1, NA, 1)) x_test <- data.frame('x' = c(NA), 'y' = c(NA)) fill_missing(x_tr, x_test, list("numeric" = c('x'), "categorical" = c('y')), 'mean', 'mode')#> $x_train #> x y #> 1 2.5 1 #> 2 3.3 1 #> 3 2.9 1 #> #> $x_test #> x y #> 1 2.9 1 #>