Replace missing values in dataframe columns by the specified methods. Separate methods can be applied for categorical column imputation and numerical column imputation.

fill_missing(x_train, x_test, column_list, num_imp, cat_imp)

Arguments

x_train

training set dataframe to be transformed

x_test

test set dataframe to be transformed

column_list

named list of columns with two character vectors, must be named numeric' and 'categorical'.

num_imp

method for numerical imputation, options are "mean and" median

cat_imp

method for categorical imputation, only option is "mode"

Value

named list, with two vectors: "x_train", the training set with missing values filled, and "x_test", the test set with missing values filled

Examples

x_tr <- data.frame('x' = c(2.5, 3.3, NA), 'y' = c(1, NA, 1)) x_test <- data.frame('x' = c(NA), 'y' = c(NA)) fill_missing(x_tr, x_test, list("numeric" = c('x'), "categorical" = c('y')), 'mean', 'mode')
#> $x_train #> x y #> 1 2.5 1 #> 2 3.3 1 #> 3 2.9 1 #> #> $x_test #> x y #> 1 2.9 1 #>