Handles missing values in a dataframe

Replace missing values in dataframe columns by the specified methods. Separate methods can be applied for categorical column imputation and numerical column imputation.

fill_missing(x_train, x_test, column_list, num_imp, cat_imp)

Arguments

x_train	training set dataframe to be transformed
x_test	test set dataframe to be transformed
column_list	named list of columns with two character vectors, must be named numeric' and 'categorical'.
num_imp	method for numerical imputation, options are "mean and" median
cat_imp	method for categorical imputation, only option is "mode"

Value

named list, with two vectors: "x_train", the training set with missing values filled, and "x_test", the test set with missing values filled

Examples

x_tr <- data.frame('x' = c(2.5, 3.3, NA), 'y' = c(1, NA, 1))
x_test <- data.frame('x' = c(NA), 'y' = c(NA))
fill_missing(x_tr, x_test, list("numeric" = c('x'),
 "categorical" = c('y')), 'mean', 'mode')
#> $x_train
#>     x y
#> 1 2.5 1
#> 2 3.3 1
#> 3 2.9 1
#> 
#> $x_test
#>     x y
#> 1 2.9 1
#>

Arguments

Value

Examples

Contents