This package aims to build an R package that elegantly performs data pre-processing in a fast and easy manner. With four separate functions that will come along with the rb4model package, users will have greater flexibility in handling many different types of datasets in the wild or those collected by them. With the rb4model package, users will be able to smoothly pre-process their data and have it ready for the machine learning model of their choice.
Here, we will replace missing values in the airquality dataset with the mean.
head(airquality)
#> Ozone Solar.R Wind Temp Month Day
#> 1 41 190 7.4 67 5 1
#> 2 36 118 8.0 72 5 2
#> 3 12 149 12.6 74 5 3
#> 4 18 313 11.5 62 5 4
#> 5 NA NA 14.3 56 5 5
#> 6 28 NA 14.9 66 5 6
head(missing_val(airquality, 'mean'))
#> Ozone Solar.R Wind Temp Month Day
#> 1 41.00000 190.0000 7.4 67 5 1
#> 2 36.00000 118.0000 8.0 72 5 2
#> 3 12.00000 149.0000 12.6 74 5 3
#> 4 18.00000 313.0000 11.5 62 5 4
#> 5 42.12931 185.9315 14.3 56 5 5
#> 6 28.00000 185.9315 14.9 66 5 6
Here, we will split the mtcars dataset into numerical and categorical featrues.
Here we will fit the iris dataset to a general linear model and return its root mean squared error.
Here, we will perform forward feature selection on the iris dataset.
y <- iris$Species
x <- iris[c(1,2,3,4)]
ffs <- ForwardSelection(feature=x, label=y, my_mod="rf")
#> note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
#>
#> note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
#>
#> note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
head(x[ffs])
#> Sepal.Width
#> 1 3.5
#> 2 3.0
#> 3 3.2
#> 4 3.1
#> 5 3.6
#> 6 3.9