Calculate correlation between all feature pairs in the numeric dataframe. Returns feature pairs having correlation higher than the threshold value.
get_correlated_features(x, threshold, consider_sign = FALSE)
Input DataFrame or tibble containing numeric variables
Float value for correlation above which feature pairs will be returned
(boolean) determines whether correlation value has to be checked for magnitude only or for sign (positive/ negative) also. Default checks only the magnitude.
DataFrame Tibble containing containing feature1, feature2, and corresponding correlation.
x <- data.frame('age'=c(23, 13, 7, 45),
'height'=c(1.65, 1.23, 0.96, 1.55),
'income'=c(20, 120, 120, 25))
get_correlated_features(x, threshold=0.7)
#> feature_1 feature_2 correlation
#> 1 age height 0.75
#> 2 age income -0.81
#> 3 height age 0.75
#> 4 height income -0.93
#> 5 income age -0.81
#> 6 income height -0.93