Calculate correlation between all feature pairs in the numeric dataframe. Returns feature pairs having correlation higher than the threshold value.

get_correlated_features(x, threshold, consider_sign = FALSE)

Arguments

x

Input DataFrame or tibble containing numeric variables

threshold

Float value for correlation above which feature pairs will be returned

consider_sign

(boolean) determines whether correlation value has to be checked for magnitude only or for sign (positive/ negative) also. Default checks only the magnitude.

Value

DataFrame Tibble containing containing feature1, feature2, and corresponding correlation.

Examples

x <- data.frame('age'=c(23, 13, 7, 45),
                'height'=c(1.65, 1.23, 0.96, 1.55),
                'income'=c(20, 120, 120, 25))
get_correlated_features(x, threshold=0.7)
#>   feature_1 feature_2 correlation
#> 1       age    height        0.75
#> 2       age    income       -0.81
#> 3    height       age        0.75
#> 4    height    income       -0.93
#> 5    income       age       -0.81
#> 6    income    height       -0.93