This function takes in a dataframe object and one categorical feature, to produce a histogram plot that visualizes the distribution of the feature. User can also choose to plot density graph of the feature by specifying in plot_type. The function also offers customization on color, plot title, font size, color-scheme, plot size and other common configurations.
categorical_eda( data, xval, plot_type = "histogram", color = NULL, title = NULL, font_size = 10, color_scheme = "Tableau 20", opacity = 0.6, facet_factor = NULL, facet_col = NULL )
data | A tibble or a dataframe |
---|---|
xval | A variable used to represent the x-axis. |
plot_type | An optional character variable used to specify plot type. Options include "histogram" and "density". |
color | A character variable used to set the color variable of the plot |
title | An optional character variable used to set the title of the plot |
font_size | An optional integer variable used to set the font size |
color_scheme | An optional character variable used to set the color shceme |
opacity | An optional integer variable used to specify density fill opacity for the density plot |
facet_factor | An optional character variable used to specify facet factor |
facet_col | An optional numeric variable used to specify number of facet columns |
A ggplot2 object, either a histogram or a density plot
library(palmerpenguins) categorical_plot <- categorical_eda( data = penguins, xval = body_mass_g, color = island, title = "Histogram of Body Mass in Different Sex", facet_factor = "island", facet_col = 1 )#> Warning: Ignoring unknown parameters: binwidth, bins, pad