This function takes in a dataframe object and one categorical feature, to produce a histogram plot that visualizes the distribution of the feature. User can also choose to plot density graph of the feature by specifying in plot_type. The function also offers customization on color, plot title, font size, color-scheme, plot size and other common configurations.

categorical_eda(
  data,
  xval,
  plot_type = "histogram",
  color = NULL,
  title = NULL,
  font_size = 10,
  color_scheme = "Tableau 20",
  opacity = 0.6,
  facet_factor = NULL,
  facet_col = NULL
)

Arguments

data

A tibble or a dataframe

xval

A variable used to represent the x-axis.

plot_type

An optional character variable used to specify plot type. Options include "histogram" and "density".

color

A character variable used to set the color variable of the plot

title

An optional character variable used to set the title of the plot

font_size

An optional integer variable used to set the font size

color_scheme

An optional character variable used to set the color shceme

opacity

An optional integer variable used to specify density fill opacity for the density plot

facet_factor

An optional character variable used to specify facet factor

facet_col

An optional numeric variable used to specify number of facet columns

Value

A ggplot2 object, either a histogram or a density plot

Examples

library(palmerpenguins) categorical_plot <- categorical_eda( data = penguins, xval = body_mass_g, color = island, title = "Histogram of Body Mass in Different Sex", facet_factor = "island", facet_col = 1 )
#> Warning: Ignoring unknown parameters: binwidth, bins, pad