This function takes in a tibble or dataframe object, two numeric columns, and produces either a scatter or line plot to visualize the relationship between the two numerical features. Users can optionally change default arguments for plot-type, color, title, size of text, color-scheme, and toggle log transformation for the x and y axis.
numerical_eda( data, xval, yval, color, title = NULL, plot_type = "scatter", font_size = 10, color_scheme = "Tableau 10", x_transform = FALSE, y_transform = FALSE )
data | A tibble or data frame object. |
---|---|
xval | A numeric variable used to represent the x-axis. |
yval | A numeric variable used to represent the y-axis. |
color | A character variable used to group the data points in different colors. |
title | An optional character variable used to set the title and axis. |
plot_type | An optional character variable used to represent the graphical relationship between xval and yval, options are "scatter" or "line" plot. |
font_size | An optional integer variable used to set the font size. |
color_scheme | An optional character variable used to set the color scheme |
x_transform | An optional logical, whether a log transformation occurs on the x-axis. |
y_transform | An optional logical, whether a log transformation occurs on the y-axis. |
numerical_plot A ggplot2 object. The numerical plot.
df <- iris numerical_plot <- numerical_eda( data = df, xval = Petal.Length, yval = Petal.Width, color = Species, title = "Petal.Length vs Petal.Width", font_size = 12 )