Almost every data analysis project involves the process of doing some exploratory data analysis(EDA) and data preprocessing. Usually they serve as a very crucial and inevitable step in a data analysis workflow. There are some very common tasks in EDA, which can include:
Typically these steps are followed by some preprocessing like imputation and dealing with outliers. All of those steps together may require lots of coding effort and can be repeated for several projects. To solve this issue, we designed the R package eaziReda that wraps all of those lines of code into four convenient functions that will allow you to quickly and easily carry out EDA along with some simple preprocessing using just a few lines of code!
You can install the development version from GitHub with:
Documentation and usage examples for eaziReda
can be found here.
missing_impute
: This function will take in a dataframe and generate a table listing the number of missing values and the percentage of missing values for each column. It also gives the user an option of doing some simple imputations on the entire dataframe in place. The imputation methods can also be customized by the user.outliers_detect
: This function will take in a vector and will return a boolean vector with outliers marked given by certain method that the users can customize. It also gives the user an option to remove all the outliers in place.corr_plot
: This function will take in a dataframe and a list of feature names to generate a correlation plot for the given list of features.histograms
: This function will take in a dataframe and a list of feature names to generates histograms for numeric features and bar plots for categorical featuresremove_outliers
: This function will remove the outliers from the given vector based on a second vector that has the outliers’ indices markedWhile there aren’t a ton of packages in R that do only EDA, quite a few of them include it as a secondary functionality. Here are a few packages that we found that do something similar:
Please note that the eaziReda project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.