preprocess can be used to read data in different formats such as txt, json, csv and return the data as a data frame. To use preprocess in a project:

library(EDAhelperR)
library(knitr)

Read csv data from buffer

file_path = "https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv"
df = preprocess(file_path)
kable(df[3:6, 3:6])
Pclass Name Sex Age
3 Heikkinen, Miss. Laina female 26
1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35
3 Allen, Mr. William Henry male 35
3 Moran, Mr. James male NA

Read local data

file_path = readr::readr_example("mtcars.csv")
df = preprocess(file_path)
kable(df[3:6, 3:6])
disp hp drat wt
108 93 3.85 2.320
258 110 3.08 3.215
360 175 3.15 3.440
225 105 2.76 3.460

Read data with different methods to dealing with missing values

file_path = readr::readr_example("mtcars.csv")
df = preprocess(file_path, method = "mean")
kable(head(df))
mpg cyl disp hp drat wt qsec vs am gear carb
21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Read data with extra readr settings

file_path = readr::readr_example("mtcars.csv")
df = preprocess(file_path, method = "mean", skip = 6, col_names = colnames(df))
kable(head(df))
mpg cyl disp hp drat wt qsec vs am gear carb
18.1 6 225.0 105 2.76 3.46 20.22 1 0 3 1
14.3 8 360.0 245 3.21 3.57 15.84 0 0 3 4
24.4 4 146.7 62 3.69 3.19 20.00 1 0 4 2
22.8 4 140.8 95 3.92 3.15 22.90 1 0 4 2
19.2 6 167.6 123 3.92 3.44 18.30 1 0 4 4
17.8 6 167.6 123 3.92 3.44 18.90 1 0 4 4