preprocess
can be used to read data in different formats such as txt, json, csv and return the data as a data frame. To use preprocess in a project:
Read csv data from buffer
file_path = "https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv"
df = preprocess(file_path)
kable(df[3:6, 3:6])
3 |
Heikkinen, Miss. Laina |
female |
26 |
1 |
Futrelle, Mrs. Jacques Heath (Lily May Peel) |
female |
35 |
3 |
Allen, Mr. William Henry |
male |
35 |
3 |
Moran, Mr. James |
male |
NA |
Read local data
108 |
93 |
3.85 |
2.320 |
258 |
110 |
3.08 |
3.215 |
360 |
175 |
3.15 |
3.440 |
225 |
105 |
2.76 |
3.460 |
Read data with different methods to dealing with missing values
21.0 |
6 |
160 |
110 |
3.90 |
2.620 |
16.46 |
0 |
1 |
4 |
4 |
21.0 |
6 |
160 |
110 |
3.90 |
2.875 |
17.02 |
0 |
1 |
4 |
4 |
22.8 |
4 |
108 |
93 |
3.85 |
2.320 |
18.61 |
1 |
1 |
4 |
1 |
21.4 |
6 |
258 |
110 |
3.08 |
3.215 |
19.44 |
1 |
0 |
3 |
1 |
18.7 |
8 |
360 |
175 |
3.15 |
3.440 |
17.02 |
0 |
0 |
3 |
2 |
18.1 |
6 |
225 |
105 |
2.76 |
3.460 |
20.22 |
1 |
0 |
3 |
1 |
18.1 |
6 |
225.0 |
105 |
2.76 |
3.46 |
20.22 |
1 |
0 |
3 |
1 |
14.3 |
8 |
360.0 |
245 |
3.21 |
3.57 |
15.84 |
0 |
0 |
3 |
4 |
24.4 |
4 |
146.7 |
62 |
3.69 |
3.19 |
20.00 |
1 |
0 |
4 |
2 |
22.8 |
4 |
140.8 |
95 |
3.92 |
3.15 |
22.90 |
1 |
0 |
4 |
2 |
19.2 |
6 |
167.6 |
123 |
3.92 |
3.44 |
18.30 |
1 |
0 |
4 |
4 |
17.8 |
6 |
167.6 |
123 |
3.92 |
3.44 |
18.90 |
1 |
0 |
4 |
4 |