Reading arguments

Arguments

Here, we are going to introduce different arguments for pd.read_csv() and pd.read_excel():

  • index_col
  • header
  • nrows
  • usecols

If you wish to know more, you can find the documentation at the following links:

index_col

df = pd.read_csv('data/cereal.csv', index_col="name")
df.head(3)
mfr type calories protein ... shelf weight cups rating
name
100% Bran N Cold 70 4 ... 3 1.0 0.33 68.402973
100% Natural Bran Q Cold 120 3 ... 3 1.0 1.00 33.983679
All-Bran K Cold 70 4 ... 3 1.0 0.33 59.425505

3 rows × 15 columns


df = pd.read_csv('data/cereal.csv', index_col=0)
df.head(3)
mfr type calories protein ... shelf weight cups rating
name
100% Bran N Cold 70 4 ... 3 1.0 0.33 68.402973
100% Natural Bran Q Cold 120 3 ... 3 1.0 1.00 33.983679
All-Bran K Cold 70 4 ... 3 1.0 0.33 59.425505

3 rows × 15 columns

candybars = pd.read_csv('data/candybars-h.csv')
candybars
This dataset was created by Hayley Boyce in February 2020. Unnamed: 1 Unnamed: 2 Unnamed: 3 ... Unnamed: 7 Unnamed: 8 Unnamed: 9 Unnamed: 10
0 Note this is not a complete dataset and there ... NaN NaN NaN ... NaN NaN NaN NaN
1 name weight chocolate peanuts ... coconut white_chocolate multi available_canada_america
... ... ... ... ... ... ... ... ... ...
25 Oh Henry 51 1 1 ... 0 0 0 Both
26 Cookies and Cream 43 0 0 ... 0 1 0 Both

27 rows × 11 columns

candybars = pd.read_csv('data/candybars-h.csv', header=2)
candybars
name weight chocolate peanuts ... coconut white_chocolate multi available_canada_america
0 Coffee Crisp 50 1 0 ... 0 0 0 Canada
1 Butterfinger 184 1 1 ... 0 0 0 America
2 Skor 39 1 0 ... 0 0 0 Both
... ... ... ... ... ... ... ... ... ...
22 Almond Joy 46 1 0 ... 1 0 0 America
23 Oh Henry 51 1 1 ... 0 0 0 Both
24 Cookies and Cream 43 0 0 ... 0 1 0 Both

25 rows × 11 columns

nrows

candybars = pd.read_csv('data/candybars.csv', nrows=7)
candybars
name weight chocolate peanuts ... coconut white_chocolate multi available_canada_america
0 Coffee Crisp 50 1 0 ... 0 0 0 Canada
1 Butterfinger 184 1 1 ... 0 0 0 America
2 Skor 39 1 0 ... 0 0 0 Both
... ... ... ... ... ... ... ... ... ...
4 Twix 58 1 0 ... 0 0 1 Both
5 Reeses Peanutbutter Cups 43 1 1 ... 0 0 1 Both
6 3 Musketeers 54 1 0 ... 0 0 0 America

7 rows × 11 columns

usecols

candybars = pd.read_csv('data/candybars.csv', usecols=[0, 1, 10])
candybars
name weight available_canada_america
0 Coffee Crisp 50 Canada
1 Butterfinger 184 America
2 Skor 39 Both
... ... ... ...
22 Almond Joy 46 America
23 Oh Henry 51 Both
24 Cookies and Cream 43 Both

25 rows × 3 columns

candybars = pd.read_csv('data/candybars.csv', usecols=['name', 'weight', 'available_canada_america'])
candybars
name weight available_canada_america
0 Coffee Crisp 50 Canada
1 Butterfinger 184 America
2 Skor 39 Both
... ... ... ...
22 Almond Joy 46 America
23 Oh Henry 51 Both
24 Cookies and Cream 43 Both

25 rows × 3 columns

Let’s apply what we learned!