Good Function Design Choices

How to write good functions

What makes a function useful?

Is a function more useful when it does more operations?

Do adding parameters make your functions more or less functional?

These are all questions we need to think about when writing functions.

1. Avoid “hard coding.”

Hard coding is the process of embedding values directly into your code without saving them in objects.

def squares_a_list(numerical_list):
    new_squared_list = list()
    
    for number in numerical_list:
        new_squared_list.append(number ** 2)
    
    return new_squared_list


def exponent_a_list(numerical_list, exponent):
    new_exponent_list = list()
    
    for number in numerical_list:
        new_exponent_list.append(number ** exponent)
    
    return new_exponent_list

2. Less is More

def load_filter_and_average(file, grouping_column, ploting_column):
    df = pd.read_csv("data/" + file)
    source = df.groupby(grouping_column).mean(numeric_only=True).reset_index()
    chart = alt.Chart(source, width = 500, height = 300).mark_bar().encode(
                      x=alt.X(grouping_column),
                      y=alt.Y(ploting_column)
            )
    return chart


bad_idea = load_filter_and_average('cereal.csv', 'mfr', 'rating')
bad_idea


404 image
def grouped_means(df, grouping_column):
    grouped_mean = df.groupby(grouping_column).mean(numeric_only=True).reset_index()
    return grouped_mean


cereal_mfr = grouped_means(cereal, 'mfr')
cereal_mfr
mfr calories protein fat ... shelf weight cups rating
0 A 100.000000 4.000000 1.000000 ... 2.000000 1.000000 1.000000 54.850917
1 G 111.363636 2.318182 1.363636 ... 2.136364 1.049091 0.875000 34.485852
2 K 108.695652 2.652174 0.608696 ... 2.347826 1.077826 0.796087 44.038462
... ... ... ... ... ... ... ... ... ...
4 P 108.888889 2.444444 0.888889 ... 2.444444 1.064444 0.714444 41.705744
5 Q 95.000000 2.625000 1.750000 ... 2.375000 0.875000 0.823750 42.915990
6 R 115.000000 2.500000 1.250000 ... 2.000000 1.000000 0.871250 41.542997

7 rows × 14 columns

def plot_mean(df, grouping_column, ploting_column):
    chart = alt.Chart(df, width = 500, height = 300).mark_bar().encode(
                      x=alt.X(grouping_column),
                      y=alt.Y(ploting_column)
            )
    return chart


plot1 = plot_mean(cereal_mfr, 'mfr', 'rating')
plot1
404 image

3. Return a single object

def load_filter_and_average(file, grouping_column, ploting_column):
    df = pd.read_csv("data/" + file)
    source = df.groupby(grouping_column).mean(numeric_only=True).reset_index()
    chart = alt.Chart(source, width = 500, height = 300).mark_bar().encode(
                      x=alt.X(grouping_column),
                      y=alt.Y(ploting_column)
            )
    return chart, source


another_bad_idea = load_filter_and_average('cereal.csv', 'mfr', 'rating')
another_bad_idea
(alt.Chart(...),
    mfr    calories   protein  ...    weight      cups     rating
 0    A  100.000000  4.000000  ...  1.000000  1.000000  54.850917
 1    G  111.363636  2.318182  ...  1.049091  0.875000  34.485852
 2    K  108.695652  2.652174  ...  1.077826  0.796087  44.038462
 ..  ..         ...       ...  ...       ...       ...        ...
 4    P  108.888889  2.444444  ...  1.064444  0.714444  41.705744
 5    Q   95.000000  2.625000  ...  0.875000  0.823750  42.915990
 6    R  115.000000  2.500000  ...  1.000000  0.871250  41.542997
 
 [7 rows x 14 columns])
another_bad_idea[0]
404 image

4. Keep global variables in their global environment

def grouped_means(df, grouping_column):
    grouped_mean = df.groupby(grouping_column).mean(numeric_only=True).reset_index()
    return grouped_mean


cereal = pd.read_csv('data/cereal.csv')

def bad_grouped_means(grouping_column):
    grouped_mean = cereal.groupby(grouping_column).mean(numeric_only=True).reset_index()
    return grouped_mean
bad_cereal_grouping = bad_grouped_means('mfr')
bad_cereal_grouping.head(3)
mfr calories protein ... weight cups rating
0 A 100.000000 4.000000 ... 1.000000 1.000000 54.850917
1 G 111.363636 2.318182 ... 1.049091 0.875000 34.485852
2 K 108.695652 2.652174 ... 1.077826 0.796087 44.038462

3 rows × 14 columns


cereal = "let's change it to a string" 
bad_cereal_grouping = bad_grouped_means('mfr')
AttributeError: 'str' object has no attribute 'groupby'

Detailed traceback: 
  File "<string>", line 1, in <module>
  File "<string>", line 2, in bad_grouped_means

Let’s apply what we learned!