7.1. Exercises

Merge Questions

Coding questions

Instructions:
Running a coding exercise for the first time could take a bit of time for everything to load. Be patient, it could take a few minutes.

When you see ____ in a coding exercise, replace it with what you assume to be the correct code. Run it and see if you obtain the desired output. Submit your code to validate if you were correct.

Make sure you remove the hash (#) symbol in the coding portions of this question. We have commented them so that the line won’t execute and you can test your code after each step.

Merging I

In this exercise, you are going to join two dataframes lego_inventory_parts.csv and lego-colors.csv and answer a few multiple-choice questions. The multiple-choice questions are being asked with the intention of using the code cell to write your own code in any way that helps to answer the question.

Tasks:

  • Combine the two dataframes to make 1 large complete dataframe by using an outer join.
  • Make sure to set the argument indicator to True.
  • Name the new dataframe lego_tower.
Hint 1
  • Are you naming your new dataframe lego_tower?
  • Are you using the arguments left_on='color_id', right_on=id, how='outer' and indicator=True?
Fully worked solution:



Merging II

This question may be a bit more challenging. We are wondering about the inventory of a store. We want to see which Lego sets are in stock and if so how many? After all, the store needs to make sure there are enough sets in stock to meet demand.

Tasks:

  • Combine the two dataframes to make one large complete dataframe by using an inner join.
  • Name the new dataframe lego_stock.
  • Group the new dataframe by set_num and find how many groups there are using .ngroups

This question is in two parts and we are going to walk you through how to tackle it.

Hint 1
  • Are you naming your new dataframe lego_stock?
  • Are you using the arguments left_on='set_num', right_on='set_num' and how='inner'?
  • Are you grouping my using groupby('set_num')?
Fully worked solution:


Ah, it appears we have multiple rows for some of the same sets.

Although it shows initially the we have 2846 different sets due to the number of rows in lego_stock, when we group them by set_num we actually only get 2306 different sets. This means that we have some rows with the same set_num but with different inventory quantities.

How are we going to get the stock quantity of each set now?

We are going to have to sum up the quantity of each set using .groupby() and.agg().

Tasks:

  • Use .groupby() and .agg() to sum up the quantity of each set and save this as store_inventory.
  • Inner join store_inventory with lego_sets and use chaining to sort the dataframe in descending order based on in-stock quantity
  • Save this new dataframe as store_inventory_details.
  • Display the new dataframe.
Hint 1
  • Are you naming your new dataframe store_inventory?
  • Are you aggregating using .agg({\'quantity\':\'sum\'})?
  • Are you using the arguments left_index=True, right_index=True, how='inner'?
  • Are you sorting in descending order of quantity?
Fully worked solution:


Now we can return to our initial problem of identifying how many Lego sets are in stock.