Electricity (Part 1)
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
This is the start of a multi-unit data science adventure involving a set of real-world temperature and electricity usage data in Cambridge, MA. For this first assignment, we will do a bit of data wrangling that will help us get our data in a form that will be useful for data visualization and analysis down the road.
1) The Data
Download the following .py file: electricity_1.py
There are two types of input data we will be working with:
temp_array
is a 2D array containing information about the daily temperatures in Cambridge, MA. In each row:- The first column represents the year in which the measurement was taken.
- The second column represents the month in which the measurement was taken (as a number).
- The third column represents the day (of the month) in which the measurement was taken.
- The fourth, fifth, and sixth columns contain the high temperature, low temperature, and average temperature on that day, respectively. These temperatures are all measured in degrees Fahrenheit.
kwh_array
is a 2D array containing information about the average monthly electricity usage of a single two-person household in Cambridge. In each row:- The first column represents the year in which the measurement was taken.
- The second column represents the month in which the measurement was taken (as an abbreviation like 'Jan' for January).
- The third column represents the household's electricity usage during that month, in units of kilowatt-hours (kWh).
2) Data Processing
In later assignments, we will explore the relationship between these two quantities (electricity consumption and temperature), but that is hard given the current form of the data. One particular source of difficulty stems from the fact that the average temperatures we have are reported daily, but the electricity usage numbers are reported monthly.
To help with this, complete the three functions format_kwh
, format_temps
,
and join_arrays
in the provided electricity_1.py
file. When you are ready, upload your file for testing below:
Next Exercise: Comparisons