menu COMPSCI 2120/9642/DIGIHUM 2220 1.0 documentation

Assignment 1: COVID-19 Data Analysis (Ontario)

_images/london_wide.jpg

In this assignment you will be performing some analysis on a real dataset which illustrates the status of COVID-19 cases in Ontario (Status of COVID-19 cases in Ontario).

The following dataset is a “snippet” of the original one (simplified to make for easier coding) containing the data for March 2020:

  • covidtesting_cs2120-snippet.csv (Update: This downloads as a ".txt" file from the UWO server for some reason. Please see OWL "Announcements" for the csv file)

  • Download and save it in the same directory as your assignment submission.


“covidtesting_cs2120-snippet.csv” contains the following column labels:

  • Reported_Date: date of reporting in format YYYY-MM-DD

  • Confirmed_Negative-Cumulative: number of negative cases (cumulative)

  • Confirmed_Positive-Daily: number of positive cases (daily count)

Disclaimer

Based on a quick viewing of the data, I “believe” that Confirmed Negative values are cumulative and that Confirmed Positive is a daily count. I have created the assignment with these assumptions. These assumptions could be wrong, and have only been applied to create this assignment –> don’t do this when using real data for your own research.


If you’re interested in downloading the original file, here is a link to it:

You’ll be using a template (a1_for_students.py). In your assignment submission you will need to perform the following tasks:

  • add a column labeled “Confirmed_Negative-Daily” which includes all daily negative cases (create this from the “Confirmed_Negative-Cumulative” column using appropriate function calls). To create this you will need to fix an error in the get_daily_increase() function.

  • add a column labeled “Confirmed_Positive-Cumulative” which includes all cumulative positive cases (create this from the “Confirmed_Positive-Daily” column using appropriate function calls). To create this you will need to fix an error in the get_cumulative_increase() function.

  • Find the maximum increase in daily confirmed positive cases using the appropriate function. Print out the result in a full sentence.

  • Export the complete dataframe (including the two columns you added) to a csv file (using the appropriate function).

Here is the template:

Don’t worry too much

  • This template contains lists and dataframes.

  • Lists are formatted like this: [1, 2, 3, 4, “bubble”, “doll”]

  • Dataframes are organized into rows and columns. Think of them as csv files which can be manipulated (i.e. add rows/columns, etc.) using pandas.

  • For this assignment you will be correcting errors which are related to expressions and parameter naming. You will also be calling on (appropriate) functions. So, don’t worry too much about understanding all of the function implementation details.

Installing Pandas (with PyCharm)

If you don’t have PyCharm

  • If you don’t have PyCharm or are having trouble getting pandas please feel free to send me an email/come to my office hours

What to submit on OWL

  • Your version of a1_for_students.py. Make sure your NAME and STUDENT NUMBER appear in a comment at the top of the program.

  • A text file containing the output of your program.

  • The output csv file which contains the original data PLUS the two added columns.

  • Include all of this in a compressed (zipped) folder

Some hints

  • Start with fixing the bugs/errors. Then move to the function calls.

  • Test your code with print statements where necessary.

  • If you need help, ask! Try the OWL forums, ask your TA, email me or come to my office hours (or book a Zoom meeting). We’re here to help.