Assignment 3: Visualization + Data Structures
=============================================
.. image:: ../img/visualization.png
   :width: 900
   :height: 250

* **Worth**: 10%
* **Due**: November 20th at 11:59 pm
* **Marking Scheme**: `a3_marking_scheme <a3_marking_scheme.txt>`_

.. admonition:: A note about this and past assignments
   :class: Warning

   Assignment 1 and Assignment 2 both involved *sifting through someone elses code and adding additional functionalities*. While this is a great skill to have (and one which is necessary for practical programming), these assignments **did not allow for any creativity or flexibility**.

   With this in mind (and also based on some of your feedback from the mid-term review) I've created Assignment 3. For this assignment you will be able to ``write all of your own code``; moreover you will have some freedom in selecting your data and how you'd like to visualize it. 

   If you have any feedback on this Assignment or any other ones, feel free to email me @ jmorra6@uwo.ca with any suggestions or comments. Of course, also please do feel free to email me or come to office hours (or book a Zoom) if you need any help.

The idea
^^^^^^^^
* You will be picking one or more datasets from `The Princeton Intro CS Repository <https://introcs.cs.princeton.edu/java/data/>`_ or `The UCI Machine Learning Repository <https://archive.ics.uci.edu/ml/index.php>`_ (or from somewhere else... just cite where you got it from)

* You will create ``3 *different* visualizations``. You could choose, for example, to use a **scatter plot** to illustrate some data (or a **pie chart**, or a **box plot**, or a **histogram**, etc.), or you could go the less traditional route and use something like a **word cloud**. 

* Where necessary (i.e. moreso for more traditional plots like scatter plots, pie charts, etc.) you will need an *appropriate title* and *axis titles*; you will also be marked on *how your graph looks* (i.e. good spacing, appropriate scale). 

* You must create and implement your own ``list(s)``, ``tuple(s)``, and ``dictionary(ies)`` **somewhere in your code** (i.e. you might, for example, use a dictionary to store your data and extract the keys and values into lists for your plot). You will need ``at least 1 of each data structure`` which you are using in the process of creating your visualizations.

.. admonition:: More on the above point
   :class: Note

   You do not need 1 data structure for each plot; you can mix and match the 3 types... when marking we are only looking to see that you are somehow using dictionaries, lists, and tuples.

   As long as you create and use them somewhere in the process of creating your visuals, it's good. You wouldn't, however, get the mark for a dictionary if you, for example, just created a dictionary such as ``{"James": "car", "Bob": "duck"}`` and printed it.

* Below are a few samples for your reference.

Sample 1
--------
A scatter plot showing that happiness is correlated with GDP per capita.

.. image:: ../img/list_plot.png
   :width: 700

source: `World Happiness Report <https://www.kaggle.com/unsdsn/world-happiness/>`_ 


Sample 2
--------
A bar graph which weakly shows (for this school) that more study time leads to higher grades, but also that over-studying could be less effective.

.. image:: ../img/tupl_plot.png
   :width: 700

source: `Student Performance <https://archive.ics.uci.edu/ml/datasets/Student+Performance#/>`_ 


Sample 3
--------
A word cloud which showcases the most-used words in "Magna Carta".

.. image:: ../img/wordplot.png
   :width: 700

source: `Magna Carta <https://introcs.cs.princeton.edu/java/data/magna-carta.txt>`_ 


What's the point?
^^^^^^^^^^^^^^^^^
I want to make sure that you can do 2 things:

1. Visualize some data with Python

2. Create and implement lists, tuples, and dictionaries

A resource to get started:
^^^^^^^^^^^^^^^^^^^^^^^^^^
`This resource <https://towardsdatascience.com/introduction-to-data-visualization-in-python-89a54c97fbed>`_ is good, but you can use any resource and any plotting/visualization library that you want (matplotlib, pandas, seaborn, bokeh, etc.). 

How you will be marked:
^^^^^^^^^^^^^^^^^^^^^^^
`a3_marking_scheme <a3_marking_scheme.txt>`_

What to submit (in zipped folder)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``a3.py``: a Python file containing your code

``v1.png``: your first visualization

``v2.png``: your second visualization

``v3.png``: your third visualization 

``ref.txt``: Cite the datasets you used here and indicate which reference corresponds to which visual(s). The format doesn't matter, so long as you link to what dataset or datasets you used.