From 1556a4812554b5c4f5d4e5bc0fd7f61c6793fe0e Mon Sep 17 00:00:00 2001
From: Christopher Brooks <cab938@mail.usask.ca>
Date: Mon, 28 Oct 2019 12:28:57 +0000
Subject: [PATCH] Good luck everyone

---
 midterm.ipynb       | 401 ++++++++++++++++++++++++++++++++++++++++++++
 midterm_history.csv |  43 +++++
 2 files changed, 444 insertions(+)
 create mode 100644 midterm.ipynb
 create mode 100644 midterm_history.csv
diff --git a/midterm.ipynb b/midterm.ipynb
new file mode 100644
index 0000000..fa04b63
--- /dev/null
+++ b/midterm.ipynb
@@ -0,0 +1,401 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# SI 330: Midterm Examination\n",
+    "\n",
+    "Fall 2019, October 28th, 2019, Christopher Brooks\n",
+    "\n",
+    "You have 80 minutes, from 8:30am-9:50am to complete this exam. When you are finished the exam you must upload **your .ipynb notebook** to Canvas here: https://umich.instructure.com/courses/320857/assignments/868191\n",
+    "\n",
+    "You are allowed to search for API documentation or examples to refresh your understanding, as well as use regex testing sites if you would like. There is to be no communication with other individuals in the class or out of it.\n",
+    "\n",
+    "Advice: Don't over think things, do what you can, show your thinking process. Full grades are awarded for correct and well written solutions, partial grades will be awarded for partial or poorly written solutions. Use your time wisely.\n",
+    "\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Question 1: Data Cleaning\n",
+    "Continuing my journey of having to learn something about sports, I decided to go look up some stats on the Wolverines Football history. I saved this dataframe in the file `midterm_history.csv`. But like most stuff from Wikipedia, it's in need of some data cleaning in order to be useful."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Part A\n",
+    " I would like you to take the CSV file and demonstrate to me the techniques you learned in this course thus far by transforming it into a table with the following columns:\n",
+    "1. **coach_firstname**: The firstname of the coach\n",
+    "2. **coach_lastname**: The surname of the coach\n",
+    "3. **overall_wins**: A number indicating how many games were won overall\n",
+    "4. **overall_losses**: A number indicating how many games were lost overall\n",
+    "5. **overall_ties**: A number indicating how many games were tied overall\n",
+    "6. **big10_wins**: The same thing as overall_wins but for the Big Ten Record\n",
+    "7. **big10_losses**: The same thing as overall_losses but for the Big Ten Record\n",
+    "8. **big10_ties**: The same thing as overall_ties but for the Big Ten Record\n",
+    "\n",
+    "Also, please set the index value to the be the **year** the record was made.\n",
+    "\n",
+    "Note: the format of most games records in wikipedia is *win-loss-tie*, where ties are omitted if they are 0."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Year</th>\n",
+       "      <th>Coach</th>\n",
+       "      <th>Overall record</th>\n",
+       "      <th>Big Ten record</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>1898</td>\n",
+       "      <td>Gustave Ferbert</td>\n",
+       "      <td>10–0</td>\n",
+       "      <td>3–0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>1901 †</td>\n",
+       "      <td>Fielding H. Yost</td>\n",
+       "      <td>11–0</td>\n",
+       "      <td>4–0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>1902</td>\n",
+       "      <td>Fielding H. Yost</td>\n",
+       "      <td>11–0</td>\n",
+       "      <td>5–0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>1903 †</td>\n",
+       "      <td>Fielding H. Yost</td>\n",
+       "      <td>11–0–1</td>\n",
+       "      <td>3–0–1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>1904 †</td>\n",
+       "      <td>Fielding H. Yost</td>\n",
+       "      <td>10–0</td>\n",
+       "      <td>2–0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Year             Coach Overall record Big Ten record\n",
+       "0    1898   Gustave Ferbert           10–0            3–0\n",
+       "1  1901 †  Fielding H. Yost           11–0            4–0\n",
+       "2    1902  Fielding H. Yost           11–0            5–0\n",
+       "3  1903 †  Fielding H. Yost         11–0–1          3–0–1\n",
+       "4  1904 †  Fielding H. Yost           10–0            2–0"
+      ]
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "df=pd.read_csv(\"midterm_history.csv\")\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Your code here"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Question 2: Data Manipulation\n",
+    "Imagine this hypothetical situation, I'm sitting out in the backyard with my daughter (Katie) and we're watching the squirrels climb along the trees. We jointly name each squirrel, and identify their type (e.g. a gray squirel versus a chipmunk). Then I try and find the out which kinds of trees the squirrels live in, while Katie tries to see the speed at which the squirrels run on different species of trees. So we have two `DataFrame` objects created, one about squirrels and their living conditions, and one about squirrels and their observed speed."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>name</th>\n",
+       "      <th>type</th>\n",
+       "      <th>tree</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>carl</td>\n",
+       "      <td>chipmunk</td>\n",
+       "      <td>oak</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>suzie</td>\n",
+       "      <td>gray</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>andy</td>\n",
+       "      <td>gray</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>bob</td>\n",
+       "      <td>black</td>\n",
+       "      <td>oak</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>john</td>\n",
+       "      <td>black</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>anthony</td>\n",
+       "      <td>eastern red</td>\n",
+       "      <td>pine</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      name         type    tree\n",
+       "0     carl     chipmunk     oak\n",
+       "1    suzie         gray  walnut\n",
+       "2     andy         gray  walnut\n",
+       "3      bob        black     oak\n",
+       "4     john        black  walnut\n",
+       "5  anthony  eastern red    pine"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>name</th>\n",
+       "      <th>speed</th>\n",
+       "      <th>tree</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>carl</td>\n",
+       "      <td>2</td>\n",
+       "      <td>oak</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>carl</td>\n",
+       "      <td>5</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>carl</td>\n",
+       "      <td>4</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>suzie</td>\n",
+       "      <td>4</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>bob</td>\n",
+       "      <td>6</td>\n",
+       "      <td>walnut</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>bob</td>\n",
+       "      <td>2</td>\n",
+       "      <td>oak</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "    name  speed    tree\n",
+       "0   carl      2     oak\n",
+       "1   carl      5  walnut\n",
+       "2   carl      4  walnut\n",
+       "3  suzie      4  walnut\n",
+       "4    bob      6  walnut\n",
+       "5    bob      2     oak"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "\n",
+    "dad_df=pd.DataFrame([[\"carl\",\"chipmunk\",\"oak\"],\n",
+    "        [\"suzie\",\"gray\",\"walnut\"],\n",
+    "        [\"andy\",\"gray\",\"walnut\"],\n",
+    "        [\"bob\",\"black\",\"oak\"],\n",
+    "        [\"john\",\"black\",\"walnut\"],\n",
+    "        [\"anthony\",\"eastern red\",\"pine\"]], columns=[\"name\",\"type\",\"tree\"])\n",
+    "\n",
+    "katie_df=pd.DataFrame([[\"carl\",2,\"oak\"],\n",
+    "          [\"carl\",5,\"walnut\"],\n",
+    "          [\"carl\",4,\"walnut\"],\n",
+    "          [\"suzie\",4,\"walnut\"],\n",
+    "          [\"bob\",6,\"walnut\"],\n",
+    "          [\"bob\",2, \"oak\"]], columns=[\"name\",\"speed\",\"tree\"])\n",
+    "\n",
+    "display(dad_df)\n",
+    "display(katie_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Part A\n",
+    "Write a function to return the mean and standard deviations (from `numpy`) of the average speed of a squirrel by type. Only include squirels for whom Katie has have collected some running data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Your code here"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Part B\n",
+    "On average, what is the fastest and slowest squirrel/tree combination? Build a dataframe where each column is labeled as to a tree type and the values of the cells are average speeds of different kinds of squirrels (so row index values should be squirrel types)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Your code here"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/midterm_history.csv b/midterm_history.csv
new file mode 100644
index 0000000..faf0be0
--- /dev/null
+++ b/midterm_history.csv
@@ -0,0 +1,43 @@
+Year,Coach,Overall record,Big Ten record
+1898,Gustave Ferbert,10–0,3–0
+1901 †,Fielding H. Yost,11–0,4–0
+1902,Fielding H. Yost,11–0,5–0
+1903 †,Fielding H. Yost,11–0–1,3–0–1
+1904 †,Fielding H. Yost,10–0,2–0
+1906 †,Fielding H. Yost,4–1,1–0
+1918 †,Fielding H. Yost,5–0,2–0
+1922 †,Fielding H. Yost,6–0–1,4–0
+1923 †,Fielding H. Yost,8–0,4–0
+1925,Fielding H. Yost,7–1,5–1
+1926 †,Fielding H. Yost,7–1,5–0
+1930 †,Harry Kipke,8–0–1,5–0
+1931 †,Harry Kipke,8–1–1,5–1
+1932 †,Harry Kipke,8–0,6–0
+1933 †,Harry Kipke,7–0–1,5–0–1
+1943 †,Fritz Crisler,8–1,6–0
+1947,Fritz Crisler,10–0,6–0
+1948,Bennie Oosterbaan,9–0,6–0
+1949 †,Bennie Oosterbaan,6–2–1,4–1–1
+1950,Bennie Oosterbaan,6–3–1,4–1–1
+1964,Bump Elliott,9–1,6–1
+1969 †,Bo Schembechler,8–3,6–1
+1971,Bo Schembechler,11–1,8–0
+1972 †,Bo Schembechler,10–1,7–1
+1973 †,Bo Schembechler,10-0–1,7–0–1
+1974 †,Bo Schembechler,10–1,7–1
+1976 †,Bo Schembechler,10–2,7–1
+1977 †,Bo Schembechler,10–2,7–1
+1978 †,Bo Schembechler,10–2,7–1
+1980,Bo Schembechler,10–2,8–0
+1982,Bo Schembechler,8–4,8–1
+1986 †,Bo Schembechler,11–2,7–1
+1988,Bo Schembechler,9–2–1,7–0–1
+1989,Bo Schembechler,10–2,8–0
+1990 †,Gary Moeller,9–3,6–2
+1991,Gary Moeller,10–2,8–0
+1992,Gary Moeller,9–0–3,6–0–2
+1997,Lloyd Carr,12–0,8–0
+1998 †,Lloyd Carr,10–3,7–1
+2000 †,Lloyd Carr,9–3,6–2
+2003,Lloyd Carr,10–3,7–1
+2004 †,Lloyd Carr,9–3,7–1

	Year	Coach	Overall record	Big Ten record
0	1898	Gustave Ferbert	10–0	3–0
1	1901 †	Fielding H. Yost	11–0	4–0
2	1902	Fielding H. Yost	11–0	5–0
3	1903 †	Fielding H. Yost	11–0–1	3–0–1
4	1904 †	Fielding H. Yost	10–0	2–0
	name	type	tree
0	carl	chipmunk	oak
1	suzie	gray	walnut
2	andy	gray	walnut
3	bob	black	oak
4	john	black	walnut
5	anthony	eastern red	pine