Assignment 2 on time series

cab938 · Nov 2, 2019 · 45afa6e · 45afa6e
1 parent 59becf8
commit 45afa6e
Show file tree

Hide file tree

Showing 2 changed files with 1,135 additions and 0 deletions.
diff --git a/assignment2.ipynb b/assignment2.ipynb
@@ -0,0 +1,102 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "nbgrader": {
+     "grade": false,
+     "grade_id": "cell-4cc1a4b58a7804ce",
+     "locked": true,
+     "schema_version": 1,
+     "solution": false
+    }
+   },
+   "source": [
+    "# Assignment 2\n",
+    "For this assignment you'll be looking at some of my physical activity data, which I've put in `datasets/strava.csv`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "nbgrader": {
+     "grade": false,
+     "grade_id": "cell-118357d73b1f8d41",
+     "locked": true,
+     "schema_version": 1,
+     "solution": false
+    }
+   },
+   "source": [
+    "## Question 1 (25%)\n",
+    "This data is a bit of a mess. It includes running and cycling! Your first job is to identify the different run *sessions* from the dataset and list their *start* and *end* times. You know a given activity is a run if it has a value for `Power` (since I often use a stryd footpod) or if it came before Sept 15th (since that's when I bought a bike). And you can assume that a session is made up of contiguous values at at least a 20 minute resolution. That is, if I am running then stop recording data for 20 minutes or more then start running again that's two sessions. But if I only stop for 7 minutes, then start up again, that's just one session.\n",
+    "\n",
+    "Create a `DataFrame` which is made up of three columns, the `start` and `end` times of each run as described above, with a `length` field and the with the session number, starting with `0` of course, as the index.\n",
+    "\n",
+    "**Note**: You can find pandas frequency strings for the resampling here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Question 2 (10%)\n",
+    "How many minutes long was my longest run?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Question 3 (25%)\n",
+    "It's common to estimate the calorie burn when your heart rate is between 90 and 150 beats per minute. The the equation below is suitable for men:\n",
+    "\n",
+    "$$((-55.0969 + (0.6309*H) + (0.1988*W) + (0.2017*A)) \\div 4.184) * 60 * T$$\n",
+    "\n",
+    "Where HR is your heart rate in beats per minute, W is your weight in kilograms (let's be generous and assume I'm roughly 240lbs), A is the age in years (so let's go 41 here), and T is the exercise duration in hours.\n",
+    "\n",
+    "What I would like is for you to calculate my rolling average 5 minute caloric burn for the session which began at time `2019-08-17 11:50:36`, and report to me the time I had just finished 1 minute of my top caloric burn."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Question 4 (40%)\n",
+    "Today, November 1st 2019, I went out for a bike ride and wiped out. Not a little loss of control or a little skid, we're talking flying off the bike in traffic kind of wipe out. I finished my ride as I had planned, and when I was telling my wife about the incident she asked me where but I couldn't really remember. I did remember a few things though - can you help me track down where I wiped out from my ride log data, which I put in `datasets/wipout.csv`?\n",
+    "\n",
+    "Specifically I remember that:\n",
+    "1. I launched off the bike. So at some point both my reported speed `enhanced_speed` and my reported  `cadence` from my wearables would have been 0.\n",
+    "2. I was at least 5 minutes into my ride at the time, headed out of the city, and that it wasn't in the last ten minutes of my ride.\n",
+    "3. I know my `heart_rate` didn't drop right away. Heart rate responds relatively slowely so while my speed and cadence dropped nearly right away to zero (since my bike was lying in the road) my heart rate was still high. More specifically, I would say that compared to the 2 minutes before and 2 minutes after the event, my `heart_rate` was more than one standard deviation higher while I was lying on the grass.\n",
+    "4. I don't know if this helps, but I do remember stopping during my ride *before* I wiped out to give an in promptu lecture to a motorist on whether cars belonged in the bike lane. So some of the above criteria might also relate to that event. But I know I actually wiped out after talking to that driver.\n",
+    "\n",
+    "Where and when did I wipe out? In units of `timestamp` and `position_lat_degrees` and `position_long_degrees` please, so I can look this up on Google Maps.\n",
+    "\n",
+    "**Hint**: it might be useful to consider the heart rate carefully. Remember z-scored from STATS 250? No worries if you don't, a z-score can be created with the `scipy.stats.zscore` function. You just pass in a list of values and it returns to you a list of z-scores. This mapping is the number of standard deviations from the mean each data point is. So if a given data point has a z-score of 1.2, it is 1.2 standard deviations greater than the mean of whatever data you passed in. And z-scores can be negative too, so a z-score of -3.2 for a given data point means that data point is much smaller -- more than three standard deviations smaller -- than the mean of the rest of the data you calculated z-scores for."
+   ]
+  }
+ ],
+ "metadata": {
+  "celltoolbar": "Create Assignment",
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}