Skip to content

Commit

Permalink
Update and fix notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
AnesBenmerzoug committed Jun 3, 2024
1 parent 77d1827 commit e94f1d7
Show file tree
Hide file tree
Showing 11 changed files with 1,360 additions and 645 deletions.
163 changes: 129 additions & 34 deletions notebooks/nb_20_dynamic_programming.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,8 @@
"slide_type": "skip"
},
"tags": [
"remove-input",
"remove-output",
"remove-input-nbconv",
"remove-output-nbconv",
"ActiveScene"
"ActiveScene",
"remove-cell"
]
},
"outputs": [],
Expand All @@ -46,7 +43,8 @@
"slide_type": "skip"
},
"tags": [
"ActiveScene"
"ActiveScene",
"remove-cell"
]
},
"outputs": [],
Expand All @@ -58,13 +56,18 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"editable": true,
"init_cell": true,
"jupyter": {
"source_hidden": true
},
"scene__Initialization": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"ActiveScene"
"ActiveScene",
"remove-cell"
]
},
"outputs": [],
Expand All @@ -88,7 +91,8 @@
"slide_type": "skip"
},
"tags": [
"ActiveScene"
"ActiveScene",
"remove-cell"
]
},
"outputs": [],
Expand Down Expand Up @@ -260,6 +264,8 @@
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"raw_mimetype": "",
"slideshow": {
"slide_type": "subslide"
}
Expand All @@ -284,7 +290,7 @@
"\n",
"$$\n",
"\\begin{equation}\n",
"u^*_k = \\displaystyle \\argmin_{u \\in \\mathbf{U}}\n",
"u^*_k = \\displaystyle \\underset{u \\in \\mathbf{U}}{\\text{argmin}} \n",
"\\left[ c(\\mathbf{x}_k, \\mathbf{u}_k) + V(f(\\mathbf{x}_k, \\mathbf{u}_k)) \\right].\n",
"\\end{equation}\n",
"$$"
Expand All @@ -293,6 +299,7 @@
{
"cell_type": "markdown",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": "subslide"
}
Expand Down Expand Up @@ -325,7 +332,7 @@
"slide_type": ""
},
"tags": [
"hide-input"
"remove-input"
]
},
"outputs": [],
Expand All @@ -336,7 +343,12 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
}
},
"source": [
"We wish to travel from node A to node G at minimum cost. If the cost represents time then we want to find the shortest path from A to G.\n",
"\n",
Expand All @@ -348,15 +360,28 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
}
},
"source": [
"We start by determining all possible paths first ."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"plot_all_paths_graph(G)"
Expand All @@ -368,17 +393,16 @@
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-cell"
]
}
},
"source": [
"We then compute the cost-to-go at each node to determine the shortest path.\n",
"\n",
"Each node in this new graph represents a state. We will start from the tail (the last states) and compute recursively the cost for each state transition.\n",
"\n",
"Let $c(n_1, n_2)$ the cost of moving from node $n_1$ to node $n_2$ and $V(n)$ be the optimal cost-to-go from node $n$. We have $$V({\\text{G}}) = 0$$.\n",
"Let $c(n_1, n_2)$ the cost of moving from node $n_1$ to node $n_2$ and $V(n)$ be the optimal cost-to-go from node $n$.\n",
"\n",
"We have $V({\\text{G}}) = 0$ because the cost of going from node $G$ to itself is 0.\n",
"\n",
"We start with nodes **F** and **E**:\n",
"\n",
Expand Down Expand Up @@ -424,22 +448,22 @@
"Now that we have computed the optimal cost-to-go, we can proceed in a forward manner to determine the best path:\n",
"\n",
"$$\n",
"\\pi^* = \\underset{n}{\\argmin} [c(n_1, n_2) + V(n_2)]\n",
"\\pi^* = \\underset{n}{\\text{argmin}} [c(n_1, n_2) + V(n_2)]\n",
"$$\n",
"\n",
"For the first action (step) we have:\n",
"\n",
"$$\n",
"\\pi^*_0 &= \\underset{n_2 \\in \\{ B, C, D \\}}{\\argmin} \\left[ c(A, n_2) + V(n_2) \\right] \\\\ \n",
"&= \\underset{n_2}{\\argmin} \\left[ c(A, n_2 = B) + V(n_2 = B), c(A, n_2 = C) + V(n_2 = C), c(A, n_2 = D) + V(n_2 = D) \\right] \\\\\n",
"&= \\underset{n_2}{\\argmin} \\left[ 4 + 2, 5 + 3, 3 + 6 \\right] \\\\\n",
"\\pi^*_0 &= \\underset{n_2 \\in \\{ B, C, D \\}}{\\text{argmin}} \\left[ c(A, n_2) + V(n_2) \\right] \\\\ \n",
"&= \\underset{n_2}{\\text{argmin}} \\left[ c(A, n_2 = B) + V(n_2 = B), c(A, n_2 = C) + V(n_2 = C), c(A, n_2 = D) + V(n_2 = D) \\right] \\\\\n",
"&= \\underset{n_2}{\\text{argmin}} \\left[ 4 + 2, 5 + 3, 3 + 6 \\right] \\\\\n",
"&= B\n",
"$$\n",
"\n",
"Proceeding the same way we get:\n",
"\n",
"$$\n",
"\\pi^* &= \\{\\pi^*_0, \\pi^*_1, \\pi^*_2\\} &= \\{\\text{B, E, G} \\}\n",
"\\pi^* = \\{ \\pi^*_0, \\pi^*_1, \\pi^*_2 \\} = \\{ \\text{B, E, G} \\}\n",
"$$\n",
"\n",
"The shortest-path is ABEG."
Expand All @@ -448,15 +472,28 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"plot_all_paths_graph(G, show_solution=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
}
},
"source": [
"### Value Iteration\n",
"\n",
Expand Down Expand Up @@ -528,7 +565,15 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-input"
]
},
"outputs": [],
"source": [
"%%html\n",
Expand All @@ -538,7 +583,15 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"env = create_grid_world_environment(render_mode=\"rgb_array\", max_steps=50)\n",
Expand All @@ -556,7 +609,15 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"env.reset()\n",
Expand All @@ -574,7 +635,15 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"G = convert_graph_to_directed(G)\n",
Expand All @@ -583,7 +652,12 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
}
},
"source": [
"We wish for the car to travel from its starting cell in red to the target cell in green. If the cost represents time and each step has the same cost then we want to find the shortest path to the goal.\n",
"\n",
Expand Down Expand Up @@ -611,21 +685,34 @@
"Compute the optimal cost-to-go at each node.\n",
"\n",
"You can use `dict(G.nodes(data=True))` to get a dictionary that maps the nodes to their attributes\n",
"and you can use `G.start_node` and `G.target_node` to access the start and end (i.e. goal) nodes, respectively.\n",
"and you can use `G.start_node` and `G.target_node` to access the start and target nodes, respectively.\n",
":::"
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
}
},
"source": [
":::{exercise-end}\n",
":::"
]
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-cell"
]
},
"source": [
"````{solution} grid-world\n",
"````"
Expand All @@ -634,7 +721,15 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"# Your solution here"
Expand Down
Loading

0 comments on commit e94f1d7

Please sign in to comment.