From 4559ea625f062cee0107be4d6322c90590f1323c Mon Sep 17 00:00:00 2001 From: "m.mahdizadeh" Date: Sat, 2 Oct 2021 01:03:11 +0330 Subject: [PATCH 1/3] add notebook --- .../adversarial attacks.ipynb | 1 + notebooks/adversarial_attack/metadata.yml | 29 +++++++++++++++++++ 2 files changed, 30 insertions(+) create mode 100644 notebooks/adversarial_attack/adversarial attacks.ipynb create mode 100644 notebooks/adversarial_attack/metadata.yml diff --git a/notebooks/adversarial_attack/adversarial attacks.ipynb b/notebooks/adversarial_attack/adversarial attacks.ipynb new file mode 100644 index 0000000..eba874a --- /dev/null +++ b/notebooks/adversarial_attack/adversarial attacks.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"index.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.8"}},"cells":[{"cell_type":"markdown","metadata":{"id":"LDerBS_foQl2"},"source":["# Adversarial attaks and robustness\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"a5z34ayPNA13"},"source":["# Table of Contents\n","1. [Introduction](#introduction)\n","2. [The importance of robustness](#importance)\n","3. [Attacks](#attack)\n"," * [PGD](#pgd)\n"," * [FGSM](#fgsm)\n"," * [DeepFool](#deepfool)\n","5. [robustness](#robustness)\n"," * [Adversarial training](#advers_train)\n"," * [Provable defenses](#prove_defense)\n","6. [Example](#code_example)\n","7. [References](#references)"]},{"cell_type":"markdown","metadata":{"id":"2OotVuBgfqME"},"source":["## Introduction "]},{"cell_type":"markdown","metadata":{"id":"ZeHq0bAh4c2u"},"source":["The problem of unusual mistakes that matchine learning algorithm make, was not a serious avenue of study until the algorithm started to work very well most of the time, as now that is actually the exception rather than the rule. Most machine learning techniques were designed to work on specific problem sets in which the training and test data are generated from the same statistical distribution. When those models are applied to the real world, adversaries may create some data that violates that statistical assumption.\n","\n","An adversarial attack is a method to generate adversarial examples. Therefore, an adversarial example is an input to a machine learning model that is purposely designed to cause a model to make a mistake in its predictions despite resembling a valid input to a human.\n","\n","For example, in the picture bellow we start with a panda that has not been modified in any way and the network train on the imageNet data set is able to recognize it as being a panda with about 60% confidence in that decision. The optimal direction to modify the image in a way that cause the network to make a mistake is given by the image in the middle, which looks a lot like noise to a human but it actually is carefully computed as a function of the parameters of the network. If the image of the structured attack is multiplied by a very small coefficient and added to the original picture, it would create a picture that a human cannot tell from the original panda, but that tiny change is enough to fool the network into recognizing the image of panda as being a gibbon with 99.9% confidence in that decision; thus, it is not barely fining the decision boundary and barely crossing it to change the decision.\n","\n","\n","

\n"," \n","

\n","\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"gYV4Z8O_fqMF"},"source":["## The importance of robustness "]},{"cell_type":"markdown","metadata":{"id":"-thmGRK44qD6"},"source":["We just seen examples of how easily machine learning models can be fooled into making wrong predictions by adding slight noise to the input that may be imperceptible to humans. This problem occurs, because we learn and use models without understanding how they work internally and whether they actually learn concepts that would mirror human cognition. While the panda gibbon example might look interesting, it is easy to reframe adversarial examples as safety and security problems.\n","\n","For examples, by taping a small stickers to a stop sign we can fool a vision system to recognize the sign as a \"Speed Limit 45” sign, implying that this might lead to wrong and dangerous actions taken by an autonomous car or even natural changes to inputs may lead to wrong predictions with safety consequences. The most common examples are self driving cars driving in foggy weather conditions or with a slightly smeared camera, all resulting in small modifications of the input. The pictures in bellow are examples of possible input transformations, mirroring potential conditions in the real world for a self driving system leading to wrong predictions of the steering angle.\n","

\n"," \n","

\n","\n","Therefore, the models that are used should be robust to these kinks of attacks. Robustness is the idea that a model’s prediction is stable to small variations in the input, because it’s prediction is based on reliable abstractions of the real task that mirror how a human would perform the task.\n","\n","More examples of the danger of adversarial attacks can be found on [this](https://rll.berkeley.edu/adversarial/) and [this](https://arxiv.org/abs/1701.04143) papers."]},{"cell_type":"markdown","metadata":{"id":"iEn8GUyrfqMO"},"source":["## Attacks "]},{"cell_type":"markdown","metadata":{"id":"CjLahM_Vbqz2"},"source":["Machine learning algorithms develop their behavior through experience. They are complex mathematical functions that transform inputs to outputs. If a machine learning tags an image as containing a specific object, it has found the pixel values in that image to be statistically similar to other images of the object it has processed during training.\n","\n","Adversarial attacks exploit this characteristic to confound machine learning algorithms by manipulating their input data. For instance, by adding tiny and inconspicuous patches of pixels to an image.\n","\n","in the following, some of the known current techniques for generating adversarial examples are listed."]},{"cell_type":"markdown","metadata":{"id":"_4ZBEZCDfqMO"},"source":["### PGD "]},{"cell_type":"markdown","metadata":{"id":"UcGXgwZ_fqMP"},"source":["Fill later\n"]},{"cell_type":"code","metadata":{"id":"1Z203XsFfG4x"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"zoqWqW24fqMQ"},"source":["### FGSM "]},{"cell_type":"markdown","metadata":{"id":"4L4mL1nt5d99"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"UstdBiu75eKS"},"source":["### DeepFool "]},{"cell_type":"markdown","metadata":{"id":"pxixV6SG6ldE"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"acf3jhH7fqMS"},"source":["## Robustness "]},{"cell_type":"markdown","metadata":{"id":"3V8OgzSpff0S"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"r4gF7qMR5zwm"},"source":["### Adversarial training "]},{"cell_type":"markdown","metadata":{"id":"Zbj9Zxlp6CxS"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"kHneZ5816DET"},"source":["### Provable defenses "]},{"cell_type":"markdown","metadata":{"id":"eebanxiL59Oq"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"29Hx95TFfqMU"},"source":["## Example \n"]},{"cell_type":"markdown","metadata":{"id":"CdJiNHvH6nWn"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"8BDIzkJhfqMV"},"source":["## References \n"]},{"cell_type":"markdown","metadata":{"id":"TAfujAPs6n1B"},"source":["Fill later\n"]}]} \ No newline at end of file diff --git a/notebooks/adversarial_attack/metadata.yml b/notebooks/adversarial_attack/metadata.yml new file mode 100644 index 0000000..7e88335 --- /dev/null +++ b/notebooks/adversarial_attack/metadata.yml @@ -0,0 +1,29 @@ +title: Adversarial attacks and robustness + +meta: + - name: keywords + content: Artificial Intelligence, Adversarial attacks, robustness + +header: + title: Adversarial attacks and robustness + description: | + In this notebook we talk about diffrent kinds of attacks on machine learning and how to resist them. + +authors: + label: + position: top + text: Authors + kind: people + content: + - name: matinamehdizadeh + role: Author + contact: + - link: https://github.com/matinamehdizadeh + icon: fab fa-github + - link: mailto://matinamehdizadeh@gmail.com + icon: fas fa-envelope + +comments: + label: false + kind: comments + From 9d4b29b96e4ba2fca9faa5eff6ddbb00fe14a6c4 Mon Sep 17 00:00:00 2001 From: "m.mahdizadeh" Date: Sat, 2 Oct 2021 01:49:41 +0330 Subject: [PATCH 2/3] correct some spelling --- notebooks/adversarial_attack/adversarial attacks.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/notebooks/adversarial_attack/adversarial attacks.ipynb b/notebooks/adversarial_attack/adversarial attacks.ipynb index eba874a..ef20580 100644 --- a/notebooks/adversarial_attack/adversarial attacks.ipynb +++ b/notebooks/adversarial_attack/adversarial attacks.ipynb @@ -1 +1 @@ -{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"index.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.8"}},"cells":[{"cell_type":"markdown","metadata":{"id":"LDerBS_foQl2"},"source":["# Adversarial attaks and robustness\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"a5z34ayPNA13"},"source":["# Table of Contents\n","1. [Introduction](#introduction)\n","2. [The importance of robustness](#importance)\n","3. [Attacks](#attack)\n"," * [PGD](#pgd)\n"," * [FGSM](#fgsm)\n"," * [DeepFool](#deepfool)\n","5. [robustness](#robustness)\n"," * [Adversarial training](#advers_train)\n"," * [Provable defenses](#prove_defense)\n","6. [Example](#code_example)\n","7. [References](#references)"]},{"cell_type":"markdown","metadata":{"id":"2OotVuBgfqME"},"source":["## Introduction "]},{"cell_type":"markdown","metadata":{"id":"ZeHq0bAh4c2u"},"source":["The problem of unusual mistakes that matchine learning algorithm make, was not a serious avenue of study until the algorithm started to work very well most of the time, as now that is actually the exception rather than the rule. Most machine learning techniques were designed to work on specific problem sets in which the training and test data are generated from the same statistical distribution. When those models are applied to the real world, adversaries may create some data that violates that statistical assumption.\n","\n","An adversarial attack is a method to generate adversarial examples. Therefore, an adversarial example is an input to a machine learning model that is purposely designed to cause a model to make a mistake in its predictions despite resembling a valid input to a human.\n","\n","For example, in the picture bellow we start with a panda that has not been modified in any way and the network train on the imageNet data set is able to recognize it as being a panda with about 60% confidence in that decision. The optimal direction to modify the image in a way that cause the network to make a mistake is given by the image in the middle, which looks a lot like noise to a human but it actually is carefully computed as a function of the parameters of the network. If the image of the structured attack is multiplied by a very small coefficient and added to the original picture, it would create a picture that a human cannot tell from the original panda, but that tiny change is enough to fool the network into recognizing the image of panda as being a gibbon with 99.9% confidence in that decision; thus, it is not barely fining the decision boundary and barely crossing it to change the decision.\n","\n","\n","

\n"," \n","

\n","\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"gYV4Z8O_fqMF"},"source":["## The importance of robustness "]},{"cell_type":"markdown","metadata":{"id":"-thmGRK44qD6"},"source":["We just seen examples of how easily machine learning models can be fooled into making wrong predictions by adding slight noise to the input that may be imperceptible to humans. This problem occurs, because we learn and use models without understanding how they work internally and whether they actually learn concepts that would mirror human cognition. While the panda gibbon example might look interesting, it is easy to reframe adversarial examples as safety and security problems.\n","\n","For examples, by taping a small stickers to a stop sign we can fool a vision system to recognize the sign as a \"Speed Limit 45” sign, implying that this might lead to wrong and dangerous actions taken by an autonomous car or even natural changes to inputs may lead to wrong predictions with safety consequences. The most common examples are self driving cars driving in foggy weather conditions or with a slightly smeared camera, all resulting in small modifications of the input. The pictures in bellow are examples of possible input transformations, mirroring potential conditions in the real world for a self driving system leading to wrong predictions of the steering angle.\n","

\n"," \n","

\n","\n","Therefore, the models that are used should be robust to these kinks of attacks. Robustness is the idea that a model’s prediction is stable to small variations in the input, because it’s prediction is based on reliable abstractions of the real task that mirror how a human would perform the task.\n","\n","More examples of the danger of adversarial attacks can be found on [this](https://rll.berkeley.edu/adversarial/) and [this](https://arxiv.org/abs/1701.04143) papers."]},{"cell_type":"markdown","metadata":{"id":"iEn8GUyrfqMO"},"source":["## Attacks "]},{"cell_type":"markdown","metadata":{"id":"CjLahM_Vbqz2"},"source":["Machine learning algorithms develop their behavior through experience. They are complex mathematical functions that transform inputs to outputs. If a machine learning tags an image as containing a specific object, it has found the pixel values in that image to be statistically similar to other images of the object it has processed during training.\n","\n","Adversarial attacks exploit this characteristic to confound machine learning algorithms by manipulating their input data. For instance, by adding tiny and inconspicuous patches of pixels to an image.\n","\n","in the following, some of the known current techniques for generating adversarial examples are listed."]},{"cell_type":"markdown","metadata":{"id":"_4ZBEZCDfqMO"},"source":["### PGD "]},{"cell_type":"markdown","metadata":{"id":"UcGXgwZ_fqMP"},"source":["Fill later\n"]},{"cell_type":"code","metadata":{"id":"1Z203XsFfG4x"},"source":[""],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"zoqWqW24fqMQ"},"source":["### FGSM "]},{"cell_type":"markdown","metadata":{"id":"4L4mL1nt5d99"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"UstdBiu75eKS"},"source":["### DeepFool "]},{"cell_type":"markdown","metadata":{"id":"pxixV6SG6ldE"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"acf3jhH7fqMS"},"source":["## Robustness "]},{"cell_type":"markdown","metadata":{"id":"3V8OgzSpff0S"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"r4gF7qMR5zwm"},"source":["### Adversarial training "]},{"cell_type":"markdown","metadata":{"id":"Zbj9Zxlp6CxS"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"kHneZ5816DET"},"source":["### Provable defenses "]},{"cell_type":"markdown","metadata":{"id":"eebanxiL59Oq"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"29Hx95TFfqMU"},"source":["## Example \n"]},{"cell_type":"markdown","metadata":{"id":"CdJiNHvH6nWn"},"source":["Fill later\n"]},{"cell_type":"markdown","metadata":{"id":"8BDIzkJhfqMV"},"source":["## References \n"]},{"cell_type":"markdown","metadata":{"id":"TAfujAPs6n1B"},"source":["Fill later\n"]}]} \ No newline at end of file +{"nbformat":4,"nbformat_minor":2,"metadata":{"colab":{"name":"index.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.8"}},"cells":[{"cell_type":"markdown","source":["# Adversarial attaks and robustness\n","\n","\n"],"metadata":{"id":"LDerBS_foQl2"}},{"cell_type":"markdown","source":["# Table of Contents\n","1. [Introduction](#introduction)\n","2. [The importance of robustness](#importance)\n","3. [Attacks](#attack)\n"," * [PGD](#pgd)\n"," * [FGSM](#fgsm)\n"," * [DeepFool](#deepfool)\n","5. [robustness](#robustness)\n"," * [Adversarial training](#advers_train)\n"," * [Provable defenses](#prove_defense)\n","6. [Example](#code_example)\n","7. [References](#references)"],"metadata":{"id":"a5z34ayPNA13"}},{"cell_type":"markdown","source":["## Introduction "],"metadata":{"id":"2OotVuBgfqME"}},{"cell_type":"markdown","source":["The problem of unusual mistakes that machine learning algorithms make, was not a serious avenue of study until the algorithm started to work very well most of the time, as now that is actually the exception rather than the rule. Most machine learning techniques were designed to work on specific problem sets in which the training and test data are generated from the same statistical distribution. When those models are applied to the real world, adversaries may create some data that violates that statistical assumption.\r\n","An adversarial attack is a method to generate adversarial examples. Therefore, an adversarial example is an input to a machine learning model that is purposely designed to cause a model to make a mistake in its predictions despite resembling a valid input to a human.\r\n","For example, in the picture below we start with a panda that has not been modified in any way and the network train on the imageNet data set is able to recognize it as being a panda with about 60% confidence in that decision. The optimal direction to modify the image in a way that cause the network to make a mistake is given by the image in the middle, which looks a lot like noise to a human but it actually is carefully computed as a function of the parameters of the network. If the image of the structured attack is multiplied by a very small coefficient and added to the original picture, it would create a picture that a human cannot tell from the original panda, but that tiny change is enough to fool the network into recognizing the image of panda as being a gibbon with 99.9% confidence in that decision; thus, it is not barely fining the decision boundary and barely crossing it to change the decision.\r\n","\r\n","\r\n","\r\n","\r\n","

\r\n"," \r\n","

\r\n","\r\n","\r\n","\r\n"],"metadata":{"id":"ZeHq0bAh4c2u"}},{"cell_type":"markdown","source":["## The importance of robustness "],"metadata":{"id":"gYV4Z8O_fqMF"}},{"cell_type":"markdown","source":["We just saw examples of how easily machine learning models can be fooled into making wrong predictions by adding slight noise to the input that may be unnoticeable to humans. This problem occurs, because we learn and use models without understanding how they work internally and whether they actually learn concepts that would mirror human cognition. While the panda gibbon example might look interesting, it is easy to reframe adversarial examples as safety and security problems.\r\n","\r\n","For examples, by taping a small sticker to a stop sign we can fool a vision system to recognize the sign as a \"Speed Limit 45” sign, implying that this might lead to wrong and dangerous actions taken by an autonomous car or even natural changes to inputs may lead to wrong predictions with safety consequences. The most common examples are self driving cars driving in foggy weather conditions or with a slightly smeared camera, all resulting in small modifications of the input. The pictures in below are examples of possible input transformations, mirroring potential conditions in the real world for a self driving system leading to wrong predictions of the navigation angle.\r\n","

\r\n"," \r\n","

\r\n","\r\n","Therefore, the models that are used should be robust to these kinks of attacks. Robustness is the idea that a model’s prediction is stable to small variations in the input, because it’s prediction is based on reliable concepts of the real task that mirror how a human would perform the task.\r\n","\r\n","More examples of the danger of adversarial attacks can be found on [this](https://rll.berkeley.edu/adversarial/) and [this](https://arxiv.org/abs/1701.04143) papers."],"metadata":{"id":"-thmGRK44qD6"}},{"cell_type":"markdown","source":["## Attacks "],"metadata":{"id":"iEn8GUyrfqMO"}},{"cell_type":"markdown","source":["Machine learning algorithms develop their behavior through experience. They are complex mathematical functions that transform inputs to outputs. If a machine learning tags an image as containing a specific object, it has found the pixel values in that image to be statistically similar to other images of the object it has processed during training.\r\n","\r\n","Adversarial attacks exploit this characteristic to confuse machine learning algorithms by manipulating their input data. For instance, by adding tiny and unremarkable patches of pixels to an image.\r\n","\r\n","in the following, some of the known current techniques for generating adversarial examples are listed."],"metadata":{"id":"CjLahM_Vbqz2"}},{"cell_type":"markdown","source":["### PGD "],"metadata":{"id":"_4ZBEZCDfqMO"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"UcGXgwZ_fqMP"}},{"cell_type":"code","execution_count":null,"source":[],"outputs":[],"metadata":{"id":"1Z203XsFfG4x"}},{"cell_type":"markdown","source":["### FGSM "],"metadata":{"id":"zoqWqW24fqMQ"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"4L4mL1nt5d99"}},{"cell_type":"markdown","source":["### DeepFool "],"metadata":{"id":"UstdBiu75eKS"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"pxixV6SG6ldE"}},{"cell_type":"markdown","source":["## Robustness "],"metadata":{"id":"acf3jhH7fqMS"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"3V8OgzSpff0S"}},{"cell_type":"markdown","source":["### Adversarial training "],"metadata":{"id":"r4gF7qMR5zwm"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"Zbj9Zxlp6CxS"}},{"cell_type":"markdown","source":["### Provable defenses "],"metadata":{"id":"kHneZ5816DET"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"eebanxiL59Oq"}},{"cell_type":"markdown","source":["## Example \n"],"metadata":{"id":"29Hx95TFfqMU"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"CdJiNHvH6nWn"}},{"cell_type":"markdown","source":["## References \n"],"metadata":{"id":"8BDIzkJhfqMV"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"TAfujAPs6n1B"}}]} \ No newline at end of file From c33365d5a5350a4403cf7d5beb7e8fad557968e9 Mon Sep 17 00:00:00 2001 From: matinamehdizadeh Date: Mon, 22 Aug 2022 14:21:42 +0430 Subject: [PATCH 3/3] add notebooks --- notebooks/MLOps/index.ipynb | 480 +++++++++++++ .../metadata.yml | 8 +- .../Recurrent Neural Networks/index.ipynb | 663 ++++++++++++++++++ .../Recurrent Neural Networks/metadata.yml | 29 + .../adversarial attacks.ipynb | 1 - 5 files changed, 1176 insertions(+), 5 deletions(-) create mode 100644 notebooks/MLOps/index.ipynb rename notebooks/{adversarial_attack => MLOps}/metadata.yml (61%) create mode 100644 notebooks/Recurrent Neural Networks/index.ipynb create mode 100644 notebooks/Recurrent Neural Networks/metadata.yml delete mode 100644 notebooks/adversarial_attack/adversarial attacks.ipynb diff --git a/notebooks/MLOps/index.ipynb b/notebooks/MLOps/index.ipynb new file mode 100644 index 0000000..06f279d --- /dev/null +++ b/notebooks/MLOps/index.ipynb @@ -0,0 +1,480 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "LDerBS_foQl2" + }, + "source": [ + "# MLOps\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a5z34ayPNA13" + }, + "source": [ + "# Table of Contents\n", + "1. [Introduction](#introduction)\n", + "2. [Machine Learning Lifecycle](#mll)\n", + "3. [MLOps Tools](#tools)\n", + " * [Data Management](#data)\n", + " * [Modeling](#model)\n", + " * [Operationalization](#operation)\n", + "4. [Example](#code_example)\n", + "5. [Conclusion](#conclusion)\n", + "6. [References](#references)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2OotVuBgfqME" + }, + "source": [ + "## Introduction " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZeHq0bAh4c2u" + }, + "source": [ + "MLOps, also known as Machine Learning Operations for Production, is a set of standardized practices that can be utilized to build, deploy, and govern the lifecycle of ML models. This setup helps to ease the interaction among cross-functional teams and provides an automated platform to keep track of everything required for the complete cycle of ML models. MLOps practices also result in increased scalability, security, and reliability of the ML systems, leading to shorter development cycles and escalated profits from the ML projects. \n", + "\n", + "
\n", + "
\n", + "\n", + "

\n", + " \n", + "

\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gYV4Z8O_fqMF" + }, + "source": [ + "## Machine Learning Lifecycle " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-thmGRK44qD6" + }, + "source": [ + "MLOps lifecycle has seven different stages. All the processes happen iteratively, and the success of the entire machine learning system comes with the successful execution of each of these stages.\n", + "\n", + "The machine learning lifecycle is the process of developing, deploying, and managing a machine learning model for a specific application. The lifecycle typically consists of:\n", + "\n", + "

\n", + " \n", + "

\n", + "\n", + "ML Development: This is the basic step that involves creating a complete pipeline beginning from data processing to model training and evaluation codes. \n", + "\n", + "Model Training: Once the setup is ready, the next logical step is to train the model. Here, continuous training functionality is also needed to adapt to new data or address specific changes. \n", + "\n", + "Model Evaluation: Performing inference over the trained model and checking the accuracy/correctness of the output results. \n", + "\n", + "Model Deployment: When the proof of concept stage is accomplished, the other part is to deploy the model according to the industry requirements to face the real-life data. \n", + "\n", + "Prediction Serving: After deployment, the model is now ready to serve predictions over the incoming data. \n", + "\n", + "Model Monitoring: Over time, problems such as concept drift can make the results inaccurate hence continuous monitoring of the model is essential to ensure proper functioning. \n", + "\n", + "Data and Model Management: It is a part of the central system that manages the data and models. It includes maintaining storage, keeping track of different versions, ease of accessibility, security, and configuration across various cross-functional teams. \n", + "\n", + "\n", + "Models are deployed across the organization and in various systems without a consistent way to monitor them. Models have been in production for a long time and never refreshed." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iEn8GUyrfqMO" + }, + "source": [ + "## MLOps Tools " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CjLahM_Vbqz2" + }, + "source": [ + "One of the challenges in ML lifecycle management is manual labor. Every step and the transition between steps are manual. It means data scientists need to collect, analyze, and process data for each application manually. They need to examine their older models to develop new ones and manually fine-tune each time. A large amount of time is allocated to model monitoring to prevent performance degradation. A successful deployment of machine learning models at scale requires automation of steps of the lifecycle. Automation decreases the time allocated to resource-consuming steps such as feature engineering, model training, monitoring, and retraining. It frees up time to rapidly experiment with new models.\n", + "\n", + "The MLOps tools help organizations apply DevOps practices to the process of creating and using AI and machine learning models. These tools are typically used by machine learning engineers, data scientists, and DevOps engineers. MLOps tools can be divided into three major areas.\n", + "\n", + "

\n", + " \n", + "

\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_4ZBEZCDfqMO" + }, + "source": [ + "### Data Management " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UcGXgwZ_fqMP" + }, + "source": [ + "MLOps Tools for data management consist of data labeling tools which are used to label large volumes of data such as texts, images, or audios and data versioning tools which enable managing different versions of datasets and storing them in an accessible and well-organized way.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zoqWqW24fqMQ" + }, + "source": [ + "### Modeling " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4L4mL1nt5d99" + }, + "source": [ + "MLOps Tools for modeling consist of feature engineering tools that automate the process of extracting useful features from raw datasets to create better training data for machine learning models like [Feast](https://github.com/feast-dev/feast). Another tool is for experiment tracking which save all the necessary information about different experiments like [MLFlow](https://mlflow.org) and the last tool is for Hyperparameter Optimization that automate the process of searching and selecting hyperparameters that give optimal performance for machine learning models." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UstdBiu75eKS" + }, + "source": [ + "### Operationalization " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pxixV6SG6ldE" + }, + "source": [ + "MLOps Tools for operationalization consist of model deployment tools which facilitate integrating ML models into a production environment to make predictions like [Kubeflow](https://www.kubeflow.org). the other tool concerning operationalization is for model monitoring which detect data drifts and anomalies over time and allow setting up alerts in case of performance issues." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "29Hx95TFfqMU" + }, + "source": [ + "## Example \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CdJiNHvH6nWn" + }, + "source": [ + "In this section we see an example of ml lifrcycle using MLFlow. MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It is designed to work with any machine learning library, determine most things about your code by convention, and require minimal changes to integrate into an existing codebase.\n", + "First, we install and import nessecary packages." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we define our metric for evaluation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the next cell, we first read the wine-quality csv file from the URL and then split the data into training and test sets.\n", + "Then, we split the target from our data set which is the quality column and at the end we register our model.\n", + "\n", + "The mlflow.start_run function start a new MLflow run, setting it as the active run under which metrics and parameters will be logged, mlflow.log_metric function logs a single key-value metric, mlflow.log_param function logs a single key-value param in the currently active run, mlflow.log_artifact function logs a local file or directory as an artifact and mlflow.set_tracking_uri function set tracking store URI.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting train.py\n" + ] + } + ], + "source": [ + "%%writefile train.py\n", + "# !pip install mlflow\n", + "import os\n", + "import warnings\n", + "import sys\n", + "\n", + "import pandas as pd\n", + "import numpy as np\n", + "from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.linear_model import ElasticNet\n", + "from urllib.parse import urlparse\n", + "import mlflow\n", + "import mlflow.sklearn\n", + "\n", + "import logging\n", + "\n", + "logging.basicConfig(level=logging.WARN)\n", + "logger = logging.getLogger(__name__)\n", + "\n", + "\n", + "def eval_metrics(actual, pred):\n", + " rmse = np.sqrt(mean_squared_error(actual, pred))\n", + " mae = mean_absolute_error(actual, pred)\n", + " r2 = r2_score(actual, pred)\n", + " return rmse, mae, r2\n", + "\n", + "warnings.filterwarnings(\"ignore\")\n", + "np.random.seed(40)\n", + "\n", + "csv_url = (\"http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv\")\n", + "try:\n", + " data = pd.read_csv(csv_url, sep=\";\")\n", + "except Exception as e:\n", + " logger.exception(\"Unable to download training & test CSV, check your internet connection. Error: %s\", e)\n", + "\n", + "train, test = train_test_split(data)\n", + "\n", + "train_x = train.drop([\"quality\"], axis=1)\n", + "test_x = test.drop([\"quality\"], axis=1)\n", + "train_y = train[[\"quality\"]]\n", + "test_y = test[[\"quality\"]]\n", + "\n", + "alpha = float(sys.argv[1]) if len(sys.argv) > 1 else 0.5\n", + "l1_ratio = float(sys.argv[2]) if len(sys.argv) > 2 else 0.5\n", + "\n", + "with mlflow.start_run():\n", + " lr = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)\n", + " lr.fit(train_x, train_y)\n", + "\n", + " predicted_qualities = lr.predict(test_x)\n", + "\n", + " (rmse, mae, r2) = eval_metrics(test_y, predicted_qualities)\n", + "\n", + " print(\"Elasticnet model (alpha=%f, l1_ratio=%f):\" % (alpha, l1_ratio))\n", + " print(\" RMSE: %s\" % rmse)\n", + " print(\" MAE: %s\" % mae)\n", + " print(\" R2: %s\" % r2)\n", + "\n", + " mlflow.log_param(\"alpha\", alpha)\n", + " mlflow.log_param(\"l1_ratio\", l1_ratio)\n", + " mlflow.log_metric(\"rmse\", rmse)\n", + " mlflow.log_metric(\"r2\", r2)\n", + " mlflow.log_metric(\"mae\", mae)\n", + "\n", + " tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme\n", + "\n", + " if tracking_url_type_store != \"file\":\n", + " mlflow.sklearn.log_model(lr, \"model\", registered_model_name=\"ElasticnetWineModel\")\n", + " else:\n", + " mlflow.sklearn.log_model(lr, \"model\")" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Elasticnet model (alpha=0.600000, l1_ratio=0.800000):\n", + " RMSE: 0.8326325509502465\n", + " MAE: 0.6676500690618903\n", + " R2: 0.0177082428508879\n" + ] + } + ], + "source": [ + "!python train.py 0.6 0.8" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mlflow ui" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we serve our model which is to host machine-learning models (on the cloud or on premises) and to make their functions available via API so that applications can incorporate AI into their systems. Model serving is crucial, as a business cannot offer AI products to a large user base without making its product accessible." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mlflow models serve -m \"/Users/model\" --no-conda -p 1234" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!curl -X POST -H \"Content-Type:application/json; format=pandas-split\" --data '{\"columns\":[\"alcohol\", \"chlorides\", \"citric acid\", \"density\", \"fixed acidity\", \"free sulfur dioxide\", \"pH\", \"residual sugar\", \"sulphates\", \"total sulfur dioxide\", \"volatile acidity\"],\"data\":[[12.8, 0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' http://127.0.0.1:1234/invocations" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next step is to deploy our model using ducker. First we build the image and then deploy it to our cluster. One way to do this is by applying the respective Kubernetes manifests through the kubectl CLI" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!mlflow models build-docker \\\n", + " -m ./mlruns/0/d1a8010b10f84f5a9b0a51e2b420efb2/artifacts/model \\\n", + " -n my-docker-image \\\n", + " --enable-mlserver\n", + "\n", + "!kubectl apply -f my-config.yaml" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%writefile my-manifest.yaml\n", + "\n", + "apiVersion: serving.kserve.io/v1beta1\n", + "kind: InferenceService\n", + "metadata:\n", + " name: mlflow-model\n", + "spec:\n", + " predictor:\n", + " containers:\n", + " - name: mlflow-model\n", + " image: my-docker-image\n", + " ports:\n", + " - containerPort: 8080\n", + " protocol: TCP\n", + " env:\n", + " - name: PROTOCOL\n", + " value: v2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8BDIzkJhfqMV" + }, + "source": [ + "## Conclusion \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TAfujAPs6n1B" + }, + "source": [ + "MLOps solution provides data scientists with an easier and efficient way to maintain monitor models. By getting models into production and bridging the gap between the stakeholder teams, they can focus on data science. With the help of MLOps, deployment can be done on any platform.\n", + "\n", + "In this nootboke we talk about MLOps and its lifecycle and the nessecity of using it. and at the end we saw an simple example of developing and deploying a model using MLFlow which is a library used for MLOps in python. " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## References \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "[MLOps concepts for busy engineers: model serving](https://spell.ml/blog/mlops-concepts-model-serving-X385lREAACcAAGzS)\n", + "
\n", + "[MLOps Principles](https://ml-ops.org/content/mlops-principles)\n", + "
\n", + "[MLOps Python Tutorial for Beginners -Get Started with MLOps](https://www.projectpro.io/data-science-in-python-tutorial/mlops-python-tutorial-for-beginners#mcetoc_1fglt18dug)\n", + "
\n", + "[The MLOps–A Complete Guide and tutorial](https://www.devopsschool.com/blog/the-mlops-a-complete-guide-and-tutorial/)\n", + "
\n", + "[Machine Learning, Pipelines, Deployment and MLOps Tutorial](https://www.datacamp.com/tutorial/tutorial-machine-learning-pipelines-mlops-deployment#why-mlops-)\n", + "
\n", + "[Introduction to MLOps](https://www.youtube.com/watch?v=Kvxaj6pHeVA)" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "index.ipynb", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.8" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/adversarial_attack/metadata.yml b/notebooks/MLOps/metadata.yml similarity index 61% rename from notebooks/adversarial_attack/metadata.yml rename to notebooks/MLOps/metadata.yml index 7e88335..64c9968 100644 --- a/notebooks/adversarial_attack/metadata.yml +++ b/notebooks/MLOps/metadata.yml @@ -1,13 +1,13 @@ -title: Adversarial attacks and robustness +title: MLOps meta: - name: keywords - content: Artificial Intelligence, Adversarial attacks, robustness + content: Artificial Intelligence, MLOps header: - title: Adversarial attacks and robustness + title: MLOps description: | - In this notebook we talk about diffrent kinds of attacks on machine learning and how to resist them. + In this notebook we talk about MLOps with a simple example of model deployment. authors: label: diff --git a/notebooks/Recurrent Neural Networks/index.ipynb b/notebooks/Recurrent Neural Networks/index.ipynb new file mode 100644 index 0000000..0456f32 --- /dev/null +++ b/notebooks/Recurrent Neural Networks/index.ipynb @@ -0,0 +1,663 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "LDerBS_foQl2" + }, + "source": [ + "# Recurrent Neural Networks\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "a5z34ayPNA13" + }, + "source": [ + "# Table of Contents\n", + "1. [Introduction](#introduction)\n", + "2. [Training](#train)\n", + "3. [Architectures](#architectures)\n", + " * [One to Many](#otm)\n", + " * [Many to One](#mto)\n", + " * [Many to Many](#mtm)\n", + "6. [Example](#code_example)\n", + "6. [Conclusion](#conclusion)\n", + "7. [References](#references)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2OotVuBgfqME" + }, + "source": [ + "## Introduction " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZeHq0bAh4c2u" + }, + "source": [ + "Traditional feed-forward neural networks take in a fixed amount of input data all at the same time and produce a fixed amount of output each time. However, in some context in machine learning we want to have more flexibility in the types of data that our model can process. therefore, we move to this idea of recurrent neural networks (RNN). A recurrent neural network is a special type of an artificial neural network adapted to work for time series data or data that involves sequences; Meaning, RNNs do not consume all the input data at once. Instead, they take them in one at a time and in a sequence. At each step, the RNN does a series of calculations before producing an output. The output, known as the hidden state, is then combined with the next input in the sequence to produce another output. This process continues until the model is programmed to finish or the input sequence ends. To sum up, RNNs have the concept of memory that helps them store the states or information of previous inputs to generate the next output of the sequence.\n", + "\n", + "
\n", + "\n", + "

\n", + " \n", + "

\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gYV4Z8O_fqMF" + }, + "source": [ + "## Training " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-thmGRK44qD6" + }, + "source": [ + "We can think about RNNs in two ways. one is this concept of having a hidden state that feeds back at itself recurrently. The other one is to think about unrolling this computational graph for multiple time steps. This would help understanding the recurrent network easier.\n", + "\n", + "

\n", + " \n", + "

\n", + "\n", + " $x_t$ is the input at time step t. To keep things simple we assume that $x_t$ is a scalar value with a single feature. You can extend this idea to a d-dimensional feature vector.\n", + "
\n", + " $o_t$ is the output of the network at time step t. We can produce multiple outputs in the network but for this example we assume that there is one output.\n", + "
\n", + " $h_t$ vector stores the values of the hidden states at time t. This is also called the current context. $h_0$ vector is initialized to zero.\n", + "
\n", + " $w_t$ is weight matrix.\n", + "
\n", + " At every time step we can unfold the network for k time steps to get the output at time step k+1. The unfolded network is very similar to the feedforward neural network.\n", + "
\n", + " Now that we are seeing recurrent neural network as an feedforward neural network with k step, we can easily compute the outputs.\n", + "\n", + "
\n", + " $h_t = f_w(h_{t-1}, x_t) = tanh(w_{hh}h_{t-1} + w_{xh}x_t)$\n", + "
\n", + " $y_t = w_{yh}h_t$\n", + "
\n", + " \n", + " During training, for each piece of training data we will have a corresponding ground-truth label that we want the model to output. After receiving these outputs, we will calculate the loss of that process, which measures how far off, the model’s output is from the correct answer. Using this loss, we can calculate the gradient of the loss function for back-propagation.\n", + "With the gradient that we just obtained, we can update the weights in the model accordingly. Combined with the forward pass, back-propagation is looped over and again, allowing the model to become more accurate with its outputs each time as the weight matrices values are modified to pick out the patterns of the data.\n", + "\n", + "Although it may look as if each RNN cell is using a different weight as shown in the graphics, all of the weights are actually the same as that RNN cell is essentially being re-used throughout the process. This may lead to one of RNNs disadvantages which is the vanishing gradient problem, where the gradients used to compute the weight update may get very close to zero due to multiplication of the same matrix over and over again which prevents the network from learning new weights. The deeper the network, the more pronounced is this problem.\n", + "\n", + "The pseudo-code for training is given below. The value of k which is the recursion factor can be selected by the user for training. In the pseudo-code below $p_t$ is the target value at time step t:\n", + "\n", + "Repeat till stopping criterion is met:\n", + "
\n", + "Set all h to zero.\n", + "
\n", + "Repeat for t = 0 to k\n", + "
\n", + "Forward propagate the network over the unfolded network for k time steps to compute all h and y.\n", + "
\n", + "Compute the error as: $error = y_{k} - p_{k}$\n", + "
\n", + "Backpropagate the error across the unfolded network and update the weights.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iEn8GUyrfqMO" + }, + "source": [ + "## Architectures " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CjLahM_Vbqz2" + }, + "source": [ + "RNNs are really flexible and can adapt to your needs. As you will see in the images below, your input and output size can come in different forms, yet they can still be fed and extracted from the RNN model. There are different types of recurrent neural networks with varying architectures that are shown below." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_4ZBEZCDfqMO" + }, + "source": [ + "### One to Many " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UcGXgwZ_fqMP" + }, + "source": [ + "This type of neural network has a input which is an object of fixed size like an image and the output is a sequence of variable lenght, such as a caption where diffrent captions might have diffrent number of words, so our output needs to be variable at lenght. \n", + "
\n", + "\n", + "

\n", + " \n", + "

" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zoqWqW24fqMQ" + }, + "source": [ + "### Many to One " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4L4mL1nt5d99" + }, + "source": [ + "This RNN takes a sequence of inputs that could be variably sized like a text and generates a single output. Sentiment analysis is a good example of this kind of network where a given sentence can be classified as expressing positive or negative sentiments or in a computer vision contex, you might imagine taking as input, a video which might have variable number of frames and we want to read this entire video of potentioally variable lenght and at the end, make a classification decision about the kind of activity that is going on in that video.\n", + "\n", + "

\n", + " \n", + "

" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UstdBiu75eKS" + }, + "source": [ + "### Many to Many " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pxixV6SG6ldE" + }, + "source": [ + "This RNN takes a sequence of inputs and generates a sequence of outputs. Machine translation is one of the examples where our input might be some sentence in English, which could have a variable lenght and our output is the same sentence but in French, which also could have a variable length and crucially the length of the English sentence might be diffrent from the lenght of the French sentence so we need some models that have the capacity to accept both variable length sequences on the input and the output.\n", + "\n", + "We might also consider problems in computer vision contex, where our input is variably length like a video sequence with variable number of frames and we want to make a decision for each element of that input sequence. which in the context of video, is making a classification decision along every frame of that video.\n", + "\n", + "

\n", + " \n", + "

\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we saw above, RNNs are like a general paradigm for handling variable sized sequenced data that allow us to capture all of these diffrent types of setups in our models." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "29Hx95TFfqMU" + }, + "source": [ + "## Example \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CdJiNHvH6nWn" + }, + "source": [ + "In this example we will be implementing a simple RNN character model with PyTorch to familiarize ourselves with the PyTorch library and get started with RNNs. \n", + "In this implementation, we will be building a model that can complete your sentence based on a few characters or a word used as input.\n", + "\n", + "We will start off by installing and importing the main packages that we will use." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "#!pip3 install torch\n", + "# !pip3 install numpy\n", + "import torch\n", + "from torch import nn\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have to set our device first. we would use gpu if available and cpu if not." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "GPU not available, CPU used\n" + ] + } + ], + "source": [ + "if torch.cuda.is_available():\n", + " device = torch.device(\"cuda\")\n", + " print(\"GPU is available\")\n", + "else:\n", + " device = torch.device(\"cpu\")\n", + " print(\"GPU not available, CPU used\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then, we will define the sentences that we want our model to output when fed with the first word or the first few characters and create a dictionary out of all the characters that we have in the sentences and map them to an integer." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "text = ['hey how are you','good i am fine','have a nice day']\n", + "chars = set(''.join(text))\n", + "int2char = dict(enumerate(chars))\n", + "char2int = {char: ind for ind, char in int2char.items()}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we will be padding our input sentences to ensure that all the sentences are of the sample length. While RNNs are typically able to take in variably sized inputs, we will usually want to feed training data in batches to speed up the training process. In order to used batches to train on our data, we'll need to ensure that each sequence within the input data are of equal size.\n", + "\n", + "Therefore, in most cases, padding can be done by filling up sequences that are too short with 0 values and trimming sequences that are too long. In our case, we'll be finding the length of the longest sequence and padding the rest of the sentences with blank spaces to match that length." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "maxlen = len(max(text, key=len))\n", + "\n", + "for i in range(len(text)):\n", + " while len(text[i]) (Batch Size, Sequence Length, One-Hot Encoding Size)\n" + ] + } + ], + "source": [ + "dict_size = len(char2int)\n", + "seq_len = maxlen - 1\n", + "batch_size = len(text)\n", + "\n", + "def one_hot_encode(sequence, dict_size, seq_len, batch_size):\n", + " features = np.zeros((batch_size, seq_len, dict_size), dtype=np.float32)\n", + " \n", + " for i in range(batch_size):\n", + " for u in range(seq_len):\n", + " features[i, u, sequence[i][u]] = 1\n", + " return features\n", + "\n", + "input_seq = one_hot_encode(input_seq, dict_size, seq_len, batch_size)\n", + "print(\"Input shape: {} --> (Batch Size, Sequence Length, One-Hot Encoding Size)\".format(input_seq.shape))\n", + "\n", + "input_seq = torch.from_numpy(input_seq)\n", + "target_seq = torch.Tensor(target_seq)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To start building our own neural network model, we can define a class that inherits PyTorch’s base class (nn.module) for all neural network modules. After doing so, we can start defining some variables and also the layers for our model under the constructor. For this model, we wll only be using one layer of RNN followed by a fully connected layer. The fully connected layer will be in-charge of converting the RNN output to our desired output shape.\n", + "\n", + "We also have to define the forward pass function under forward() as a class method. The order the forward function is sequentially executed, therefore have to pass the inputs and the zero-initialized hidden state through the RNN layer first, before passing the RNN outputs to the fully-connected layer.\n", + "\n", + "The last method that we have to define is the method that we called earlier to initialize the hidden state - init_hidden(). This basically creates a tensor of zeros in the shape of our hidden states.\n", + "\n", + "Then we create an instance of our model and initialize the hyperparameters and start the training process." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "class Model(nn.Module):\n", + " def __init__(self, input_size, output_size, hidden_dim, n_layers):\n", + " super(Model, self).__init__()\n", + "\n", + " self.hidden_dim = hidden_dim\n", + " self.n_layers = n_layers\n", + " self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True) \n", + " self.fc = nn.Linear(hidden_dim, output_size)\n", + " \n", + " def forward(self, x):\n", + " \n", + " batch_size = x.size(0)\n", + " hidden = self.init_hidden(batch_size)\n", + " out, hidden = self.rnn(x, hidden)\n", + " out = out.contiguous().view(-1, self.hidden_dim)\n", + " out = self.fc(out)\n", + " \n", + " return out, hidden\n", + " \n", + " def init_hidden(self, batch_size):\n", + " hidden = torch.zeros(self.n_layers, batch_size, self.hidden_dim).to(device)\n", + " return hidden" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "model = Model(input_size=dict_size, output_size=dict_size, hidden_dim=12, n_layers=1)\n", + "model = model.to(device)\n", + "\n", + "n_epochs = 100\n", + "lr=0.01\n", + "\n", + "criterion = nn.CrossEntropyLoss()\n", + "optimizer = torch.optim.Adam(model.parameters(), lr=lr)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Epoch: 10/100............. Loss: 2.5228\n", + "Epoch: 20/100............. Loss: 2.1118\n", + "Epoch: 30/100............. Loss: 1.7116\n", + "Epoch: 40/100............. Loss: 1.3229\n", + "Epoch: 50/100............. Loss: 0.9832\n", + "Epoch: 60/100............. Loss: 0.7112\n", + "Epoch: 70/100............. Loss: 0.5081\n", + "Epoch: 80/100............. Loss: 0.3617\n", + "Epoch: 90/100............. Loss: 0.2649\n", + "Epoch: 100/100............. Loss: 0.2016\n" + ] + } + ], + "source": [ + "input_seq = input_seq.to(device)\n", + "for epoch in range(1, n_epochs + 1):\n", + " optimizer.zero_grad() # Clears existing gradients from previous epoch\n", + " #input_seq = input_seq.to(device)\n", + " output, hidden = model(input_seq)\n", + " output = output.to(device)\n", + " target_seq = target_seq.to(device)\n", + " loss = criterion(output, target_seq.view(-1).long())\n", + " loss.backward() # Does backpropagation and calculates gradients\n", + " optimizer.step() # Updates the weights accordingly\n", + " \n", + " if epoch%10 == 0:\n", + " print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')\n", + " print(\"Loss: {:.4f}\".format(loss.item()))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we have to test our model." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [], + "source": [ + "def predict(model, character):\n", + " character = np.array([[char2int[c] for c in character]])\n", + " character = one_hot_encode(character, dict_size, character.shape[1], 1)\n", + " character = torch.from_numpy(character)\n", + " character = character.to(device)\n", + " \n", + " out, hidden = model(character)\n", + "\n", + " prob = nn.functional.softmax(out[-1], dim=0).data\n", + " char_ind = torch.max(prob, dim=0)[1].item()\n", + "\n", + " return int2char[char_ind], hidden" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "def sample(model, out_len, start='hey'):\n", + " model.eval() \n", + " start = start.lower()\n", + " chars = [ch for ch in start]\n", + " size = out_len - len(chars)\n", + " for ii in range(size):\n", + " char, h = predict(model, chars)\n", + " chars.append(char)\n", + "\n", + " return ''.join(chars)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'good i am fine '" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sample(model, 15, 'good')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we can see, the model is able to come up with the sentence ‘good i am fine ‘ if we feed it with the words ‘good’, achieving what we intended for it to do.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this notebook we discuss:\n", + "
\n", + "- How a recurrent neural network handles sequential data\n", + "
\n", + "- Unfolding a recurrent neural network\n", + "
\n", + "- Training and back propagation in time\n", + "
\n", + "- Various architectures and variants of RNN\n", + "
\n", + "- Simple example of a vanilla RNN\n", + "\n", + "This is just the tip of the iceberg when it comes to Recurrent Neural Networks. While the vanilla RNN is rarely used in solving NLP or sequential problems, having a good grasp of the basic concepts of RNNs will definitely aid in your understanding as you move towards the more popular GRUs and LSTMs." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8BDIzkJhfqMV" + }, + "source": [ + "## References \n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TAfujAPs6n1B" + }, + "source": [ + "[Recurrent Neural Networks Lecture, Stanford University School of Engineering](https://www.youtube.com/c/stanfordengineering)\n", + "
\n", + "[Recurrent Neural Network (RNN) Tutorial: Types, Examples, LSTM and More](https://www.simplilearn.com/tutorials/deep-learning-tutorial/rnn)\n", + "
\n", + "[RNN walkthrough](https://github.com/gabrielloye/RNN-walkthrough)\n", + "
\n", + "[An Introduction To Recurrent Neural Networks And The Math That Powers Them](https://machinelearningmastery.com/an-introduction-to-recurrent-neural-networks-and-the-math-that-powers-them/)\n", + "
\n", + "[A Tour of Recurrent Neural Network Algorithms for Deep Learning](https://machinelearningmastery.com/recurrent-neural-network-algorithms-for-deep-learning/)\n", + "
\n", + "[A Beginner’s Guide on Recurrent Neural Networks with PyTorch](https://blog.floydhub.com/a-beginners-guide-on-recurrent-neural-networks-with-pytorch/)" + ] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "index.ipynb", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.6" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/Recurrent Neural Networks/metadata.yml b/notebooks/Recurrent Neural Networks/metadata.yml new file mode 100644 index 0000000..7b8f304 --- /dev/null +++ b/notebooks/Recurrent Neural Networks/metadata.yml @@ -0,0 +1,29 @@ +title: Recurrent Neural Networks + +meta: + - name: keywords + content: Artificial Intelligence, Recurrent Neural Networks + +header: + title: Recurrent Neural Networks + description: | + In this notebook we talk about Recurrent Neural Networks. + +authors: + label: + position: top + text: Authors + kind: people + content: + - name: matinamehdizadeh + role: Author + contact: + - link: https://github.com/matinamehdizadeh + icon: fab fa-github + - link: mailto://matinamehdizadeh@gmail.com + icon: fas fa-envelope + +comments: + label: false + kind: comments + diff --git a/notebooks/adversarial_attack/adversarial attacks.ipynb b/notebooks/adversarial_attack/adversarial attacks.ipynb deleted file mode 100644 index ef20580..0000000 --- a/notebooks/adversarial_attack/adversarial attacks.ipynb +++ /dev/null @@ -1 +0,0 @@ -{"nbformat":4,"nbformat_minor":2,"metadata":{"colab":{"name":"index.ipynb","provenance":[],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.8"}},"cells":[{"cell_type":"markdown","source":["# Adversarial attaks and robustness\n","\n","\n"],"metadata":{"id":"LDerBS_foQl2"}},{"cell_type":"markdown","source":["# Table of Contents\n","1. [Introduction](#introduction)\n","2. [The importance of robustness](#importance)\n","3. [Attacks](#attack)\n"," * [PGD](#pgd)\n"," * [FGSM](#fgsm)\n"," * [DeepFool](#deepfool)\n","5. [robustness](#robustness)\n"," * [Adversarial training](#advers_train)\n"," * [Provable defenses](#prove_defense)\n","6. [Example](#code_example)\n","7. [References](#references)"],"metadata":{"id":"a5z34ayPNA13"}},{"cell_type":"markdown","source":["## Introduction "],"metadata":{"id":"2OotVuBgfqME"}},{"cell_type":"markdown","source":["The problem of unusual mistakes that machine learning algorithms make, was not a serious avenue of study until the algorithm started to work very well most of the time, as now that is actually the exception rather than the rule. Most machine learning techniques were designed to work on specific problem sets in which the training and test data are generated from the same statistical distribution. When those models are applied to the real world, adversaries may create some data that violates that statistical assumption.\r\n","An adversarial attack is a method to generate adversarial examples. Therefore, an adversarial example is an input to a machine learning model that is purposely designed to cause a model to make a mistake in its predictions despite resembling a valid input to a human.\r\n","For example, in the picture below we start with a panda that has not been modified in any way and the network train on the imageNet data set is able to recognize it as being a panda with about 60% confidence in that decision. The optimal direction to modify the image in a way that cause the network to make a mistake is given by the image in the middle, which looks a lot like noise to a human but it actually is carefully computed as a function of the parameters of the network. If the image of the structured attack is multiplied by a very small coefficient and added to the original picture, it would create a picture that a human cannot tell from the original panda, but that tiny change is enough to fool the network into recognizing the image of panda as being a gibbon with 99.9% confidence in that decision; thus, it is not barely fining the decision boundary and barely crossing it to change the decision.\r\n","\r\n","\r\n","\r\n","\r\n","

\r\n"," \r\n","

\r\n","\r\n","\r\n","\r\n"],"metadata":{"id":"ZeHq0bAh4c2u"}},{"cell_type":"markdown","source":["## The importance of robustness "],"metadata":{"id":"gYV4Z8O_fqMF"}},{"cell_type":"markdown","source":["We just saw examples of how easily machine learning models can be fooled into making wrong predictions by adding slight noise to the input that may be unnoticeable to humans. This problem occurs, because we learn and use models without understanding how they work internally and whether they actually learn concepts that would mirror human cognition. While the panda gibbon example might look interesting, it is easy to reframe adversarial examples as safety and security problems.\r\n","\r\n","For examples, by taping a small sticker to a stop sign we can fool a vision system to recognize the sign as a \"Speed Limit 45” sign, implying that this might lead to wrong and dangerous actions taken by an autonomous car or even natural changes to inputs may lead to wrong predictions with safety consequences. The most common examples are self driving cars driving in foggy weather conditions or with a slightly smeared camera, all resulting in small modifications of the input. The pictures in below are examples of possible input transformations, mirroring potential conditions in the real world for a self driving system leading to wrong predictions of the navigation angle.\r\n","

\r\n"," \r\n","

\r\n","\r\n","Therefore, the models that are used should be robust to these kinks of attacks. Robustness is the idea that a model’s prediction is stable to small variations in the input, because it’s prediction is based on reliable concepts of the real task that mirror how a human would perform the task.\r\n","\r\n","More examples of the danger of adversarial attacks can be found on [this](https://rll.berkeley.edu/adversarial/) and [this](https://arxiv.org/abs/1701.04143) papers."],"metadata":{"id":"-thmGRK44qD6"}},{"cell_type":"markdown","source":["## Attacks "],"metadata":{"id":"iEn8GUyrfqMO"}},{"cell_type":"markdown","source":["Machine learning algorithms develop their behavior through experience. They are complex mathematical functions that transform inputs to outputs. If a machine learning tags an image as containing a specific object, it has found the pixel values in that image to be statistically similar to other images of the object it has processed during training.\r\n","\r\n","Adversarial attacks exploit this characteristic to confuse machine learning algorithms by manipulating their input data. For instance, by adding tiny and unremarkable patches of pixels to an image.\r\n","\r\n","in the following, some of the known current techniques for generating adversarial examples are listed."],"metadata":{"id":"CjLahM_Vbqz2"}},{"cell_type":"markdown","source":["### PGD "],"metadata":{"id":"_4ZBEZCDfqMO"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"UcGXgwZ_fqMP"}},{"cell_type":"code","execution_count":null,"source":[],"outputs":[],"metadata":{"id":"1Z203XsFfG4x"}},{"cell_type":"markdown","source":["### FGSM "],"metadata":{"id":"zoqWqW24fqMQ"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"4L4mL1nt5d99"}},{"cell_type":"markdown","source":["### DeepFool "],"metadata":{"id":"UstdBiu75eKS"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"pxixV6SG6ldE"}},{"cell_type":"markdown","source":["## Robustness "],"metadata":{"id":"acf3jhH7fqMS"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"3V8OgzSpff0S"}},{"cell_type":"markdown","source":["### Adversarial training "],"metadata":{"id":"r4gF7qMR5zwm"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"Zbj9Zxlp6CxS"}},{"cell_type":"markdown","source":["### Provable defenses "],"metadata":{"id":"kHneZ5816DET"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"eebanxiL59Oq"}},{"cell_type":"markdown","source":["## Example \n"],"metadata":{"id":"29Hx95TFfqMU"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"CdJiNHvH6nWn"}},{"cell_type":"markdown","source":["## References \n"],"metadata":{"id":"8BDIzkJhfqMV"}},{"cell_type":"markdown","source":["Fill later\n"],"metadata":{"id":"TAfujAPs6n1B"}}]} \ No newline at end of file