Skip to content

Commit

Permalink
Merge pull request #11 from dlmbl/2024
Browse files Browse the repository at this point in the history
undefined
  • Loading branch information
adjavon authored Aug 21, 2024
2 parents e95b6f4 + 0fca9ec commit f0da865
Show file tree
Hide file tree
Showing 29 changed files with 3,297 additions and 5,396 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/build-notebooks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Build Notebooks
on:
push:

jobs:
run:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

- name: Install dependencies
run: |
python -m pip install -U pip
python -m pip install jupytext nbconvert
- name: Build notebooks
run: |
jupytext --to ipynb --update-metadata '{"jupytext":{"cell_metadata_filter":"all"}}' solution.py
jupyter nbconvert solution.ipynb --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags solution --to notebook --output exercise.ipynb
jupyter nbconvert solution.ipynb --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags task --to notebook --output solution.ipynb
- uses: EndBug/add-and-commit@v9
with:
add: solution.ipynb exercise.ipynb
45 changes: 27 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,46 @@
# Exercise 9: Explainable AI and Knowledge Extraction
# Exercise 8: Explainable AI and Knowledge Extraction

## Overview
The goal of this exercise is to learn how to probe what a pre-trained classifier has learned about the data it was trained on.

We will be working with a simple example which is a fun derivation on the MNIST dataset that you will have seen in previous exercises in this course.
Unlike regular MNIST, our dataset is classified not by number, but by color! The question is... which colors fall within which class?

![CMNIST](assets/cmnist.png)

In this exercise, we will return to conventional, gradient-based attribution methods to see what they can tell us about what the classifier knows.
We will see that, even for such a simple problem, there is some information that these methods do not give us.

We will then train a generative adversarial network, or GAN, to try to create counterfactual images.
These images are modifications of the originals, which are able to fool the classifier into thinking they come from a different class!.
We will evaluate this GAN using our classifier; Is it really able to change an image's class in a meaningful way?

Finally, we will combine the two methods — attribution and counterfactual — to get a full explanation of what exactly it is that the classifier is doing. We will likely learn whether it can teach us anything, and whether we should trust it!

## Setup

Before anything else, in the super-repository called `DL-MBL-2023`:
Before anything else, in the super-repository called `DL-MBL-2024`:
```
git pull
git submodule update --init 09_knowledge_extraction
git submodule update --init 08_knowledge_extraction
```

Then, if you have any other exercises still running, please save your progress and shut down those kernels.
This is a GPU-hungry exercise so you're going to need all the GPU memory you can get.

Next, run the setup script. It might take a few minutes.
```
cd 09_knowledge_extraction
source setup.sh
cd 08_knowledge_extraction
sh setup.sh
```
This will:
- Create a `mamba` environment for this exercise
- Download and unzip data and pre-trained network
- Create a `conda` environment for this exercise
- Download the data and train the classifier we're learning about
Feel free to have a look at the `setup.sh` script to see the details.


Next, begin a Jupyter Lab instance:
```
jupyter lab
```
...and continue with the instructions in the notebook.
Next, open the exercise notebook!

## Overview
### Acknowledgments

In this exercise we will:
1. Train a classifier to predict, from 2D EM images of synapses, which neurotransmitter is (mostly) used at that synapse
2. Use a gradient-based attribution method to try to find out what parts of the images contribute to the prediction
3. Train a CycleGAN to create counterfactual images
4. Run a discriminative attribution from counterfactuals
This notebook was written by Diane Adjavon, from a previous version written by Jan Funke and modified by Tri Nguyen, using code from Nils Eckstein.
Binary file added assets/cmnist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/same_class_diff_color.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/same_color_diff_class.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/stargan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions create_environment.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Contains the steps that I used to create the environment, for memory
mamba create -n 08_knowledge_extraction python=3.11 pytorch torchvision pytorch-cuda=12.1 -c conda-forge -c pytorch -c nvidia
mamba activate 08_knowledge_extraction
pip install -r requirements.txt
mamba env export > environment.yaml
Empty file removed dac/__init__.py
Empty file.
72 changes: 0 additions & 72 deletions dac/activations.py

This file was deleted.

Loading

0 comments on commit f0da865

Please sign in to comment.