Statistical Machine Learning for Anomaly Detection

Instructors

Mike Wojnowicz and Karin Knudson

Description

This workshop will introduce students to a variety of models in statistical machine learning that can be used for anomaly detection. Topics will include: Mixture Models, Hidden Markov Models, Latent Dirichlet Allocation, Variational Autoencoders, and Normalizing Flows. Areas of application range from cybersecurity to biology to speech recognition.

Pre-requisites

Students should have some passing familiarity with basic statistical concepts (maximum likelihood, Bayes law, expectations, covariance matrices, conditional probability), and should be comfortable with Python programming. We will also assume knowledge of calculus.

Work In Progress

The slides, exercises, and wrapper will continue to evolve over the course of the week.

Structure

Vision

Break the course into modules, a little less than one per day. Each module turn has roughly four submodule. A typical submodule will consist of about 20 minutes of presentation, followed by about 20 minutes of interactive group activity.

Execution

We allow for approximate execution of the vision. Some submodules will have no exercises, some will have multiple. Some exercises will occur mid-slides.

Topics

Module 1: Basics
- Frequentist approaches and Maximum Likelihood
- Exponential Families
- Bayesian approaches and Conjugate Bayesian Models
Module 2: Expectation Maximization (EM)
- Introduction (K-means)
- Mixture Models
- Hidden Markov Models
- Expectation Maximization, Generally
Module 3: Probabilistic Graphical Models
- Probabilistic Graphical Models
Module 4: Variational Inference
- Overview (and Relation to EM)
- Bayesian Mixture Models
- Latent Dirichlet Allocation
Module 5: Black Box Models
- Variational Autoencoders (VAE)
- Normalizing Flow

Goals

By the end of this workshop, we hope that you will

Have some sense of various possible ways to approach an unsupervised learning problem (e.g., frequentist vs. Bayesian).
Have some intuition about the the models listed in the description. This intuition should, for example, support intelligent use of api's when modeling data.
Understand what expectation maximization and variational inference are, and how they underly the learning of models such as those above.
Develop and/or reinforce some understanding about modeling tools (e.g., how to determine conditional independence in a probabilistic graphical model; what exponential families mean and are used for; etc.)

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
data		data
exercises		exercises
ignore		ignore
slides		slides
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Statistical Machine Learning for Anomaly Detection

Instructors

Description

Pre-requisites

Work In Progress

Structure

Vision

Execution

Topics

Goals

Further Readings

Module 1: Overview

Module 2: Expectation Maximization (EM)

Module 3: Probabilistic Graphical Models (PGMs)

Module 4: Variational Inference

Module 5: Black Box Models

About

Releases

Packages

Contributors 2

Languages

mikewojnowicz/stat_ml_anomaly

Folders and files

Latest commit

History

Repository files navigation

Statistical Machine Learning for Anomaly Detection

Instructors

Description

Pre-requisites

Work In Progress

Structure

Vision

Execution

Topics

Goals

Further Readings

Module 1: Overview

Module 2: Expectation Maximization (EM)

Module 3: Probabilistic Graphical Models (PGMs)

Module 4: Variational Inference

Module 5: Black Box Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages