ttt-ml

AI for Tic-Tac-Toe

This repository contains implementations of various AI techniques for playing Tic-Tac-Toe (or Noughts and Crosses)

Currently implemented AI algorithms include:

Minimax (with alpha-beta pruning)

Recursive brute-force search of the solution space, but using a pruning method that ignores parts of the tree that cannot effect the outcome of a search.

Supervised learning for a value network, using an oracle.

Using a training set of (state, value) pairs provided by an oracle (currently set to 10% of the state space), we train a neural network (either MLP or ConvNet) that generalises to the rest of the state space.

CPUCT Tree Search with a two-headed policy-value network.

AlphaZero / Lc0 style self-play reinforcement learning. Starting from a randomly initialised network, an agent plays itself over and over, updating both move predictions and state-value estimations.

Good resources for this include

Iterated Distillation and Amplification
AlphaZero Paper on ArXiv
Dominik Klein's Neural Networks for Chess (from which our implementation was essentially cloned, with some stylistic changes)

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
az_method		az_method
.gitattributes		.gitattributes
.gitignore		.gitignore
AZTrain.py		AZTrain.py
Agent.py		Agent.py
C4State.py		C4State.py
EpisodeMemory.py		EpisodeMemory.py
Hyperparameters.py		Hyperparameters.py
ModelTools.py		ModelTools.py
NetMaker.py		NetMaker.py
Oracle.py		Oracle.py
PlayAgent.py		PlayAgent.py
README.md		README.md
SearchBench.py		SearchBench.py
SearchBench.py.lprof		SearchBench.py.lprof
ShowGame.py		ShowGame.py
State.py		State.py
SupervisedTrain.py		SupervisedTrain.py
TestSuite.py		TestSuite.py
Tests.py		Tests.py
ValidationData.py		ValidationData.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ttt-ml

AI for Tic-Tac-Toe

Currently implemented AI algorithms include:

Minimax (with alpha-beta pruning)

Supervised learning for a value network, using an oracle.

CPUCT Tree Search with a two-headed policy-value network.

Good resources for this include

About

Releases

Packages

Languages

cosmobobak/ttt-ml

Folders and files

Latest commit

History

Repository files navigation

ttt-ml

AI for Tic-Tac-Toe

Currently implemented AI algorithms include:

Minimax (with alpha-beta pruning)

Supervised learning for a value network, using an oracle.

CPUCT Tree Search with a two-headed policy-value network.

Good resources for this include

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages