Skip to content

This will contain all the file for the ENSAE course : Advanced ML

License

Notifications You must be signed in to change notification settings

AntoineTSP/Advanced_ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Advanced ML : Course Project

Observing the semantics of textual data through embedding coupled with t-SNE

t-SNE custom implementation

We implemented our own t-SNE method, following the Scikit-Learn ways. To create an instance and fit it onto data:

from TSNE_code.TSNE_utils import TSNE

custom_tsne = TSNE(n_components=2, perplexity=15, 
                   adaptive_learning_rate=True, patience=50, 
                   n_iter=1000, early_exaggeration=4)

custom_embedding = custom_tsne.fit_transform(X, verbose=3)

Documentation

The t-SNE functions are all explained through docstrings. We also leveraged the Sphinx library to create a numpy-like HTML documentation, making it more easily readable.

To view this documentation, navigate to the build/html/index.html file or click here.

t-SNE on famous test datasets

We applied our t-SNE implementation on the MNIST and Olivetti datasets, to verify that our implementation was correct, and to compare it to Scikit-Learn's.
Those comparisons can be found in the folders MNIST and Olivetti, which contain notebooks with our comparative studies.

LSTM and embedding

The training of the LSTM model on the IMDB dataset can be found in the notebook LSTM.ipynb. It also contains the visualization of the word embedding through different t-SNE instances.
The interactive 3D plot of the 3D t-SNE applied to our word embedding can be found in interactive_3d_plot.html.

Reference paper

Visualizing Data using t-SNE, Laurens van der Maaten, Geoffrey Hinton; 9(86):2579−2605, 2008.

About

This will contain all the file for the ENSAE course : Advanced ML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published