This jupyter notebook solves the problem of classifiying cancer documents correctly into one of 3 categories, 'Thyroid_Cancer', 'Colon_Cancer', 'Lung_Cancer'. The problem and the dataset can be found at - https://www.kaggle.com/datasets/falgunipatel19/biomedical-text-publication-classification.
This notebook shows how one can use Neural Network Embeddings to solve this classification problem. Both alternatives - Embeddings with pre-computed embeddings and without, are implemented.
Pre-computed embeddings used are the GLOVE word embeddings from 2014 English Wikipedia, downloaded from https://nlp.stanford.edu/projects/glove.
Coding is done using Python and Keras. Intermediate outputs are printed in the notebook for clarity.