Skip to content

Releases: KarelZe/thesis

Changes between 27 February and 5 March

05 Mar 08:53
4b65e8f
Compare
Choose a tag to compare

What's Changed

Didn't work 100 % on thesis. Spent some time on exam prep.

Writing 📖

Other Changes

Outlook🎒

  • finish remaining tasks from last week
  • exam prep

Full Changelog: 23-09...23-10

Changes between February, 20th and February, 26th

26 Feb 18:39
12041e7
Compare
Choose a tag to compare

What's Changed

Empirical Study ⚗️

  • Add notes, code, tests, and Chapter on effective spread🍕 by @KarelZe in #184

Writing 📖

  • Add chapter on Regression Trees🎄 by @KarelZe in #170
  • Add section on attention maps🧭 by @KarelZe in #172
  • Edit in review comments🎒 by @KarelZe in #174
  • Optimized citations/typesetting and extended check_formalia.py 🐍 by @KarelZe in #175
  • Edit in comments from second review👨‍🎓 by @KarelZe in #179
  • Add visualizations for layer norm🍇 by @KarelZe in #178
  • Add chapter on TabTransformer📑 by @KarelZe in #180
  • Add chapter on FT-Transformer🕹️ by @KarelZe in #181
  • Add notes and viz on train-test-split🍿 by @KarelZe in #182

Other Changes

Outlook🎒

  1. Write the chapter on the gradient boosting procedure
  2. Finish the attention and embeddings chapter. Add some nice visuals!
  3. Integrate feedback
  4. Resolve my small TODOs in LaTeX sources / go through warnings / fix overflows
  5. Loosely research how pre-training on unlabelled data can be implemented in PyTorch
  6. (merge and rework the Chapter on feature engineering)

Full Changelog: 23-08...23-09

Changes between February, 13th and February, 19th

19 Feb 16:04
90bfcf0
Compare
Choose a tag to compare

What's Changed

Writing 📖

  • Refactor and enhance stacked hybrid rules to separate chapter 🔢 by @KarelZe in #155
  • Extend chapter on LR algorithm📖 by @KarelZe in #156
  • Research on trade initiator for CBOE / ISE 📑 by @KarelZe in #157
  • Improve readability of Overview over Transformers 🤖 by @KarelZe in #158
  • Rewrite chapter on positional encoding🧵 by @KarelZe in #159
  • Rewrite chapter position-wise FFN for clarity🎱 by @KarelZe in #160
  • Rewrite chapter on residual connections🔗 by @KarelZe in #161
  • Update citation style and table of symbols🎙️ by @KarelZe in #162
  • Add feature set definition to appendix🧃 by @KarelZe in #164
  • Add visualizations of Transformer for tabular data🖼️ by @KarelZe in #165
  • Improve captioning and transitions for Transformer chapters 🍞 by @KarelZe in #166
  • Fix and simplify formulas❤️‍🩹 by @KarelZe in #167
  • Streamline and extend the chapter on LR algorithm📑 by @KarelZe in #168
  • Rewrite layer norm chapter and fuse with residual connections 🍔 by @KarelZe in #169
  • Restructure chapter on trade initiator🪴 by @KarelZe in #163

Outlook 🏍️

  • Merge and rework chapters on FTTransformer, TabTransforrmer, token embeddings, feature engineering, and attention maps
  • Write a chapter on decision trees and gradient boosting as well as attention
  • Create nice visualizations for categorical embeddings and layer norm
  • Integrate feedback from @lxndrblz and @pheusel
  • Improve transformer implementation e.g., by choosing different search spaces, using numerical embeddings, fixing sample weighting, completing experiments with pytorch 2.0 etc.
  • Investigate results of current models e.g., robustness, effective spread, spread, partial dependence plots, etc. (see #8)

Full Changelog: 23-07...23-08

Changes between February, 6th and February, 12th

12 Feb 17:29
3ef3b29
Compare
Choose a tag to compare

Due to the slow progress last week, I decided to switch plans and progress with writing. I wrote all chapters on classical trade classification rules (9 pages). I also incorporated these chapters into thesis.pdf. Also, I gathered several ideas on how to improve the transformer chapters.

What's Changed

Writing 📖

Other Changes

Outlook 🐿️

(same as last week, as I worked on the classical trade classification rules)

  • Complete notes and write a draft on the selection of (semi-) supervised approaches
  • Rethink Transformer chapter. I'm still not happy with the overall quality. Will probably spend more time rewriting/rethinking.
  • Improve transformer implementation e.g., by choosing different search spaces, using numerical embeddings, fixing sample weighting, completing experiments with pytorch 2.0 etc.
  • Investigate results of current models e.g., robustness, effective spread, spread, partial dependence plots, etc. (see #8)
    Full Changelog: 23-06...23-07

Changes between January, 30th and February, 5th

05 Feb 16:18
6ad26ab
Compare
Choose a tag to compare

What's Changed

Writing 📖

  • Rewrite transformer chapters for clarity by @KarelZe in #139
  • Fix merge and build errors in reports 🐞 by @KarelZe in #140
  • Chapter on related works 👪 by @KarelZe in #141
  • Add notes on depth, trade size, and CLNV rule💸 by @KarelZe in #142
  • Improve notes on tick rule, quote rule, LR algorithm, and EMO rule💸 by @KarelZe in #144
  • Notes for meeting and misc pre-writing changes🐿️ by @KarelZe in #145

Other Changes

Outlook 🧪

  • Complete notes and write a draft on the selection of (semi-) supervised approaches
  • Rethink Transformer chapter. I'm still not happy with the overall quality. Will probably spend more time rewriting/rethinking.
  • Improve transformer implementation e.g., by choosing different search spaces, using numerical embeddings, fixing sample weighting, completing experiments with pytorch 2.0 etc.
  • Investigate results of current models e.g., robustness, effective spread, spread, partial dependence plots, etc. (see #8)

Full Changelog: 23-05...23-06

Changes between January, 23rd and January, 29th

29 Jan 11:23
2c29052
Compare
Choose a tag to compare

What's Changed

Empirical Study ⚗️

  • Feature engineering for a very large dataset 🌌 by @KarelZe in #126
  • Add retraining for gradient boosting [+ 2 %] 🍾 by @KarelZe in #130
  • Improve accuracy of TabTransformer [+ 5 % from prev.]🪅 by @KarelZe in #129
  • Fix cardinalities of Transformer implementation🪲 by @KarelZe in #132

Writing 📖

  • Complete notes on layer norm🍔 by @KarelZe in #123
  • Chapter on layer norm + notes on SSL and embeddings for tabular data 🧲 by @KarelZe in #131
  • Add chapter on embeddings of tabular data💤 by @KarelZe in #133
  • Fix broken references in expose 🔗 by @KarelZe in #135
  • Rework chapters on transformer 🤖 by @KarelZe in #134
  • [WIP] Add a chapter on attention, self-attention, multi-headed attention, and cross attention🅰️ by @KarelZe in #136

Other Changes

Outlook 🚀

  • Complete notes and write a draft on related works
  • Complete notes and write a draft on the selection of (semi-) supervised approaches
  • Complete notes and write a draft on classical trade classification rules
  • Try to shorten / streamline the theoretical background by one page. Also, aim for understanding, and improve visualizations
  • Improve transformer implementation e. g., by choosing different search spaces, using numerical embeddings, fixing sample weighting, completing experiments with pytorch 2.0 etc

Full Changelog: 23-04...23-05

Changes between January, 16th and January, 22nd

22 Jan 15:19
547d10a
Compare
Choose a tag to compare

What's Changed

Empirical Study ⚗️

Writing 📖

Other Changes

Outlook 🧪

  • It wasn't easy to obtain Jupyter resources on cluster last week. Thus, training and improving the Transformer didn't progress as initially hoped. Two SLURM jobs are still pending. Had some success with small scale experiments, though, with a performance similar to gradient boosting for FTTransformer. The results from Gradient Boosting with option features also look promising. See readme.md.
  • I also decided to break down training and tuning into smaller chunks after reading https://github.com/google-research/tuning_playbook. I hope it will give us more insights. I have already experimented with gradient tracking and added the option to automatically find the maximum batch size. I also restructured my notes on how I want to progress with training and tuning. Will add the option to keep certain parameters static. Also plan to add a much simpler baseline such as logistic regression and simplify evaluation. Might experiment with retraining.
  • Writing progressed slower than I anticipated due to various reasons. Still have to write the chapters on attention and MHSA, as well as pre-training of transformers.
  • I'll use next week to clean up the remaining tasks. 💯
    Full Changelog: 23-03...23-04

Changes between January, 9th and January, 15th

15 Jan 13:29
3ca6b27
Compare
Choose a tag to compare

What's Changed

Empirical Study ⚗️

  • Run feature engineering on large scale (100 %) 💡 by @KarelZe in #109
  • Run exploratory data analysis on cluster (10 %) by @KarelZe in #108

Writing 📖

  • Add chapter on input embedding (finished) positional encoding (cont'd) 🛌 by @KarelZe in #107
  • Finish chapter on positional encoding🧵 by @KarelZe in #111
  • Add chapter on TabTransformer🔢 by @KarelZe in #112
  • Add chapter on FTTransformer 🤖 by @KarelZe in #113
  • Correction of column embedding in chapter TabTransformer 🤖 by @KarelZe in #115

Other Changes

Outlook💡

  • Perform a code review of all previously written code.
  • Continue with transformer week. 🤖 Mainly write remaining chapters on the classical transformer architecture, attention and MHSA, as well as pre-training of transformers.
  • Research additional tricks from literature to optimize training behaviour of transformers. Structure them for the chapter on training and tuning our models.
  • Increase performance of current transformer implementations by applying the tricks from above to match the performance of gradient-boosted trees.
  • Add shared embeddings to the TabTransformer implementation.
  • Restructure notes and draft chapter for model selection of supervised and semi-supervised models.

Full Changelog: 23-02...23-03

Changes between January, 2nd and January, 8th

08 Jan 17:14
f1491c0
Compare
Choose a tag to compare

What's Changed

Empirical Study ⚗️

  • Create sklearn-compatible estimators 🦜 by @KarelZe in #93. Having a common sklearn-like interface is necessary for further aspects of training and evaluation like calculating SHAP values, creating learning curves, or simplify hyperparameter tuning etc.
  • Interpretability with SHAP and attention maps 🐇 by @KarelZe in #85. Kernel SHAP values can now be calculated for all models (classical + ml-based). This was marketed as one of the contributions of my paper. Need to research how to handle high correlation between features in kernel SHAP. Attention maps can be calculated for transformer-based models.
  • Add sample weighting to TransformerClassifier 🏋️ by @KarelZe in #100. Weight samples in the training set similar to how it's done in CatBoost. Therefore, more recent observations become more important.
  • Early stopping based on accuracy for TransformerClassifier🧁 by @KarelZe in #102. Perform early stopping based on validation accuracy instead of log loss. Thus, the implementation of early stopping for neural networks and gradient boosting is now discovered.
  • Improve robustness and tests of TabDataset 🚀 by @KarelZe in #101
  • Add instructions on using SLURM 🐧 by @KarelZe in #103. SLURM enables us to run a script on multiple nodes on the bwHPC cluster and extended periods. Required for final training.
  • Finalize exploratory data analysis 🚏 by @KarelZe in #105.
  • Finalize feature engineering🪄 by @KarelZe in #104

Writing 📖

  • Pre-write feature engineering chapter 🪛 by @KarelZe in #88
  • Write chapter on attention maps (finished) and gbm (contd) 🧭 by @KarelZe in #99. I noticed while implementing attention maps (see #85) that the common practice for calculating attention maps in the tabular domain is myopic. I researched approaches for transformers from other domains e. g., machine translations and documented my findings in this chapter. The chosen approaches take into account all attention layers and can handle attention heads with varying importance.
  • Questions for bi-weekly meeting❓ by @KarelZe in #106

Other Changes

Outlook 🧪

  • Start into the Transformer week 🎉 . I will spend next week and the week after improving the Transformer-based models. I want to dive into learning rate scheduling, learning rate warm up etc. Will also pre-write the chapters on FTTransformer, TabTransformer, the classical Transformer, and self-attention.

Full Changelog: 23-01...23-02

Changes between December, 26th and January, 1st

01 Jan 10:31
630fea6
Compare
Choose a tag to compare

What's Changed

Christmas break🎄

Other Changes

Full Changelog: v0.2.7...cw-01