Thamme Gowda thammegowda

🔭 I’m currently working on neural machine translation, imbalanced learning
🌱 I’m currently learning …
👯 I’m looking to collaborate on …
🤔 I’m looking for help with …
💬 Ask me about … anything
📫 How to reach me: @thammegowda
😄 Pronouns: he/him/his
⚡ Fun fact: …

Tools

mtdata, datasets downloader
sotastream, streaming approach to training data
RTG, NMT toolkit
BotEval, chatbot eval
nlcodec, vocabulary manager
awkg, pythonic awk
Sparkler, a webcrawler on Apache Spark
Parser-Indexer: Java, Python
Web apps (I wasn’t trying to be a web dev):
- Supervising UI, for image labeling
- NLLB Serve, an MT webapp to serve huggingface models

Research

See my latest publications on Google Scholar

Repo	Description	Status	Note
PyMarian	Python bindings to Marian C++; `pip install pymarian`	Complete✅	Paper; PyPI
BotEval	Facilitating human evaluation of chatbots; `pip install boteval`	Complete✅	Paper @ ACL2024 Demos; Demos; PyPI
Cometoid	Distilling strong reference based metrics into stronger reference-less metrics	Complete✅	Paper @ WMT2023; Models on Huggingface
sotastream	A streaming approach to machine translation training. `pip install sotastream`	Complete✅	Paper @ NLP OSS 2023 ; PyPI
016-many-eng-v2	Many-to-English (v2)	Complete✅
015-nmt-ablation	Transformer ablation, showing that model can work without encoder.	Complete✅
014-udhr-dataset	Parallel sentence alignment from Universal Declaration of Human Rights corpus	WIP/Incomplete◒
013-nmt-codeswitching	Done	Complete✅	Paper
012-macrobert	Macro sampling in BERT	Didn’t work❌	Maybe we should revisit
011-imb-learn	Imbalanced machine learning: case studies in image recognition, text classification, and machine translation	Incomplete◒	Docs
010-hyperparam-theory	A theory on hyperparameter	Incomplete◒	Book idea! Needs more time. 🕙
009-nmt-toolkits	A survey of NMT toolkits	Incomplete◒	Lost interest
008-asr-eval-macro	Macro-averaged evaluation for automatic speech recognition	Incomplete◒	(Some positive results, but needs more evidence)
007-mt-eval-macro	Macro Average: Rare Types are Important Too	Complete✅	NAACL 2021
006-many-to-eng	Many-to-English machine translation tools, data, and pretrained models	Complete✅	ACL 2021 Demos. Demo page
005-nmt-imbalance	Finding the optimal vocabulary size for neural machine translation	Complete ✅	EMNLP 2020 Findings
005-nmt-imbalance-old	Neural machine translation with imbalanced classes	Complete ✅	Rejected from *ACL; Arxiv link
004-nmt-learning-curve	NMT learning curve revisited.	Complete ✅	Not published
image-forensics-MFSec17	An Approach for Automatic and Large Scale Image Forensics	Complete ✅	MFSec 2017
autoextractor	Clustering webpages based on structure and style similarity	Complete✅	IEEE IRI 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thamme Gowda thammegowda

Achievements

Achievements

Highlights

Organizations

Block or report thammegowda

Tools

Research

Pinned Loading