From 4e0b2350f2052f6d3bdd07187415d32485b4e76a Mon Sep 17 00:00:00 2001 From: Nan Xiao Date: Sat, 26 Oct 2024 23:30:43 -0400 Subject: [PATCH] Remove less relevant text for blog post --- docs/articles/get-started.md | 32 +------------------------------- docs/articles/get-started.qmd | 31 +------------------------------ 2 files changed, 2 insertions(+), 61 deletions(-) diff --git a/docs/articles/get-started.md b/docs/articles/get-started.md index 5839181..e9ef1aa 100644 --- a/docs/articles/get-started.md +++ b/docs/articles/get-started.md @@ -11,31 +11,7 @@ python3 examples/get-started.py ``` -## Introduction - -Fitting topic models at scale using classical algorithms on CPUs can be -slow. Carbonetto et al. (2022) demonstrated the equivalence between -Poisson NMF and multinomial topic model likelihoods, and proposed a -novel optimization strategy: fit a Poisson NMF via coordinate descent, -then recover the corresponding topic model through a simple -transformation. This method was implemented in their R package, -[fastTopics](https://cran.r-project.org/package=fastTopics). - -Building on this theoretical insight, tinytopics adopts an alternative -approach by directly solving a sum-to-one constrained neural Poisson -NMF, optimized using stochastic gradient methods, implemented in -PyTorch. Although this approach may not have the same theoretical -guarantees, it does have a few potential practical benefits: - -- Scalable: Runs efficiently on both CPUs and GPUs and enables - large-scale topic modeling tasks. -- Extensible: The model architecture is flexible and can be extended, - for example, by adding regularization or integrating with other data - modalities. -- Minimal: The core implementation is kept simple and readable, - reflecting the package name: **tiny**topics. - -This article shows a canonical tinytopics workflow using a simulated +Let’s walk through a canonical tinytopics workflow using a synthetic dataset. ## Import tinytopics @@ -170,9 +146,3 @@ plot_top_terms( ``` ![](images/get-started/F-top-terms-learned.png) - -## References - -Carbonetto, P., Sarkar, A., Wang, Z., & Stephens, M. (2021). -Non-negative matrix factorization algorithms greatly improve topic model -fits. arXiv Preprint arXiv:2105.13440. diff --git a/docs/articles/get-started.qmd b/docs/articles/get-started.qmd index 3d160c8..807bf2a 100644 --- a/docs/articles/get-started.qmd +++ b/docs/articles/get-started.qmd @@ -14,30 +14,7 @@ eval: false python3 examples/get-started.py ``` -## Introduction - -Fitting topic models at scale using classical algorithms on CPUs can be slow. -Carbonetto et al. (2022) demonstrated the equivalence between Poisson NMF -and multinomial topic model likelihoods, and proposed a novel optimization -strategy: fit a Poisson NMF via coordinate descent, then recover the -corresponding topic model through a simple transformation. -This method was implemented in their R package, -[fastTopics](https://cran.r-project.org/package=fastTopics). - -Building on this theoretical insight, tinytopics adopts an alternative -approach by directly solving a sum-to-one constrained neural Poisson NMF, -optimized using stochastic gradient methods, implemented in PyTorch. -Although this approach may not have the same theoretical guarantees, -it does have a few potential practical benefits: - -- Scalable: Runs efficiently on both CPUs and GPUs and enables large-scale - topic modeling tasks. -- Extensible: The model architecture is flexible and can be extended, - for example, by adding regularization or integrating with other data modalities. -- Minimal: The core implementation is kept simple and readable, reflecting - the package name: **tiny**topics. - -This article shows a canonical tinytopics workflow using a simulated dataset. +Let's walk through a canonical tinytopics workflow using a synthetic dataset. ## Import tinytopics @@ -168,9 +145,3 @@ plot_top_terms( ``` ![](images/get-started/F-top-terms-learned.png) - -## References - -Carbonetto, P., Sarkar, A., Wang, Z., & Stephens, M. (2021). -Non-negative matrix factorization algorithms greatly improve topic model fits. -arXiv Preprint arXiv:2105.13440.