Skip to content

Commit

Permalink
Remove less relevant text for blog post
Browse files Browse the repository at this point in the history
  • Loading branch information
nanxstats committed Oct 27, 2024
1 parent 1a27f23 commit 4e0b235
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 61 deletions.
32 changes: 1 addition & 31 deletions docs/articles/get-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,31 +11,7 @@
python3 examples/get-started.py
```

## Introduction

Fitting topic models at scale using classical algorithms on CPUs can be
slow. Carbonetto et al. (2022) demonstrated the equivalence between
Poisson NMF and multinomial topic model likelihoods, and proposed a
novel optimization strategy: fit a Poisson NMF via coordinate descent,
then recover the corresponding topic model through a simple
transformation. This method was implemented in their R package,
[fastTopics](https://cran.r-project.org/package=fastTopics).

Building on this theoretical insight, tinytopics adopts an alternative
approach by directly solving a sum-to-one constrained neural Poisson
NMF, optimized using stochastic gradient methods, implemented in
PyTorch. Although this approach may not have the same theoretical
guarantees, it does have a few potential practical benefits:

- Scalable: Runs efficiently on both CPUs and GPUs and enables
large-scale topic modeling tasks.
- Extensible: The model architecture is flexible and can be extended,
for example, by adding regularization or integrating with other data
modalities.
- Minimal: The core implementation is kept simple and readable,
reflecting the package name: **tiny**topics.

This article shows a canonical tinytopics workflow using a simulated
Let’s walk through a canonical tinytopics workflow using a synthetic
dataset.

## Import tinytopics
Expand Down Expand Up @@ -170,9 +146,3 @@ plot_top_terms(
```

![](images/get-started/F-top-terms-learned.png)

## References

Carbonetto, P., Sarkar, A., Wang, Z., & Stephens, M. (2021).
Non-negative matrix factorization algorithms greatly improve topic model
fits. arXiv Preprint arXiv:2105.13440.
31 changes: 1 addition & 30 deletions docs/articles/get-started.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -14,30 +14,7 @@ eval: false
python3 examples/get-started.py
```

## Introduction

Fitting topic models at scale using classical algorithms on CPUs can be slow.
Carbonetto et al. (2022) demonstrated the equivalence between Poisson NMF
and multinomial topic model likelihoods, and proposed a novel optimization
strategy: fit a Poisson NMF via coordinate descent, then recover the
corresponding topic model through a simple transformation.
This method was implemented in their R package,
[fastTopics](https://cran.r-project.org/package=fastTopics).

Building on this theoretical insight, tinytopics adopts an alternative
approach by directly solving a sum-to-one constrained neural Poisson NMF,
optimized using stochastic gradient methods, implemented in PyTorch.
Although this approach may not have the same theoretical guarantees,
it does have a few potential practical benefits:

- Scalable: Runs efficiently on both CPUs and GPUs and enables large-scale
topic modeling tasks.
- Extensible: The model architecture is flexible and can be extended,
for example, by adding regularization or integrating with other data modalities.
- Minimal: The core implementation is kept simple and readable, reflecting
the package name: **tiny**topics.

This article shows a canonical tinytopics workflow using a simulated dataset.
Let's walk through a canonical tinytopics workflow using a synthetic dataset.

## Import tinytopics

Expand Down Expand Up @@ -168,9 +145,3 @@ plot_top_terms(
```

![](images/get-started/F-top-terms-learned.png)

## References

Carbonetto, P., Sarkar, A., Wang, Z., & Stephens, M. (2021).
Non-negative matrix factorization algorithms greatly improve topic model fits.
arXiv Preprint arXiv:2105.13440.

0 comments on commit 4e0b235

Please sign in to comment.