Skip to content

Commit

Permalink
Fix CPU model perf by increasing learning rate
Browse files Browse the repository at this point in the history
  • Loading branch information
nanxstats committed Oct 20, 2024
1 parent 84872fa commit 50db051
Show file tree
Hide file tree
Showing 10 changed files with 44 additions and 19 deletions.
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ matrix factorization (NMF), built on PyTorch and runs on GPU.

## Installation

First, install a CUDA-enabled version of
[PyTorch](https://pytorch.org/get-started/locally/) and run with an Nvidia GPU.
At the moment, only the Linux and Windows versions of PyTorch support CUDA.
First, [install PyTorch](https://pytorch.org/get-started/locally/).
To run on Nvidia GPUs, install a CUDA-enabled version on supported platforms
(at the moment, Linux and Windows).

You can install tinytopics from PyPI:

Expand All @@ -24,7 +24,3 @@ python3 -m pip install -e .
```

Try [getting started](articles/get-started.md).

## Known issues

- [ ] Running on CPU produces different (and worse) models than on GPU.
13 changes: 12 additions & 1 deletion docs/articles/get-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ X, true_L, true_F = generate_synthetic_data(n, m, k, avg_doc_length=256 * 256)
Train the model

``` python
model, losses = fit_model(X, k)
model, losses = fit_model(X, k, learning_rate=0.01)
```

Plot loss curve
Expand All @@ -51,6 +51,17 @@ plot_loss(losses, output_file="loss.png")

![](images/loss.png)

!!! tip

The performance of the model can be sensitive to the learning rate.
If you experience suboptimal results or observe performance discrepancies
between the model trained on CPU and GPU, tuning the learning rate can help.

For example, using the default learning rate of 0.001 on this synthetic
dataset can lead to inconsistent results between devices (worse model
on CPU than GPU). Increasing the learning rate towards 0.01 significantly
improves model fit and ensures consistent performance across both devices.

## Post-process results

Derive matrices
Expand Down
13 changes: 12 additions & 1 deletion docs/articles/get-started.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ X, true_L, true_F = generate_synthetic_data(n, m, k, avg_doc_length=256 * 256)
Train the model

```{python}
model, losses = fit_model(X, k)
model, losses = fit_model(X, k, learning_rate=0.01)
```

Plot loss curve
Expand All @@ -54,6 +54,17 @@ plot_loss(losses, output_file="loss.png")

![](images/loss.png)

!!! tip

The performance of the model can be sensitive to the learning rate.
If you experience suboptimal results or observe performance discrepancies
between the model trained on CPU and GPU, tuning the learning rate can help.

For example, using the default learning rate of 0.001 on this synthetic
dataset can lead to inconsistent results between devices (worse model
on CPU than GPU). Increasing the learning rate towards 0.01 significantly
improves model fit and ensures consistent performance across both devices.

## Post-process results

Derive matrices
Expand Down
Binary file modified docs/articles/images/F-top-terms-learned.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/articles/images/L-learned.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/articles/images/L-true.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/articles/images/loss.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 3 additions & 7 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ matrix factorization (NMF), built on PyTorch and runs on GPU.

## Installation

First, install a CUDA-enabled version of
[PyTorch](https://pytorch.org/get-started/locally/) and run with an Nvidia GPU.
At the moment, only the Linux and Windows versions of PyTorch support CUDA.
First, [install PyTorch](https://pytorch.org/get-started/locally/).
To run on Nvidia GPUs, install a CUDA-enabled version on supported platforms
(at the moment, Linux and Windows).

You can install tinytopics from PyPI:

Expand All @@ -24,7 +24,3 @@ python3 -m pip install -e .
```

Try [getting started](articles/get-started.md).

## Known issues

- [ ] Running on CPU produces different (and worse) models than on GPU.
13 changes: 12 additions & 1 deletion examples/get-started.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
# In[ ]:


model, losses = fit_model(X, k)
model, losses = fit_model(X, k, learning_rate=0.01)


# Plot loss curve
Expand All @@ -65,6 +65,17 @@

# ![](images/loss.png)
#
# !!! tip
#
# The performance of the model can be sensitive to the learning rate.
# If you experience suboptimal results or observe performance discrepancies
# between the model trained on CPU and GPU, tuning the learning rate can help.
#
# For example, using the default learning rate of 0.001 on this synthetic
# dataset can lead to inconsistent results between devices (worse model
# on CPU than GPU). Increasing the learning rate towards 0.01 significantly
# improves model fit and ensures consistent performance across both devices.
#
# ## Post-process results
#
# Derive matrices
Expand Down
4 changes: 2 additions & 2 deletions tinytopics/fit.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
from .models import NeuralPoissonNMF


def fit_model(X, k, num_epochs=200, batch_size=64, learning_rate=0.001, device=None):
def fit_model(X, k, learning_rate=0.001, num_epochs=200, batch_size=64, device=None):
"""
Fit topic model via sum-to-one constrained neural Poisson NMF using batch gradient descent.
Args:
X (torch.Tensor): Document-term matrix.
k (int): Number of topics.
learning_rate (float, optional): Learning rate for Adam optimizer. Default is 0.001.
num_epochs (int, optional): Number of training epochs. Default is 200.
batch_size (int, optional): Number of documents per batch. Default is 64.
learning_rate (float, optional): Learning rate for Adam optimizer. Default is 0.001.
device (torch.device, optional): Device to run the training on. Defaults to CUDA if available, otherwise CPU.
Returns:
Expand Down

0 comments on commit 50db051

Please sign in to comment.