Skip to content

Commit

Permalink
Update on "add memory metrics to TensorBoard"
Browse files Browse the repository at this point in the history
<img width="1391" alt="Screenshot 2024-02-15 at 5 19 09 PM" src="https://github.com/pytorch-labs/torchtrain/assets/150487191/af8a2efb-13ff-4e8f-84f2-b245784747ed">



[ghstack-poisoned]
  • Loading branch information
tianyu-l committed Feb 17, 2024
1 parent e1a577d commit b77c89f
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions train.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,8 +222,8 @@ def main(args):
gpu_mem_stats = gpu_metrics.get_current_stats(return_data=True)

metrics = {
"loss/global_avg": global_avg_loss,
"loss/global_max": global_max_loss,
"loss_metrics/global_avg_loss": global_avg_loss,
"loss_metrics/global_max_loss": global_max_loss,
"wps": wps,
"memory_current/active(%)": gpu_mem_stats.active_curr,
"memory_current/allocated(%)": gpu_mem_stats.allocated_curr,
Expand Down

0 comments on commit b77c89f

Please sign in to comment.