-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Latte is a GPU server, donated in part by NVIDIA Corp. for use by the CS community. It features 8 datacenter-class NVIDIA Tesla P100 GPUs, which offer a large speedup for machine learning and related GPU computing tasks. The Tensorflow and PyTorch libraries are available for use as well.
To begin using latte
, you need to first have a CSUA account and be a member of the ml2018
group. You can check if you are a member by logging into soda.csua.berkeley.edu
and using the id
command.
To get a CSUA account, please visit our office in 311 Soda and an officer will create an account for you.
To get into the ml2018
group, send an email to [email protected] with the following:
- Name
- CSUA Username
- Intended use
Once we receive your email, we will give you access to the group.
Once you have an account, you can log into latte.csua.berkeley.edu
over SSH. This will bring you into the slurmctld
machine. From here, you can begin setting up your jobs.
slurmctld
is meant for testing only. There are limits to the amount of compute you can use while in this machine.
The /datasets/
directory has some publicly-available datasets to use in /datasets/share/
. If you are using your own dataset, please place them in /datasets/
inside a subdirectory of your choosing. /datasets/
has the restricted deletion bit set, so anything you put in your subdirectory cannot be deleted by anyone but you (and root). Make sure to check if the dataset you're adding does not already exist.
When you first login, you will have an empty home directory. The contents of your home directory on soda are in /sodahome/
, which is mounted over a network filesystem and will be slower than /home
. While it may be annoying to copy files over, I assure you nothing is worse than doing file operations over an network-mounted file system.
Once you run your program and it works, you can submit a job.
Slurm is used to manage the job scheduling on latte
.
To run a job, you need to submit it using the sbatch
command. You can read about how to use Slurm here.
This will send the job to one of the GPU nodes and run the job.
If you have any questions, please email [email protected].