Skip to content

Latest commit

 

History

History
132 lines (104 loc) · 14.4 KB

File metadata and controls

132 lines (104 loc) · 14.4 KB

Description

This module provisions a PBS Server Host to operate and administer a PBS Professional cluster. The following requirements must be observed:

  • one must have an existing Altair license server with sufficient licenses to run PBS Pro
  • jobs should be submitted from a network filesystem mounted on all hosts to facilitate file transfers for jobs and their logs

Example

The following example snippet demonstrates use of the server module in concert with the pbspro-preinstall and filestore modules.

  - id: pbspro_server
    source: community/modules/scheduler/pbspro-server
    use:
    - homefs
    - pbspro_preinstall
    settings:
      pbs_license_server:  ## IP address or resolvable DNS name of license server

GPU Support

More information on GPU support in PBS Pro and other Cluster Toolkit modules can be found at docs/gpu-support.md

Support

PBS Professional is licensed and supported by Altair. This module is maintained and supported by the Cluster Toolkit team in collaboration with Altair.

License

Copyright 2022 Google LLC

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Requirements

Name Version
terraform >= 0.14.0

Providers

No providers.

Modules

Name Source Version
pbs_install github.com/GoogleCloudPlatform/hpc-toolkit//community/modules/scripts/pbspro-install v1.36.0&depth=1
pbs_qmgr github.com/GoogleCloudPlatform/hpc-toolkit//community/modules/scripts/pbspro-qmgr v1.36.0&depth=1
pbs_server github.com/GoogleCloudPlatform/hpc-toolkit//modules/compute/vm-instance v1.36.0&depth=1
server_startup_script github.com/GoogleCloudPlatform/hpc-toolkit//modules/scripts/startup-script v1.36.0&depth=1

Resources

No resources.

Inputs

Name Description Type Default Required
auto_delete_boot_disk Controls if boot disk should be auto-deleted when instance is deleted. bool true no
bandwidth_tier Tier 1 bandwidth increases the maximum egress bandwidth for VMs.
Using the tier_1_enabled setting will enable both gVNIC and TIER_1 higher bandwidth networking.
Using the gvnic_enabled setting will only enable gVNIC and will not enable TIER_1.
Note that TIER_1 only works with specific machine families & shapes and must be using an image th
at supports gVNIC. See official docs for more details.
string "not_enabled" no
client_host_count Number of client hosts to configure number 0 no
client_hostname_prefix Name prefix for client hosts string n/a yes
deployment_name Cluster Toolkit deployment name. Cloud resource names will include this value. string n/a yes
disk_size_gb Size of disk for instances. number 200 no
disk_type Disk type for instances. string "pd-standard" no
enable_oslogin Enable or Disable OS Login with "ENABLE" or "DISABLE". Set to "INHERIT" to inherit project OS Login setting. string "ENABLE" no
enable_public_ips If set to true, instances will have public IPs on the internet. bool true no
execution_host_count Number of execution hosts to configure number n/a yes
execution_hostname_prefix Name prefix for execution hosts string n/a yes
guest_accelerator List of the type and count of accelerator cards attached to the instance.
list(object({
type = string,
count = number
}))
null no
instance_count Number of instances number 1 no
instance_image Instance Image

Expected Fields:
name: The name of the image. Mutually exclusive with family.
family: The image family to use. Mutually exclusive with name.
project: The project where the image is hosted.
map(string)
{
"family": "hpc-centos-7",
"project": "cloud-hpc-image-public"
}
no
labels Labels to add to the instances. Key-value pairs. map(string) n/a yes
local_ssd_count The number of local SSDs to attach to each VM. See https://cloud.google.com/compute/docs/disks/local-ssd. number 0 no
local_ssd_interface Interface to be used with local SSDs. Can be either 'NVME' or 'SCSI'. No effect unless local_ssd_count is also set. string "NVME" no
machine_type Machine type to use for the instance creation string "c2-standard-60" no
metadata Metadata, provided as a map map(string) {} no
name_prefix Name prefix for PBS execution hostnames string null no
network_interfaces A list of network interfaces. The options match that of the terraform
network_interface block of google_compute_instance. For descriptions of the
subfields or more information see the documentation:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance#nested_network_interface

_NOTE:_ If network_interfaces are set, network_self_link and
subnetwork_self_link will be ignored, even if they are provided through
the use field. bandwidth_tier and enable_public_ips also do not apply
to network interfaces defined in this variable.

Subfields:
network (string, required if subnetwork is not supplied)
subnetwork (string, required if network is not supplied)
subnetwork_project (string, optional)
network_ip (string, optional)
nic_type (string, optional, choose from ["GVNIC", "VIRTIO_NET"])
stack_type (string, optional, choose from ["IPV4_ONLY", "IPV4_IPV6"])
queue_count (number, optional)
access_config (object, optional)
ipv6_access_config (object, optional)
alias_ip_range (list(object), optional)
list(object({
network = string,
subnetwork = string,
subnetwork_project = string,
network_ip = string,
nic_type = string,
stack_type = string,
queue_count = number,
access_config = list(object({
nat_ip = string,
public_ptr_domain_name = string,
network_tier = string
})),
ipv6_access_config = list(object({
public_ptr_domain_name = string,
network_tier = string
})),
alias_ip_range = list(object({
ip_cidr_range = string,
subnetwork_range_name = string
}))
}))
[] no
network_self_link The self link of the network to attach the VM. string "default" no
network_storage An array of network attached storage mounts to be configured.
list(object({
server_ip = string,
remote_mount = string,
local_mount = string,
fs_type = string,
mount_options = string,
client_install_runner = map(string)
mount_runner = map(string)
}))
[] no
on_host_maintenance Describes maintenance behavior for the instance. If left blank this will default to MIGRATE except for when placement_policy, spot provisioning, or GPUs require it to be TERMINATE string null no
pbs_data_service_user PBS Data Service POSIX user string "pbsdata" no
pbs_exec Root path in which to install PBS string "/opt/pbs" no
pbs_home PBS working directory string "/var/spool/pbs" no
pbs_license_server IP address or DNS name of PBS license server string n/a yes
pbs_license_server_port Networking port of PBS license server number 6200 no
pbs_server_rpm_url Path to PBS Pro Server Host RPM file string n/a yes
placement_policy Control where your VM instances are physically located relative to each other within a zone.
object({
vm_count = number,
availability_domain_count = number,
collocation = string,
})
null no
project_id Project in which Google Cloud Storage bucket will be created string n/a yes
region Default region for creating resources string n/a yes
server_conf A sequence of qmgr commands in format as generated by qmgr -c 'print server' string "# empty qmgr configuration file" no
service_account Service account to attach to the instance. See https://www.terraform.io/docs/providers/google/r/compute_instance_template.html#service_account.
object({
email = string,
scopes = set(string)
})
{
"email": null,
"scopes": [
"https://www.googleapis.com/auth/devstorage.read_write",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring.write",
"https://www.googleapis.com/auth/servicecontrol",
"https://www.googleapis.com/auth/service.management.readonly",
"https://www.googleapis.com/auth/trace.append"
]
}
no
spot Provision VMs using discounted Spot pricing, allowing for preemption bool false no
startup_script Startup script used on the instance string null no
subnetwork_self_link The self link of the subnetwork to attach the VM. string null no
tags Network tags, provided as a list list(string) [] no
threads_per_core Sets the number of threads per physical core. By setting threads_per_core
to 2, Simultaneous Multithreading (SMT) is enabled extending the total number
of virtual cores. For example, a machine of type c2-standard-60 will have 60
virtual cores with threads_per_core equal to 2. With threads_per_core equal
to 1 (SMT turned off), only the 30 physical cores will be available on the VM.

The default value of "0" will turn off SMT for supported machine types, and
will fall back to GCE defaults for unsupported machine types (t2d, shared-core
instances, or instances with less than 2 vCPU).

Disabling SMT can be more performant in many HPC workloads, therefore it is
disabled by default where compatible.

null = SMT configuration will use the GCE defaults for the machine type
0 = SMT will be disabled where compatible (default)
1 = SMT will always be disabled (will fail on incompatible machine types)
2 = SMT will always be enabled (will fail on incompatible machine types)
number 0 no
zone Default zone for creating resources string n/a yes

Outputs

Name Description
pbs_server Name of the controller node