Choosing Instance Types

When configuring a cloud-based cluster, there are many instance types to choose from, and each cloud service provider (CSP) has their own naming conventions. Elements of instance type names correspond to elements of the physical machines that clusters are deployed from. This page explains these instance naming conventions for AWS, Azure, and Google clusters.

Please note that specific instance types are subject to change as CSPs add and remove hardware; however, you can always refer to this page to help identify nodes and any additional features that are included with them.

AWS

On AWS, instance types are named following a pattern based on:

instance family
generation family
processor family
additional capabilities
instance size

An example of an AWS instance type from their documentation

Instance Families

Syntax	Description
c	Compute optimized
d	Dense storage
f	FPGA
g	Graphics intensive
hpc	High performance computing
inf	AWS Inferentia
m	General purpose
mac	macOS
p	GPU accelerated
r	Memory optimized
t	Burstable performance
trn	AWS Trainium
u	High memory
vt	Video transcoding
x	Memory intensive

Processor Families

Syntax	Description
a	AMD processors
g	AWS Graviton processors
i	Intel processor

Note for the i syntax: Many older Intel-based instance types do not include this code. It was likely added when AWS began offering more Graviton and AMD-based options.

Additional Capabilities

Syntax	Description
d	Instance store volumes
n	Network and EBS optimized
e	Extra storage or memory
z	High performance
flex	Flex instance

Instance Generations

Older instance generations are usually kept available for a set period of time, but it’s suggested to use newer versions for optimal performance.

documentation for the current generation of AWS instances
documentation for the previous generation of AWS instances

Selection Guidelines

For CPU-based workloads, most PW users will want to select compute-optimized instance types, which include the c and hpc instance families:

c5n.18xlarge: One of PW’s default configuration instances. c5 instances are based on Intel Skylake processors. Note that the instance name is missing an i in the name because it predates other processors being included in the family.
c6in.24xlarge: A newer generation of the c instance family. Note that this instance includes an i in the name to separate it from c6a (AMD) and c6g (Graviton) instances.
hpc6a.48xlarge: AMD EPYC-based instances designed specifically for HPC workloads

For GPU-based workloads, look for instances in the g and p families:

g5.48xlarge: g5 instances are equipped with NVIDIA A10G Tensor Core GPUs and AMD EPYC processors.
p3.16xlarge: p3 instances include Intel Skylake processors and NVIDIA V100 Tensor Core GPUs.

Note

Instance options vary by zone and region. If you're trying to use a specific instance type and it's not visible in the dropdown, try changing to a different region first.

Azure

Azure's naming structure follows this pattern:

[Family] + [Sub-family] + [# of vCPUs] + [Constrained vCPUs] + [Additive Features] + [Accelerator Type] + [Version]

On the PW platform, we also add an Azure instance's tier.

For example, the Azure instance Standard_HC44rs can be broken down into:

Tier: Standard
Family: H
Sub-family: C
CPUs: 44
Additive Features: rs
- r: RDMA capable
- s: Premium Storage capable

Instance Families

Syntax	Description
A	Entry-level VMs for dev/test
Bs	Economical burstable VMs
D	General purpose compute
E	Optimized for in-memory applications
F	Compute optimized virtual machines
G	Memory and storage optimized virtual machines
H	High Performance Computing virtual machines
Ls	Storage optimized virtual machines
M	Memory optimized virtual machines
Mv2	Largest memory optimized virtual machines
N	GPU-enabled virtual machines

Instance Sub-families

Many Azure instance families include sub-families with different features. For example, H-Series instances come in two flavors:

HB: Up to 120 AMD EPYC 7003-series CPU cores, 448 GB of RAM, and no hyperthreading
HC: Up to 44 Intel Xeon Platinum 8168 processor cores, 8 GB of RAM per CPU core, no hyperthreading, and up to 4 Managed Disks

Additional Capabilities

Syntax	Description
a	AMD-based processor
b	Block Storage performance
d	diskful (that is, a local temp disk is present); this feature is for newer Azure VMs; see Ddv4 and Ddsv4-series
i	isolated size
l	low memory; a lower amount of memory than the memory intensive size
m	memory intensive; the most amount of memory in a particular size
p	ARM CPU
t	tiny memory; the smallest amount of memory in a particular size
s	Premium Storage capable, including possible use of Ultra SSD
C	confidential
NP	node packing
r	RDMA capable

Please note that this is not a complete list of additive features. Additionally, these identifiers are not used in all node types that may apply to them. For example, Standard_HB60rs instances have AMD EPYC processors, but don’t have an a listed as an additional capability.

Instance Generations (version)

Like other cloud providers, Azure instances are routinely updated with newer generations. Azure has product pages for each instance series that describes their specifications, current generation, and additional features. Information for the H-Series nodes can be found on this page.

Selection Guidelines

For compute clusters, we suggest using Azure H-Series nodes as they are InfiniBand/RDMA enabled for high-speed networking. Our primary default cluster configuration uses Standard_HC44rs instances.

For Lustre, stick to instances that have d and s listed as additional features for their enhanced storage functionality.

Google

Machine Families & Series

Google instances fall into one of four categories (called familes) and are further categorized by their series and generation.

General-purpose: best price-performance ratio for a variety of workloads
- e2
- n2, n2d, n1
- c3
- tau t2d, tau t2a
Compute-optimized: highest performance per core on Compute Engine and optimized for compute-intensive workloads
- h3
- c2, c2d
Memory-optimized: ideal for memory-intensive workloads, offering more memory per core than other machine families, with up to 12 TB of memory
- m3, m2, m1
Accelerator-optimized: ideal for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high-performance computing (HPC); this family is the best option for workloads that require GPUs
- a2
- g2

Selection Guidelines

h3-standard-88 is Google’s newest node type that's suitable for HPC workloads. This type features 88 vCPUs (no hyperthreading), 352GB memory, and up to 200 Gbps network egress bandwidth.

c2-standard-60 instances are smaller than the h3 nodes, but are also well suited for HPC applications.

For GPUs, try the a2 series.

Choosing Instance Types

AWS

Instance Families

Processor Families

Additional Capabilities

Instance Generations

Selection Guidelines

Further Reading

Azure

Instance Families

Instance Sub-families

Additional Capabilities

Instance Generations (version)

Selection Guidelines

Further Reading

Google

Machine Families & Series

Selection Guidelines

Further Reading

Choosing Instance Types

AWS​

Instance Families​

Processor Families​

Additional Capabilities​

Instance Generations​

Selection Guidelines​

Further Reading​

Azure​

Instance Families​

Instance Sub-families​

Additional Capabilities​

Instance Generations (version)​

Selection Guidelines​

Further Reading​

Google​

Machine Families & Series​

Selection Guidelines​

Further Reading​

AWS

Instance Families

Processor Families

Additional Capabilities

Instance Generations

Selection Guidelines

Further Reading

Azure

Instance Families

Instance Sub-families

Additional Capabilities

Instance Generations (version)

Selection Guidelines

Further Reading

Google

Machine Families & Series

Selection Guidelines

Further Reading