Skip to main content

Choosing Instance Types

When configuring a cloud-based cluster, there are many instance types to choose from, and each cloud service provider (CSP) has their own naming conventions. Elements of instance type names correspond to elements of the physical machines that clusters are deployed from. This page explains these instance naming conventions for AWS, Azure, and Google clusters.

Please note that specific instance types are subject to change as CSPs add and remove hardware; however, you can always refer to this page to help identify nodes and any additional features that are included with them.

AWS

On AWS, instance types are named following a pattern based on:

  • instance family
  • generation family
  • processor family
  • additional capabilities
  • instance size

An example of an AWS instance type with labels for each element. An example of an AWS instance type from their documentation

Instance Families

SyntaxDescription
cCompute optimized
dDense storage
fFPGA
gGraphics intensive
hpcHigh performance computing
infAWS Inferentia
mGeneral purpose
macmacOS
pGPU accelerated
rMemory optimized
tBurstable performance
trnAWS Trainium
uHigh memory
vtVideo transcoding
xMemory intensive

Processor Families

SyntaxDescription
aAMD processors
gAWS Graviton processors
iIntel processor

Note for the i syntax: Many older Intel-based instance types do not include this code. It was likely added when AWS began offering more Graviton and AMD-based options.

Additional Capabilities

SyntaxDescription
dInstance store volumes
nNetwork and EBS optimized
eExtra storage or memory
zHigh performance
flexFlex instance

Instance Generations

Older instance generations are usually kept available for a set period of time, but it’s suggested to use newer versions for optimal performance.

Selection Guidelines

For CPU-based workloads, most PW users will want to select compute-optimized instance types, which include the c and hpc instance families:

  • c5n.18xlarge: One of PW’s default configuration instances. c5 instances are based on Intel Skylake processors. Note that the instance name is missing an i in the name because it predates other processors being included in the family.
  • c6in.24xlarge: A newer generation of the c instance family. Note that this instance includes an i in the name to separate it from c6a (AMD) and c6g (Graviton) instances.
  • hpc6a.48xlarge: AMD EPYC-based instances designed specifically for HPC workloads

For GPU-based workloads, look for instances in the g and p families:

  • g5.48xlarge: g5 instances are equipped with NVIDIA A10G Tensor Core GPUs and AMD EPYC processors.
  • p3.16xlarge: p3 instances include Intel Skylake processors and NVIDIA V100 Tensor Core GPUs.
Note

Instance options vary by zone and region. If you're trying to use a specific instance type and it's not visible in the dropdown, try changing to a different region first.

Further Reading

You can read more about AWS instances and naming conventions any time by visiting this page of their documentation.

Azure

Azure's naming structure follows this pattern:

[Family] + [Sub-family] + [# of vCPUs] + [Constrained vCPUs] + [Additive Features] + [Accelerator Type] + [Version]

On the PW platform, we also add an Azure instance's tier.

For example, the Azure instance Standard_HC44rs can be broken down into:

  • Tier: Standard
  • Family: H
  • Sub-family: C
  • CPUs: 44
  • Additive Features: rs
    • r: RDMA capable
    • s: Premium Storage capable

Instance Families

SyntaxDescription
AEntry-level VMs for dev/test
BsEconomical burstable VMs
DGeneral purpose compute
EOptimized for in-memory applications
FCompute optimized virtual machines
GMemory and storage optimized virtual machines
HHigh Performance Computing virtual machines
LsStorage optimized virtual machines
MMemory optimized virtual machines
Mv2Largest memory optimized virtual machines
NGPU-enabled virtual machines

Instance Sub-families

Many Azure instance families include sub-families with different features. For example, H-Series instances come in two flavors:

  • HB: Up to 120 AMD EPYC 7003-series CPU cores, 448 GB of RAM, and no hyperthreading
  • HC: Up to 44 Intel Xeon Platinum 8168 processor cores, 8 GB of RAM per CPU core, no hyperthreading, and up to 4 Managed Disks

Additional Capabilities

SyntaxDescription
aAMD-based processor
bBlock Storage performance
ddiskful (that is, a local temp disk is present); this feature is for newer Azure VMs; see Ddv4 and Ddsv4-series
iisolated size
llow memory; a lower amount of memory than the memory intensive size
mmemory intensive; the most amount of memory in a particular size
pARM CPU
ttiny memory; the smallest amount of memory in a particular size
sPremium Storage capable, including possible use of Ultra SSD
Cconfidential
NPnode packing
rRDMA capable

Please note that this is not a complete list of additive features. Additionally, these identifiers are not used in all node types that may apply to them. For example, Standard_HB60rs instances have AMD EPYC processors, but don’t have an a listed as an additional capability.

Instance Generations (version)

Like other cloud providers, Azure instances are routinely updated with newer generations. Azure has product pages for each instance series that describes their specifications, current generation, and additional features. Information for the H-Series nodes can be found on this page.

Selection Guidelines

For compute clusters, we suggest using Azure H-Series nodes as they are InfiniBand/RDMA enabled for high-speed networking. Our primary default cluster configuration uses Standard_HC44rs instances.

For Lustre, stick to instances that have d and s listed as additional features for their enhanced storage functionality.

Further Reading

You can read more about Azure naming conventions by visiting this page of their documentation.

You can read more about Azure instance types on this page of their documentation.

Google

Machine Families & Series

Google instances fall into one of four categories (called familes) and are further categorized by their series and generation.

  • General-purpose: best price-performance ratio for a variety of workloads
    • e2
    • n2, n2d, n1
    • c3
    • tau t2d, tau t2a
  • Compute-optimized: highest performance per core on Compute Engine and optimized for compute-intensive workloads
    • h3
    • c2, c2d
  • Memory-optimized: ideal for memory-intensive workloads, offering more memory per core than other machine families, with up to 12 TB of memory
    • m3, m2, m1
  • Accelerator-optimized: ideal for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high-performance computing (HPC); this family is the best option for workloads that require GPUs
    • a2
    • g2

Selection Guidelines

h3-standard-88 is Google’s newest node type that's suitable for HPC workloads. This type features 88 vCPUs (no hyperthreading), 352GB memory, and up to 200 Gbps network egress bandwidth.

c2-standard-60 instances are smaller than the h3 nodes, but are also well suited for HPC applications.

For GPUs, try the a2 series.

Further Reading

You can read more about Google instances and naming conventions any time by visiting this page of their documentation.