HPC 101 - Scalability

Introduction

Scalability is a vast and complex topic within the HPC research field. Thus, this section only serves as a brief starting point. In short, scalability refers to how the performance of a workload increases with an increase in computational resources.

Calculating performance up scaling

For example, suppose you have a workload that takes 10 day to complete on a single node with 64 cores (threads). When you add another node with the same specifications, scalability is measured by the magnitude of time reduction observed. Thus if the workload now completed in 5 days, you would have seen a 100% (or 2x) increase in performance. This type of scaling is referred to as linear scaling.

Scenarios for performance scaling

However scaling is not always linear. There may be a scenario where the scaling remains linear up until a specific number of cores (threads) / nodes are reached and then diminished scaling (for example: only a 0.1 performance improvement is obtained after increasing the resources by 100%) occurs. Performance may even decrease after a certain point.

The scenarios referred to above is often dictated with how the specific software being utilized achieves parallelization, i.e. how it de-constructs and distributes a workload in a parallel computing environment. Thus, again, it is important that a user understand how the software they wish to utilize in the HPC environment performs this parallelization. In this way, the user may aid the HPC staff in optimizing the scalability of the software for the specific HPC infrastructure provided. This in turn will be to the benefit of the rest of the HPC user base.