This article was authored by AzureCAT Marc van Eijk. It was edited by Bruce Hamilton and reviewed by Damir Bersinic, Gavin Kemp, Daniel Neumann, and Emmanuel Sache.
Table of Contents:
- Scalability - This article
- Summary & Learn more
Scalability is the ability of a system to handle increased load on an application, which can vary over time as other factors, and forces, affect the audience size in addition to the size and scope of the application.
For the core discussion of this pillar, see Scalability in Pillars of software quality.
A horizontal scaling approach for hybrid applications allows for adding more instances to meet demand and then disabling them during quieter periods.
In hybrid scenarios, scaling out individual components requires additional consideration when components are spread across clouds. Scaling one part of the application can require the scaling of another. For example, if the number of client connections increases but the application’s web services are not scaled out appropriately, the load on the database might saturate the application.
Some application components can scale out linearly, while others have scaling dependencies and might be limited to what extend they are able to scale. For example, a VPN tunnel providing hybrid connectivity for the application components locations has a limit to the bandwidth and latency it can be scaled to. How are components of the application scaled to ensure these requirements are met?
Ascertain scaling thresholds. To handle the various dependencies in your application, determine the extent to which application components in different clouds can scale independently of each other, while still meeting the requirements to run the application. Hybrid applications often need to scale particular areas in the application to handle a feature as it interacts and affects the rest of the application. For example, exceeding a number of front-end instances may require scaling the back-end.
Define scale schedules. Most applications have busy periods, so you need to aggregate their peak times into schedules to coordinate optimal scaling.
Use a centralized monitoring system. Platform monitoring capabilities can provide autoscaling, but hybrid applications need a centralized monitoring system that aggregates system health and load. A centralized monitoring system can initiate scaling a resource in one location and scaling a depending resource in another location. Additionally, a central monitoring system can track which clouds autoscale resources and which clouds don’t.
Leverage autoscaling capabilities (as available). If autoscaling capabilities are part of your architecture, you implement autoscaling by setting thresholds that define when an application component needs to be scaled up, out, down, or in. An example of autoscaling is a client connection that is autoscaled in one cloud to handle increased capacity, but causes other dependencies of the application, spread across different clouds, to also be scaled. The autoscaling capabilities of these dependent components must be ascertained.
If autoscaling is not available, consider implementing scripts and other resources to accommodate manual scaling, triggered by thresholds in the centralized monitoring system.
Determine expected load by location. Hybrid applications that handle client requests might primarily rely on a single location. When the load of client requests exceeds a threshold, additional resources can be added in a different location to distribute the load of inbound requests. Make sure that the client connections can handle the increased loads and also determine any automated procedures for the client connections to handle the load.
Next Article: Availability