Auto-Scaling Algorithms in Serverless Cloud Environments

Authors

  • Revathi Arul Independent Researcher Velachery, Chennai, India (IN) – 600042 Author

Keywords:

Auto-scaling; Serverless computing; Function-as-a-Service; Predictive scaling; Cloud performance

Abstract

Serverless cloud computing, epitomized by Function-as-a-Service (FaaS) platforms, offers a revolutionary paradigm where developers focus solely on code logic while infrastructure concerns are fully abstracted by cloud providers. By enabling fine-grained resource billing based on actual execution duration and per-request consumption, serverless mitigates upfront capacity planning and minimizes idle infrastructure costs. However, inherent workload variability and abrupt request bursts introduce performance and cost challenges. Traditional reactive auto-scaling approaches, which provision additional function instances only after utilization thresholds are breached, often incur cold-start delays and transient latency spikes. Conversely, fully predictive algorithms, relying on historical time-series forecasting, can misestimate sudden demand changes, leading to under- or over-provisioning that either degrades user experience or elevates cost inefficiency. In this manuscript, we propose a novel Hybrid Predictive-Reactive (HPR) auto-scaling algorithm specifically tailored for serverless environments.

The algorithm integrates lightweight single exponential smoothing for near-term workload forecasting with robust reactive threshold triggers, triggering proactive scale-outs when forecasts anticipate imminent capacity exhaustion and reactive adjustments when actual utilization deviates beyond safe bounds. Controlled experiments are conducted in an enhanced CloudSim-based simulation framework, employing synthetic sinusoidal patterns, randomized Poisson bursts, and an industry-standard real-world FaaS workload trace. Performance metrics such as average response time, scaling latency, CPU utilization, throughput, and cost per thousand requests are systematically evaluated. Compared against baseline reactive-only and predictive-only schemes, HPR reduces mean response time by over 20 %, lowers average scaling latency by 25 %, increases utilization by 8 %, and cuts cost by 8 % on average. These results underscore the effectiveness of combining predictive foresight with reactive safety nets in achieving both stringent Service Level Objectives (SLOs) and cost efficiency in serverless auto-scaling. Implications for practical deployment and avenues for integrating advanced machine-learning forecasts and dynamic threshold tuning are also discussed.

Downloads

Download data is not yet available.

Downloads

Additional Files

Published

2025-09-04

How to Cite

Arul, Revathi. “Auto-Scaling Algorithms in Serverless Cloud Environments”. International Journal of Advanced Research in Computer Science and Engineering (IJARCSE) 1, no. 3 (September 4, 2025): Sep(35–42). Accessed January 22, 2026. https://ijarcse.org/index.php/ijarcse/article/view/77.

Similar Articles

1-10 of 55

You may also start an advanced similarity search for this article.