A common problem in high traffic systems is scaling. When applications take a long time to initialize, scaling in the middle of a traffic spike can cause a noticeable latency impact and even causing downtime. Predictive scaling together with machine learning involves gathering and utilizing data to accurately forecast incoming spikes and efficiently allocate resources within a microservices-based infrastructure. By using machine learning, systems can predict the upcoming traffic and scale the resources in real-time, optimizing performance and minimizing downtime. This approach enables businesses to stay ahead of their resource needs, ensuring seamless and cost-effective operation of their cloud-based services.
In this thesis, design and implement predictive scaling in a cloud microservice architecture using machine learning. The goal is to build a model that can accurately forecast spikes and scale the system accordingly, based on gathered data of traffic patterns such as CPU utilization, memory usage and other metrics. Compare your solution by implementing and/or analyzing other scaling methods, such as reactive and proactive scaling.
Bachelor/Master of Science in Computer Science/Engineering
In this thesis, investigate these questions: