
Introduction
A common problem companies face is how to scale their reporting and other time-intensive tasks that could take 10+ minutes to generate.
If you don't plan ahead, it is easy for your application to become overwhelmed and for your infrastructure to fall apart.
During this short article we will explore the queue-based load leveling pattern, and look at how it can help alleviate this issue.
Queue-Based Load Leveling
How The Pattern Works
Queue-based load leveling is a technique used to decouple large tasks from the main application, and then store these in a queue to be processed later.
- The term "queue-based" refers to the use of queues to store the tasks that can be processed later.
- The term "load leveling" refers to converting an irregular or large number of concurrent requests into a steady stream.
Development Challenges
Beware! You may need to refactor your application to support this. Let's take reporting as an example.
Before the user would simply click a button, wait 10 minutes, and the report would be returned in a single request.
Under the queue-based load leveling pattern, we would split this action into two parts:
- Process Now: Return a quick success message saying the report is being generated
- Process Later: Generate the report when resource available
SQS
A tool we can use to achieve this in AWS is Amazon Simple Queue Service (Amazon SQS).
SQS is a fully managed queue system. There are lots of alternatives, but if you're in the AWS eco-system already then it can make sense just to leverage their managed solution.
SQS provides message queues where you can send, store, and receive messages.
There are two types:
- Standard Queues: Best effort ordering, will be processed at least once, but really high throughput.
- FIFO (First-In-First-Out) Queues: Exact order, exactly-once processing, but lower throughput.
That's it! They're really that simple.
Interactive Demo
Let's look at the infrastructure load when we decouple the reporting from the main application.
Click "Add Worker" to get started.

Summary
Queue-based load leveling can be a great pattern to have in your pocket if you work with irregular traffic patterns that trigger time consuming processes.
Great When
- Your application is subject to overloading
- You can process part of the request later
Bad when
- You need an instant response from the application