Skip to main content

Understanding Eventloops (Tokio Internals)

· 3 min read
Abhishek Tripathi
Curiosity brings awareness.

Prelude

This is the first post in a four part series that will provide an understanding of the mechanics behind the Tokio runtime in Rust. This post focuses on the challenges in a multi-threaded event loop that force us to think of async runtimes like Tokio.

Index of the four part series:

  1. Visualizing Tokio Internals: Part I - Multi-Threaded Event Loop / Server
  2. Visualizing Tokio Internals: Part II - Reactor
  3. Visualizing Tokio Internals: Part III - Wakers
  4. Visualizing Tokio Internals: Part IV - Executors

Multi-Threaded Event Loop / Server

What challenges in a multi-threaded event loop force us to think of async runtimes like Tokio?

Phase 0: The Problem

Learning Objective
After reading this you will be able to answer:

Why do we need async runtimes like Tokio?

  • Resource Efficiency: Traditional thread-per-connection models waste system resources
  • Scalability: Async enables handling thousands of connections with minimal overhead
  • Performance: Event-driven architecture reduces context switching and memory usage
  • Cost-Effective: Better resource utilization means lower infrastructure costs

Modern applications, especially network services, need to handle many things concurrently. Imagine a web server handling thousands of client connections simultaneously.

A naive approach is to dedicate one Operating System (OS) thread to each connection. Let's see why this doesn't scale well.

The Thread-Per-Connection Resource Drain

The visualization below shows resource consumption (CPU/Memory) and throughput limits of a blocking thread-per-connection model.

Description:

Imagine a dashboard resembling htop or Task Manager:

  1. CPU Usage: Bars representing individual CPU cores.
  2. Memory Usage: A single bar showing total RAM consumption.
  3. Active Threads: A counter or list showing running OS threads.
  4. Requests/Second: A throughput meter.
  5. Incoming Requests Queue: A visual queue of pending connections.

Simulation:

  • Start: The server starts. CPU/Memory usage is low. Throughput is 0. Few base threads exist.
  • Low Load: Simulate a few incoming connections (~10). For each, a new OS thread is created.
    • Visual: Active Threads count increases slightly. Memory usage ticks up slightly. CPU usage might blip as threads start but stays relatively low if connections are mostly idle. Throughput matches the request rate.
  • High Load: Simulate hundreds or thousands of incoming connections. Many connections involve waiting for network I/O (reading request body, waiting for database, sending response).
    • Visual:
      • Active Threads: The count explodes. Each thread requires kernel resources and its own stack (~MBs).
      • Memory Usage: The Memory bar shoots up dramatically, potentially hitting system limits.
      • CPU Usage: CPU bars likely thrash. Even if threads are mostly waiting (blocked on I/O), the OS spends significant time context switching between them. This is overhead, not useful work.
      • Requests Queue: The incoming requests queue grows rapidly because threads are created, but many quickly block on I/O. The server struggles to accept new connections.
      • Requests/Second: The throughput meter hits a plateau far below the incoming request rate, possibly even decreasing as context-switching overhead dominates.
htop - Thread-Per-Connection Server
Active Connections:10
Threads:12
Requests/sec:0
Request Queue:0
CPU Usage
CPU 0
5%
CPU 1
3%
CPU 2
4%
CPU 3
2%
CPU 4
5%
CPU 5
3%
CPU 6
2%
CPU 7
4%
Average:
0%
Memory Usage
Mem
8%
Top Threads
PID
USER
PR
NI
CPU%
MEM%
CMD
1000
server
20
0
5.0
0.3
http-conn
1001
server
20
0
16.0
0.4
http-conn
1002
server
20
0
2.0
0.3
http-conn
1003
server
20
0
10.0
0.4
http-conn
1004
server
20
0
15.0
0.4
http-conn
1005
server
20
0
5.0
0.4
http-conn
1006
server
20
0
3.0
0.4
http-conn
1007
server
20
0
8.0
0.3
http-conn
1008
server
20
0
18.0
0.4
http-conn
1009
server
20
0
12.0
0.3
http-conn
... and 2 more threads

Thread-Per-Connection Resource Monitor

10
SlowFast1x

Performance Impact

0
100
200
300
400
500
Resource Utilization (%)
CPU Usage
Memory Usage
Thread Count
Throughput
Request Queue

System Status:

System is handling connections efficiently. Resources are well-utilized with minimal overhead.

Figure 1: Interactive visualization of thread-per-connection scaling issues. As connection count increases, resources are consumed by thread overhead, while throughput plateaus and then declines due to context switching costs.
Insight

We need a way to handle multiple waiting tasks concurrently without needing a dedicated OS thread for each one while it's waiting. This leads to asynchronous programming.