Thread model#

If you were given an open source middleware and asked to analyze its implementation, what would you start with? The answer might be:

Source code modules
Abstract concepts and design patterns
Threads

For modern open source middleware, I think the thread/process model is most importance. This is because modern middleware basically uses multi-processing or multi-threading to fully utilize hardware resources. No matter how well the abstraction is encapsulated or how elegantly the design patterns are applied, the program has to run on the cpu as a thread. And how multi-threading is divided by function, how to synchronize the communication between threads, these things are the difficulty and focus.

Simply put, Envoy uses the thread design pattern of non-blocking + Event Driven + Multi-Worker-Thread. In the history of software design, there are many names for similar design patterns, such as:

This section assumes that the reader has been introduced to Envoy’s event-driven model. If not, you can read the book’s Event-driven. This section references: Envoy threading model - Matt Klein

Unlike the single thread of Node.JS, Envoy supports multiple Worker Threads to run their own independent event loops in order to take full advantage of multi-Core CPUs. This design comes at a cost, because multiple worker threads / main threads are not completely independent from each other. They need to share some data, such as:

Upstream Cluster’s endpoints, health status…
Various monitoring statistical metrics

Threading overview#

Figure : Threading overview

Source: Envoy threading model - Matt Klein

Envoy uses some different types of threads, as shown above. The two importance threads are selected below for illustration:

Main: This thread owns server startup and shutdown, all xDS API handling (including DNS, health checking, and general cluster management), runtime, stat flushing, admin, and general process management (signals, hot restart, etc.). Everything that happens on this thread is asynchronous and “non-blocking.” In general the main thread coordinates all critical process functionality that does not require a large amount of CPU to accomplish. This allows the majority of management code to be written as if it were single threaded.

Worker: By default, Envoy spawns a worker thread for every hardware thread in the system. (This is controllable via the --concurrency option). Each worker thread runs a “non-blocking” event loop that is responsible for listening on every listener (there is currently no listener sharding), accepting new connections, instantiating a filter stack for the connection, and processing all IO for the lifetime of the connection. Again, this allows the majority of connection handling code to be written as if it were single threaded.

Thread Local#

Because of the way Envoy separates main thread responsibilities from worker thread responsibilities, there is a requirement that complex processing can be done on the main thread and then made available to each worker thread in a highly concurrent way. This section describes Envoy’s Thread Local Storage (TLS) system at a high level. In the next section I will describe how it is used for handling cluster management.

Source: Envoy threading model - Matt Klein

Figure : Thread Local Storage (TLS) system

Source: Envoy threading model - Matt Klein

Figure : Cluster manager threading

If the shared data is locked for write and read access, the concurrency will definitely decrease. So the Envoy author referred to the Linux kernel’s [read-copy-update (RCU)] (https://en.wikipedia.org/wiki/Read-copy) under the condition that the real-time consistency requirements for data synchronization updates are not high. They have implemented a set of Thread Local data synchronization mechanism. The underlying implementation is based on C++11’s thread_local function and libevent’s libevent::event_active(&raw_event_, EV_TIMEOUT, 0).

The following figure is based on Envoy threading model - Matt Klein, trying to use Cluster Manager as an example to illustrate how Envoy use Thread Local mechanism the share data between threads at source code level.

Figure - ThreadLocal Classes — *Figure: ThreadLocal Classes*#

Open with Draw.io

The above figure can be briefly described as follows:

The main thread initializes ThreadLocal::InstanceImpl and registers each Dispatcher to ThreadLocal::InstanceImpl
The main thread notifies all worker threads to create local ThreadLocalClusterManagerImpl
When the main thread senses that a Cluster has been deleted, it notifies the ThreadLocalClusterManagerImpl of each worker thread to delete the Cluster.
When TCPProxy on the worker thread tries to connect to an OnDemand Cluster (unknown cluster), it get the thread-local ThreadLocalClusterManagerImpl

Ref#

Envoy threading model - Matt Klein

Thread model

Contents

Thread model#

Threading overview#

Thread Local#

Ref#