Node.js is traditionally known as a single-threaded, asynchronous, event-driven JavaScript runtime.
It’s designed for building scalable network applications using non-blocking I/O and an event-driven asynchronous paradigm rather than relying on multithreading.
Before diving into the details, let’s differentiate between the terms synchrony, asynchrony, and multithreading:
- Synchrony: Synchrony involves processing tasks sequentially, where each task must be completed before the next begins. A single thread handles one task at a time, often resulting in idle time if a task involves waiting (e.g., for I/O).
- Asynchrony: Asynchrony allows a single thread to manage multiple tasks without waiting for each task to complete. Tasks are split into smaller chunks or callbacks and rely on signals or event-driven mechanisms to notify the thread when they are complete.
- Multithreading: Multithreading uses multiple threads to execute tasks concurrently. Each thread can independently handle a task, enabling parallel execution and reducing idle time for CPU-bound or I/O-intensive operations.
Is multithreading useful for I/O-bound tasks?
Well, for network applications, having multiple threads simply waiting on I/O is often inefficient. Threads consume resources, active or not.
Idle threads waste CPU time that could otherwise be put to use by threads performing actual computation.
Context switching between threads also adds overhead, as the CPU must save the current thread’s data, application pointer, and other state information, then load the corresponding data for the next thread to execute.
Moreover, shared memory access by multiple threads can lead to concurrency issues like race conditions, deadlocks, or resource starvation.
Event-driven asynchronous I/O, on the other hand, eliminates the need to manage multiple threads, enhances scalability, and simplifies application design by avoiding thread management complexities.
Thread-based networking is relatively inefficient and very difficult to use. Furthermore, users of Node.js are free from worries of dead-locking the process since there are no locks.
Almost no function in Node.js directly performs I/O, so the process never blocks. Because nothing blocks, scalable systems are very reasonable to develop in Node.js.
Does Node.js use threads? Yes, it does.
Node.js uses threads in two ways:
- The main Event Loop thread: Executes your JavaScript code (initialization and callbacks) and handles non-blocking asynchronous operations such as network I/O.
- The Worker Pool (a.k.a threadpool) threads: Offloads tasks for I/O APIs that the OS can’t handle asynchronously and certain CPU-intensive operations.
Note: we have no control over Worker Pool threads as they are managed by libuv
.
Addressing CPU-intensive tasks beyond the Worker Pool
Consider a synchronous CPU-intensive task, such as hashing every element of a large array using the crypto
module.
This blocking operation ties up the Event Loop thread, preventing it from handling other incoming requests until it’s done.
Because Node handles many clients with few threads, if thread blocks handling one client’s request, then pending client requests may not get a turn until the thread finishes its callback or task.
The fair treatment of clients is thus the responsibility of your application. This means you shouldn’t do too much work for any client in any single callback or task.
There are several examples of synchronous, CPU-intensive tasks or attacks that should be avoided from running continuously in the Event Loop thread:
- ReDoS (Regular expression Denial of Service): Using a vulnerable regular expression.
- JSON DoS (JSON Denial of Service): Using large JSON objects in
JSON.parse
orJSON.stringify
. - Certain synchronous Node.js APIs, such as
zlib.inflateSync
,fs.readFileSync
,child.execSync
, etc. - Computationally heavy algorithms (e.g.,
O(N²)
operations on large datasets).
Introducing Node.js worker threads
Node.js v12.11.0 has stabilised the worker_threads
module after it has been experimental for the last two versions.
Workers (threads) are useful for performing CPU-intensive JavaScript operations.
They will help a little with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be.
Let’s start with a simple example from the Node.js documentation to demonstrate how we can create worker threads:
How can worker threads communicate with their parent thread?
The message
event is emitted for any incoming message whenever port.postMessage()
sends data through the channel.
Internally, a Worker
object has a built-in pair of the worker.MessagePorts
that are already associated with each other when the Worker
is created.
For more complex scenarios, you can create a custom MessageChannel
instead of using the default channel.
Here is another example from the Node.js documentation that demonstrates creating a worker.MessageChannel
object to be used as the underlying communication channel between the two threads:
Worker thread standard channels
You can configure process.stderr and process.stdout to perform synchronous writes to a file, preventing issues like unexpectedly interleaved output from console.log() or console.error(), or output being lost if process.exit() is called before asynchronous write finishes.
worker.stderr
: Ifstderr: true
wasn’t passed to theWorker
constructor, data pipes to the parent thread’sprocess.stderr
duplex stream.worker.stdin
: Ifstdin: true
was passed to theWorker
constructor, data written to this stream will be available in the worker thread as aprocess.stdin
.worker.stdout
: Ifstdout: true
wasn’t passed to theWorker
constructor, data will be piped to the parent thread’sprocess.stdout
duplex stream.
Let’s solve the problem we faced earlier
To avoid blocking the Event Loop with the CPU-intensive task of hashing the array elements, delegate the work to a worker thread. Once completed, the worker thread will return the hashed array to the main thread.
And in the same folder, let’s create a worker.js
file to write the worker logic on it:
This approach prevents blocking the main Event Loop, allowing it to handle other requests concurrently.
Conclusion
Offloading CPU-intensive synchronous tasks to worker threads and leaving only I/O-bound asynchronous tasks to the Event Loop can dramatically improve Node.js application performance.
Node.js worker threads operate in isolated contexts, minimizing traditional concurrency issues and relying on message passing for communication between the main thread and worker threads.
Comments