Threading

Message processing nodes and parsers must work in a multi-instance, multithreaded environment. There can be many node objects or parser objects each with many syntax elements, and there can be many threads executing methods on these objects. The message broker design ensures that a message object and any objects it owns are used only by the thread that receives and processes the message through the message flow.

An instance of a message flow processing node is shared and used by all the threads that service the message flow in which the node is defined. For parsers, an instance of a parser is used only by a single message flow thread.

A user-defined extension should adhere to this model, and should avoid the use of global data or resources that require semaphores to serialize access across threads. Such serialization can result in performance bottlenecks.

User-defined extension implementation functions must be reentrant, and any functions they invoke must also be reentrant. All user-defined extension utility functions are fully reentrant.

Although a user-defined extension can spawn additional threads if required, it is essential that the same thread returns control to the broker on completion of an implementation function. Failure to do this will compromise the integrity of the broker and will produce unpredictable results.

Execution model

When an execution group is initialized, the appropriate lils are made available to the runtime. The Execution Group runtime process starts, and spawns a dedicated configuration thread. In the message flow execution environment, the message flow is thread-safe. You can concurrently run message flows on many OS threads, without having to consider serialization issues. Any user-defined nodes that you implement should not compromise this threading model. Note the following points:
  • An input message sent to a message flow is only processed by the thread that received it. No thread or context switching takes place during message processing
  • Data structures accessed by message flows are only visible to a single thread, and these data structures exist only for the lifetime of the message being processed.
  • A single instance of a message flow is shared between all the threads in the message flow thread pool. This is related to the behavior of a message flow node in that it does not have state.
  • The memory requirements of an Execution Group are not unduly affected by running message flows on more OS threads.

Threading model

The following message flow example will help you understand some of the threading considerations you should be aware of when designing and implementing your own user-defined nodes. It considers the example of a user-defined input node.

A message flow can be configured to run on a set of threads. This is determined by the number of input nodes in the message flow, and the value of the additionalInstances property of the message flow. These two elements determine the maximum capacity of the thread pool the message flow can use. Therefore, if your message flow has particular processing requirements that dictate single threaded execution, you need to ensure that this is the case.

A typical order of events for input node processing looks like this:
  1. Input node construction takes place
  2. A thread is demanded from the thread pool
  3. The allocated thread starts in the node's run method
  4. Configuration (or reconfiguration) is committed
  5. Initialization processing is performed on the thread's context
  6. The thread connects to the broker's queue manager
  7. A message group and buffer object are created
  8. A queue open request for the input queue is sent to the queue manager. This queue is kept open for the duration of the thread's lifetime.
  9. The input node enters a message processing loop
  10. When a message is received, the data buffer contains the header and body of the message
  11. Message objects are created and attached to the thread's group
  12. Thread dispatching is activated if multiple threads are specified
  13. Message data is propagated downstream.
You should note the following:
  • Your input node will implement the chosen message flow threading model.
  • Your input node will always have at least one thread either reading from its input source or actively processing a message received by it. If a message flow has multiple input nodes, then any additional thread instances are available to service any of the input nodes, as determined by the dispatching policy of that input node.
Threads can be demanded or requested. When your message flow is deployed, the input node demands an initial thread. Although the message flow has a pool of threads associated with it, it is the input node that is responsible for the dispatching policy. This means that it always needs to ensure that one instance of itself is running on a thread. Because the default value of the additionalInstances property is zero, any further requests for a thread will fail if you have defined multiple input nodes. This means that it is possible for a message flow to consume more threads that you expect. Also, this could mean that if you have defined multiple input nodes, one of the input nodes could be starved of threads.

Allowing the broker to start additional copies of the message flow in separate threads using the additionalInstances property is the most efficient way of preventing the input queue from becoming a bottleneck. However, creating separate threads allows parallel processing of messages from the message queue, so you should only use this property when the order messages are processed is not important.

Threads are created as a result of input node construction and operation. A thread remains active or idle in the thread pool, and idle threads remain in the pool until they are dispatched by an input node, or the broker is shut down.

The figure below illustrates thread allocation in a message flow.

Thread allocation in a message flow


See the accompanying text for an explanation of the elements in the diagram

Initially, Thread 1 is demanded (A), and waits for messages. When a message arrives (B), Thread 1 propagates the message, and dispatches Thread 2. Thread 2 receives a message immediately (C), and propagates, and dispatches Thread 3, which waits for a message (D). Thread 1 completes (E), and returns to the thread pool. Thread 3 then receives a message (F), dispatches Thread 1, and propagates the message. Thread 1 now waits for a message to arrive on the queue (G).

It is worth noting the point marked H. At this instance in the message flow, Thread 1 acquires a message, but because all other threads are active at that time, it cannot dispatch. The message is propagated.

After this message has been propagated, Thread 2 completes (I), receives a new message from the queue, and propagates this new message. Thread 3 then completes (J), and returns to the pool. Thread 2 then also completes (K). Because it did not dispatch, when Thread 1 completes (L), it cannot return to the thread pool even though there are no messages on the queue, and instead waits for a message to arrive on the queue.

Note the following points about thread behavior in WebSphere Message Broker:
  • Threads are only created if required by the workload. This means that an execution group process can use fewer threads than it has been configured for.
  • Unless all available threads are actively processing within a message flow, one thread will always be reading the input queue.
  • If the workload increases at any point, other input nodes in the same message flow will be able to acquire threads that have been returned to the thread pool.
If a thread acquires a message, but all other threads are active at the time, it cannot dispatch. The message is propagated. When this thread completes, because it did not dispatch, it cannot return to the thread pool even though there are no messages on the queue.
Related concepts
User-defined input nodes
Related reference
cniDispatchThread