WebSphere brand IBM WebSphere Telecom Web Services Server, Version 7.1

Guidelines for choosing limits for SLA Cluster Enforcement

This section provides information and examples to help you configure the long-term SLA Cluster Enforcement policies at the requester level, service level, and operation level.

Overview

At the core of the long-term SLA implementation is the concept of a sliding window that is defined by a combination of two policies (message.sla.TimeWindow and message.sla.TimeUnit). The guiding principle is that in any given time window, the system will not allow more tokens to be consumed than what is allowed by the most granular rate policy defined for the request: either the message.sla.ClusterOperationRate, message.sla.ClusterServiceRate, or message.sla.ClusterRequesterRate policy (in that order).

All three examples apply only to the SLA Cluster Enforcement mediation primitive. They cannot be applied to the SLA Local Enforcement mediation primitive.

Requester level policies

In this example, a telecom network operator has exposed Terminal Location (TL) and Terminal Status (TS) services to a client (Requester1) using TWSS. Assume that the network operator assigns a weight of 10 to each TL and TS operation, and wants to restrict Requester1 to send at most 10 requests (of either TL or TS) within a time window of 600 seconds.

In this case, the policy set is coded as follows:
<requesterID>Requester1</requesterID>
	<policies>
		<policy attribute="message.sla.ClusterEnabled" value="true"/>
		<policy attribute="message.sla.ClusterWeight" value="10"/>
		<policy attribute="message.sla.ClusterRequesterRate" value="100"/>
		<policy attribute="message.sla.TimeUnit" value="Second"/>
		<policy attribute="message.sla.TimeWindow" value="600"/>
	</policies>		

With these policies in effect, here is an example of what would happen to a set of requests originating with Requester1. Some of the requests are rejected because the SLA policy is defined at the requester level for a time window of 600 seconds.

(The notations indicate that the first request is a Terminal Location request consuming 50 tokens, for example a getLocation request with 5 targets. The request is sent 60 seconds after the start of the interval in question.)
Time line showing requests in the example using requester level policies. Refer to the accompanying table for details.
Table 1. Example using requester level policies
Time (seconds) Request Disposition
60 TL (50) Accepted
180 TS (50) Accepted
360 TL (10) Rejected, because the total number of tokens (100) has been exhausted for the 600 second window.
540 TS (10) Rejected
620 TL (10) Rejected
680 TS (30) Accepted, because only 50 tokens have been consumed during the previous 600 seconds–therefore enough tokens are available to accommodate this request.
820 TS (50) Accepted

Service level policies

In this example, a telecom network operator has exposed short messaging (SMS), Terminal Location (TL), and Terminal Status (TS) services to a client (Requester1) using TWSS. Assume that the network operator assigns a weight of 1 to each SMS request and a weight of 10 to each TL and TS operation. Also assume that the network operator wants to restrict Requester1 to send at most 10 TL and 10 TS requests (each of them counted separately) and any number of SMS requests provided the total consumed capacity does not exceed 500 tokens, within a time window of 600 seconds.

For Requester1, message.sla.ClusterRequesterRate (at requester level) is defined to be 500. Note that this allows any request from Requester1 to consume up to 500 tokens collectively. The message.sla.ClusterServiceRate policy is defined for both TS and TL to be 100. This policy overrides the requesterRate policy and allows each TL and TS to consume 100 tokens each.

In this case the policy set retrieved for any TL or TS request is coded as follows:
<requesterID>Requester1</requesterID>
	<policies>
		<policy attribute="message.sla.ClusterEnabled" value="true"/>
		<policy attribute="message.sla.ClusterWeight" value="10"/>
		<policy attribute="message.sla.ClusterRequesterRate" value="500"/>
		<policy attribute="message.sla.ClusterServiceRate" value="100"/>
		<policy attribute="message.sla.TimeUnit" value="Second"/>
		<policy attribute="message.sla.TimeWindow" value="600"/>
	</policies>  
The policy set retrieved for an SMS request is coded as follows:
<requesterID>Requester1</requesterID>
	<policies>
		<policy attribute="message.sla.ClusterEnabled" value="true"/>
		<policy attribute="message.sla.ClusterWeight" value="1"/>
		<policy attribute="message.sla.ClusterRequesterRate" value="500"/>
		<policy attribute="message.sla.TimeUnit" value="Second"/>
		<policy attribute="message.sla.TimeWindow" value="600"/>
	</policies>

With these policies in effect, and assuming that message.sla.ClusterServiceRate is defined for both TL and TS services, here is an example of what would happen to a set of requests originating with Requester1.

(As in the previous example, the notations indicate that the first request is a Terminal Location request consuming 50 tokens, for example a getLocation request with 5 targets. The request is sent 60 seconds after the start of the interval in question.)
Time line showing requests in the example using service level policies. Refer to the accompanying table for details.
Table 2. Example using service level policies
Time (seconds) Request Disposition
60 TL (50) Accepted
180 TS (100) Accepted, because both TL and TS are allowed to consume 100 tokens each over the interval.
360 TL (50) Accepted; however, now the 100-token limit has been reached for both TL and TS.
540 TS (10) Rejected, because TS has used its limit of 100 tokens within the past 600 seconds.
620 TL (10) Rejected, because TL has also used its limit of 100 tokens within the past 600 seconds.
680 TL (30) Accepted, because only 50 tokens have been consumed by TL during the previous 600 seconds–therefore enough tokens are available to accommodate this request.
750 TS (50) Rejected, because TS is still at its threshold of 100 tokens within the past 600 seconds.
810 TS (50) Accepted, because more than 600 seconds have elapsed since TS used up its limit of 100 tokens.

Service operation level policies

This example demonstrates how to set up policies for different operations within a service. Consider a telecom network operator who has exposed TL service to a client (Requester1) using TWSS. Assume that the network operator assigns a weight of 10 to each TL operation. Also assume that the network operator wants to allow Requester1 to send at most 10 getLocation and 10 getLocationForGroup requests within a time window of 600 seconds.

For Requester1, service TL, and operation getLocation, the message.sla.ClusterOperationRate policy is defined to be 100. For Requester1, service TL, and operation getLocationForGroup, the message.sla.ClusterOperationRate policy is also defined to be 100. Note that if a message.sla.ClusterServiceRate policy is defined, then that policy is enforced for all operations other than getLocation and getLocationForGroup. Thus both getLocation and getLocationForGroup requests are allocated 100 tokens each. In a similar way, if a message.sla.ClusterRequesterRate policy is defined, that policy will apply for all services other than TL.

In this case the policy set retrieved for getLocation and getLocationForGroup requests from Requester1 is coded as follows:
<requesterID>Requester1</requesterID>
	<policies>
		<policy attribute="message.sla.ClusterEnabled" value="true"/>
		<policy attribute="message.sla.ClusterWeight" value="10"/>
		<policy attribute="message.sla.ClusterRequesterRate" value="500"/>
		<policy attribute="message.sla.ClusterServiceRate" value="50"/>
		<policy attribute="message.sla.ClusterOperationRate" value="100"/>
		<policy attribute="message.sla.TimeUnit" value="Second"/>
		<policy attribute="message.sla.TimeWindow" value="600"/>
	</policies>  
In this case, the ClusterOperationRate (100) overrides both ClusterServiceRate (50) and ClusterRequesterRate (500). This allows Requester1 to send 100 getLocation requests. The ClusterServiceRate (50) value applies to all operations for which no ClusterOperationRate is defined. The ClusterRequesterRate (500) value applies to all other services for which ClusterServiceRate is not defined.

With these policies in effect, here is an example of what would happen to a set of requests originating with Requester1.

(The notations indicate that the first request is a getLocation request consuming 50 tokens, for example a request with 5 targets. The request is sent 60 seconds after the start of the interval in question.)
Time line showing requests in the example using service operation level policies. Refer to the accompanying table for details.
Table 3. Example using service operation level policies
Time (seconds) Request Disposition
60 getLoc (50) Accepted
180 getLocGroup (100) Accepted
360 getLoc (50) Accepted; however, now the 100-token limit has been reached for both getLocation and getLocationForGroup. 300 tokens remain for other request types, based on the difference between the ClusterRequesterRate value (500) and the 200 tokens already consumed.
540 getLocGroup (10) Rejected, because getLocationForGroup has used its limit of 100 tokens within the past 600 seconds.
620 getLoc (10) Rejected, because getLocation has also used its limit of 100 tokens within the past 600 seconds.
680 getLoc (30) Accepted, because only 50 tokens have been consumed by getLocation during the previous 600 seconds–therefore enough tokens are available to accommodate this request.
750 getLocGroup (50) Rejected, because getLocationForGroup is still at its threshold of 100 tokens within the past 600 seconds.
810 getLocGroup (50) Accepted, because more than 600 seconds have elapsed since getLocationForGroup used up its limit of 100 tokens.

General guidelines

In general, follow these guidelines when defining SLA policies:
  • Even though the Service Policy Manager allows you to define the policy values for message.sla.ClusterRequesterRate, message.sla.ClusterServiceRate and message.sla.ClusterOperationRate at any level, you should define message.sla.ClusterRequesterRate only at the requester level, message.sla.ClusterServiceRate only at the service level and message.sla.ClusterOperationRate only at the operation level.
  • The time window should be defined at the same level where the rate policy is defined.
  • If you change the policies to modify the rate or duration of the time window, your changes will take effect as soon as the next request is sent. However, if you change the duration of the time window, requests which are already placed will expire according to the policy values that were in force when the requests were placed.
Here are some general guidelines for enhancing performance:
  • The long-term SLA Enforcement mediation primitive uses a relational database to store the SLA-related data. Whenever there is a request for which SLA needs to be enforced, a database-read is performed. For each successful request, a new record is created in the database. However, there is no database record for a request that has violated the SLA. Depending on the trace level, an entry may be created in the trace file.
  • For performance reasons, it is recommended that you keep the time window fairly short–especially for services that generate a high number of requests per second. For example, if a sendSMS operation is running at 100 requests per second and you set the SLA window for 10 minutes, the SLA calculation will involve an aggregation of over 60,000 records. A more reasonable time window in this case would be one minute.
  • Expired SLA data is cleaned from the table using a background thread. The behavior of the cleaner thread can be controlled using the mediation promoted properties named cleanupInterval and maxEntriesPerCleanup. The 'cleanupInterval' specifies the interval after which the cleanup task will be executed. The 'maxEntriesPerCleanup' specifies the number of records that will be cleared from the database when the cleanup job is executed once.
  • The cleanup thread can also affect performance. It is advisable to run the cleaning less often with a batch size approximately equal to the number of requests that might have been created during the time window. Alternatively, the cleaning can be disabled and a batch job outside the mediation primitive can be used to clean the data during off peak period.
  • When you have multiple Access Gateway flows deployed within the same cluster, then each of these flows (enterprise applications) will have one cleaner thread running. For better performance, run the cleaner thread in one application only and disable it in the remaining applications. This can be done by setting the cleanupInterval to zero in the remaining applications.
  • Even if the enterprise application is stopped in the WebSphere® Integrated Solutions Console, the cleaner thread will continue to run.



Terms of use
(C) Copyright IBM Corporation 2009. All Rights Reserved.