BLD FLASH 9546 SOURCE MATERIAL DATED: 11/95 ¦ USING COMPRESSION TECHNIQUES TO IMPROVE PRINTER THROUGHPUT Printer throughput is affected by many factors: complexity of the datastream, amount of data processed per impression, bandwidth of the communications path, host system processing power, and printer controller memory/processing power are all significant factors that must be analyzed to ensure maximum throughput (and rated speed where possible) are achieved. With the general migration away from local channel-attached printers to re- mote, distributed, LAN or communications controller-attached printers, the 'weakest link' with the most profound effect on printer throughput is shift- ing to the bandwidth of the communications path (both hardware and software). In a perfect world, all installations would ensure their communications equipment and software provided more than enough bandwidth to keep their printers constantly fed. However, reality is far from perfect and many cus- tomers today try and balance the conflicting goals of optimal bandwidth and minimal costs. In some cases, the telephony infrastructure within a country actually dictates the use of sub-optimal line speeds. In others, customers will not gather the necessary information to determine what bandwidth is op- timal, but instead will treat a new printer as if it were just another termi- nal or other 3270-like device. In still others, customers will make a conscious decision to use sub-optimal lines and then look to their IBM sup- port team for ingenious ways to maintain rated speed. This flash attempts to provide the IBM account team with enough information to guide the customer in remote printer pre-installation performance-related activities. It will compare (in a very general way) various forms of data compression/compaction, and contrast compression on the communications layer with that between a host print driver and internal printer microcode. It will also provide general guidelines for estimating bandwidth requirements. These can be used during pre-install activities and also are applicable in situ- ations where an installed printer suffers from degraded throughput. In the latter case, the bandwidth estimation can help focus attention on where the performance bottleneck resides. If bandwidth is deemed the culprit, the compression techniques discussed here can be benchmarked with customer data to determine if they can provide relief in situations where line speed cannot be increased. Wherever possible, the IBM account team should assist the customer by provid- ing guidelines to ensure the appropriate line speed is installed to keep a printer 'fed' with enough data to maintain rated speed. Appendix C to this flash gives two conservative formulas which can be used to calculate needed bandwidth. When bandwidth cannot be increased, tuning of the communications path (such as setting higher RU sizes and increasing pacing values) and decreasing the amount of data sent across the lines are the primary means of incrementally improving throughput to the optimal value in the 'real' world. This flash ad- dresses compression techniques most customers can use to optimize the avail- able communications bandwidth. NOTE: While bandwidth is the focus of this flash, turnaround time can also play a significant role in reducing printer throughput. This is not really much of a factor with land-based communications paths (the norm today), but becomes important if satellite-based communications are used. Satellites tend to have high bandwidth, but slow turnaround. If satellites are used to route data, pacing values should be set to very high levels (such as 50 or more). This will help prevent 'dead' time while the host waits for the secondary LU (the printer) to send back acknowledgements on data buffer receives before the host can send additional data. Since acknowledgements are also required for resource loading (such as sending a font character set to the printer), land-based communication paths are almost always required when high-speed printers (such as an IBM 3900-0W1) are attached. Communication specialists should be involved early in situations where satellite-based communications are planned. THE MECHANICS OF COMPRESSION ____________________________ Compression can occur on different levels. If the entire data path is exam- ined, it begins at the host level with the printer driver (PSF). PSF receives data from print spool and converts it into the proper datastream required by the printer. For remote printers (the primary subject of this flash), the print datastream must then pass to the host communications layer (for exam- ple, MVS VTAM). VTAM dissects the datastream into 'packets' (RUs) and then plops the data onto communication lines (usually via a communications controller such as a 3745). On the other end of the line, the secondary (or receiving) communications product receives the packets and rebuilds the print datastream. In the case of PSF/2, this is CM/2. For PSF/6000 (or an AFCCU printer attached via SDLC), this is SNA Server/6000. As a final step the print datastream is accepted and processed by the receiv- ing print driver (PSF/2, PSF/6000, or in the case of an SDLC-attached AFCCU printer, the actual printer controller). The diagram below illustrates the end-to-end printing path: HOST SYSTEM RECEIVING SYSTEM (MVS, VSE, VM, OS/400) (OS/2, AIX) ----------------------------- -------------------------------- | ! | COMMS | ! | | --------- ! --------- | PATH | --------- ! --------- | | | HOST | ! | SNA | |<--- | | SNA | ! | APPL | | | | APPL | ! | PROD | | | | | | ! | | | | | | ! | | | --->| | (CM/2)| ! |(PSF/2)| | | | (PSF) | ! | (VTAM)| | | |(SNA/6)| ! |(PSF/6)| | | --------- ! --------- | | --------- ! --------- | | ! | | ! | ----------------------------- -------------------------------- |<-- Communications Layer --->| |<----- Host to Receiving Appl. Layer ----->| COMPRESSION TECHNIQUES ______________________ Where possible, the best place for compression is within the communications layer. This limits the amount of processing cycles the print drivers must de- vote to compression techniques, and allows the communications equipment (es- pecially lines) to be optimized. As an example, in the case of PSF/MVS and PSF/2, the communication layer is between MVS/VTAM and CM/2. So the first thing to examine is whether the communications products (at both the originating and receiving nodes) can be set up for compression. In the case of IBM SNA products, this is RU compression. RU compression can be specified in all system versions of VTAM, as well as CM/2. Both nodes must indicate compression is allowed for it to be activated. RU compression is possible with all host VTAM products and PSF/2 (either PSF Direct or PSF/2 DPF). While compression will most often be applied in situ- ations where low-to-medium speed printers (such as the 3116 or 3130) are at- tached via low-speed lines (19.2 Kbps and below), it also applies in high-volume production applications such as a 3900-DW system attached via PSF/2 and running in two-up duplex mode on 18x11 paper via PSF Direct, as the bandwidth requirement in this scenario could exceed 1 Mb/sec (in other words - 3/4 of a T1 line). With a combination of RU compression, 8 KB RU size, and reasonable pacing values (say 6), the bandwidth requirement could well drop in the 350-400 Kbps range. In a more typical environment where compression can improve printer through- put, consider the following real-life example: A customer in a country where bandwidth is severely constrained had a 3935 connected to PSF/MVS via a 9.6 Kbps SDLC line. Removing the bandwidth bottleneck was not a viable option, so alternatives were researched and tested on site. For a particular text- intensive benchmarking job, the initial performance was 6 ipm. Implementing blank compression on the host (discussed below in the section entitled "Gen- eral Line Data Compression within MVS" ) improved the performance to 10 ipm, which was still below customer requirements. Next PSF/2 was installed between the line and the 3935 so that RU compression could be used between MVS/VTAM and CM/2 on the PSF/2 system. With this change, performance jumped to 26 ipm. (While this was a significant improvement - 2.6 times the performance without RU compression - it turns out the customer did not choose to implement a PSF/2-based solution, as it increased the overall cost by several thousand dollars. We'll continue this example later...) One might think we need look no further, as RU compression seems to be a via- ble technique to help alleviate bandwidth restrictions. However, RU com- pression is not possible when the remote print server is PSF/6000, because the SNA Server/6000 product does not support RU decompression. This means we can't use the communications layer to support compression be- tween PSF/MVS and PSF/6000, for example. It also means that we can't use the communications layer to perform data compression between a host print driver and a comms-attached AFCCU printer (because an SDLC-attached AFCCU printer actually is using SNA Server/6000 under the covers to receive data from the host system). Line speeds of 19.2 Kbps or less for SDLC-attached AFCCU printers can prevent the printers from running at rated speed for many applications. Since we cannot use the communications layer to compress print data, we must look fur- ther upstream (to the host application) and attempt to perform data com- pression there. This also implies we may need to decompress the data at the receiving end (within the AFCCU printer control unit). HUFFMAN COMPRESSION (UNIQUE TO SDLC AFCCU PRINTERS) ___________________________________________________ For both PSF/MVS and PSF/2, we have added Huffman compression between the host application (PSF) and a receiving SDLC-attached AFCCU printer. This compressed data flows across the line and continues through the communi- cations layer (SNA Server/6000) within the AFCCU printer. The data is then expanded by a Huffman decompressor (part of the AFCCU microcode) and finally processed normally by the actual printer controller. This function ONLY applies to AFCCU printers attached via SDLC to either PSF/MVS or PSF/2. So today this is restricted to SDLC-attached 3130s or 3935s driven by either a PSF/MVS V2.1.x host (with APAR OW13534 installed), or a PSF/2 V2.0 server. Support for Huffman compression is not yet available for PSF/MVS V2.2, but it is coming soon (see APAR OW16122). Please note there is no Huffman compression possible between PSF/MVS and a remote server (either PSF/2 or PSF/6000). Huffman compression ONLY applies between a host PSF (specifically PSF/MVS or PSF/2) and an SDLC-attached AFCCU printer (3130 or 3935). In the example above for the 3935 connected to PSF/MVS via a 9.6 Kbps line, performance of the text-intensive application was measured with Huffman com- pression activated, but without VTAM RU compression. Printer throughput was measured at 17 ipm (versus 6 ipm with no compression and 26 ipm when PSF/2 was installed and RU compression activated). While the performance with Huffman was less than that with PSF/2 and RU compression, the Huffman sol- ution did not require the additional expense of a PSF/2 system acting as print server, and hence was the solution chosen by the customer. GENERAL LINE DATA COMPRESSION WITHIN MVS ________________________________________ There are other forms of 'data compression' possible, particularly in an MVS environment. These are as follows (note that these apply to ALL AFP devices): o 'Blank compression' is available with PSF/MVS APAR OW07350. This function compresses blanks within line data records when 6 or more contiguous blanks are encountered in a print line (it is activated by the COMPRESS=YES JCL keyword in the PRINTDEV). o For line data with TRAILING blanks, JES2 provides truncation of trailing blanks within line data records on spool. This is activated via the BLNKTRNC=YES keyword (which defaults to YES if not specified) within an OUTCLASS statement (so applies to SYSOUT classes). OTHER FORMS OF DATA COMPRESSION _______________________________ In addition to line data (text) compression, "compression" is also possible for two other data forms - image and graphics. ("Compression" is used in quotes here because GOCA is not strictly compressed data - rather it involves shorthand representations for describing graphical objects. However, as GOCA is used in place of 'simple' bit-mapped image data, the GOCA object will of- ten be much smaller than its corresponding bitmap (IM1) representation.) Both types of compressed data (IOCA for image and GOCA for graphics) can greatly reduce the amount of data sent between the host application and the receiving system, and also result in DASD savings on host libraries. Data in IOCA/GOCA form is passed as is by the communications layer, and is decompressed/interpreted within the receiving printer's microcode. All IBM AFCCU printers support both IOCA and GOCA, as do most of IBM's coax/twinax attached IPDS printers (such as 3116 or 4028). Both are further supported by IBM CCU printers equipped with the AFIG feature (such as 3835-2 or 3900-001) and by PCL4/5 printers (controlled by PSF/6000 or PSF/2). Image and/or graphics in "compressed" form are handled particularly well by AFCCU printers, but in other printers may not yield as high throughput as un- compressed image/graphics, as decompression can require significant process- ing time within printer microcode. As a result, there are once again trade-offs between line optimization and printer processor cycles. Existing performance flashes describe these trade-offs via examples. SUMMARY _______ Without a doubt, the 'best' answer in situations where printer throughput de- grades solely due to 'data starvation' is to increase line speed to ensure enough bandwidth is available even for the most complex data-intensive appli- cations. However, as with most computing decisions, cost is one of the most important parameters, and it may not be possible to provide the required line speed. In these cases, compression should be benchmarked to see if it can provide visible benefits. In any given customer environment, the best form of compression may be a com- bination of the methods described above. (A word of caution - while some of the techniques above can work together, such as blank truncation and either Huffman or RU compression, there are rapidly diminishing returns when one at- tempts to compress data that has already been manipulated by a compression algorithm. In fact, compressing data that has already been treated by a com- pression algorithm can actually INCREASE the amount of data, rather than re- duce it). As a general rule, MVS/JES2 accounts should always use the default trailing blank truncation. Many customers can also take advantage of RU compression, and should do so whenever bandwidth is a concern. In addition, maximizing RU sizes (which is hardware-dependent) and pacing values helps optimize the line and move the maximum amount of data between the host and the receiving systems. For the particular case of SDLC-attached 3130s and 3935s with line speeds be- low 56 Kbps, Huffman compression should be carefully considered (where the printers are connected to either PSF/MVS or PSF/2) if performance degradation is present. APPENDIX A: COMPRESSION IN PSF/MVS __________________________________ When PSF/MVS is driving a printer, compression of the data sent to the printer may enhance the throughput of the printer. There are two types of compression that PSF itself can perform. One is line-data only compression which can be activated by specifying the optional COMPRESS keyword on the PRINTDEV statement and the other is Huffman compression which can be used to compress any type of data. The characteristics of the majority of data sent to the printer and the printer used will determine the optimal compression type. If possible, try both separately to see which gives the best results. Huffman compression may improve performance, because the algorithm compresses on a byte-by-byte basis and a different compression bit code is assigned to represent each indivisible character in the uncompressed data. The number of bits used to represent each uncompressed character depends upon how fre- quently the character occurs in the data to be compressed. Characters that occur frequently are assigned short compression bit codes and characters that occur less frequently are assigned longer compression bit codes. The Huffman compression algorithm can be used on any type of data (line data, MO:DCA-P, image, fonts, etc.). Data compressed using the Huffman algorithm can only be sent to those AFCCU printers (SDLC attached) that contain the de- compression microcode. Note that no printer microcode updates are necessary. Huffman compression may improve printer throughput when the following condi- tions are met: o The printer is attached via a low-speed communications line, for example, a 9.6 or 19.2 Kbps line. o Printer throughput is limited by the attachment. o The arrangement of the data (occurrence and position) is irregular, for example, text that has many embedded blanks will do well. The compression ratio varies considerably. It does best with irregular data and with data in double-byte font form. In general, text data as above will give compression ratios between 1.25:1 and 1.75:1. Double-byte data will do even better. Using Huffman compression may significantly increase the number of CPU cycles that PSF requires. It may also change PSF from being an I/O-bound process to being a CPU-bound process. The cost-effectiveness of compression depends upon a balance between the availability and cost of additional host CPU cycles to perform compression and the cost of additional communications bandwidth. For maximum data transfer, remember to set the RU size. On a clean line (retransmissions are infrequent) choose the largest possible RU size. APSUX07 is used to tell PSF/MVS to activate Huffman compression to compress the data before being sent to the printer by setting the XTP7HCA flag in the initialization call. The XTP7HCA flag defaults to B'0'. The default specifies that PSF will not compress the IPDS data. If the flag is set to B'1', then PSF will compress the IPDS data using the Huffman compression algorithm. The flag can be set during the INIT call, and it remains set between calls; only APSUX07 can change the state of this flag. APPENDIX B: HUFFMAN COMPRESSION IN PSF/2 ________________________________________ Details of PSF/2's implementation of Huffman compression are given in the "What's New" online reference, which is placed in the PSF/2 folder as part of PSF/2 V2.0 installation. Briefly, Huffman compression applies ONLY to SDLC-attached 3130s or 3935s. These attach to a PSF/2 system via the PS/2 Multi-Protocol/A adapter. Huffman compression is activated by setting an environment variable within the OS/2 CONFIG.SYS file. As a result, it applies on the boot following the setting of environment variable within CONFIG.SYS, and remains active until CONFIG.SYS is modified and the server rebooted. The required line is: SET PSFSDLCCOMPRESS=Plu-Alias1;Plu-Alias2;... where Plu-Aliasx is the alias defined for the printer in CM/2. The printer senses Huffman compression automatically (no printer configura- tion settings are required) and will decompress the print data if Huffman compression was used. APPENDIX C: ESTIMATING PRINTER PERFORMANCE __________________________________________ For distributed printers attached to a host via some conglomeration of commu- nications equipment, the available line bandwidth is a critical component in determining overall throughput. When planning for a remote printer installation, or when trying to determine whether line speed is a factor when a printer's throughput falls below opti- mal values, a rough calculation is helpful. The following can be used for rough estimates: I = printer speed in impressions per minute P = bytes of application data per impression (per side of a sheet) L = line speed in bits/second Assuming 25% overhead (this includes control bytes within the datastream and overhead caused by communications controls), the formula is: L = I*P/6 This formula is appropriate for most basic text applications, where the ap- plication formats each line of data. It also applies when limited image is included in the data (you'll need to include image bytes per impression in the determination of the P variable). The formula will be on the conservative side in most cases. For applications with a higher ratio of control characters to data bytes (for example, formatting tables with DCF or using field-level formatting to map fields within a line record to multiple locations on the page), a higher overhead factor is required. Assuming 50% total overhead, the formula be- comes: L = I*P/5 Given any two of the variables in either of the formulas above, the remaining variable can be estimated. An example might be helpful. Let's assume we have an IBM 3130 printer at- tached via comms line to a PSF/MVS host. If we want to try and estimate what speed line we need to drive the printer at rated speed, we first must estimate the number of bytes of data per im- pression. Suppose we plan to print listings 2-up on 8-1/2 by 11 inch paper, and each logical page of data is composed of 80 bytes of text per line, and 45 lines per page. We know the rated speed of the 3130 is 30 impressions per minute. Thus I=30. We can calculate P (bytes/impression) as follows: P=80(bytes/line)*45(lines/logical page)*2(logical pages/impression)=7200 Since this is simple text, we use the first formula. Hence: L = I*P/6 = 30*7200/6 = 36,000 bits/sec So for this application, we should be safe using a 38.4 Kbps line. Let's examine this from another perspective. Let's still assume a 3130 printer. But now let's assume the line speed is set at 19.2 Kbps. How many bytes per page can the application generate and still allow us to achieve rated speed? Here we have I (30) and also L (19,200). Rearranging the formula, we see: P = 6*L/I = 6*19,200/30 = 3,840 bytes/impression If we maintain the assumption that our application is generating lines of 80 characters each, we can then see that the maximum number of lines of data per impression we can send is 3,840/80 = 48 lines per impression. Clearly in this case we'd need to use some form of compression to try and im- prove overall throughput. This example would be a good case for either Huffman compression (assuming the 3130 is attached to PSF/MVS via SDLC) or for VTAM-CM/2 RU compression (if the 3130 were managed by a PSF/2 system, which in turn was attached to the host). Note that the formula can be adjusted by an additional fudge factor if the overall compression ratio is known. For example, if we assume RU compression reduces the data by 40%, then the formula becomes: L = (I*P/6)*.6 ==> In other words, we only need 60% as much line speed when data is reduced 40% (.6 = 1-.4) APPENDIX D: QUESTIONS AND ANSWERS _________________________________ 1) Can I use VTAM compression with PSF/2 DIRECT? Yes. PSF/2 PSF Direct uses Communication Manager. So, like PSF/2 DPF, it can use VTAM RU compression. 2) Can I use PSF/MVS Huffman compression with PSF Direct and AFCCU printers? No. PSF/2 PSF Direct scans the data stream for a few IPDS commands that it wouldn't recognize if the data was compressed. Besides, presumably the printer is attached to the PS/2 by a high speed link. So, end-to- end com- pression is not important. Only compression between MVS and PSF/2 (that is, RU compression over the communications layer). 3) What about PSF/MVS Huffman compression with PSF/2 DPF and AFCCU printers? No, this also won't work. PSF/2 DPF must be able to understand the IPDS data stream. It needs to respond to STM and XOH/OPC commands. And, to perform printer error recovery, it needs to be able to understand the LCC and LPD commands as it prints the spooled data. We did NOT add Huffman decompression logic to the receiving end of DPF. 4) Why was Huffman compression added at all? Huffman arose because the AIX communication product, SNA Server/6000, does not support RU compression/decompression. It is important to note that the AFCCU controllers use a RISC System/6000 running AIX and SNA Server/6000. So, to provide compression support for the AFCCU printers attached via slow SDLC lines, Huffman routines were added to the sending side of PSF/MVS and PSF/2. 5) Can Huffman compression be used with printers like 3116 attached via slow lines? No. While the host software compresses the data, it is up to the printer con- trol unit to decompress the data back to normal IPDS. The only control units today with this capability are the 3935 and 3130. 6) Shouldn't Huffman compression be used all the time to help alleviate line bottlenecks? Actually, no. Each individual situation must be evaluated separately. Com- pression is a trade-off between CPU cycles and communication bandwidth. It is appropriate when CPU cycles are available and communication bandwidth is low. For high communications bandwidth (such as T1 lines) on a fully-utilized CPU, compression robs precious CPU cycles while not translating into increased printer throughput. Huffman compression is designed to make a 3130 on a 19.2 Kb/s line run at 30 ipm instead of 18 or 20. It isn't designed to make a 3900-0W1 print at 229 ipm instead of 20. 7) Is compression active on a job by job basis? Once Huffman is turned on by exit 07, the data is compressed as long as we have an SNA connection. PSF does not care who or what we are connected to or what printer is on the other end (there is no OPC or STM checking). 8) My printer is not running at rated speed. It's attached via a 19.2 Kbps line. Is communications bandwidth the cause of the problem? Not necessarily. The cause could be data-dependent, such as having hex data that is not mapped in the font codepage. Running the application with printer data checks unblocked should be an early problem determination step. You should also use the formulas in Appendix C to ensure bandwidth deserves a second look. If it does, a further problem determination technique is to ac- tually measure the line utilization. If the line seems to be underutilized, the problem probably is not bandwidth, but some other factor (not enough printer memory, too many resources per page, 240 pel overlays with signif- icant image data printed on 300 pel device, etc.). So before jumping the gun and blaming bandwidth, try and eliminate other causes of performance bottle- necks. $EOM