Power6 High-End System Firmware Fix History - Release levels EH3xx

Firmware Description and History


EH350 For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url: http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
EH350_176_038 / FW350.H0 01/20/17	Impact: Availability Severity: ATT System firmware changes that affect all systems A problem was fixed for a Live Partition Mobility migration that resulted in the source managed system going to the management console Incomplete state after the migration to the target system was completed. This problem is very rare and has only been detected once.. The problem trigger is that the source partition does not halt execution after the migration to the target system. The management console went to the Incomplete state for the source managed system when it failed to delete the source partition because the partition would not stop running. When this problem occurred, the customer network was running very slowly and this may have contributed to the failure. The recovery action is to re-IPL the source system but that will need to be done without the assistance of the management console. For each partition that has a OS running on the source system, shut down each partition from the OS. Then from the Advanced System Management Interface (ASMI), power off the managed system. Alternatively, the system power button may also be used to do the power off. If the management console Incomplete state persists after the power off, the managed system should be rebuilt from the management console. For more information on management console recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm A rare problem was fixed for a system hang that can occur when dynamically moving "uncapped" partitions to a different shared processor pool. To prevent a system hang, the "uncapped" partitions should be changed to "capped" before doing the move. A problem was fixed for Live Partition Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of firmware. Subsequent DLPAR of Virtual Adapters will fail with HMC error message HSCL294C, which contains text similar to the following: "0931-007 You have specified an invalid drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and later. Not affected by this issue are partitions installed with VIOS, IBM i, or earlier levels of AIX. System firmware changes that affect certain systems On systems with IBM i partitions, a problem was fixed for frequent logging of Informational errors of B7005120 for the HMC closed pipe condition for messages sent to IBM i partitions.. The HMC closed pipe to the hypervisor does not represent an error but is a normal operating state that does not need concern or service. Therefore, the informational logging of the HMC closed pipe condition has been removed. Without the fix, IBM support and the customer should ignore the B7005120 informational error logs. Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed for a concurrent maintenance operation on the I/O drawers that causes the four embedded storage controllers in the FC 5791, FC 5794, FC 5797, or FC 5798 I/O drawers to not show up on the I/O properties panel on the HMC. These I/O drawer features provide a 4U high I/O drawer containing twenty PCI-X slots and up to sixteen hot-swap disk bays. These I/O drawers attach to the CEC either via 12X or RIO-2 attachment cables, If the I/O is already lost when the fix is applied, it can be restored by using the Advanced System Management Interface (ASMI) to invoke "System Configuration > Configure I/O enclosure > Clear inactives" or it can be recovered by re-IPLing the system. Without the fix applied, the I/O can be recovered by re-IPLing the system.
EH350_172_038 / FW350.G1 06/23/16	Impact: Availability Severity: SPE System firmware changes that affect all systems A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during RSA PPS signature verification. The Common Vulnerabilities and Exposures issue number is CVE-2015-3194. System firmware changes that affect certain systems On systems with dedicated processor partitions, a problem was fixed for the dedicated processor partition becoming intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors. This is a follow-on to the fix provided in 350.G0 for a different issue for delays in dedicated processor partitions that were caused by low I/O utilization.
EH350_171_038 / FW350.G0 02/05/16	Impact: Security Severity: SPE System firmware changes that affect all systems A problem was fixed for some service processor error logs not getting reported to the OS partitions as needed. The service processor was not checking for a successful completion code on the error log message send, so it was not doing retries of the send to the OS when that was needed to ensure that the OS received the message. For systems with an invalid P-side or T-side in the firmware, a problem was fixed in the partition firmware Real-Time Abstraction System (RTAS) so that system Vital Product Data (VPD) is returned at least from the valid side instead of returning no VPD data. This allows AIX host commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent even if there is one bad code side. Without the fix, all the VPD data is blocked from the OS until the invalid code side is recovered by either rejecting the firmware update or attempting to update the system firmware again. A security problem was fixed for an OpenSSL specially crafted X.509 certificate that could cause the service processor to reset in a denial-of-service (DOS) attack. The Common Vulnerabilities and Exposures issue number is CVE-2015-1789. A security problem was fixed in OpenSSL where a remote attacker could cause an infinite loop on the service processor using malformed Elliptic Curve parameters during the SSL authentication. This would cause the service processor performance problems and also prevent new management console connections from being made. To recover from this attack, a reset or power cycle of the service processor is needed after scheduling and completing a normal shutdown of running partitions.. The Common Vulnerabilities and Exposures issue number is CVE-2015-1788. A security problem was fixed in the lighttpd server on the service processor OpenSSL where a remote attacker, while attempting authentication, could insert strings into the lighttpd server log file. Under normal operations on the service processor, this does not impact anything because the log is disabled by default. The Common Vulnerabilities and Exposures issue number is CVE-2015-3200. A problem was fixed for the bulk power controller (BPC) not being able to connect to a service processor with Security Mode set to "SSLv3 Disabled". The Advanced System Management Interface (ASMI) is used to change the Security Mode to "SSLv3 Disabled". This highest level of security protection does not allow service processor clients to connect using the SSLv3 protocol. A problem was fixed for a Network boot/install failure using bootp in a network with switches using the Spanning Tree Protocol (STP). A Network boot/install using lpar_netboot on the management console was enhanced to allow the number of retries to be increased. If the user is not using lpar_netboot, the number of bootp retries can be increased using the SMS menus. If the SMS menus are not an option, the STP in the switch can be set up to allow packets to pass through while the switch is learning the network configuration. System firmware changes that affect certain systems On PowerVM systems with dedicated processor partitions with low I/O utilization, the dedicated processor partition may become intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors.
EH350_166_038 05/14/15	Impact: Availability Severity: SPE System firmware changes that affect all systems A problem was fixed with the fspremote service tool to make it support TLSv1.2 connections to the service processor to be compatible with systems that had been fixed for the OpenSSL Padding Oracle On Dowgraded Legacy Encryption (POODLE) vulnerabilities. After the POODLE fix is installed, by default the system only allows secured connections from clients using the TLSv1.2 protocol. A problem was fixed for a partition deletion error on the management console with error code 0x4000E002 and message "...insufficient memory for PHYP". The partition delete operation has been adjusted to accommodate the temporary increase in memory usage caused by memory fragmentation, allowing the delete operation to be successful. A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests. These DMA size adjustments were marked as UE previously for a condition that is normal. A security problem was fixed in OpenSSL where the service processor would, under certain conditions, accept Diffie-Hellman client certificates without the use of a private key, allowing a user to falsely authenticate. The Common Vulnerabilities and Exposures issue number is CVE-2015-0205. A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages. A specially crafted DTLS message could exhaust all available memory and cause the service processor to reset. The Common Vulnerabilities and Exposures issue number is CVE-2015-0206. A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages. A specially crafted DTLS message could do an null pointer de-reference and cause the service processor to reset. The Common Vulnerabilities and Exposures issue number is CVE-2014-3571. A security problem was fixed in OpenSSL to fix multiple flaws in the parsing of X.509 certificates. These flaws could be used to modify an X.509 certificate to produce a certificate with a different fingerprint without invalidating its signature, and possibly bypass fingerprint-based blacklisting. The Common Vulnerabilities and Exposures issue number is CVE-2014-8275. A security vulnerability, commonly referred to as GHOST, was fixed in the service processor glibc functions getbyhostname() and getbyhostname2() that allowed remote users of the functions to cause a buffer overflow and execute arbitrary code with the permissions of the server application. There is no way to exploit this vulnerability on the service processor but it has been fixed to remove the vulnerability from the firmware. The Common Vulnerabilities and Exposures issue number is CVE-2015-0235. A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with malformed Elliptic Curve private keys. The Common Vulnerabilities and Exposures issue number is CVE-2015-0209. A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with a specially crafted X.509 certificate that causes an invalid pointer, out-of-bounds write, or a null pointer de-reference. The Common Vulnerabilities and Exposures issue numbers are CVE-2015-0286, CVE-2015-0287, and CVE-2015-0288. System firmware changes that affect certain systems On a system with redundant service processors, a problem was fixed for an operations panel core dump with SRC B181A0FA during an administrative failover (AFO) of the service processor. On a system with redundant service processors, a problem was fixed for bad pointer reference in the mailbox function during data synchronization between the two service processors. The de-reference of the bad pointer caused a core dump, reset/reload, and fail-over to the backup service processor. On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot. On systems with partitions using shared processors, a problem was fixed that could result in latency or timeout issues with I/O devices. A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations: - Using concurrent maintenance to perform a hot repair of a node. - Reducing the size of an Active Memory Sharing (AMS) pool. A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures. The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations. Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system. On systems in IPv6 networks, a problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client. For systems with a IBM i load source disk attached to an Emulex-based fibre channel adapter such as F/C #5735, a problem was fixed that caused an IBM i load source boot to fail with SRC B2006110 logged and a message to the boot console of "SPLIT-MEM Out of Room". This problem occurred for load source disks that needed extra disk scans to be found, such as those attached to a port other than the first port of a fibre channel adapter (first port requires fewest disk scans). Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed for concurrent maintenance operations to limit hardware retries on failed hardware so that it can be concurrently repaired. A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer. A problem was fixed for the servicing of a bulk power controller (BPC) that may cause the cross power Static Circuit Breaker (SCB) on the other BPC to trip, leaving the SCB inactivated at the end of the service procedure with a call home SRC 14012A85 or 14012B85 logged.
EH350_163_038 01/08/15	Impact: Security Severity: SPE System firmware changes that affect all systems A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511. A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings. This could be used to disclose portions of the program memory on the service processor. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508. Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets. A specially crafted DTLS handshake packet could cause the service processor to reset. The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507. A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509. A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange. A specially crafted handshake packet could cause the service processor to reset. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510. A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567. A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Dowgraded Legacy Encryption (POODLE). This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566. The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options. The Security Configuration options of "Disabled", "Default", and "Enabled" for SSLv3 determines the level of protection from POODLE. The management console also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.7.0 SP4 with PTF MH01482) to eliminate all vulnerability to POODLE and allow use of option 1 "Disabled" as shown below: -1) Disabled: This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack. All clients must be capable of using TLS to make the secured connections to the service processor to use this option. This requires the management console be at a minimum level of HMC V7 R7.7.0 SP4 with POODLE PTF MH01482. -2) Default: This medium level of security protection disables SSLv3 for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections. But the legacy management consoles are allowed to use SSLv3 to connect to the service processor. This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels. Running a non-POODLE compliant HMC to a service processor in "Default" mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI. -3) Enabled: This basic level of security protection enables SSLv3 for all service processor client connection. It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections. This option is intended for customer sites on protected internal networks that have a large investment in legacy hardware that need SSLv3 to make browser and HMC connection to the service processor. The level of POODLE protection actually achieved in "Enabled" mode is determined by the percentage of clients that are at the POODLE fix compliant levels.
EH350_159_038 06/25/14	Impact: Security Severity: HIPER New Features and Functions Support was dropped for Secured Socket Layer (SSL) Version 2 and SSL weak and medium cipher suites in the service processor web server (Ligthttpd). Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3. Supported web browsers for Power6 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24). System firmware changes that affect all systems HIPER/Pervasive: A security problem was fixed in the OpenSSL Montgomery ladder implementation for the ECDSA (Elliptic Curve Digital Signature Algorithm) to protect sensitive information from being obtained with a flush and reload cache side-channel attack to recover ECDSA nonces from the service processor. The Common Vulnerabilities and Exposures issue number is CVE-2014-0076. The stolen ECDSA nonces could be used to decrypt the SSL sessions and compromise the Hardware Management Console (HMC) access password to the service processor. Therefore, the HMC access password for the CEC should be changed after applying this fix. HIPER/Pervasive: A security problem was fixed in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor. The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability. The stolen private keys could be used to decrypt the SSL sessions and and compromise the Hardware Management Console (HMC) access password to the service processor. Therefore, the HMC access password for the CEC should be changed after applying this fix. HIPER/Pervasive: A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication. A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224. HIPER/Pervasive: A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments. This could be used to execute arbitrary code on the service processor. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195. HIPER/Pervasive: Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service. These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands. The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198. HIPER/Pervasive: A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221. HIPER/Pervasive: A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange. A specially crafted handshake packet could cause the service processor to reset. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470. A problem was fixed that caused the system information LED to be lit without a corresponding SRC and error log for the event. This problem typically occurs when an operating system on a partition terminates abnormally. A security problem was fixed in the service processor Lighttpd web server that allowed denial of service vulnerabilities for the Advanced System Manager Interface (ASMI). The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2011-4362 and CVE-2012-5533. A problem was fixed on the service processor where the Small-Footprint CIM Broker Daemon (SFCBD) process was accessing a null pointer and failing with a core dump, triggering a FSP dump to collect the core. A problem was fixed that caused a security scan of the Advanced System Manager Interface (ASMI) to fail. The Lighttpd web server configuration cipher list was updated to improve the security. A security problem in the Secure Socket Layer (SSL) protocol on the service processor was fixed to prevent a man-in-the-middle attack. The Common Vulnerabilities and Exposures issue number is CVE-2011-3389. A security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor of the CEC. The Common Vulnerabilities and Exposures issue number is CVE-2014-2323. A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor of the CEC. The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.. A problem was fixed that caused a "code accept" during a concurrent firmware installation from the HMC to fail with SRC E302F85C. A security problem was fixed in the service processor TCP/IP stack to discard illegal TCP/IP packets that have the SYN and FIN flags set at the same time. An explicit packet discard was needed to prevent further processing of the packet that could result in an bypass of the iptables firewall rules. System firmware changes that affect certain systems On systems using dynamic Distributed Host Control Protocol (DHCP) IP addresses, a problem was fixed that caused communication hangs when DHCP client processes were unable to renew their IP addresses. The iptable rules needed to be updated to open DHCP ports 67 and 68 to prevent the DHCP network traffic from being filtered by the service processor. On a system with partitions with redundant Virtual Asynchronous Services Interface (VASI) streams, a problem was fixed that caused the system to terminate with SRC B170E540. The affected partitions include Active Memory Sharing (AMS), encapsulated state partitions, and hibernation-capable partitions. The problem is triggered when the management console attempts to change the active VASI stream in a redundant configuration. This may occur due to a stream reconfiguration caused by Live Partition Mobility (LPM); reconfiguring from a redundant Paging Service Partition (PSP) to a single-PSP configuration; or conversion of a partition from AMS to dedicated memory. On systems involved in a series of consecutive Live Partition Mobility (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20. Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure). On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition. Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition. For a partition with a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to Power7 or Power6 using Live Partition Mobility, a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000. The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems. On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor. On systems with a redundant service processor and multiple nodes, a problem was fixed where the second of two consecutive Administrative Failovers (AFOs) for the service processor would fail with B181EF9A and B1813918 SRCs reported to the error log. The first AFO of the two is successful. Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail with SRC B181394F.
EH350_149_038 07/25/13	Impact: Availability Severity: SPE System firmware changes that affect all systems A problem was fixed that caused the wrong MCM (processor module) to be called out for certain types of failures. A problem was fixed that caused the managed system to go to the incomplete state on the management console after a partition was deleted. A problem was fixed that caused an error log generated by the partition firmware to show conflicting firmware levels. This problem occurs after a firmware update or a logical partition migration (LPM) operation on the system. The firmware was enhanced to display on the management console the correct number of concurrent live partition mobility (LPM) operations that is supported. A problem was fixed that caused the state of the Host Ethernet Adapter (HEA) port of be reported as down when the physical port is actually up. A problem was fixed that caused the partition target of a logical partition migration (LPM) to have its UTC time shifted forward from the actual time on the source partition. A problem was fixed that that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side. System firmware changes that affect certain systems On systems with I/O towers attached, a problem was fixed that caused multiple service processor reset/reloads if the tower was continuously sending invalid System Power Control Network (SPCN) status data. On a partition with a large number of potentially bootable devices, a problem was fixed that caused the partition to fail to boot with a default catch, and SRC BA210000 may also be logged. On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions. On systems running Active Memory Sharing (AMS) partitions, a timing problem was fixed that may occur if the system is undergoing AMS pool size changes. A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary. Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail with SRC B181C350. On systems running multiple IBM i partitions that are configured to communicate with each other via virtual Opticonnect, concurrent hot add/repair maintenance operations may time-out. When this problem occurs, a platform reboot may be required to recover.
EH350_143_038 01/09/13	Impact: Function Severity: ATT System firmware changes that affect all systems A problem was fixed that caused the hypervisor to be left in an inconsistent state after a partition create operation failed. A problem was fixed that caused the hypervisor to become unresponsive and the managed system to go the incomplete state on the management console. A problem was fixed that caused the service processor to fail to boot after a concurrent firmware update; this causes a system crash. System firmware changes that affect certain systems A problem was fixed that prevented the HMC command "lshwres" from showing any I/O adapters if any adapter name contained the ampersand character in the VPD. The Power Hypervisor was enhanced to insure better synchronization of vSCSI and NPIV I/O interrupts to partitions. On systems running AIX or Linux, a problem was fixed that caused a partition to fail to boot with SRC CA260203. This problem also can cause concurrent firmware updates to fail. Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed that caused the Hypervisor to become unresponsive during a concurrent maintenance operation.
EH350_132_038 07/27/12	Impact: Availability Severity: SPE New Features and Functions Support for live partition mobility between systems running Ex350 system firmware, and 8246-L2S systems. System firmware changes that affect certain systems On systems booting from an NPIV (N-port ID virtualization) device, a problem was fixed that caused the boot to intermittently terminate with the message "PReP-BOOT: unable to load full PReP image.". This problem occurs more frequently on the IBM V7000 Storage System running the SAN Volume Controller (SVC), but not on every boot. A problem was fixed that caused the system to checkstop (SRC B114E550) after logging an unrecoverable error (SRC B70069F4) on an I/O hub. On systems on which Internet Explorer (IE) is used to access the Advanced System Management Interface (ASMI) on the Hardware Management Console (HMC), a problem was fixed that caused IE to hang for about 10 minutes after saving changes to network parameters on the ASMI. On systems running the AIX operating system, a problem was fixed that caused the hypervisor to crash with SRC B7000103, after an HEA (Host Ethernet Adapter) error was logged, when there is a lot of AIX activity on the HEAs.
EH350_126_038 05/02/12	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect all systems The firmware was enhanced to log SRCs BA180030 and BA180031 as informational instead of predictive. A problem was fixed that caused uncorrectable SRC B1818A10 to be erroneously logged after a successful concurrent firmware installation. The firmware was enhanced to increase the threshold of soft NVRAM errors on the service processor to 32 before SRC B15xF109 is logged. (Replacement of the service processor is recommended if more than one B15xF109 is logged per week.) System firmware changes that affect certain systems HIPER/Pervasive: On systems with PCI adapters in a feature code (F/C) F/C 5803 or 5873 I/O drawer assigned to a Virtual I/O Server (VIOS), and on systems with the I/O adapters in a CEC drawer assigned to a VIOS, a problem was fixed that caused the system to crash with SRC B700F103. A problem was fixed that caused the hypervisor to hang during a concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O drawer. Recovering from the hypervisor hang required a platform reboot. On system performing Live Partition Mobility (LPM), a problem was fixed that caused a partition to crash if the following sequence of operations is performed: 1. The partition is configured with, and is using, more than 1 dedicated processor. 2. The partition is migrated using LPM from a POWER6 to a POWER7 platform. 3. At any time following the migration from POWER6 to POWER7, one or more of the dedicated processors is removed from the partition using a Dynamic Logical Partitioning (DLPAR) operation. Once these 3 steps operations have been done, a partition crash is likely if either: - The partition is subsequently migrated to any other platform (POWER6 or POWER7) using LPM, or - The partition is resumed from hibernation. A problem was fixed that caused the output of the AIX command "uname -m" to be incorrect on the POWER7 system after a successful Live Partition Migration (LPM) operation from a POWER6 to a POWER7 system. A problem was fixed that caused booting from a virtual fibre channel tape device to fail with SRC B2008105. Concurrent hot add/repair maintenance (CHARM) firmware fixes A problem was fixed that caused unrecoverable SRCs B1813918 and B182953C during a CHARM operation.
EH350_120_038 11/09/11	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect all systems A problem was fixed that caused the system to terminate when rebooting after the power was removed, then reapplied. A problem was fixed that caused the message "IPL: 500 - Internal Server Error" to be displayed when the Hardware Management Console option was selected (which is under the System Information option) on the Advanced System Management Interface (ASMI). On systems running more than 100 logical partitions, a problem was fixed that caused a concurrent firmware installation to fail. A problem was fixed that caused a system's partition dates to revert back to 1969 after the service processor or its battery was replaced. This occurred regardless of whether or not the service processor's time-of-day (TOD) clock was correctly set during the service action. A problem was fixed that caused a partition migration operation to abort when the partition has more than 4096 virtual slots. A problem was fixed that caused the message "500 - Internal Server Error." to be displayed when a setting was changed on the Advanced System Management Interface's (ASMI's) power on/off menu, when the change was attempted when the system was powering down. A problem was fixed that caused booting or installing a partition or system from a USB device to fail with error code BA210012. This usually occurs when an operating system (OS) other than the OS that is already on the partition or system is being booted or installed. On the System Management Services (SMS) remote IPL (RIPL) menus, a problem was fixed that caused the SMS menu to continue to show that an Ethernet device is configured for iSCSI, even though the user has changed it to BOOTP. A problem was fixed that caused a firmware installation from the HMC with the "do not auto accept" option selected to fail. The field replaceable unit (FRU) call-out list for clock card failures was enhanced to reduce the number of parts replaced. A problem was fixed that caused the bulk power controller (BPC) to erroneously log SRCs B181843C and B181EF88, and a PWR dump to be generated. System firmware changes that affect certain systems HIPER/Non-Pervasive: On systems running Active Memory Sharing (AMS) with a F/C 5803 or 5873 I/O drawer attached, a problem was fixed that caused the system to crash with SRC B170E540 after a warm boot or platform dump IPL. On systems running a virtual I/O (VIO) partition, or using a Shared Ethernet Adapter (SEA), a problem was fixed that caused a severe performance degradation. On systems running IBM i partitions, a problem was fixed that caused changing the processor weight on an IBM i partition to 255 to have no effect. On system using the utility capacity on demand (COD) feature, a problem was fixed that prevented the hypervisor from correctly crediting the time used when the sequence number of the activation code reached certain values. On systems with an iSCSI network, a problem was fixed that caused the system to hang when booting from an iSCSI device in the system management services (SMS) menus. On systems with an iSCSI network, when booting a logical partition using that iSCSI network, a problem was fixed that caused the iSCSI gateway parameter displayed on the screen to be incorrect. It did not impact iSCSI boot functionality. On systems using fibre channel adapters, the firmware was enhanced by the addition of a new option in the system management services (SMS) Mutliboot menu that facilitates zoning of physical and virtual fibre channel adapters. On systems with external I/O drawers, the firmware was enhanced such that SRCs 10001B02 and 1000911C place a call home. On systems with external InfiniBand or PCI-E drawers or towers, a problem was fixed that caused the system to crash with SRC B7000103 if the I/O hub adapter crashed at the same time an external drawer or tower was being initialized. Concurrent hot add/repair maintenance (CHARM) firmware fixes On partitions running Red Hat Linux 6.1, a problem was fixed that caused a node evacuation operation to fail. HIPER/Non-Pervasive: On systems with a F/C 5803 or 5873 I/O drawer attached, a problem was fixed that caused the system to crash with SRC B170E540 after a warm boot or platform dump IPL. A problem was fixed that caused the host Ethernet adapters (HEA) to be in a non-functional state after a hot node add.
EH350_108_038 07/07/11	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect all systems A problem was fixed that caused some of the extended error log data to be parsed incorrectly. This problem only occurs on systems with a large number of deconfigured components. System firmware changes that affect certain systems HIPER: On systems running VIOS, a problem was fixed that caused the system to crash with SRC B700F103. On systems running shared processor partitions, a problem was fixed that caused a partition to hang until powered off and back on. Concurrent maintenance (CM) firmware fixes The firmware was enhanced to allow the concurrent replacement of the secondary service processor even if the service processor redundancy policy is set to "disabled".
EH350_107_038 06/06/11	Impact: Availability Severity: ATT New Features and Functions Support for the attachment of a System Director Management Console (SDMC). System firmware changes that affect all systems PARTITION-DEFERRED: A problem was fixed that prevented virtual LANs (VLANs) in a VIOS with partition ID of 1 from being displayed as bootable devices in the system management services (SMS) menus. A problem was fixed that prevented a hardware management console (HMC) from being permanently disconnected using the Advanced System Management Services (ASMI) menus. A problem was fixed that prevented the timed-power-on command from turning the system back on if the service processor's clock was adjusted to an earlier time. Adjustment of the service processor's clock could have been done through the operating system or the Advanced System Management Interface (ASMI). This problem could occur during the fall when clocks are set back when daylight saving time ends, for example. A problem was fixed that caused certain service processor error log entries with a severity of "predictive", and a failing subsystem of "service processor firmware", to be erroneously converted to "informational". A problem was fixed that caused the HMC2 port on the advanced system management interface (ASMI) to erroneously default to static IP addressing instead of dynamic. A problem was fixed that caused a firmware installation to fail with SRC B181EF7C. A problem was fixed that prevented processor resources from being moved to another partition by a DLPAR (dynamic LPAR) operation. A problem was fixed that prevented partitions from booting. The firmware was enhanced to list the attached devices when viewing the adapter information for a partition profile on the HMC GUI. A problem was fixed that could cause the target partition to crash after a successful P6 to P7 partition migration. Possible AIX error log entries include: label: DSI_PROC, resource: SYSVMM, with description: "DATA STORAGE INTERRUPT, PROCESSOR". Other partition-related crash descriptors may also be logged. A problem was fixed that could cause AIX error log entries following a successful partition migration. Possible AIX error log entries include: label: RTAS_ERROR, resource: sysplanar0, with description: "INTERNAL ERROR CODE". Other errors may also be logged. A problem was fixed that caused a partition to crash with SRC BA330002 after several concurrent installations of system firmware, or partition migrations, without a reboot. A problem was fixed that caused multiple DR_DMA_MIGRATE_FAIL entries in the AIX error log. A problem was fixed that caused the installation of some versions of Linux to fail. A problem was fixed that caused a partition migration or partition hibernation operation to hang with the partition left in the "suspending" state. The firmware was enhanced to log SRC B1768B76 as informational instead of unrecoverable. A problem was fixed that caused the platform to become unresponsive; this was indicated by an incomplete state on the HMC. When this problem occurred, the partitions on the managed system became unresponsive. A problem was fixed that caused the managed system to go to the incomplete state on the HMC. The firmware was enhanced to log a predictive SRC if the Ethernet cables are misplugged (swapped) on a node controller. The firmware was enhanced such that a call home is not made when an error logged by the system controller or the network controller is informational, or recovered, and the reset/reload bit is set. The field replaceable unit (FRU) list for SRC B158C004, which indicates a clock card failure, was enhanced to include additional parts. On systems with a F/C 5803 or 5873 I/O expansion drawer, a problem was fixed that caused SRC B7006907 to be erroneously logged. System firmware changes that affect certain systems On systems running Advanced Memory Sharing (AMS), a problem was fixed that caused an AMS partition to crash with SRC B700F103. This problem may occur when reducing the size of the AMS pool (or doing a hot node repair on a model MMB or MHB) at the same time as dynamically creating an AMS partition, or changing an AMS partition's maximum memory. On systems using logical host Ethernet adapter (LHEA) ports, a problem was fixed that caused the activation of a partition that is using an LHEA logical port (LPORT) to hang at C2008104, and the HMC to show an Incomplete status for the system. On systems using capacity on demand (CoD), a problem was fixed that caused multiple informational B7005300 SRCs to be logged, which caused the error log to wrap, and predictive and unrecoverable SRC data to be lost. On systems running IBM i partitions, IBM i network installation capability was not reported correctly to the HMC after installation of the firmware service pack that enabled this function without rebooting the managed system. On systems running IBM i partitions, a problem was fixed that caused a RAID array of SCSI disks to be exposed if an MES upgrade was done, or a system plan was created. On systems and partitions running IBM i, a problem was fixed that caused the operating system to use excessive processor cycles. Concurrent Maintenance (CM) firmware fixes On systems with five or more nodes, and the hub numbers are consecutive, a problem was fixed that caused B6005121 to be erroneously logged (and call home) after a concurrent maintenance operation on a node.
EH350_103_038 02/21/11	Impact: Data Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect certain systems HIPER: IBM testing has uncovered a potential undetected data corruption issue when a mobility operation is performed on an AMS (Active Memory Sharing) partition. The data corruption can occur in rare instances due to a problem in IBM firmware. This issue was discovered during internal IBM testing, and has not been reported on any customer system. IBM recommends that systems running on EH340_075 or later move to EH350_103 to pick up the fix for this potential problem. (Firmware levels older than EH340_075 are not exposed to the problem.) On systems with a F/C 5803 or 5873 I/O drawer attached, a problem was fixed that caused a partition to crash during a page migration operation. A problem was fixed that caused a partition to crash with SRC BA330002 after several concurrent installations of system firmware, or partition migrations, without a reboot. A problem was fixed that caused AIX licensing issues when migrating a partition from a P6 to a P7 system. On systems running IBM i partitions, a problem was fixed that caused SRC BA040030 to be erroneously logged, and a call home to be made, even though the partition booted successfully.
EH350_085_038 10/26/10	Impact: Availability Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible. System firmware changes that affect all systems HIPER: A problem was fixed that caused the HMC to show the server's status as incomplete, and SRC B7000602 to be logged against SFLPHMCCMDTASK in serviceable events. This problem can also cause the system to crash when it occurs. HIPER: A problem was fixed that caused repeated reset/reloads of the service processor, and fail-overs, to occur after a hypervisor-initiated reset/reload of the service processor was completed. That led to loss of communication between the service processor and the hypervisor (indicated by SRC B182951C). A problem was fixed that caused a CEC node to unexpectedly lose power, and caused a system crash. Systems that had a service processor role change at server power off are exposed to this issue. A problem was fixed that caused SRC B181440D to be erroneously logged. The firmware was enhanced to log SRC B181D30B as informational instead of predictive. The firmware was enhanced to list the attached devices when viewing the adapter information for a partition profile on the HMC GUI. A problem was fixed that caused the hypervisor to issue almost continuous reset/reload requests to the service processor. System firmware changes that affect certain systems The firmware was enhanced to support the network installation of the IBM i operating system from the hardware management console (HMC) command line interface (CLI). On systems using the IPv6 protocol, a problem was fixed that caused valid link local and unique link local addresses to be erroneously invalidated. This prevented the port with that address from being used for network boot or network installation.
EH350_071_038 06/30/10	Impact: Usability Severity: SPE System firmware changes that affect all systems DEFERRED: A problem was fixed that could result in a system checkstop while running floating point computations. Although this is a high-impact problem, it has a very low probability of occurring. A problem was fixed that caused a call home to be erroneously made with SRC B181E911, and a service processor dump to be taken unnecessarily. A problem was fixed that caused the HMC to show a status of "Incomplete" for the managed system, and numerous service processor dumps to be generated. The firmware was enhanced to improve the callouts for certain types of processor failures that log SRC B1xxE504. The firmware was enhanced to make a hidden error log visible when a failing GX adapter caused errors to be logged by the processor run-time (PRDF) diagnostics. The firmware was enhanced to improve the callouts when NVRAM corruption is detected in the bulk power controller's (BPC's) service processor. On systems running EH350_xxx firmware, a problem was fixed the prevented the reset/reload bit from being set correctly in a service processor error log entry. A problem was fixed that caused SRC B181E617 to be erroneously logged and a service processor dump to be unnecessarily generated. System firmware changes that affect certain systems On systems running the IBM i operating system, a problem was fixed that caused a DLPAR move operation with an IOP (I/O processor) and IOA (I/O adapter) to fail intermittently. The DLPAR operation was successful, but the IOA failed to power on in the new partition. Concurrent maintenance (CM) firmware fixes A problem was fixed that would cause a concurrent maintenance operation to fail if the HMC was rebooted before the previous CM operation was complete. On systems with F/C 5803 or F/C 5873 I/O drawers attached and a boot device in the drawer, a problem was fixed that prevented a partition from booting after the concurrent repair of the GX adapter that connects the 5803 or 5873 drawer to the system, or to the node that contains the GX adapter.
EH350_049_038 03/10/10	Impact: Serviceability Severity: HIPER System firmware changes that affect all systems HIPER: A problem was fixed that caused the system to crash if the server was running AIX and had a F/C 5802 or 5877 drawer (in a 19" rack), or F/C 5803 or 5873 drawer (in a 24" rack), attached. DEFERRED: This fix corrects the handling of a specific processor instruction sequence that has the potential to result in undetected data errors. This specific instruction sequence has only been observed in a small number of highly tuned Floating Point intensive applications. However, it is strongly recommended that this fix be applied to all POWER6 systems. This fix has the potential to decrease system performance on applications that make extensive use of floating point divide, square root, or estimate instructions. A problem was fixed that prevented an SRC from being recorded in the service processor dump produced by a host-initiated reset. A problem was fixed that prevented the repair of a deconfigured system controller from being completed successfully. A problem was fixed that caused SRC 10009135, followed by 10009139, to be erroneously logged. These SRCs indicate a system power control network (SPCN) loop is being broken, then re-established. The firmware was enhanced to allow a temporary threshold reduction for processor unit book interconnect predictive errors. A problem was fixed that caused a reset/reload of a network controller. A problem was fixed that, under certain rare circumstances, caused a partition to hang when being shut down. A problem was fixed that caused the system to hang with SRCs B182953C, B182954C and B17BE434 being logged. The firmware was enhanced to detect and handle 12X InfiniBand I/O drawer cabling errors better. A problem was fixed that, under certain rare circumstances, caused the system to become unresponsive and appear to hang when page migration occurred on a PCIe slot. System firmware changes that affect certain systems A problem was fixed that caused a virtual SCSI or virtual fibre channel adapter to be seen by the operating system as not bootable when it was added to a partition using a dynamic LPAR (DLPAR) operation. On systems running IBM i, a problem was fixed that caused booting the operating system from a fibre channel device to fail with SRC 576B8301. On systems with a F/C 5802 or 5877 drawer attached, a problem was fixed that could impact the performance of a 4-port Ethernet adapter F/C 5272, 5275, 5279, 5280, 5525, 5526, or 5527 installed in that drawer. In partitions running AIX or Linux, a problem was fixed that caused the addition of an I/O slot to a partition using a dynamic LPAR (DLPAR) add operation to fail. On systems with shared processors, a problem was fixed that caused the partitions to hang and become unresponsive for very short periods of time. A problem was fixed that prevented the IPv6 DHCP address from being displayed on the advanced system management interface (ASMI) network configuration screens when IPv6 and DHCP were enabled. This only occurred on systems with virtual LAN (VLAN) addresses (such as eth0.30, eth0.31), and when IPv6 addresses were assigned to the eth0.xx interface. On systems running redundant VIOS partitions, a problem was fixed that prevented Ethernet traffic from being properly bridged between the two partitions. This problem also prevented shared Ethernet adapter failover from working correctly. A problem was fixed that caused the hypervisor to loop unnecessarily and consume too many processor cycles. This impacted the performance of the system. Concurrent maintenance (CM) firmware fixes The firmware was enhanced such that if an Ethernet cable is misplugged on a node controller during a concurrent node add operation, the node add operation will be completed successfully. A problem was fixed that prevented the concurrent repair of a redundant system controller. A problem was fixed that caused unpredictable system behavior if a capacity on demand (CoD) or a virtualization engine technology (VET) activation code was entered and accepted after a node 0 evacuation was done. The unpredictable machine behavior might also have occurred, if a node 0 evacuation failed, a system dump was taken, and a memory-preserving IPL was then initiated. A problem was fixed that caused a concurrent maintenance operation after a node evacuation to fail. When this problem occurred, the system erroneously states that a platform memory dump is pending. A problem was fixed that prevented a concurrent maintenance operation from completing successfully.
EH350_038_038 10/30/09	Impact: Function Severity: Special Attention New Features and Functions: Support for the concurrent repair of a system controller. Support for the concurrent removal of 12X-attached 24" I/O drawers. Support for a USB-attached half-high 5.25" backup device using a removable hard disk drive (HDD). Support for a platform dump that is not disruptive. Support for i5/OS multipath storage I/O through VIOS partitions. System firmware changes that affect all systems A problem was fixed that might cause a concurrent firmware maintenance (CFM) operation to fail repeatedly, or a concurrent maintenance (CM) operation to fail repeatedly, when a large number of I/O loop errors were being logged during the CFM operation. The firmware was enhanced to handle system dumps (SYSDUMPs) larger than 4GB in size. On systems running system firmware release EH340 EH340, a problem was fixed that caused a dynamic LPAR (DLPAR) operation on memory to fail until the platform was rebooted. The firmware was enhanced to improve the performance of the F/C 5732 , 5735, and 5769 PCI-E adapters. The firmware was enhanced such that SRCs B181F126, B181F127, and B181F129 are correctly logged, and no longer calls home unnecessarily for these SRCs. A problem was fixed that caused a repair and verify (R&V) operation on the hardware management console (HMC) to fail with the message "Exception encountered while rendering panel as HTML". The firmware was enhanced such that a generic B1817201 SRC will no longer be logged when a cache error occurs on a node controller (NC). Unique SRCs will now be logged for cache failures, and upper and lower thresholds have been added to the NC cache error logging scheme. The firmware was enhanced to improve the field replaceable unit (FRU) callouts for SRCs B1xxC004 and B1xxC005. A problem was fixed that might cause the system to crash with SRC B181E504, then SRC B1813909, being logged. The firmware was enhanced to more accurately describe the reason memory was deconfigured on the advanced system management interface (ASMI) memory deconfiguration screen. The firmware was enhanced such that when a certain type of hardware failure occurs in a bulk power controller (BPC), the appropriate errors will be logged instead of SRCs B1818601 and B1818611, which indicate a firmware failure. On systems using the HEA (host Ethernet adapter), also know as the Integrated Virtual Ethernet (IVE) function, a problem was fixed that caused link failures if the HEA was connected to certain third-party Ethernet switches. A problem causing an unexpected increment in the Pxs_TXIME register, but not affecting network performance, was also fixed. Concurrent maintenance (CM) firmware fixes A problem was fixed that caused SRC B181A494 to be erroneously logged if a concurrent maintenance operation took longer than 60 minutes. On systems with 24" I/O drawers, a problem was fixed that might cause a partition to crash, with a system reboot required for recovery, when a F/C 5797 or 5798 drawer was concurrently added. On systems with four drawers, a problem was fixed that caused the system controller to perform a reset/reload, which caused a concurrent maintenance operation to fail, on the fourth node (P4). A problem was fixed that caused the current replacement of an InfiniBand GX adapter or I/O planar to fail if a partition owned an embedded device on the planar.


EH340 For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url: http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
EH340_132_039 12/01/10	Impact: Availability Severity: SPE System firmware changes that affect all systems A problem was fixed that caused the HMC to show a status of "Incomplete" for the managed system, and numerous service processor dumps to be generated. A problem was fixed that caused the hardware called out for SRC B1xxF201 to be incorrect. The firmware was enhanced to log SRC B181D30B as informational instead of predictive. The firmware was enhanced to list the attached devices when viewing the adapter information for a partition profile on the HMC GUI. A problem was fixed that caused the HMC2 port on the advanced system management interface (ASMI) to erroneously default to static IP addressing instead of dynamic. System firmware changes that affect certain systems A problem was fixed that prevented the timed-power-on function from turning the system back on if the service processor's clock was adjusted to an earlier time. This problem could occur during the fall when clocks are set back when daylight savings time ends, for example. A problem was fixed that caused a partition to fail to reboot, or fail to boot if it had been shut down once since the platform was booted, with SRC B2001230 and word 3 = 000000BF. This failure can be seen on a partition that owns a PCI, PCI-E, or PCI-X slot.
EH340_122_039 05/19/10	Impact: Availability Severity: ATT System firmware changes that affect all systems DEFERRED: This fix corrects the handling of a specific processor instruction sequence that has the potential to result in undetected data errors. This specific instruction sequence has only been observed in a small number of highly tuned floating point-intensive applications. However, it is strongly recommended that this fix be applied to all POWER6 systems. This fix has the potential to decrease system performance on applications that make extensive use of floating point divide, square root, or estimate instructions. A problem was fixed that prevented an SRC from being recorded in the service processor dump produced by a host-initiated reset. A problem was fixed that caused a reset/reload of a node controller. A problem was fixed that caused the system to become unresponsive and appear to hang when page migration occurred on a PCIe slot. The firmware was enhanced to improve the callouts for certain types of processor failures that log SRC B1xxE504. The firmware was enhanced to improve the callouts when NVRAM corruption is detected in the bulk power controller's (BPC's) service processor. System firmware changes that affect certain systems A problem was fixed that caused a virtual SCSI or virtual fibre channel adapter to be seen by the operating system as not bootable when it was added to a partition using a dynamic LPAR (DLPAR) operation. In partitions running AIX or Linux, a problem was fixed that caused the addition of an I/O slot to a partition using a dynamic LPAR (DLPAR) add operation to fail. On systems running redundant VIOS partitions, a problem was fixed that prevented Ethernet traffic from being properly bridged between the two partitions. This problem also prevented shared Ethernet adapter failover from working correctly. A problem was fixed that caused the system to crash with SRC B7000103 when a concurrent maintenance operation was performed on an I/O slot directly from a partition (using AIX SMIT or IBM i HST). A problem was fixed that caused a system or partition running Linux to crash when the "serv_config -l" command was run. On systems running active memory sharing (AMS), the firmware was enhanced so that error messages indicating "out of compliance" issues with the memory (HMC SRC HSCL031F) will not be generated if the user allocates more memory than is installed in the system. (Allocating more memory than is installed in the system is supported in active memory sharing.) On systems using InfiniBand switches for processor clustering, a problem was fixed that caused InfiniBand ports to intermittently drop out. A problem was fixed that caused the hypervisor to loop unnecessarily and consume too many processor cycles. This impacted the performance of the system. Concurrent maintenance (CM) firmware fixes A problem was fixed that caused the concurrent addition of a node to fail with SRC B181A422. A problem was fixed that caused unpredictable system behavior if a capacity on demand (CoD) or a virtualization engine technology (VET) activation code was entered and accepted after a node 0 evacuation was done. The unpredictable machine behavior might also have occurred, if a node 0 evacuation failed, a system dump was taken, and a memory-preserving IPL was then initiated. A problem was fixed that caused a concurrent maintenance operation after a node evacuation to fail. When this problem occurred, the system erroneously states that a platform memory dump is pending. A problem was fixed that prevented a concurrent maintenance operation from completing successfully. On systems with F/C 5803 or F/C 5873 I/O drawers attached and a boot device in the drawer, a problem was fixed that prevented a partition from booting after the concurrent repair of the GX adapter that connects the 5802 or 5877 drawer to the system, or to the node that contains the GX adapter.
EH340_112_039 12/16/09	Impact: Serviceability Severity: HIPER System firmware changes that affect all systems HIPER: A problem was fixed that might cause the system to crash if the server is running AIX and has a F/C 5802 or 5877 drawer (in a 19" rack), or F/C 5803 or 5873 drawer (in a 24"rack), attached. On systems with a lot of memory, the firmware was enhanced to reduce the time partition migrations take from hours to minutes. A problem was fixed that might cause the system to crash with SRC B181E504, then SRC B1813909, being logged. The firmware was enhanced such that SRCs B181F126, B181F127, and B181F129 are correctly handled, and no longer cause unnecessary calls home to be made. The firmware was enhanced such that SRC B1817201, when generated by a bulk power controller (BPC), is correctly handled. A problem was fixed that caused the system to hang with SRCs B182953C, B182954C, and B17BE434 being logged. A problem was fixed that caused SRC 10009135, followed by 10009139, to be erroneously logged. These SRCs indicate a system power control network (SPCN) loop is being broken, then re-established. The firmware was enhanced to allow a temporary threshold reduction for processor unit book interconnect predictive errors. System firmware changes that affect certain systems On a single system running Oracle in multiple partitions, with multiple IBM LHCAs connected in the same subnet, a problem was fixed that caused the remaining partitions to lose their reliable datagram socket (RDS) heartbeat connections after the reboot of a single partition. There is a greater probability of encountering this problem if the partition being rebooted has a large partition memory assigned to it. On systems using the HEA (host Ethernet adapter), also know as the Integrated Virtual Ethernet (IVE) function, a problem was fixed that caused link failures if the HEA was connected to certain third-party Ethernet switches. A problem causing an unexpected increment in the Pxs_TXIME register, but not affecting network performance, was also fixed. Concurrent maintenance (CM) firmware fixes On systems with four nodes, a problem was fixed that caused the system controller to perform a reset/reload, which caused a concurrent maintenance operation to fail, on the fourth node (P4). A problem was fixed that caused the concurrent replacement of an InfiniBand GX adapter or I/O planar to fail if a partition owned an embedded device on the planar. The firmware was enhanced such that if an Ethernet cable is misplugged on a node controller during a concurrent node add operation, the node add operation will be completed successfully.
EH340_101_039 09/23/09	Impact: Serviceability Severity: Attention System firmware changes that affect all systems DEFERRED: The firmware was enhanced to reduce the number of correctable errors (CEs) being erroneously logged against the memory bus with SRC B124E504. The firmware was enhanced such that SRC B181F126 is correctly managed, and no longer calls home unnecessarily for this problem.
EH340_095_039 08/20/09	Impact: Function Severity: HIPER System firmware changes that affect all systems DEFERRED: This fix corrects the handling of a specific processor instruction sequence that was generated on a particular heavily-tuned High Performance Computing (HPC) application. This specific instruction sequence has the potential to produce an incorrect result. This instruction sequence has only been observed in a single HPC application. However, it is strongly recommended that you apply this fix. The firmware was enhanced such that a generic B1817201 SRC will no longer be logged when a cache error occurs on a node controller (NC). Unique SRCs will now be logged for cache failures, and upper and lower thresholds have been added to the NC cache error logging scheme. System firmware changes that affect certain systems HIPER for systems with F/C 5803 or 5873 drawers attached: A problem was fixed that prevented node concurrent maintenance operations on systems with F/C 5803 or 5873 drawers attached to them. On systems with F/C 5802 or 5877 drawers attached, a problem was fixed that prevented an I/O slot's power LED from accurately reflecting the state of the I/O slot in a 5802 or 5877 drawer, under certain circumstances. A problem was fixed that under certain rare circumstances caused a partition to crash when a 24" InfiniBand I/O drawer (feature code 5797 or 5798) drawer was concurrently added. When this problem occurred, rebooting the system was required to recover. On systems running system firmware EH340_075 and Active Memory Sharing, a problem was fixed that might have caused a partition to lose I/O entitlement after the partition was moved from one system to another using PowerVM Mobility. On systems running system firmware EH340_075 and Active Memory Sharing, a problem was fixed that might have caused a partition to fail to boot with SRC B700F103 if the partition had more than 24 virtual processors assigned to it. On systems running system firmware release EH340, a problem was fixed that might have caused the I/O performance to be degraded if a node evacuation operation was performed (as part of a concurrent maintenance operation to fix a failing I/O adapter or drawer) after the repair was complete. On systems with external I/O towers attached, the firmware was enhanced so that the system will not crash when SRC B7006981 is logged for certain types of I/O hardware failures. Concurrent maintenance (CM) firmware fixes A problem was fixed that might have caused the performance of an I/O loop (attached to a 12X I/O adapter) to be degraded if a B7006982, B7006984, B7006985, B70069F2, B70069F3, or B70069F4 SRC is logged after a concurrent maintenance operation on that loop. A problem was fixed that caused concurrent maintenance operations on memory DIMMs to fail if the replacement DIMMs were functionally equivalent to the original DIMMs, but did not have the same CCIN (customer card identification number). A problem was fixed that caused SRC B1xxB889 SRCs to be erroneously logged during a node evacuation operation. (Node evacuation is one step in a concurrent maintenance operation on a node.) A problem was fixed that caused the system to crash during a hot node or GX adapter repair with certain hardware configurations. A problem was fixed that caused replacement of a system controller with power off, and the system at standby, to fail. A problem was fixed that caused the system to crash during a hot node repair or upgrade.
EH340_075_039 05/26/09	Impact: Function Severity: HIPER New features and functions: - DEFERRED: Support for F/C 5803 (24" I/O drawer) and F/C 5873 (diskless 24" I/O drawer). Attention: After this level of firmware is installed, the platform must be powered off, then powered on, before the 5803 or 5873 I/O drawer is added to the system. - DEFERRED: Support for POWER VM Active Memory Sharing. Attention: After this level of firmware is installed, the platform must be powered off, then powered on to activate the POWER VM Active Memory Sharing function. Attention: If EH340_075 has been installed, and the new POWER VM Active Memory Sharing function has been activated, and you want to back-level the system firmware, the active memory sharing pool must be deactivated and deleted prior to back-leveling the system firmware. IBM does not recommend back-leveling the system firmware. System firmware changes that affect all systems: HIPER: A problem was fixed that caused a system to fail to reboot after a B1xxE504 SRC was logged, due to a processor interconnection bus failure. The same SRC, B1xxE504, was logged when the reboot failed. A problem was fixed that caused non-terminating SRCs (such as B1818A1E) that indicate registry read errors to be logged during a disruptive installation of system firmware. A problem was fixed that prevented the system from powering on after the "reset service processor settings" or "reset all settings" option was selected in the advanced system management interface (ASMI) menus. A problem was fixed that caused the detailed data at the end of an "early power off warning type 5" AIX error log entry to be filled with invalid data instead of zeros. A problem was fixed that caused the secondary system controller to reset/reload with SRC B1xxB741 being logged, if the system controller lost the communication path to one of the node controllers. A problem was fixed that prevented all of the necessary files from being synchronized between the primary and the secondary service processors. One possible symptom of this problem was the time-of-day clocks being out of synch after a service processor failover. A problem was fixed that caused SRC B1818601 to be logged, and a service processor dump to be generated, at runtime. A problem was fixed that caused the number of empty GX adapter slots displayed by the advanced system management interface (ASMI) to be incorrect. A problem was fixed that prevented a newly installed 12X I/O adapter from being recognized if the system controller was at standby, and the newly installed adapter was a 12X RIO adapter and the previous adapter was a 12X InfiniBand adapter, or vice-versa. The firmware was enhanced so that SRC B1xxE458 (with word 6=0000E42B) will be logged as informational instead of generating a call home. The firmware was enhanced such that error logs with relevant information will be created when a system crashes under certain circumstances, rather than a generic SRC (B1813410), with very little debug information, being logged. A problem was fixed that caused the system to hang when terminating if the system had been in power save mode. The firmware was enhanced so that if the secondary system controller remains hung after the primary system controller successfully boots, a predictive error will be logged, and a call home will be made. A problem was fixed that caused SRC B181D312, and a call home to be made, when a bulk power controller (BPC) and a hardware management console (HMC) are temporarily disconnected. The firmware was enhanced such that if an attempt is made to enable redundancy when the system is booting, the error log entry that is made will be informational instead of predictive. The firmware was enhanced so that a call home will be made if the hypervisor issues a "terminate immediate" interrupt. A problem was fixed that caused SRC 11001D12 to be erroneously logged when the system was booting. A problem was fixed that caused incorrect field replaceable unit (FRU) part numbers to be returned for the BPA scroll assembly, UEPO panel and the CEC MDA scroll assembly. The firmware was enhanced so that the service processor only logs SRC B1A38B24 when a valid network set up error is found. The callouts for this SRC were also improved. The firmware was enhanced so that SRCs B181720D, B1818A13, and B1818A0F, and occasionally a service processor dump, will not be generated when the service processor's two Ethernet interfaces are on the same subnet. (This is an invalid configuration.) System firmware changes that affect certain systems: In systems using InfiniBand switches for processor clustering, a problem was fixed that caused packets to be dropped under certain circumstances. On systems running firmware release EH340, a problem was fixed that caused data in the platform dump to be invalid. On systems with five or more nodes, a problem was fixed that prevented the identify LED function from turning on the correct node's LED. On systems with a large number of I/O drawers, a communication problem was fixed that caused unnecessary system controller failovers, unnecessary reset/reloads, and unnecessary dumps, and SRC B181F105 to be logged. On systems with a large number of I/O drawers, the firmware was enhanced to reduce the boot time. Concurrent maintenance (CM) firmware fixes: DEFERRED: A problem was fixed that caused SRC B150A422 to be erroneously logged, and the advanced system management interface (ASMI) to erroneously show deconfigured processor cores, if system firmware was installed while a node was deactivated due a concurrent maintenance operation. DEFERRED: A problem was fixed that caused SRC B181B171 to be logged, and the system to crash, during a concurrent node repair or concurrent GX adapter repair. A problem was fixed that prevented a concurrent add or repair of a GX adapter from being re-attempted if a reset/reload of the primary system controller occurred during the GX add part of the initial procedure. A problem was fixed that might cause a concurrent node repair, a concurrent I/O expansion unit repair, a concurrent PCI slot repair, or a DLPAR removal or moving of I/O slots to fail if the I/O hardware involved is in a failed state. A problem was fixed that caused a hot node repair operation to fail if 16GB huge pages were configured on the system. On systems using on/off (temporary) memory capacity on demand (COD), the firmware was enhanced to improve memory COD's interaction with other tools (such as Inventory Scout in AIX), and to make the billing process easier. A problem was fixed that caused a concurrent node add or repair operation to fail if the operation immediately followed an upgrade of system firmware from EH330_xxx to EH340_039, then a concurrent installation of EH340_061.
EH340_061_039 04/20/09	Impact: Function Severity: Special Attention System firmware changes that affect all systems: DEFERRED: A problem was fixed that caused the advanced system management interface (ASMI) menus to become unresponsive, and the system to appear to hang, when a GX adapter slot reservation was attempted when the system was at service processor standby. A problem was fixed that caused the service processor diagnostics to report a "TOD (time-of-day) overflow" error, instead of an uncorrectable memory error, when failures occurred on memory DIMMs. A problem was fixed that prevented the service processor from automatically booting from the permanent (or P) side if the temporary (or T) side of the firmware flash was corrupted. When the problem occurred, the service processor stopped instead of booting from the P side. A problem was fixed that might have caused the system to crash when a processor was dynamically removed when the system was running. If the system is running the EH340 release of system firmware, this problem can also occur during a concurrent maintenance operation. The firmware was enhanced such that data corruption in the Anchor (VPD) will be corrected by the firmware, rather than having to have the Anchor card replaced. A problem was fixed that caused non-terminating SRCs (such as B1818A1E) that indicate registry read errors to be logged during a disruptive installation of system firmware. A problem was fixed that prevented the system from powering on after the "reset to factory settings" option was selected in the advanced system management interface (ASMI) menus. The firmware was enhanced to improve the service processor's capability to recover from bad bits in the flash memory. A predictive error, or an unrecoverable error, will be logged against the card that contains the system firmware if the number of correctable or uncorrectable errors exceeds the threshold. A problem was fixed that caused a partition being migrated to crash on the target system. On systems running the EH340 release of system firmware, a problem was fixed that caused an abort code to be logged in the virtual input/output system (VIOS) error log on the source system after a successful partition migration. A problem was fixed that caused a partition being migrated to become unresponsive on the target system when firmware-assisted dump was enabled. The firmware was enhanced so that SRC BA210012 will not generate a call home when logged. The callouts for SRC B181E6ED, which is logged when a system is booted with service processor redundancy disabled, were improved to indicate that redundancy was disabled rather than calling out a firmware failure. A problem was fixed that caused hardware to be deconfigured when the system encountered network errors, even though the SRCs were being logged as informational. A problem was fixed that prevented all of the necessary files from being synchronized between the primary and secondary service processors. One possible symptom of this problem was the time-of-day clocks being out of synch after a service processor failover. System firmware changes that affect certain systems: On systems with firmware release EH340 installed, a problem was fixed that caused a system firmware installation to fail with SRC E302F9D3 being erroneously logged. On systems with 16GB DIMMs and firmware release EH340 installed, a problem was fixed that caused prevented the concurrent replacement of a distributed converter assembly (DCA) in a processor node. On systems with external I/O drawers, a problem was fixed that could cause the system to hang on checkpoint C700406E during a "warm" reboot (a reboot in which the processor drawer is power-cycled but the I/O drawers are not). On systems running system firmware release EH340 and IBM i partitions, a problem was fixed that caused message CPF9E7F, CPF9E2D or CPF9E5E (which indicates a licensing key problem) to be received by the IBM i partitions when the number of physical processors was greater than the number of IBM i licenses. On systems with virtual fiber channel disks, a problem was fixed that prevented the system management services (SMS) from displaying the virtual fiber channel disks if the virtual fiber channel server reported that any of them were reserved. Concurrent maintenance (CM) firmware fixes DEFERRED: On systems running system firmware release EH340, a problem was fixed that caused the system to checkstop during the "hot add" of a GX I/O adapter card. A problem was fixed that caused a concurrent maintenance operation to be halted with SRC B181A433 being logged. A problem was fixed that caused concurrent maintenance operations, if attempted immediately after a disruptive firmware installation, to be disabled. A problem was fixed that caused SRC B150D15E to be erroneously logged during a concurrent node addition or concurrent memory upgrade. On systems with five or more processor nodes, a problem was fixed that identifies the wrong node LED. A problem was fixed that caused a concurrent processor add operation, after a disruptive installation of system firmware, to fail with SRC B181A422 being logged. A problem was fixed that caused concurrent maintenance operations, if attempted immediately after a concurrent firmware installation, to be disabled. A problem was fixed that caused a concurrent node add to fail after a disruptive firmware installation with SRC B181A422 being logged. A problem was fixed that prevented a concurrent add or repair of a GX adapter from being re-attempted if a reset/reload of the primary system controller occurred during the GX add part of the initial procedure.
EH340_039_039 11/21/08	Impact: Function Severity: Attention New Features and Functions: Support for concurrent processor node addition, as well as hot and cold node repair. Support for up to 30 feature code 5791, 5797, 5798, 5807, 5808, and 5809 I/O drawers in two powered I/O racks, with the limitation that no more than 12 of those 30 drawers can be feature codes 5791, 5797, 5798, 5807, 5808, and 5809. Support for migrating memory DIMMs from POWER5 model 59x systems to model FHA systems. Support for concurrently connecting an I/O rack to a model FHA system. Support for the 8GB fiber channel adapter, F/C 5735. Support for a virtual tape device. Support for USB flash memory storage devices. Support in the system controller firmware for IPv6. Support in the hypervisor for three types of hardware performance monitors. Support for installing AIX and Linux using the integrated virtualization manager (IVM). On systems running AIX, support was added for an enhanced power and thermal management capability. When static power save mode is selected, AIX will "fold" processors to free processors which can then be put in the "nap" state. System firmware changes that affect all systems: A problem was fixed that prevented the default partition environment in the advanced system management interface (ASMI) power on/off menu from being set to "i5/OS" when it was blank. The firmware was enhanced so that SRC B1xx3409, which indicates an invalid state change (such as pushing the power on button twice quickly) will be logged as informational instead of predictive, and will not call home. A problem was fixed that caused a service processor dump to be taken and SRC B181EF88 to be logged, even though the operation of the system was not affected. On systems that are managed by a hardware management console (HMC), a problem was fixed that, under certain rare circumstances, caused SRC B181E411 to be logged, a call home to be made, and a service processor dump to be taken. The firmware was enhanced so that SRC B1812224, which indicates that the user attempted to enable redundancy when the managed system was booting, will be logged as informational instead of predictive. A problem was fixed that prevented error log entries on the secondary service processor (or system controller) from generating a serviceable event on the hardware management console (HMC). A problem was fixed that, under certain rare circumstances, caused SRC B1754202 to be erroneously logged (as a predictive error with a call home) after a disruptive firmware installation. A problem was fixed that caused SRC B1818A0F to be erroneously logged during a firmware installation when service processor (or system controller) failover is disabled. A problem was fixed that prevented the machine type and model data from being added to a node controller's error log entries. On systems with external I/O frames, a problem was fixed that might have prevented the firmware from "unthrottling" processors after entering power save mode. System firmware changes that affect certain systems On systems with the integrated x-series adapter (IXA), a problem was fixed that prevented the creation of a system plan on the HMC. On systems with multiple host channel adapter (HCA) cards, a problem was fixed that logical ports on the HCA cards to be intermittently inactive. In networks using a time server, a problem was fixed that caused the date on a client system to be reset to 1969 if the client system lost power.

EH330
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs

EH330_104_034

04/26/10

Impact: Availability Severity: ATT

System firmware changes that affect all systems

DEFERRED: This fix corrects the handling of a specific processor instruction sequence that has the potential to result in undetected data errors. This specific instruction sequence has only been observed in a small number of highly tuned floating point-intensive applications. However, it is strongly recommended that this fix be applied to all POWER6 systems. This fix has the potential to decrease system performance on applications that make extensive use of floating point divide, square root, or estimate instructions..
A problem was fixed that caused SRC B181440C to be erroneously logged, and a call home to be erroneously made, during the installation of system firwmare.
A problem was fixed that caused SRC B1818A0A to be erroneously logged during a concurrent firmware update.
The firmware was enhanced such that SRCs B181F126, B181F127, and B181F129 are correctly logged, and no longer cause unnecessary calls home to be made.
The firmware was enhanced so that SRC B181720D, and occasionally a service processor dump, will not be generated when the service processor's two Ethernet interfaces are on the same subnet. (This is an invalid configuration.)
In partitions running AIX or Linux, a problem was fixed that, under certain rare circumstances, caused the addition of an I/O slot to a partition using a dynamic LPAR (DLPAR) add operation to fail.
A problem was fixed that caused the system to hang with SRCs B182953C, B182954C and B17BE434 being logged.
A problem was fixed that caused SRC B1818902 to be erroneously logged during a firmware installation.
A problem was fixed that caused a reset/reload of a node controller.

System firmware changes that affect certain systems

On partitions running AIX or Linux, a problem was fixed that caused a dynamic LPAR (DLPAR) operation to add an I/O slot to fail.
On systems running redundant VIOS partitions, a problem was fixed that prevented Ethernet traffic from being properly bridged between the two partitions. This problem also prevented shared Ethernet adapter failover from working correctly.
On systems using InfiniBand switches for processor clustering, a problem was fixed that caused InfiniBand ports to intermittently drop out.

EH330_095_034

08/31/09

Impact: Usability Severity: HIPER

System firmware changes that affect all systems

DEFERRED: This fix corrects the handling of a specific processor instruction sequence that was generated on a particular heavily-tuned High Performance Computing (HPC) application. This specific instruction sequence has the potential to produce an incorrect result. This instruction sequence has only been observed in a single HPC application. However, it is strongly recommended that you apply this fix.
HIPER: A problem was fixed that caused the migration of a partition using shared processors to fail with a reason code of 4180043, or caused the source system to hang or crash.
A problem was fixed that caused SRC 1000911B to be erroneously logged during a reset/reload of the service processor.

System firmware changes that affect certain systems

On systems with 7311-D11, 7314-G30, 5790, or 5796 19" drawers attached, a problem was fixed that caused SRC 10009138 to be erroneously logged.

Concurrent maintenance (CM) firmware fixes

A problem was fixed that caused SRC B7005603 to be erroneously logged when a F/C 5802 or 5877 19" drawer was concurrently added to the system.

EH330_092_034

05/18/09

Impact: Usability Severity: Special Attention

System firmware changes that affect all systems:

DEFERRED: A problem was fixed that caused the advanced system management interface (ASMI) menus to become unresponsive, and the system to appear to hang, when a GX adapter slot reservation was attempted when the system was at service processor standby.
The firmware was enhanced to improve the service processor's capability to recover from bad bits in the flash memory. A predictive error, or an unrecoverable error, will be logged against the card that contains the system firmware if the number of correctable or uncorrectable errors exceeds the threshold.
A problem was fixed that prevented the service processor from automatically booting from the permanent (or P) side if the temporary (or T) side of the firmware flash was corrupted. When the problem occurred, the service processor stopped instead of booting from the P side.
The firmware was enhanced so that SRC B1xxE458 (with word 6=0000E42B) will be logged as informational instead of generating a call home.
A problem was fixed that caused non-terminating SRCs (such as B1818A1E) that indicate registry read errors to be logged during a disruptive installation of system firmware.
The firmware was enhanced to improve the field replaceable unit (FRU) callouts when a clock failure occurs.
A problem was fixed that caused a partition being migrated to become unresponsive on the target system when firmware-assisted dump was enabled.
The callouts for SRC B181E6ED, which is logged when a system is booted with service processor redundancy disabled, were improved to indicate that redundancy was disabled rather than calling out a firmware failure.
A problem was fixed that caused hardware to be deconfigured when the system encountered network errors, even though the SRCs were being logged as informational.
A problem was fixed that caused the detailed data at the end of an "early power off warning type 5" AIX error log entry to be filled with invalid data instead of zeros.
A problem was fixed that caused a partition being migrated to crash on the target system.
A problem was fixed that might cause a system to crash with SRC B170E504 when a processor was dynamically deconfigured.
The firmware was enhanced such that when data is written to the VPD (Anchor) card, the results are verified, resulting in fewer VPD cards being replaced.
A problem was fixed that prevented all of the necessary files from being synchronized between the primary and the secondary system controllers. One possible symptom of this problem was the time-of-day clocks being out of synch after a system controller failover.
A problem was fixed that caused SRC B1818601 to be logged, and a service processor dump to be generated, at runtime.

System firmware changes that affect certain systems:

In systems using InfiniBand switches for processor clustering, a problem was fixed that caused packets to be dropped under certain circumstances.
On systems with five or more nodes, a problem was fixed that prevented the identify LED function from turning on the correct node's LED.

EH330_076_034

12/05/08

Impact: Serviceability Severity: HIPER

System firmware changes that affect all systems:

DEFERRED and HIPER: The system initialization settings were changed to reduce the likelihood of a system crash under extremely rare circumstances.
HIPER: A problem was fixed that caused a system to fail to reboot after a B1xxE504 SRC was logged, due to a processor interconnection bus failure. The same SRC, B1xxE504, was logged when the reboot failed.
A problem was fixed that caused SRC 11001D1x to be erroneously logged during system boot.
A problem was fixed that might, if a platform dump occurred, have caused a reset/reload of the service processor, and the platform dump to be corrupted.
A problem was fixed that caused incorrect field replaceable unit (FRU) part numbers to be returned for the BPF scroll assembly, UEPO panel and the CEC MDA scroll assembly.
A problem was fixed that prevented the system from rebooting if an error occurred during a memory-preserving IPL.
The firmware was enhanced so that if a system with redundant system controllers is booted with redundancy disabled, a call home error will be logged.
The firmware was enhanced so that a call home will be made if the hypervisor issues a "terminate immediate" interrupt.
A problem was fixed that prevented service processor and hypervisor error log entries from being reported to the operating system after a successful partition migration. This problem only affected the partition that was migrated.
On systems running AIX or Linux, a problem was fixed that, under certain rare circumstances, might cause the operating system to crash.
A problem was fixed that, in certain configurations, caused the removal of a host Ethernet adapter (HEA) port to fail when using a dynamic LPAR (DLPAR) operation.
A problem was fixed that, under certain rare circumstances, caused the hypervisor to crash when it was booting with SRC B6000103 being logged.
A problem was fixed that, under certain circumstances, prevented the operating system from recovering a PCI-E adapter on which a temporary enhanced error handling (EEH) error occurred.
A problem was fixed that, under certain rarely occurring circumstances, caused the system to crash if an L2 or L3 cache failure is not discovered and repaired when it initially occurs.
A problem was fixed that caused the service processor diagnostics to call out a processor as the failing item, instead of the memory DIMMs, when a large number of memory error correction coding (ECC) errors occurred.
A problem was fixed that prevented the system from powering on after the "reset to factory settings" option was selected in the advanced system management interface (ASMI) menus.
A problem was fixed that caused the wrong field replaceable unit (FRU) to be called out when SRC B152F109, which indicates a problem with the NVRAM in a bulk power controller (BPC), was logged.
(picked up under feature 683162): A problem was fixed that prevented service processor and hypervisor error log entries from being reported to the operating system after a successful partition migration. This problem only affected the partition that was migrated.
A problem was fixed that might cause a default catch to occur when booting from an iSCSI device.

System firmware changes that affect certain systems:

On systems with a host Ethernet adapter (HEA) or host channel adapter (HCA) assigned to a Linux partition, a problem was fixed that prevented the partition from booting if 512 GB, 1 TB, or 1.5 TB of memory was assigned to the partition. When this problem occurred, SRC B700F105 was logged.
In systems with clustered processors, various problems were fixed in the InfiniBand interconnection networks.
A problem was fixed that, under certain circumstances, caused an AIX or Linux partition to fail to boot with SRC D200E0AF being logged.
On systems with external I/O frames, a problem was fixed that might have prevented the firmware from "unthrottling" processors after entering power save mode.

EH330_046_034

08/28/08

Impact: Function Severity: HIPER

System firmware changes that affect all systems:

DEFERRED and HIPER: A problem was fixed that, under certain rarely occurring circumstances, an application could cause a processor to go into an error state, and the system to crash.
HIPER: A problem was fixed that caused the system to terminate abnormally with SRC B131E504.
HIPER: A problem was fixed that might cause a partition to crash during a partition migration before the migration was complete.
DEFERRED: Enhancements were made to the system firmware to reduce the system boot time on power up.
DEFERRED: A problem was fixed such that under certain rare circumstances, if a system controller failover occurred, the new secondary system controller was not able to communicate with the system.
DEFERRED: A problem was fixed that caused SRC B1608CB0 to be logged if a separate I/O frame is attached to the CEC frame.
A problem was fixed that caused multiple instances of SRC B1818A03 and B1818A0A to be logged erroneously, and multiple calls home to be made, during a frame connection reset.
A problem was fixed that caused SRC B1819506 to be erroneously generated, and a call home to be made, when service processor (or system controller) error log entries were generated faster than they could be processed.
A problem was fixed that caused the hardware management console (HMC) to show an "Incomplete" state after it attempted to read a file with an incorrect size from the service processor (or system controller). This problem also occurred if the "factory configuration" option was used on the advanced system management interface (ASMI) menus.
Enhancements were made to the firmware to improve the FRU callouts for certain types of failures of the time-of-day clock circuitry.
A problem was fixed that prevented a dump file larger than 4 GB from being successfully off-loaded to the hardware management console (HMC).
On systems with redundant bulk power controllers, a problem was fixed that caused the hardware management console (HMC) to get stuck at "Pending Authentication" for one of the bulk power controllers (BPCs).
On systems with I/O drawers attached, a problem was fixed that might have caused some I/O slots in the drawers not to be configured when the system was booted.
In systems with clustered processors, various problems were fixed in the InfiniBand interconnection networks.
A problem was fixed that caused the location codes of the external InfiniBand ports on a 5791 I/O drawer with the InfiniBand interface to be reported incorrectly on the HMC.
A problem was fixed that caused SRC B7006971 to be generated because the firmware was incorrectly performing operations on PCI-Express I/O adapters during dynamic LPAR (DLPAR) operations on memory.
A problem was fixed the might have caused an out-of-memory condition in the hypervisor, with SRC B7000200 being logged.
A problem was fixed in the thermal management firmware that caused SRCs B1812635 and B1812636 to be logged, and the system or node to run in low power mode when it should have been in nominal, or nominal when it should have been in low power mode.
A problem was fixed that caused SRC B1818A10 to be erroneously generated after a successful installation of system firmware.
A problem was fixed that caused the AIX commands "lsmcode" and "diag" to fail after a partition migration.
A problem was fixed that caused the message "BA330000malloc error!" to be displayed on the operating system console after a partition migration, even though SRC BA330000 had not been logged. When this problem occurred, the partition migration appeared to be successful. However, a process within the partition was either hung or had failed, and in most cased the partition had to be rebooted to fully recover.
A problem was fixed that caused the status of the connection between the hardware management console (HMC) and the service processor to be set to an invalid state. This might cause problems when the HMC and service processor tried to communicate.
A problem was fixed that caused partitions that were being rebooted to hang at D200E0AF after a concurrent firmware update under certain circumstances.
A problem was fixed that prevented the replacement of a system controller from completing successfully if the system controller had been guarded out prior to it replacement.
A problem was fixed that caused the system controller to go through an unnecessary reset/reload cycle when a checkstop occurred or the system was powered off.
Enhancements were made to the firmware to improve the FRU callouts for certain types of failures of the node controller.
A problem was fixed that caused predictive SRC B181EF88 to be logged when, under certain circumstances, a system controller failover occurred at runtime.
A problem was fixed such that if redundancy was disabled, and the emergency power off (EPO) switch was then used to power off the system, redundancy was erroneously enabled when the system came back up.
Enhancements were made to the firmware to improve the FRU callouts for certain types of failures of a node controller.
A problem was fixed such that caused the service processor (or system controller) to lose its communication link with the hypervisor, and SRC A181D000 to be logged, under certain rare circumstances.
On systems using virtual shared processor pools (VSPP), a problem was fixed that caused the number of processors assigned to the partitions to be reduced after a memory-preserving IPL.

EH330_034_034

06/10/08

Impact: Function Severity: HIPER

This level is a disruptive update from the prior level, EH330_018. The system should be powered off before installing this level of system firmware. If this level is installed when the system is running, the CECs will be rebooted, causing all partitions to be terminated, and a reboot will be required.

System firmware changes that affect all systems:

HIPER: A problem was fixed that caused a concurrent firmware installation to hang with SRC BA00E840 being logged. This problem may also cause a partition migration to hang, under certain circumstances, with the same SRC, BA00E840, being logged. This SRC will be logged when this level of firmware is installed and will generate a call home; it should be ignored. It will not be logged during subsequent installations.
HIPER: The processor initialization settings were changed to reduce the likelihood of a processor going into an error state and causing a checkstop or system crash.
HIPER: A problem was fixed that, under certain circumstances, caused a system termination during a service processor failover.
HIPER: A problem was fixed that caused large numbers of enhanced error handling (EEH) errors to be logged against the 4-port gigabit Ethernet adapter, F/C 5740, under certain circumstances.
HIPER: On systems with a redundant system controllers installed and enabled, a problem was fixed that might cause a communications hang between the two system controllers. When this occurred, it triggered a reset/reload of the primary system controller, and the resulting fail-over to the secondary system controller failed in such a way that the system crashed.
Several problems were fixed that might cause one or both of the clock cards to be deconfigured, and erroneously called out as bad, when the system boots up from the power-off state.
A problem was fixed that caused the /tmp directory on the system controllers and the service processor in the bulk power controller (BPC) to fill up, which results in an out-of-memory condition. When this problem occurred, the system controllers or service processor in the BPC usually performed a reset/reload. This is one possible cause of SRC B1817201 being logged.
A problem was fixed that prevented the "i5/OS enable/disable" setting (in the ASMI power on/off menu) from taking effect when the system is booted. This solution requires the system to be booted up to hypervisor standby twice after the setting is changed to "enabled". This will be fixed in a future service pack to remove the requirement for the second boot to hypervisor standby.
A problem was fixed that caused the firmware to receive a false error indication when reading the registers on the LED controller. SRC B1811340 was logged when this happened.
A problem was fixed that prevented an error fail-over to the secondary system controller from completing successfully.
A problem was fixed that might have caused a system firmware installation to fail with SRC B18138B7 being logged.
A problem was fixed that caused an error log to be generated that called out system controller A (Un-P1-C2), instead of the correct callout, which was system controller B (Un-P2-C5).
A problem was fixed that caused the P1 LED on the front light strip to be on when it should have been off.
A problem was fixed that caused the wrong memory DIMM location to be called out when certain types of failures occurred.
A problem was fixed that might have caused cache chip failures when the system is operating in Power Save mode. Error log entries that might indicate that this problem is occurring include correctable errors and uncorrectable errors in L2, i-cache and d-cache memory, parity errors, and SRC B181E504.
The firmware was enhanced so that the IDs "celogin1" and "celogin2" allow an authorized service provider to log into the bulk power controller (BPC).
A problem was fixed that caused a partition using a host channel adapter (HCA) or host Ethernet adapter (HEA) to appear to hang (with progress code D200C1FF being displayed) before successfully shutting down. The amount of time the partition appeared to hang depended on the amount of memory assigned to the partition and the usage of HCA or HEA.
A problem was fixed that prevented the HMC from connecting to the managed system if the HMC's DHCP server IP range is changed when the managed system is running.
The error logging and FRU callout firmware was enhanced so that if a failure occurs on one or both clock cards, only one will get deconfigured, and the system will continue to try to boot instead of terminating.
The firmware was enhanced to improve the system memory error recovery.
The firmware was enhanced so that the contents of the /tmp directory are included when a service processor dump is taken.
A problem was fixed in the hypervisor that might cause a partition migration to fail.
The firmware was enhanced so that:

A failure when writing VPD to a P6 processor will cause the node to be deconfigured rather than terminating the system.
The failure of a VPD write operation will not corrupt the VPD table, which may lead to unnecessary system down-time and unnecessary FRU replacement.

System firmware changes that affect certain systems:

On systems using QLogic InfiniBand switches, a problem was fixed that caused the PortInfo:linkWidthActive and PortInfo:linkSpeedActive to be inaccurately stored and displayed on the display of subnet parameters.

EH330_018_018

05/13/08

Impact: New Severity: New

GA Level