Power7 System Firmware Fix History - Release levels AM710, AM720, AM730, AM780


Firmware Description and History

AM780
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM780_097_040 / FW780.83

03/05/18
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY

Impact:  Availability      Severity:  SPE

System firmware changes that affect certain systems

  • On systems running IBM i partitions at IBM i V6R1 or V7R1 at less than TR5, a problem was fixed for IBM i partitions failing to boot with SRC B600690B.  If the IBMi partition is running, a DLPAR add of I/O may fail.  This problem was introduced with FW780.80 and is present in FW780.81 and FW780.82 and always happens at these levels.  The problem can be resolved by moving up to OS IBM i 7.1 TR5 or later level, if the update to the fixed firmware level is not wanted. 
    For more information, see the following IBM Tech Note:  https://www.ibm.com/support/docview.wss?uid=nas8N1022482
AM780_096_040 / FW780.82

01/31/18
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY

Impact:  Security      Severity:  SPE

Response for Recent Security Vulnerabilities

  • In response to recently reported security vulnerabilities, this firmware update is being released to address Common Vulnerabilities and Exposures issue number CVE-2017-5715.  In addition, Operating System updates are available to mitigate the CVE-2017-5753 and CVE-2017-5754 security issues. This pertains to the following models:
    1) IBM Power 770 (9117-MMB)
    2) IBM Power 780 (9179-MHB)
    This firmware update also addresses CVE-2017-5715 for IBM i, along with updates for AIX and Linux, for the following models:
    1) IBM Power 770 (9117-MMD)
    2) IBM Power 780 (9179-MHD)
    3) IBM Power ESE (8412-EAD)
AM780_094_040 / FW780.81

01/09/18
Systems 8412-EAD; 9117-MMD; and 9179-MHD ONLY

Impact:  Security      Severity:  SPE

New features and functions

  • In response to recently reported security vulnerabilities, this firmware update is being released to address Common Vulnerabilities and Exposures issue numbers CVE-2017-5715,  CVE-2017-5753 and CVE-2017-5754.  Note that a subsequent FW release is required and will replace this FW update for CVE-2017-5715 for IBMi when available. In addition, Operating System updates are required in conjunction with this FW level for CVE-2017-5753 and CVE-2017-5754.
    The models addressed by this service pack update have the P7+ processor: 
    1) IBM Power 770 (9117-MMD)
    2) IBM Power 780 (9179-MHD)
    3) IBM Power ESE (8412-EAD)
AM780_091_040 / FW780.80

12/13/17
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Availability      Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for an intermittent core dump of netsCommonMsgServer on the service processor with a serviceable callout for SRC B181EF88.  This problem can be triggered by brief network outages that cause the HMC to disconnect and reconnect to the service processor, causing race conditions in the HMC session shutdowns.
  • A problem was fixed for an invalid date from the service processor causing the customer date and time to go to the Epoch value (01/01/1970) without a warning or chance for a correction.  With the fix,  the first IPL attempted on an invalid date will be rejected with a message alerting the user to set the time correctly in the service processor.  If the warning is ignored and the date/time is not corrected, the next IPL attempt will complete to the OS with the time reverted to the Epoch time and date.  This problem is very rare but it has been known to occur on service processor replacements when the repair step to set the date and time on the new service processor was inadvertently skipped by the service representative.
  • A  problem was fixed for incorrect low affinity scores for a partition reported from the HMC "lsmemopt" command when a partition has filled an entire drawer.  A low score indicates the placement is poor but in this case the placement is actually good.  More information on affinity scores for partitions and the Dynamic Platform Optimizer can be found at the IBM Knowledge Center: https://www.ibm.com/support/knowledgecenter/en/9119-MME/p8hat/p8hat_dpoovw.htm.
  • A problem was fixed for spurious loggings of SRCs A7004715 and A7001730 for system VPD errors that did not reflect actual problems in the system Vital Product Data (VPD) card.  With the fix,  the VPD card  SRCs are now reported only after a certain error threshold is achieved to ensure that replacement of the VPD card will help resolve the VPD problems.

System firmware changes that affect certain systems

  • On systems with mirrored memory running IBM i partitions, a problem was fixed for memory fails in the partition that also caused the system to crash.  The system failure will occur any time that IBM i partition memory towards the beginning of the partition's assigned memory fails.  With the fix, the memory failure is isolated to the impacted partition, leaving the rest of the system unaffected.
  • A  problem was fixed for a Power Enterprise Pool (PEP) system losing its assigned processor and memory resources after an IPL of the system.  This is an intermittent problem caused by a small timing window that makes it possible for the server to not get the IPL-time assignment of resources from the HMC.  If this problem occurs, it can be corrected by the HMC to recover the pool without needing another IPL of the system.
AM780_089_040 / FW780.70

07/26/17
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Availability      Severity:  ATT

New features and functions

  • Support for the Advanced System Management Interface (ASMI) was changed to allow the special characters of "I", "O", and "Q" to be entered for the serial number of the I/O Enclosure under the Configure I/O Enclosure option.  These characters have only been found in an IBM serial number rarely, so typing in these characters will normally be an incorrect action.  However, the special character entry is not blocked by ASMI anymore so it is able to support the exception case.  Without the enhancement, the typing of one of the special characters causes message "Invalid serial number" to be displayed.
  • Support for firmware updates using  USB was enabled.   Without the change, entitlement checks prevent the USB code update from running on systems with FW780.
  • Support was added  for the Universally Unique IDentifier (UUID) property for each partition.  The UUID provides each partition with an identifier that is persisted by the platform across partition reboots, reconfigurations, OS reinstalls, partition migration,  and hibernation.

System firmware changes that affect all systems

  • A problem was fixed for an intermittent IPL failure with SRC B181E6C7 for a deadlock condition when testing the clocks during the IPL.  The problem state can be recovered by doing another IPL.  The problem is triggered by an error in the IPL clock test causing a interrupt handler to switch to the redundant clock and deadlock.  With the fix, the clock fault is handled and the bad clock is guarded, with the IPL completing on the redundant clock.
  • A  problem was fixed for a partition boot fail or hang from a Fibre Channel device having fabric faults.  Some of the fabric errors returned by the VIOS are not interpreted correctly by the Open Firmware VFC drive, causing the hang instead of generating helpful error logs.
  • A problem was fixed for an SRC BA090006 serviceable event log occurring whenever an attempt was made to boot from an ALUA  (Asymmetric Logical Unit Access) drive.  These drives are always busy by design and cannot be used for a partition boot, but no service action is required if a user inadvertently tries to do that.  Therefore, the SRC was changed to be an informational log.
  • A problem was fixed for a Power Enterprise Pool (PEP) resource Grace Period not being reset when the server is in the "Out of Compliance" state and the resource has been returned to put the server back in Compliance.  The Grace Period was not being reset after a double-commit of a resource (doing an "remove" of an active resource) was resolved by restarting the server with the double-committed resource. When Grace Period ends, the "double-committed" resources on the server have to have been freed up from use to prevent the server from going to "Out of Compliance".  If the user fails to free up the resource, the PEP is in an "Out of Compliance" state, and the only PEP actions allowed are ones to free up the double-commit. Once that is completed, the PEP is back In Compliance. The loss of the Grace Period for the error makes it difficult to move resources around in the PEP.  Without the fix, the user can  "Add" another PEP resource to the server, and the action of adding a PEP resource resets the Grace Period timer.  One could then "Remove" that one PEP resource just added, and then any further "removes" of PEP resources would behave as expected with the full Grace Period in effect.
  • A problem was fixed for  Power Enterprise Pool (PEP) IFL processors assignments causing an "Out of Compliance" for normal processor licenses.  The number of IFL processors purchased was first credited as satisfying any "unreturned" PEP processor resources, thus potentially leaving the system "Out Of Compliance" since IFL processors should not be taking the place of the normal (expensive) processor usage.  In this situation, without the fix, the user will need to either purchase more "expensive" non-IFL processors to satisfy the non-IFL workloads or adjust the partitions to reduce the usage of non-IFL processors.  This is a very infrequent problem for the following reasons: 
    1) PEP processors are infrequently left "unreturned" for short periods of time for specialized operations such as LPM migrations
    2) The user would have to purchase IFL processors from IBM, which is not a common occurrence.
    3) The user would have to put in a COD key for IFL processors while a PEP processor is still "unreturned".
  • A problem was fixed for a Power Enterprise Pool (PEP) resource Grace Period being short by one hour with 71 hours provided instead of 72.  The Grace Period is provided when all PEP resources are assigned and the user double-uses these resources (typically this is done for a Live Partition Mobility (LPM) migration).  This "borrowing" is temporarily permitted in this case even if there are not enough licenses to cover resources in both servers. The PEP goes into "Approaching Out Of Compliance", indicating the user has a certain amount of time to resolve this double-use. The problem here is that the time length of this Grace Period lasts one hour less than stated.  For a 72-hour Grace Period (the standard setting), the user only gets 71 hours.  The user sees "71 hours remaining" (correct) on first display at start,  then right away, if the user displays again, 70 hours is shown remaining.  But thereafter, the Grace Period time decrements correctly for the time remaining.
  • A problem was fixed for Power Enterprise Pool (PEP) non-applicable error messages being displayed when re-entering PEP XML files for PEP updates, in which one of the XML operations calls for Conversion of Perm Resources to PEP Resources.  There is no error as the PEP key was accepted on the first use.  The following message may be seen on the HMC and can be ignored:   "...HSCL0520 A Mobile CoD processor conversion code to convert 0 permanently activated processors to Mobile CoD processors on the managed system has been entered.  HSCL050F This CoD code is not valid for your managed system.  Contact your CoD administrator."

System firmware changes that affect certain systems

On systems with IBM i partitions, a problem was fixed for frequent logging of informational B7005120 errors due to communications path closed conditions during messaging from HMCs to IBM i partitions.  In the majority of cases these errors are due to normal operating conditions and not due to errors that require service or attention.  The logging of informational errors due to this specific communications path closed condition that are the result of normal operating conditions has been removed.
AM780_084_040 / FW780.60

01/16/17
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Availability      Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for a Live Partition Mobility migration that resulted in the source managed system going to the Hardware Management Console (HMC) Incomplete state after the migration to the target system was completed.  This problem is very rare and has only been detected once.. The problem trigger is that the source partition does not halt execution after the migration to the target system.   The HMC went to the Incomplete state for the source managed system when it failed to delete the source partition because the partition would not stop running.  When this problem occurred, the customer network was running very slowly and this may have contributed to the failure.  The recovery action is to re-IPL the source system but that will need to be done without the assistance of the HMC.  For each partition that has a OS running on the source system, shut down each partition from the OS.  Then from the Advanced System Management Interface (ASMI),  power off the managed system.  Alternatively, the system power button may also be used to do the power off.  If the HMC Incomplete state persists after the power off, the managed system should be rebuilt from the HMC.  For more information on HMC recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
  • A problem was fixed for a latency time of about 2 seconds being added to a target Live Partition Mobility (LPM) migration system when there is a latency time check failure.  With the fix, in the case of a latency time check failure, a much smaller default latency is used instead of two seconds.  This error would not be noticed if the customer system is using a NTP time server to maintain the time.
  • A problem was fixed for a shared processor pool partition showing an incorrect zero "Available Pool Processor" (APP) value after a concurrent firmware update.  The zero APP value means that no idle cycles are present in the shared processor pool but in this case it stays zero even when idle cycles are available.  This value can be displayed using the AIX "lparstat" command.  If this problem is encountered, the partitions in the affected shared processor pool can be dynamically moved to a different shared processor pool.  Before the dynamic move, the  "uncapped" partitions should be changed to "capped" to avoid a system hang. The old affected pool would continue to have the APP error until the system is re-IPLed.
  • A rare problem was fixed for a system hang that can occur when dynamically moving "uncapped" partitions to a different shared processor pool.  To prevent a system hang, the "uncapped" partitions should be changed to "capped" before doing the move.
  • A problem was fixed for a blank SRC in the LPA dump for user-initiated non-disruptive adjunct dumps.  The SRC is needed for problem determination and dump analysis.
  • A problem was fixed for incorrect error messages from the Advanced System Management Interface (ASMI) functions when the system is powered on but in the  "Incomplete State".  For this condition, ASMI was assuming the system was powered off because it could not communicate to the PowerVM hypervisor.  With the fix, the ASMI error messages will indicate that ASMI functions have failed because of the bad hypervisor connection instead of falsely stating that the system is powered off.
  • A problem was fixed for Live Partition Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of firmware. Subsequent DLPAR of Virtual Adapters will fail with HMC error message HSCL294C, which contains text similar to the following:  "0931-007 You have specified an invalid drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and later. Not affected by this issue are partitions installed with VIOS, IBM i, or earlier levels of AIX.
AM780_080_040 / FW780.50

06/29/16
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Availability      Severity:  SPE

New Features and Functions

  • Support was added for the Stevens6+ option of the internal tray loading DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM Drive.  The Stevens6+ option is a FRU hardware replacement for the Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD drive is installed on the system without the required firmware support, the boot of an AIX partition will fail when the DVD is used as the load source.  Also, an IBM i partition cannot consistently boot from the DVD drive using D-mode IPL.  A SRC C2004130 may be logged for the load source not found error.
  • Support was added for systems to be able to automatically convert permanently activated resources (processor and memory) to  Mobile CoD resources for use in a Power Enterprise Pool (PEP).  The ability to do a CoD resource license conversion requires a minimum HMC level of V8R8.4.0 or later.  More information on how to use a  PEP for a group of systems tp share Mobile Capacity on Demand (CoD) processor resources and memory resources can be found in the IBM Knowledge Center at the following link: https://www.ibm.com/support/knowledgecenter/HW4M4/p8ha2/systempool_cod.htm.  

System firmware changes that affect all systems

  • A problem was fixed for PCI adapters locking up when powered on.  The problem is rare but frequency varies with the specific adapter models.  A system power down and power up is required to get the adapter out of the locked state.
  • A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during RSA PPS signature verification. The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
  • A problem was fixed for hypervisor task failures in adjunct partitions with a SRC B7000602 reported in the error log.  These failures occur during adjunct partition reboots for concurrent firmware updates but are extremely rare and require a re-IPL of the system to recover from the task failure.  The adjunct partitions may be associated with the VIOS or I/O virtualization for the physical adapters such as done for SR-IOV.
  • A problem was fixed for a shortened "Grace Period" for "Out of Compliance" users of a Power Enterprise Pool (PEP).   The "Grace Period" is short by one hour, so the user has one less hour to resolve compliance issues before the HMC disallows any more borrowing of PEP resources.  For example, if the "Grace Period" should have been 48 hours as shown in the "Out of Compliance" message, it really is 47 hours in the hypervisor firmware.  The borrowing of PEP resources is not a common usage scenario.  It is most often found in Live Partition Mobility (LPM) migrations where PEP resources are borrowed from the source server and loaned to the target server.
  • A problem was fixed for the Advanced System Management Interface "Network Services/Network Configuration" "Reset Network Configuration" button that was not resetting the static routes to the default factory setting.  The manufacturing default is to have no static routes defined so the fix clears any static routes that had been added.  A circumvention to the problem is to use the ASMI "Network Services/Network Configuration/Static Route Configuration" "Delete" button before resetting the network configuration.
  • A problem was fixed for a sequence of two or more Live Partition Mobility migrations that caused a partition to crash with a SRC BA330000 logged (Memory allocation error in partition firmware).  The sequence of LPM migrations that can trigger the partition crash are as follows:
    The original source partition level can be any FW760.xx, FW763.xx, FW770.xx, FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx, or FW840.xx P8 level.  It is migrated first to a system running one of the following levels:
    1) FW730.70 or later 730 firmware or
    2) FW740.60 or later 740 firmware
    And then a second migration is needed to a system running one of the following levels:
    1) FW760.00 - FW760.20 or
    2) FW770.00 - FW770.10
    The twice-migrated system partition is now susceptible to the BA330000 partition crash during normal operations until the partition is rebooted.  If an additional LPM migration is done to any firmware level, the thrice-migrated partition is also susceptible to the partition crash until it is rebooted.
    With the fix applied, the susceptible partitions may still log multiple BA330000 errors but there will be no partition crash.  A reboot of the partition will stop the logging of the BA330000 SRC.

System firmware changes that affect certain systems

  • On systems having a IBM i partition with more than 64 cores, a performance problem was fixed with the choice of processor cores assigned to the partition.  This problem only pertains to the Power 780 (9179-MHD) and the Power 795 (9119-FHB).
  • On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX  Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was fixed for a Restart Dump operation failing into KDB mode.  If "q" is entered to exit from KDB mode, the partition fails to start.  The AIX partition must be powered off and back on to recover.  The problem can be circumvented by disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
  • For a system partition with more than 64 cores, a problem was fixed for Live Partition Mobility (LPM)  migration operations failing with HSCL365C.  The partition migration is stopped because the platform detects a firmware error anytime the partition has more than 64 cores.  This problem only pertains to the Power 780 (9179-MHD) and the Power 795 (9119-FHB).
  • On systems with dedicated processor partitions,  a problem was fixed for the dedicated processor partition becoming intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

DEFERRED:  A problem was fixed for a I/O performance slow-down that can occur after a concurrent repair of a GX bus I/O adapter with a Feature Code of #1808, #1816, #1914, #EN22, #EN23, or #EN25.  A re-IPL of the system after the concurrent repair operation corrects the I/O performance issue.  This fix requires an IPL of the system to take effect.
AM780_075_040 / FW780.40

12/16/15
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Availability      Severity:  SPE

New Features and Functions

  • Support was added to the service processor to allow control of Dynamic Power Mode from the Hardware Management Console (HMC).  This power mode allows modifying a processor frequency,  either to reduce energy consumption or to overclock the processor and boost the machine speed.   There are four power modes possible:
    1) Disable Power Saver mode – this is default.   No changes in the processor frequency and resource will operate at 100% of nominal processor frequencies at all times.
    2) Enable Static Power Saver mode – activates the Power Saver mode, fixing the processor frequency and voltage at a predetermined low-power mode.
    3) Enable Dynamic Power Saver (favor power) mode – guarantees power savings by limiting the maximum frequency of the system under peak utilization under high utilization.
    4) Enable Dynamic Power Saver (favor performance) mode – allows a higher frequency range at high utilization.
    There is existing support to control Dynamic Power Mode from the Advanced System Management Interface (ASMI) with the "System Configuration /Power Management/ Power Mode Setup" panel options.   With the new support, the HMC can also control the Dynamic Power Modes with CLI commands lspwrmgmt (list the current power mode configuration) and chpwrmgmt (change the power mode):
    chpwrmgmt -m managed-system -r sys -o {enable | disable}  [-t {static | dynamic_favor_perf | dynamic_favor_power | fixed_max_frequency}]  [--help]
    For more information on the HMC CLI chpwrmgmt command, see the following link in the IBM KnowledgeCenter:  (https://www-01.ibm.com/support/knowledgecenter/HW4L4/p8edm/chpwrmgmt.html).
    The HMC must be at V8R8.2.0  or later to have the Dynamic Power Mode feature.
  • Support was added to the Advanced System Management Interface (ASMI) to be able to add a IPv4 static route definition for each ethernet interface on the service processor.  Using a static route definition,  a Hardware Management Console (HMC) configured on a private subnet that is different from the service processor subnet is now able to connect to the service processor and manage the CEC.  A static route persists until it is deleted or until the service processor settings are restored to manufacturing defaults.  The static route is managed with the ASMI panel "Network Services/Network Configuration/Static Route Configuration" IPv4 radio button.  The "Add" button is used to add a static route (only one is allowed for each ethernet interface) and the "Delete" button is used to delete the static route.

System firmware changes that affect all systems

  • For a partition that has been migrated with Live Partition Mobility (LPM) from FW730 to FW740 or later, a problem was fixed for a Main Storage Dump (MSD) IPL failing with SRC B2006008.  The MSD IPL can happen after a system failure and is used to collect failure data.  If the partition is rebooted anytime after the migration, the problem cannot happen.  The potential for the problem existed between the active migration and a partition reboot.
  • A problem was fixed for partial loss of Entitlement for On/Off Memory Capacity On Demand (also called Elastic COD).  Users with large amounts of Entitlement on the system of greater than "65535 GB * Days" could have had a truncation of the Entitlement value on a re-IPL of the system.  To recover lost Entitlement, the customer can request another On/Off Enablement Code from IBM support to "re-fill" their entitlement.
  • A problem was fixed for an incorrect restriction on the amount of "Unreturned"  resources allowed for a Power Enterprise Pool (PEP).  PEP allows for logical moving of resources (processors and memory) from one server to another.  Part of this is 'borrowing' resources from one server to move to another. This may result in "Unreturned" resources on the source server. The management console controls how many total "Unreturned" PEP resources can exist.  For this problem,  the user had some "Unreturned" PEP memory and asked to borrow more but this request was incorrectly refused by the hypervisor.
  • On systems where memory relocation (as done by using Live Partition Mobility (LPM) ) and a partition reboot are occurring simultaneously, a problem for a system termination was fixed.  The potential for the problem existed between the active migration and the partition reboot.
  • A problem was fixed in the hypervisor power off to protect from rare NVRAM corruption in the address space where the partition profiles are stored.  The B7005301 SRC is logged on the next IPL after the corruption that takes the system into the Hardware Management Console (HMC) recovery state.  The HMC found the partition profiles corrupted in NVRAM.  The HMC partition profile recovery procedure must be used to restore the partition profiles from the HMC.
  • A problem was fixed for a hypervisor adjunct partition failed with "SRC B2009008 LP=32770" for an unexpected SR-IOV adapter configuration.  Without the fix, the system must be re-IPLed to correct the adjunct error.  This error is infrequent and can only occur if an adapter port configuration is being changed at the same time that error recovery is occurring for the adapter.
  • A security problem was fixed for an OpenSSL specially crafted X.509 certificate that could cause the service processor to reset in a denial-of-service (DOS) attack.  The Common Vulnerabilities and Exposures issue number is CVE-2015-1789.
  • A security problem was fixed in OpenSSL where a remote attacker could cause an infinite loop on the service processor using malformed Elliptic Curve parameters during the SSL authentication.  This would cause the service processor performance problems and also prevent new management console connections from being made.  To recover from this attack, a reset or power cycle of the service processor is needed after scheduling and completing a normal shutdown of running partitions..  The Common Vulnerabilities and Exposures issue number is CVE-2015-1788.
  • A security problem was fixed in the lighttpd server on the service processor OpenSSL where a remote attacker, while attempting authentication, could insert strings into the lighttpd server log file.  Under normal operations on the service processor, this does not impact anything because the log is disabled by default.  The Common Vulnerabilities and Exposures issue number is CVE-2015-3200.
  • A problem was fixed for a Network boot/install failure using bootp in a network with switches using the Spanning Tree Protocol (STP).  A Network boot/install using lpar_netboot on the management console was enhanced to allow the number of retries to be increased.  If the user is not using lpar_netboot, the number of bootp retries can be increased using the SMS menus.  If the SMS menus are not an option, the STP in the switch can be set up to allow packets to pass through while the switch is learning the network configuration.
  • A problem was fixed in the run-time abstraction services (RTAS) extended error handling (EEH) recovery for EEH events for SR-IOV Virtual Functions (VFs) to fully reconfigure the VF devices after an EEH event.  Since the physical adapter does recover from the EEH event itself, and there are no error logs generated, it might not be immediately apparent that the VF did not fully reconfigure.  This prevents certain PCIe settings from being established for interrupts and performance settings, leading to unexpected adapter behavior and errors in the partition.
  • For systems with an invalid P-side or T-side in the firmware, a problem was fixed in the partition firmware Real-Time Abstraction System (RTAS) so that system Vital Product Data (VPD) is returned at least from the valid side instead of returning no VPD data.   This allows AIX host commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent even if there is one bad code side.  Without the fix,  all the VPD data is blocked from the OS until the invalid code side is recovered by either rejecting the firmware update or attempting to update the system firmware again.
  • A problem was fixed that prevented a second management console from being added to the system.  In some cases, network outages caused defunct management console connection entries to remain in the service processor connection table, making connection slots unavailable for new management consoles  A reset of the service processor could be used to remove the defunct entries and allow the second management console to connect.
  • A problem was fixed for some service processor error logs not getting reported to the OS partitions as needed.  The service processor was not checking for a successful completion code on the error log message send, so it was not doing retries of the send to the OS when that was needed to ensure that the OS received the message.
  • A problem was fixed for an incorrect call home for SRC B1818A0F.  There was no real problem so this call home should have been ignored.  This occurred when dynamic IP configurations were being done on the service processor and the DHCP server was not responding.  The correct solution was to fix the network configuration so that the DHCP server could be found on network.

System firmware changes that affect certain systems

  • On systems using PowerVM with shared processor partitions that are configured as capped or in a shared processor pool, there was a problem found that delayed the dispatching of the virtual processors which caused performance to be degraded in some situations.  Partitions with dedicated processors are not affected.   The problem is rare and can be mitigated, until the service pack is applied, by creating a new shared processor AIX or Linux partition and booting it to the SMS prompt; there is no need to install an operating system on this partition.  Refer to help document http://www.ibm.com/support/docview.wss?uid=nas8N1020863 for additional details.
  • On a system with a IBM i partition running 7.2 or later with 4K sector disks,  a problem was fixed for a  machine check incorrectly issued.
  • On a system with a AIX partition and a Linux partition, a problem was fixed for dynamically moving an adapter that uses DMA from the Linux partition to the AIX partition that caused the AIX to fail by going into KDB mode (0c20 crash).  The management console showed the following message for the partition operation:  "Dynamic move of I/O resources failed.  The I/O slot dynamic partitioning operation failed.".  The error was caused by Linux using 64K mappings for the DMA window and AIX using 4K mappings for the DMA window, causing incorrect calculations on the AIX when it received the adapter.  Until the fix is applied, the adapters that use DMA should only be moved from Linux to AIX when the partitions are powered off.
  • For Integrated Virtualization Manager (IVM) managed systems with more than 64 active partitions, a problem was fixed for recovery from Live Partition Mobility (LPM) errors.  Without the fix, the IVM managed system partition can appear to still be running LPM after LPM has aborted, preventing retries of the LPM operation.  In this case, the partition must be stopped and restarted to clear the LPM error state.  The problem is not frequent because it requires a failed LPM on a partition with a partition ID that is greater than 64.  This defect only pertains to the IBM Power ESE (8412-EAD).
  • On systems with IBM i partitions that have a load source device with 4K sectors, a problem has been fixed for Mainstore Dump (MSD) failing with a B200F00C SRC.  Without the fix, the  IBM i 4K sector load source devices are not supported for MSD and always fail.
  • For non-HMC managed systems in Manufacturing Default Configuration (MDC) mode with a single host partition, a problem was fixed for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were not off-loaded to the host OS.  This is an infrequent error caused by a timing error that causes the dump notification signal to the host OS to be lost.  The missing/pending dumps can be retrieved by rebooting the host OS partition.  The rebooted host OS will receive new notifications of the dumps that have to be off-loaded.
  • A problem was fixed for an IPL termination with a B150B10C SRC and B121C770 error logs.  This problem only occurred on a multiple node system and does not pertain to the Power ESE (8412-EAD).  The problem was intermittent so a re-ipl of the CEC normally resolved the problem.
AM780_071_040 / FW780.30

04/22/15
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact: Security         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for the iptables process consuming all available memory, causing an "out of memory" dump and reset/reload of the service processor.
  • A problem was fixed for the callout on power good (pgood) fault SRC 11002634 so that it includes the CEC enclosure and the failing FRU.  Previously, the callout was missing the failing FRU.
  • A problem was fixed with the fspremote service tool to make it support TLSv1.2 connections to the service processor to be compatible with systems that had been fixed for the OpenSSL Padding Oracle On Dowgraded Legacy Encryption (POODLE) vulnerabilities.  After the POODLE fix is installed, by default the system only allows secured connections from clients using the TLSv1.2 protocol.
  • A problem was fixed for performance dumps to speed its processing so it is able to handle partitions with a large number of processors configured.  Previously, for large systems, the performance dump took too long in collecting performance data to be useful in the debugging of some performance problems.
  • A problem was fixed for a faulty ambient temperature sensor that triggered emergency power offs with SRC 11007203 or 11007203 even though the temperature was not over the limit.  If the ambient temperatures are high now, the errors will be logged for call home service but they will not trigger an emergency power off.
  • A problem was fixed to prevent a hypervisor task failure if multiple resource dumps running concurrently run out of dump buffer space.  The failed hypervisor task could prevent basic logical partition operations from working.
  • A problem was fixed for a partition deletion error on the management console with error code 0x4000E002 and message "...insufficient memory for PHYP".  The partition delete operation has been adjusted to accommodate the temporary increase in memory usage caused by memory fragmentation, allowing the delete operation to be successful.
  • A problem was fixed for I/O drawer MTMS updates where a hypervisor memory leak would cause reconfiguration operations to fail or cause resources to no longer show up for user configuration.
  • A security problem was fixed in OpenSSL where the service processor would, under certain conditions, accept Diffie-Hellman client certificates without the use of a private key, allowing a user to falsely authenticate .  The Common Vulnerabilities and Exposures issue number is CVE-2015-0205.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages.  A specially crafted DTLS message could exhaust all available memory and cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0206.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) messages.  A specially crafted DTLS message could do an null pointer de-reference and cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number is CVE-2014-3571.
  • A security problem was fixed in OpenSSL to fix multiple flaws in the parsing of X.509 certificates.  These flaws could be used to modify an X.509 certificate to produce a certificate with a different fingerprint without invalidating its signature, and possibly bypass fingerprint-based blacklisting.  The Common Vulnerabilities and Exposures issue number is CVE-2014-8275.
  • A security vulnerability, commonly referred to as GHOST, was fixed in the service processor glibc functions getbyhostname() and getbyhostname2() that allowed remote users of the functions to cause a buffer overflow and execute arbitrary code with the permissions of the server application.  There is no way to exploit this vulnerability on the service processor but it has been fixed to remove the vulnerability from the firmware.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0235.
  • A problem was fixed in the Advanced System Management Interface (ASMI) to reword a confusing message for systems with no deconfigured resources.  The "System Service Aids/Deconfiguration Records" message text for this situation was changed from "Deconfiguration data is currently not available." to "No deconfigured resources found in the system."
  • A problem was fixed for a hypervisor deadlock that results in the system being in a "Incomplete state" as seen on the management console.  This deadlock is the result of two hypervisor tasks using the same locking mechanism for handling requests between the partitions and the management console.  Except for the loss of the management console control of the system, the system is operating normally when the "Incomplete state" occurs.
  • A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with malformed Elliptic Curve private keys.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0209.
  • A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with a specially crafted X.509 certificate that causes an invalid pointer, out-of-bounds write, or a null pointer de-reference.  The Common Vulnerabilities and Exposures issue numbers are CVE-2015-0286,  CVE-2015-0287, and CVE-2015-0288.

System firmware changes that affect certain systems

  • On systems with redundant service processors and unlicensed cores, a problem was fixed with firmware update to prevent SRC B170B838 errors on unlicensed cores after an administrative failover (AFO) to the backup service processor.
  • On systems with redundant service processors, a problem was fixed for serviceable events being missing on the management console for the case of a backup service processor termination error.  The error log from the failed backup service processor did not get synchronized to the primary service processor.
  • On a system with redundant service processors, a problem was fixed for bad pointer reference in the mailbox function during data synchronization between the two service processors.  The de-reference of the bad pointer caused a core dump, reset/reload, and fail-over to the backup service processor.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed for a hypervisor hang at progress code C7004091 during the IPL or hangs during serviceability tasks to the I/O drawer.
  • On systems using the Virtual I/O Server (VIOS) to share physical I/O resources among client logical partitions, a problem was fixed for memory relocation errors during page migrations for the virtual control blocks.  These errors caused a CEC termination with SRC B700F103.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations:
    - Using concurrent maintenance to perform a hot repair of a node.
    - Reducing the size of an Active Memory Sharing (AMS) pool.
    - On systems using mirrored memory, using the memory mirroring optimization tool.
    - Performing a Dynamic Platform Optimizer (DPO) operation.
  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.
  • A problem was fixed that could result in latency or timeout issues with I/O devices.  On systems using Power7+ processors (IBM Power 770 (9117-MMD, IBM Power 780 (9179-MHD), and IBM Power ESE (8412-EAD)), this issue only impacts shared processor partitions.
  • For a system with Virtual Trusted Platform Module (VTPM) partitions,  a problem was fixed for a management console error that occurred while restoring a backup profile that caused the system to to go the management console "Incomplete state".  The failed system had a suspended VTPM partition and a B7000602 SRC logged.
  • On systems with redundant service processors, a problem was fixed to add a missing check for a broken FSI link-1 pin.  The broken FSI link-1 pin was detectable during fail-over attempts to the backup service processor which failed.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer.
AM780_068_040 / FW780.21

01/07/15
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Security      Severity:  HIPER

System firmware changes that affect all systems

  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Downgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options.  The Security Configuration options of "nist_sp800_131a", "nist_compat", and "legacy" for affects the disablement SSLv3 and determines the level of protection from POODLE.  The management console also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.9.0 SP1 with PTF MH01484) to eliminate all vulnerability to POODLE and allow use of option 1 "nist_sp800_131a" as shown below:
    -1) nist_sp800_131a (SSlv3 disabled):  This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack.  All clients must be capable of using TLS v1.2 to make the secured connections to the service processor to use this option.  This requires the management console be at a minimum level that has a POODLE fix such as  HMC V7 R7.9.0 SP1 with POODLE PTF MH01484.
    -2) nist_compat (default mode - SSLv3 enabled for HMC):  This medium level of security protection disables SSLv3 (TLS v1.2 must be used instead) for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections.  But the older management consoles are allowed to use SSLv3 to connect to the service processor.  This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels.  Running a non-POODLE compliant HMC to a service processor in this default mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI.
    -3) legacy (SSLv3 enabled):  This basic level of security protection enables SSLv3 for all service processor client connection.  It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections.  This legacy option is intended for customer sites on protected internal networks that have a large investment in older hardware that need SSLv3 to make browser and HMC connections to the service processor.  The level of POODLE protection actually achieved in legacy mode is determined by the percentage of clients that are at the POODLE fix compliant levels.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.

System firmware changes that affect certain systems

  • HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem was fixed that may affect shared processor partitions where there is a mixture of dedicated and shared processor partitions with virtual IO connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting, between them.  In high availability cluster environments this problem may result in a split brain scenario.
  • On systems with redundant service processors,  a problem was fixed so that a backup memory clock failure with SRC B120CC62 is handled without terminating the system running on the primary memory clock.
AM780_066_040 / FW780.20

10/16/14
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact: Data            Severity:  HIPER

New Features and Functions

  • Support was added for using the Mellanox ConnectX-3 Pro 10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.

System firmware changes that affect all systems

  • A problem was fixed that caused the Advanced System Management Interface (ASMI) menu for Memory Low Power State to be displayed even though it is not applicable to the system.  These systems do not have the DIMM type required for memory low power state.
  • A problem was fixed that caused the Utility COD display of historical usage data to be truncated on the management console.
  • A problem was fixed for memory relocation failing during a partition reboot with SRC B700F103 logged.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was corrected that resulted in B7005300 error logs.
  • A problem was fixed for Utility COD Processors where incorrect SRCs A7004735 and A7004736 are logged when utility processors are activated.  The messages try to convey a problem that does not exist (no out of processor compliance condition actually exists).
  • A problem was fixed for the Advanced System Manager Interface (ASMI) to change the Dynamic Platform Optimizer (DPO) VET capability setting from "False" to "True".  DPO is available on all systems to use without a license required.  Even though the VET for DPO was set to "False", it did not interfere with the running of DPO.
  • A problem was fixed for the Advanced System Manager Interface (ASMI) that allowed possible cross-site request forgery (CSRF) exploitation of the ASMI user session to do unwanted tasks on the service processor.
  • A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These DMA size adjustments were marked as UE previously for a condition that is normal.
  • Multiple security problems were fixed in the Network Time Protocol (NTP) client for buffer overflows that could be exploited to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2009-1252 and CVE-2009-0159.
  • A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
  • A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings.  This could be used to disclose portions of the program memory on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508.
  • Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests.  A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509.
  • A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510.
  • A problem was fixed that caused a service processor reset/reload and a SRC B1818601 error log during an IPL when adjusting the speeds of the system fans.  This problem would normally have a successful recovery with a good IPL of the system unless two other reset/reloads of the service processor had occurred within the last 15 minutes.
  • A security problem in GNU Bash was fixed to prevent arbitrary commands hidden in environment variables from being run during the start of a Bash shell.  Although GNU Bash is not actively used on the service processor, it does exist in a library so it has been fixed.  This is IBM Product Security Incident Response Team (PSIRT) issue #2211.  The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2014-6271, CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.

System firmware changes that affect certain systems

  • HIPER/Pervasive:  A problem was fixed in PowerVM where the effect of the problem is non-deterministic but may include an undetected corruption of data, although IBM test has not been able to make this condition occur. This problem is only possible if VIOS (Virtual I/O Server) version 2.2.3.x or later is installed and the following statement is true:  A Shared Ethernet Adapter (SEA) with fail over enabled is configured on the VIOS.
  • A problem was fixed for Live Partition Mobility (LPM) migrations from Power7+ systems that use the nest accelerator (NX) for compression and encryption usage that caused the migrated partition to revert to software compression instead of using the NX hardware.  Some operating system negotiated functions may not operate correctly and could impact performance.
    This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power 780 (9179-MHB) systems.
  • A problem was fixed for performance slow-downs during Main Storage Dump (MSD) that can happen when SR-IOV adapters are updating.  An option was added to MSD to prevent SR-IOV updates during the dump.
    This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power 780 (9179-MHB) systems.
  • On systems that have Active Memory Sharing (AMS) partitions and deduplication enabled, a problem was fixed for not being able to resume a hibernated AMS partition.  Previously,  resuming a hibernated AMS partition could give checksum errors with SRC B7000202 logged and the partition would remain in the hibernated state.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems with a partition with SR-IOV enabled, a problem was fixed for a partition with one or more virtual functions (VFs), also known as a port slice,  causing the system to TI with SRC B7000103 logged.
    This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power 780 (9179-MHB) systems.
  • On systems with a partition with SR-IOV enabled, a performance problem for concurrent updates was resolved by delaying updates to SR-IOV firmware and I/O adapters as needed to minimize impacts on running workloads.  SR-IOV delayed fixes can be activated immediately using the "Updating SR-IOV Firmware" procedure from the IBM Knowledge Center:
    IBM Power 770 (9117-MMD):  http://www-01.ibm.com/support/knowledgecenter/9117-MMD/p7hb1/p7hb1_updating_sriov_firmware.htm
    IBM Power 780 (9179-MHD):  http://www-01.ibm.com/support/knowledgecenter/9179-MHD/p7hb1/p7hb1_updating_sriov_firmware.htm
    IBM Power ESE (8412-EAD):  http://www-01.ibm.com/support/knowledgecenter/8412-EAD/p7hb1/p7hb1_updating_sriov_firmware.htm
    This fix does not pertain to the IBM Power 770 (9117-MMB) or IBM Power 780 (9179-MHB) systems.
  • On systems in IPv6 networks, a  problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client.
  • On systems that have a boot disk located on a SAN,  a problem was fixed  where the SAN  boot disk would not be found on the default boot list  and then the boot disk would have to be selected from SMS menus.  This problem would normally  be seen for new partitions that had tape drives configured before the SAN boot disk.
  • On systems with a partition that has a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to  Power7 or Power6 using Live Partition Mobility (LPM), a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000.  The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems.
  • On systems with redundant service processors, a problem was fixed in the run-time error failover to the backup service processor so it does not terminate on FRU support interface (FSI) errors.  In the case of FSI errors on the new primary service processor, the primary will do a reset/reload instead of a terminate.
    This fix does not pertain to the IBM Power ESE (8412-EAD).
  • On systems with mirrored memory and a Logical Memory Block (LMB) size of 16MB, a problem for a LMB memory leak during an IPL was fixed that caused partition configuration errors.
  • A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • For systems with a IBM i load source disk attached to an Emulex-based fibre channel adapter such as F/C #5735, a problem was fixed that caused an IBM i load source boot to fail with SRC B2006110 logged and a message to the boot console of  "SPLIT-MEM Out of Room".  This problem occurred for load source disks that needed extra disk scans to be found, such as those attached to a port other than the first port of a fibre channel adapter (first port requires fewest disk scans).

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • A problem was fixed for a power off failure of an expansion drawer (F/C 5802 or F/C 5877) during a concurrent repair.  The power off commands to the drawer are now tried again using the System Power Control Network (SPCN) serial connection to the drawer to allow the repair to continue.
AM780_059_040 / FW780.11

06/23/14
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact:  Security      Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
  • A  security problem was fixed in the service processor TCP/IP stack to discard illegal TCP/IP packets that have the SYN and FIN flags set at the same time.  An explicit packet discard was needed to prevent further processing of the packet that could result in an bypass of the iptables firewall rules.
AM780_056_040 / FW780.10

04/25/14
Systems 8412-EAD; 9117-MMB; 9117-MMD; 9179-MHB and 9179-MHD ONLY
Impact: Serviceability         Severity:  SPE

New Features and Functions

  • Support for the 9117-MMD, 9179-MHD and 8412-EAD systems.
  • Support was added to the Virtual I/O Server (VIOS) for shared storage pool mirroring (RAID-1) using the virtual SCSI (VSCSI) storage adapter to provide redundancy for data storage.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added to the Management Console command line to allow configuring a shared control channel for multiple pairs of Shared Ethernet Adapters (SEAs).  This simplifies the control channel configuration to reduce network errors when the SEAs are in fail-over mode.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added for Single Root I/O Virtualization (SR-IOV) that enables the hypervisor to share a SR-IOV-capable PCI-Express adapter across multiple partitions. The SR-IOV mode is supported for the following Ethernet Network Interface Controller (NIC) I/O adapters (SR-IOV supported in both native mode and through VIOS):
    -   F/C EN10 and CCIN 2C4C - Integrated Multi-function Card with Dual 10Gb Ethernet RJ45 and Copper Twinax
    -   F/C EN11 and CCIN 2C4D - Integrated Multi-function Card with Dual 10Gb Ethernet RJ45 and Short Range (SR) Optical
    -   F/C EN0H and CCIN 2B93 - PCI Express Generation 2 (PCIe2)  2x10Gb FCoE 2x1Gb Ethernet SFP+ Adapter
    -   F/C EN0K and CCIN 2CC1 - PCI Express Generation 2 (PCIe2)  4-port (10Gb FCoE & 1Gb Ethernet) SFP+Copper and RJ45
    System firmware updates the adapter firmware level on these adapters to 1.1.58.4 when a supported adapter is placed into SR-IOV mode.
    The SR-IOV mode for Ethernet NIC is supported on the following OS levels:
    -   AIX 6.1Y TL3 SP2, or later
    -   AIX 7.1N TL3 SP2, or later
    -   IBMi 7.1 with TR8, or later
    -   SUSE Linux Enterprise Server 11 SP3
    -   Red Hat Enterprise Linux 6.5
    -   VIOS 2.2.3.2, or later
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added to the Advanced System Management Interface (ASMI) to provide a menu for "Power Supply Idle Mode".  Using the "Power Supply Idle Mode"  menu, the power supplies can be either set enabled to save power by idling power supplies when possible or set disabled to keep all power supplies fully on and allow a balanced load to be maintained on the power distribution units (PDUs) of the system.  Power supply idle mode enabled helps to reduce overall power usage when the system load is very light by having one power supply deliver all the power while the second power supply is maintained in a low power state.  All power supplies must be present and have support for power supply idle mode before power supply mode can be enabled.
    Power Supply Idle Mode is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added for monitored compliance of the Power Integrated Facility for Linux (IFL).  IFL is an optional lower cost per processor core activation for Linux-only workloads on IBM Power Systems.  Power IFL processor cores can be activated that are restricted to running Red Hat Linux or SUSE linux.  In contrast, processor cores that are activated for general-purpose workloads can run any supported operating system.  Power IFL processor cores are enabled by feature code ELJ1 using Capacity Upgrade on Demand (CUoD).  Linux partitions can use IFL processors and the other processor cores but AIX and IBM i5/OS cannot use the IFL processors.  The IFL monitored compliance process will send customer alert messages to the management console if the system is out of compliance for the number of IFL processors and general-purpose workload processors that are in active use compared to the number that have been licensed.
    Power IFL and monitored compliance is not supported on IBM Power ESE (8412-EAD) system because it has the AIX operating system only.
  • System recovery for interrupted AC power and Voltage Regulator Module (VRM) failures has been enhanced for systems with multiple CEC enclosures such that a power AC or VRM fault on one CEC drawer will no longer block the other CEC drawers from powering on.  Previously, all CEC enclosures in a system needed valid AC power before the power on of the system could proceed.
    This system recovery feature does not pertain to the IBM Power ESE (8412-EAD) system because it is a single CEC enclosure system.
  • Support for IBM PCIe 3.0 x8 dual 4-port SAS RAID adapter with 12 GB cache with feature code EJ0L and CCIN 57CE.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added to the Management Console and the Virtual I/O Server (VIOS) to provide the capability to to enable and disable individual virtual ethernet adapters from the management console.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support was added for the IBM Flash Adapter 90 (#ES09)  PCIe 2.0 x8 with 0.9TB of usable enterprise multi-level cell (eMLC) flash memory .  The system recognizes the PCI device as a high power device needing additional cooling and increases the fan speeds accordingly.  This flash feature also provides:
        -  Up to 325K read IOPs and less than 100 micro second latency.
        -  Four independent flash controllers.
        -  Capacitive emergency power loss protection.
        -  Half-length, full-height PCIe card form factor.
    The IBM Flash Adapter 90 is not included in base AIX installation media.  AIX feature support can be acquired at IBM Fix Central: http://www-933.ibm.com/support/fixcentral/ by selecting the Product Group System Storage.  This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support for Management Console logical partition Universally Unique IDs (UUIDs) so that the HMC preserves the UUID for logical partitions on backup/restore and migration.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support for IBM PCIe 3.0 x8 non-caching 2-port SAS RAID adapter with feature code EJ0J and CCIN 57B4.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support for Power Enterprise System Pools allows for the aggregation of Capacity on Demand (CoD) resources, including processors and memory, to be moved from one pool server to any other pool server as needed.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support for a Management Console Performance and Capacity Monitor (PCM) function to monitor and manage both physical and virtual resources.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • Support for virtual server network (VSN) Phase 2 that delivers IEEE standard 802.1Qbg based on Virtual Ethernet Port Aggregator (VEPA) switching.  This supports the Management Console assignment of the VEPA switching mode to virtual Ethernet switches used by the virtual Ethernet adapters of the logical partitions.  The server properties in the Management Console will show the capability "Virtual Server Network Phase 2 Capable" as "True" for the system.
    This feature is not supported on IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.

System firmware changes that affect all systems

  • A problem was fixed that prevented a HMC-managed system from being converted to manufacturing default configuration (MDC) mode when the management console command "lpcfgop -m <server> -o clear" failed to create the default partition.  The management console went to the incomplete state for this error.
  • A problem was fixed that logged an incorrect call home B7006956 NVRAM error during a power off of the system.  This error log indicates that the NVRAM of the system is in error and will be cleared on the next IPL of the system.  However, there is no NVRAM error and the error log was created because a reset/reload of the service processor occurred during the power off.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • A firmware code update problem was fixed that caused the Hardware Management Console (HMC) to go to "Incomplete State" for the system with SRC E302F880 when assignment of a partition universal unique identifier (UUID) failed for a partition that was already running.  This problem happens for disruptive code updates from pre-770 levels to 770 or later levels.
  • A problem was fixed that caused frequent SRC B1A38B24 error logs with a call home every 15 seconds when service processor network interfaces were incorrectly configured on the same subnet.  The frequency of the notification of the network subnet error has been reduced to once every 24 hours.
  • A problem was fixed that caused a memory clock failure to be called out as failure in the processor clock FRU.
  • A problem was fixed where a 12V DC power-good (pGood) input fault was reported as a SRC 11002620 with the wrong FRU callout of Un-P1 for system backplane.  The FRU callout for SRC 11002620 has been corrected to Un-P2 for I/O card.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
  • A problem was fixed for the Advanced System Management Interface (ASMI) "Login Profile/Change Password" menu where ASMI would fail with "Console Internal Error, status code 500" displayed on the web browser when an incorrect current password was entered.
  • A problem was fixed for a system with pool resources for a resource remove operation that caused the number of unreturned resources to become incorrect.  This problem occurred if the system first became out of compliance with overdue unreturned resources and then another remove of a pool resources from the server was attempted.
  • A problem was fixed for the Advanced System Management Interface (ASMI)  "System Information/Firmware Maintenance History" menu option on the service processor to display the firmware maintenance history instead of the message  "No code update history log was found".
  • A problem was fixed for a Live Partition Mobility (LPM) suspend and transfer of a partition that caused the time of day to skip ahead to an incorrect value on the target system.  The problem only occurred when a suspended partition was migrated to a target CEC that had a hypervisor time that was later than the source CEC.
  • A problem was fixed for IBM Power Enterprise System Pools that prevented the management console from changing from the backup to the master role for the enterprise pool.  The following error message was displayed on management console:  "HSCL90F7 An internal error occurred trying to set a new master management console for the Power enterprise pool. Try the operation again.  If this error persists, contact your service representative."
    This defect does not pertain to the IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • A problem was fixed for Live Partition Mobility (LPM) where a 2x performance decrease occurs during the resume phase of the migration when migrating from a system with 780 or later firmware back to a system with a pre-780 level of firmware.

System firmware changes that affect certain systems

  • On systems with multiple CEC drawers or nodes, a problem was fixed in the service processor Advanced System Management Interface (ASMI) performance dump collection that only allowed performance data to be collected for the first node of the system.  The  "System Service Aids/Performance Dump" menu of the ASMI is used to work with the performance dump.
  • On systems involved in a series of consecutive Live Partition Mobility (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20.  Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure).
  • On systems running Dynamic Platform Optimizer (DPO) with one or more unlicensed processors, a problem was fixed where the system performance was significantly degraded during the DPO operation.  The amount of performance degradation was more for systems with larger numbers of unlicensed processors.
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
  • DEFERRED:  On systems with a redundant service processor, a problem was fixed that caused a system termination with SRC B158CC62 during a clock failover initiated by certain types of clock card failures.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
    This problem does not pertain to IBM Power 770 (9117-MMB) and IBM Power 780 (9179-MHB) systems.
  • On systems with a management console and service processors configured with Internet Protocol version 6 (IPv6) addresses,  a problem was fixed that prevented the management console from discovering the service processor.  The Service Location Protocol (SLP) on the service processor was not being enabled for IPv6, so it was unable to respond to IPv6 queries.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that occurred during Offline Converter Assembly (OCA) replacement operations. The fix prevents a false  Voltage Regulator Module (VRM) fault and the logging of SRCs 10001511 or 10001521 from occurring.    This resulted in the OCA LED getting stuck in an on or "fault" state and the OCA not powering on.
  • On systems with one memory clock deconfigured, a problem was fixed where the system failed to IPL using the second memory clock with SRCs B158CC62 and B181C041 logged.
  • On systems that require in-band flash to update system firmware, a problem was fixed so in-band update would not fail if the Permanent (P) or the Temporary (T) side of the service processor was marked invalid.   Attempting to in-band flash from the AIX or Linux command line failed with a BA280000 log reported.  Attempting to in-band flash from the AIX diagnostics menus also failed because the flash menu options did not appear in this case.
  • On a system with a partition with a AIX and Linux boot source to support dual booting, a problem was fixed that caused the Host Ethernet Adapter (HEA) to be disabled when rebooting from Linux to AIX.  Linux had disabled interrupts for the HEA on power down, causing an error for AIX when it tried to use the HEA to access the network.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.
  • On systems with a redundant service processor with AC power missing to the node containing the anchor card, a problem was fixed that caused an IPL failure with SRC B181C062 when the anchor card could not be found in the vital product data (VPD) for the system.  With the fix, the system is able to find the anchor card and IPL since the anchor card gets its power from the service processor cable, not from the node where it resides.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • On a system with sixteen or more logical partitions, a problem was fixed for a memory relocation error during concurrent hot node repair that caused a hang or a failure.  The problem can also be triggered by mirrored memory defragmentation on a system with selective memory mirroring.
AM780_054_040 / FW780.02

04/18/14
Systems  9117-MMB and 9179-MHB ONLY
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Montgomery ladder implementation for the ECDSA (Elliptic Curve Digital Signature Algorithm) to protect sensitive information from being obtained with a flush and reload cache side-channel attack to recover ECDSA nonces from the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0076.  The stolen ECDSA nonces could be used to decrypt the SSL sessions and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability.  The stolen private keys could be used to decrypt the SSL sessions and and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
AM780_050_040 / FW780.01

03/10/14
Systems  9117-MMB and 9179-MHB ONLY
Impact:  Data      Severity:  HIPER

System firmware changes that affect all systems

HIPER/Non-Pervasive:  A problem was fixed for a potential silent data corruption issue that may occur when a Live Partition Mobility (LPM) operation is performed from a system (source system) running a firmware level earlier than AH780_040 or AM780_040 to a system (target system) running AH780_040 or AM780_040.
AM780_040_040 / FW780.00

12/06/13
Systems  9117-MMB and 9179-MHB ONLY
Impact:  New      Severity:  New

New Features and Functions

  • Support was added to upgrade the service processor to openssl version 1.0.1 and for compliance to National Institute of Standards and Technologies (NIST) Special Publications 800-131a.  SP800-131a compliance required the use of stronger cryptographic keys and more robust cryptographic algorithms.
  • Support was added to the Virtual I/O Server (VIOS) for Universal Serial Bus (USB) removable hard-disk drive (HDD) devices.
  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
  • Support was added to the Management Console for group-based LDAP authentication.
  • Partition Firmware was enhanced to to be able to recognize and boot from disks formatted with the GUID Partition Table (GPT) format that are capable of being greater than 2TB in size.  GPT is a standard for the layout of the partition table on a physical hard disk, using globally unique identifiers (GUID), that does not have the 2TB limit that is imposed by the DOS partition format.
  • The call home data for every serviceable event of the system was enhanced to include information on every guarded element (processor, memory,I/O chip, etc) and contains the part number and location codes of the FRUs and the service processor de-configuration policy settings.
  • Support for Dynamic Platform Optimizer (DPO) enhancements to show the logical partition current and potential affinity scores.  The Management Console has also been enhanced to show the partition scoring.  The operating system (OS) levels that support DPO:

                ◦ AIX 6.1 TL8 or later
                ◦ AIX 7.1 TL2 or later
                ◦ VIOS 2.2.2.0
                ◦ IBM i 7.1 PTF MF56058
                ◦ Linux RHEL7
                ◦ Linux SLES12

         Note: If DPO is used with an older version of the OS that predates the above levels, either:
                   - The partition needs to be rebooted after DPO completes to optimize placement, or
                   - The partition is excluded from participating in the DPO operation (through a command line option on the "optmem" command that is used to initiate a
                      DPO operation).

  • Support for Dynamic Platform Optimizer (DPO) on 9117-MMB annd 9179-MHB systems.
  • Support for Management Console command line to configure the ECC call home path for SSL proxy support.
  • Support for Management Console to minimize recovery state problems by using the hypervisor and VIOS configuration data to recreate partition data when needed.
  • Support for Management Console to provide scheduled operations to check if the partition affinity falls below a threshold and alert the user that Dynamic Platform Optimizer (DPO) is needed.
  • Support for enhanced platform serviceability to extend call home to include hardware in need of repair and to issue periodic service events to remind of failed hardware.
  • Support for Virtual I/O Server (VIOS) to support 4K block size DASD as a virtual device.
  • Support for performance improvements for concurrent Live Partition Mobility (LPM) migrations.
  • Support for Management Console to handle all Virtual I/O Server (VIOS) configuration tasks and provide assistance in configuring partitions to use redundant VIOS.
  • Support for Management Console to maintain a profile that is synchronized with the current configuration of the system, including Dynamic Logical Partitioning (DLPAR) changes.
  • Support for Virtual I/O Server (VIOS) for an IBMi client data connection to a SIS64 device driver backed by VSCSI physical volumes.
  • Support was dropped for Secured Socket Layer (SSL) protocol version 2 and SSL weak and medium cipher suites in the service processor web server (Ligthttpd) .  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added in Advanced System Management Interface (ASMI) "System Configuration/Firmware Update Policy" menu to detect and display the appropriate Firmware Update Policy (depending on whether system is HMC managed) instead of requiring the user to select the Firmware Update Policy.  The menu also displays the "Minimum Code Level Supported" value.

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor OmniOrb core dump with SRC B181EF88 logged.
  • A problem was fixed that caused the system attention LED to stay lit when a bad FRU was replaced.
  • A problem was fixed that caused a memory leak of 50 bytes of service processor memory for every call home operation.  This could potentially cause an out of memory condition for the service processor when running over an extended period of time without a reset.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a system checkstop during hypervisor time keeping services.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant Anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary Anchor card has been corrupted.
  • A problem was fixed with the Dynamic Platform Optimizer (DPO) that caused memory affinity to be incorrectly reported to the partitions before the memory was optimized.   When this occurs, the performance is impacted over what would have been gained with the optimized memory values.
  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary.
  • A problem was fixed that can cause Anchor (VPD) card corruption and  A70047xx SRCs to be logged.  Note: If a serviceable event  with SRC A7004715 is present or was logged previously, damage to the VPD card may have occurred. After the fix is applied, replacement of the Anchor VPD  card is recommended in order to restored full redundancy.
  • The firmware was enhanced to display on the management console the correct number of concurrent Live Partition Mobility (LPM) operations that is supported.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure
  • A problem was fixed that caused the state of the Host Ethernet Adapter (HEA) port to be reported as down when the physical port is actually up.
  • When powering on a system partition, a problem was fixed that caused the partition universal unique identifier (UUID) to not get assigned, causing a B2006010 SRC in the error log.
  • For the sequence of a reboot of a system partition followed immediately by a power off of the partition, a problem was fixed where the hypervisor virtual service processor (VSP) incorrectly retained locks for the powered off partition, causing the CEC to go into recovery state during the next power on attempt.
  • A problem was fixed that caused an error log generated by the partition firmware to show conflicting firmware levels.  This problem occurs after a firmware update or a Live Partition Mobility (LPM) operation on the system.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.
  • A problem was fixed during resource dump processing that caused a read of an invalid system memory address and a SRC B181C141.  The invalid memory reference resulted from the service processor incorrectly referencing memory that had been relocated by the hypervisor.

System firmware changes that affect certain systems

  • A problem was fixed that caused fans to increase to maximum speeds with SRC B130B8AF logged as a result of thermal sensors with calibration errors.
  • On systems with an I/O tower attached, a problem was fixed that caused multiple service processor reset/reloads if the tower was continuously sending invalid System Power Control Network (SPCN) status data.
  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, the firmware was enhanced to guarantee that an SRC will be generated when there is a power supply voltage fault.  If no SRC is generated, a loss of power redundancy may not be detected, which can lead to a drawer crash if the other power supply goes down.  This also fixes a problem that causes an 8 GB Fiber channel adapter in the drawer to fail if the 12V level fails in one Offline Converter Assembly (OCA).
  • On systems managed by an HMC with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that caused the hardware topology on the management console for the managed system to show "null" instead of "operational" for the affected I/O drawers.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that where a Offline Converter Assembly (OCA) fault would appear to persist after a OCA micro-reset or OCA replacement.  The fault bit reported to the OS may not be cleared, indicating a fault still exists in the I/O drawer after it has been repaired.
  • When switching between turbocore and maxcore mode, a problem was fixed that caused the number of supported partitions to be reduced by 50%.
  • On systems in turbocore mode with unlicensed processors, a problem was fixed that caused an incorrect processor count.  The AIX command lparstat gave too high a value for "Active Physical CPUs in system" when it included unlicensed turbocore processors in the count instead of just counting the licensed processors.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions.
  • On systems running AIX or linux, a hang in a Live Partition Mobility (LPM) migration for remote restart-capable partitions was fixed by adding a time-out for the required paging space to become available.  If after five minutes the required paging space is not available, the start migration command returns a error code of 0x40000042 (PagingSpaceNotReady) to the management console.
  • On systems running Dynamic Platform Optimizer (DPO) with no free memory,  a problem was fixed that caused the Hardware Management System (HMC) lsmemopt command to report the wrong status of completed with no partitions affected.  It should have indicated that DPO failed due to insufficient free memory.  DPO can only run when there is free memory in the system.
  • On systems with partitions using physical shared processor pools, a problem was fix that caused partition hangs if the shared processor pool was reduced to a single processor.
  • On a system running a Live Partition Mobility (LPM) operation, a problem was fixed that caused the partition to successfully appear on the target system, but hang with a 2005 SRC.
  • A problem was fixed that caused SRC BA330000 to be logged after the successful migration of a partition running Ax740_xxx or Ax730_xxx firmware to a system running Ax760, or a later release, or firmware.  This problem can also cause SRCs BA330002, BA330003, and BA330004 to be erroneously logged over time when a partition is migrated from a system running Ax760, or a later release, to a system running Ax740_xxx or Ax730_xxx firmware.
  • On systems using IPv6 addresses, the firmware was enhanced to reduce the time it take to install an operating system using the Network Installation Manager (NIM).
  • On systems managed by a management console, a problem was fixed that caused a partition to become unresponsive when the AIX command "update_flash -s" is run.
  • On systems with turbo-core enabled that are a target of Live Partition Mobility (LPM),  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.

Concurrent hot add/repair maintenance (CHARM) firmware fixes

  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail if a memory channel failure on the CEC was followed by a service processor reset/reload.
  • A problem was fixed that caused SRC B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
  • A problem was fixed that prevented the I/O slot information from being presented on the management console after a concurrent node repair.
  • A problem was fixed that caused Capacity on Demand (COD) "Out of Compliance" messages during concurrent maintenance operations when the system was actually in compliance for the licensed amount of resources in use.


AM730
Systems  9117-MMB and 9179-MHB ONLY
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM730_146_035 / FW730.A0

01/28/15
Impact: Security         Severity:  ATT

New Features and Functions

  • Support was added for using the Mellanox ConnectX-3 Pro 10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.
  • An enhancement was made for the Global Interrupt Queue (GIQ) so that interrupts are presented in a round-robin fashion in partitions that have idle processors instead of GIQ directed interrupts favoring lower numbered processors.

System firmware changes that affect all systems

  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
  • A security problem was fixed for the Network Time Protocol (NTP) client that allowed remote attackers to execute arbitrary code via a crafted packet containing an extension field.  The Common Vulnerabilities and Exposures issue number is CVE-2009-1252.
  • A security problem was fixed for the Network Time Protocol (NTP) client for a buffer overflow that allowed remote NTP servers to execute arbitrary code via a crafted response.  The Common Vulnerabilities and Exposures issue number is CVE-2009-0159.
  • A  security problem was fixed in the service processor TCP/IP stack to discard illegal TCP/IP packets that have the SYN and FIN flags set at the same time.  An explicit packet discard was needed to prevent further processing of the packet that could result in an bypass of the iptables firewall rules.
  • A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
  • A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings.  This could be used to disclose portions of the program memory on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508.
  • Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests.  A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509.
  • A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
  • A security problem was fixed in the Advanced System Management Interface (ASMI) to block click-jacking attempts. This prevents framing of the original ASMI page with a top layer on it with dummy buttons that could trick the user into clicking on a link.
  • A problem was fixed that caused a "code accept" during a concurrent firmware installation from the management console to fail with SRC E302F85C.
  • A problem was fixed for the callout on power good (pgood) fault SRC 11002634 so that it includes the CEC enclosure and the failing FRU.  Previously, the callout was missing the failing FRU.
  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Dowgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options.  The Security Configuration options of "Disabled", "Default", and "Enabled" for SSLv3 determines the level of protection from POODLE.  The management console also requires a POODLE fix for APAR MB03867(Fix for CVE-2014-3566 for HMC V7 R7.9.0 SP1 with PTF MH01484) to eliminate all vulnerability to POODLE and allow use of option 1 "Disabled" as shown below:
    -1) Disabled:  This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack.  All clients must be capable of using TLS to make the secured connections to the service processor to use this option.  This requires the management console be at a recommended minimum level of HMC V7 R7.9.0 SP1 with POODLE PTF MH01484.
    -2) Default:  This medium level of security protection disables SSLv3 for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections.  But the legacy management consoles are allowed to use SSLv3 to connect to the service processor.  This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels.  Running a non-POODLE compliant HMC to a service processor in  "Default" mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI.
    -3) Enabled:  This basic level of security protection enables SSLv3 for all service processor client connection.  It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections.  This option is intended for customer sites on protected internal networks that have a large investment in legacy hardware that need SSLv3 to make browser and HMC connection to the service processor.  The level of POODLE protection actually achieved in "Enabled" mode is determined by the percentage of clients that are at the POODLE fix compliant levels.
  • A problem was fixed for a Live Partition Mobility (LPM) suspend and transfer of a partition that caused the time of day to skip ahead to an incorrect value on the target system.  The problem only occurred when a suspended partition was migrated to a target CEC that had a hypervisor time that was later than the source CEC.
  • A problem was fixed that could result in latency or timeout issues with I/O devices.
  • A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These DMA size adjustments were marked as UE previously for a condition that is normal.
  • A problem was fixed for the Advanced System Manager Interface (ASMI) that allowed possible cross-site request forgery (CSRF) exploitation of the ASMI user session to do unwanted tasks on the service processor.
  • A problem was fixed for intermittent B181EF88 SRCs and netsSlp core dumps during network configurations on the service processor.  This error caused call home activity for the SRC and dumps but otherwise had no impact to the CEC functionality.
System firmware changes that affect certain systems
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed for a hypervisor hang at progress code C7004091 during the IPL or hangs during serviceability tasks to the I/O drawer.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems using the Virtual I/O Server (VIOS) to share physical I/O resources among client logical partitions, a problem was fixed for memory relocation errors during page migrations for the virtual control blocks.  These errors caused a CEC termination with SRC B700F103.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations:
    - Using concurrent maintenance to perform a hot repair of a node.
    - Reducing the size of an Active Memory Sharing (AMS) pool.
    - On systems using mirrored memory, using the memory mirroring optimization tool.
  • A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • On systems in IPv6 networks, a  problem was fixed for a network boot/install failing with SRC B2004158 and IP address resolution failing using neighbor solicitation to the partition firmware client.
  • For systems with a IBM i load source disk attached to an Emulex-based fibre channel adapter such as F/C #5735, a problem was fixed that caused an IBM i load source boot to fail with SRC B2006110 logged and a message to the boot console of  "SPLIT-MEM Out of Room".  This problem occurred for load source disks that needed extra disk scans to be found, such as those attached to a port other than the first port of a fibre channel adapter (first port requires fewest disk scans).
  • On systems with a partition that has a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to  Power7 or Power6 using Live Partition Mobility (LPM), a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000.  The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed for concurrent maintenance operations to limit hardware retries on failed hardware so that it can be concurrently repaired.
  • A problem was fixed for a power off failure of an expansion drawer (F/C 5802 or F/C 5877) during a concurrent repair.  The power off commands to the drawer are now tried again using the System Power Control Network (SPCN) serial connection to the drawer to allow the repair to continue.
  • A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer.
AM730_142_035 / FW730.91

06/24/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
AM730_127_035 / FW730.90

04/02/14
Impact: Availability         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
  • DEFERRED: A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • A problem was fixed that caused a memory clock failure to be called out as failure in the processor clock FRU.
System firmware changes that affect certain systems
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that where an Offline Converter Assembly (OCA) fault would appear to persist after an OCA micro-reset or OCA replacement.  The fault bit reported to the OS may not be cleared, indicating a fault still exists in the I/O drawer after it has been repaired.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that occurred during Offline Converter Assembly (OCA) replacement operations. The fix prevents a false  Voltage Regulator Module (VRM) fault and the logging of SRCs 10001511 or 10001521 from occurring.    This resulted in the OCA LED getting stuck in an on or "fault" state and the OCA not powering on.
  • On systems involved in a series of consecutive Live Partition Mobility (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20.  Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure).
  • On a system with partitions with redundant Virtual Asynchronous Services Interface (VASI) streams,  a problem was fixed that caused the system to terminate with SRC B170E540.  The affected partitions include Active Memory Sharing (AMS), encapsulated state partitions, and hibernation-capable partitions.  The problem is triggered when the management console attempts to change the active VASI stream in a redundant configuration.  This may occur due to a stream reconfiguration caused by Live Partition Mobility (LPM); reconfiguring from a redundant Paging Service Partition (PSP) to a single-PSP configuration; or conversion of a partition from AMS to dedicated memory.
  • On systems with one memory clock deconfigured, a problem was fixed where the system failed to IPL using the second memory clock with SRCs B158CC62 and B181C041 logged.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.
  • On a system with a partition with a AIX and Linux boot source to support dual booting, a problem was fixed that caused the Host Ethernet Adapter (HEA) to be disabled when rebooting from Linux to AIX.  Linux had disabled interrupts for the HEA on power down, causing an error for AIX when it tried to use the HEA to access the network.
AM730_122_035 / FW730.80

09/18/13
Impact:  Availability     Severity:  SPE

Note:  This service pack includes several critical concurrent fixes and a deferred fix which has a very low probability of occurrence.   IBM recommends that customers concurrently install the service pack, to protect their system against known issues, but can wait to activate the deferred fix, via a system reboot, until the next scheduled service window.

New Features and Functions

  • Support was dropped for Secured Socket Layer (SSL) Version 2 and SSL weak and medium cipher suites in the service processor web server (Lighttpd).  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).

System firmware changes that affect all systems

  • On systems with utility processors,  an accounting problem with utility processor minutes was fixed.
  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release.  Migration recovery may also be necessary.
  • A problem was fixed that caused a service processor dump to be generated with SRC B18187DA "NETC_RECV_ER" logged.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • A problem was fixed that caused the state of the Host Ethernet Adapter (HEA) port to be reported as down when the physical port is actually up.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • DEFERRED:  A problem was fixed that caused a system checkstop during hypervisor time keeping services. This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
System firmware changes that affect certain systems
  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a F/C 5802 or 5877 I/O drawer installed, the firmware was enhanced to guarantee that an SRC will be generated when there is a power supply voltage fault.  If no SRC is generated, a loss of power redundancy may not be detected, which can lead to a drawer crash if the other power supply goes down.  This also fixes a problem that causes an 8 GB Fiber channel adapter in the drawer to fail if the 12V level fails in one Offline Converter Assembly (OCA).
  • On systems managed by an HMC with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that caused the hardware topology on the management console for the managed system to show "null" instead of "operational" for the affected I/O drawers.
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • On systems with turbo-core enabled that are a target of a Live Partition Mobility (LPM) operation,  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.
  • When switching between turbocore and maxcore mode, a problem was fixed that caused the number of supported partitions to be reduced by 50%.
  • On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions.
  • A problem was fixed in the run-time abstraction services (RTAS) extended error handling (EEH) for fundamental reset that caused partitions to crash during adapter updates.  The fundamental reset of adapters now returns a valid return code.  The adapter drivers using fundamental reset affected by this fix are the following:
    • QLogic PCIe Fibre Channel adapters (combo card)
    • IBM PCIe Obsidian
    • Emulex BE3-based ethernet adapters
    • Broadcom-based PCIe2 4-port 1Gb ethernet
    • Broadcom-based FlexSystem EN2024 4-port 1Gb ethernet for compute nodes
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • A problem was fixed that caused SRC  B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
AM730_114_035 / FW730.70

04/03/13
Impact:  Availability     Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a card (and its children) that was removed after the system was booted to continue to be listed in the guard menus in the Advanced System Management Interface (ASMI).
  • A problem was fixed that prevented predictive guard errors from being deleted on the secondary service processor.  This caused hardware to be erroneously guarded out if a service processor failover occurred, then the system was rebooted.
  • A problem was fixed that caused SRC B1813221, which indicates a failure of the battery on the service processor, to be erroneously logged after a service processor reset or power cycle.
  • A problem was fixed that caused various SRCs to be erroneously logged at boot time including B181E6C7 and B1818A14.
  • A problem was fixed that caused a code update operation to fail with a time-out error, creating a call-home with SRC B1818A0F .  This problem is more likely to occur on HMC-managed systems experiencing a high level of management activity during a code update.
  • A problem was fixed that caused system fans to be erroneously called out as failing with one or more of the following SRC's: 11007610,11007620,11007630,11007640, or 11007650.
  • A problem was fixed that caused the service processor (or system controller) to crash when it boots from the new level during a concurrent firmware installation.
  • A problem was fixed that caused SRC B7006A72 to be erroneously logged.
  • A problem was fixed that caused the system power to be throttled, resulting in decreased performance.  This problem typically occurs after a PCI adapter is plugged into a node (CEC drawer), and can also happen when a dedicated I/O partition is powered on or off.
  • The Power Hypervisor was enhanced to insure better synchronization of vSCSI and NPIV I/O interrupts to partitions.
  • A problem was fixed that caused SRC B15A3303 ("CEC Hardware: Time-Of-Day Hardware Predictive Error") to be erroneously logged, and the time-of-day to be set to Jan 1, 1970.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
System firmware changes that affect certain systems
  • On systems with an I/O tower attached, a problem was fixed that caused SRCs 10009135 and 10009139 to be erroneously logged.
  • A problem was fixed that caused various parts to be erroneously guarded out in some cases, and the clock card being called out as defective in other cases, when both ac cords providing power to a drawer were unplugged when the system was powered on.
  • On systems running Selective Memory Mirroring (SMM), a problem was fixed that caused the hypervisor to hang or crash when an uncorrectable hardware error occurred in a memory DIMM.
  • On systems with redundant service processors, a problem was fixed that caused the sibling service processor state to show up as "unknown" in the service processor error log if a code synchronization problem was detected after a service processor was replaced.
  • On systems with an I/O tower attached, a problem was fixed that caused multiple service processor reset/reloads if the tower was continuously sending invalid System Power Control Network (SPCN) status data.
  • A problem was fixed that caused the HMC to display incorrect data for a virtual Ethernet adapter's transactions statistics.
  • A problem was fixed that caused a hibernation resume operation to hang if the connection to the paging space is lost near the end of the resume processing.  This is more likely on a partition that supports remote restart.
  • A problem was fixed that caused the system to terminate with a bad address checkstop during mirroring defragmentation.
  • A problem was fixed that prevented the HMC command "lshwres" from showing any I/O adapters if any adapter name contained the ampersand character in the VPD.
  • On a system running a Live Partition Mobility (LPM) operation, a problem was fixed that caused the partition to successfully appear on the target system, but hang with a 2005 SRC.
  • On a partition with a large number of potentially bootable devices, a problem was fixed that caused the partition to fail to boot with a default catch, and SRC BA210000 may also be logged.
  • On systems running Active Memory Sharing (AMS) partitions, a problem was fixed that may arise due to the incorrect handling of a return code in an error path during the Live Partition Mobility (LPM) of an AMS partition.
  • On systems running Active Memory Sharing (AMS) partitions, a timing problem was fixed that may occur if the system is undergoing AMS pool size changes.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused SRC B15738B0 to be erroneously logged after a successful concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a concurrent hot add/repair maintenance (CHARM) operation to fail after this sequence of events occurred:

                        1.  A user-initiated platform system dump is requested (from the ASMI or management console).
                        2.  A service processor reset/reload takes place while dump collection is in progress.
                        3.  A concurrent hot add/repair maintenance operation is attempted.
  • On systems in which there are no processors in the shared processor pool, a problem was fixed that caused the Hypervisor to become unresponsive (the service processor starts logging time-out errors against the Hypervisor, and the HMC can no longer talk to the Hypervisor) during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a hypervisor memory leak during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a concurrent node repair or upgrade to fail during the system deactivation step with a hypervisor error code of 0x300.
  • A problem was fixed that caused a the system to terminate with a bad address checkstop during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused the system to hang if memory relocation is performed during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused partition activations to fail during or after a node repair operation.
  • A problem was fixed that caused synchronization problems in an application using the Barrier Synchronization Register (BSR) facility during the memory relocation that occurs in a concurrent hot add/repair maintenance operation.
  • A problem was fixed that prevented the I/O slot information from being presented on the management console after a concurrent node repair.
  • On systems running multiple IBM i partitions that are configured to communicate with each other via virtual Opticonnect, concurrent hot add/repair maintenance operations may time-out.  When this problem occurs, a platform reboot may be required to recover.
AM730_099_035

10/24/12
Impact:  Availability      Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • HIPER/Non-Pervasive: DEFERRED:  A problem was fixed that caused a system crash with SRC B170E540.
  • HIPER/Non-Pervasive:  A related problem was also fixed that could cause a live lock on the power bus resulting in a system crash.
  • To address poor placement of partitions following a reboot of a server with unlicensed cores, the firmware was enhanced to run the affinity manager when the initialize configuration operation is done from the HMC.  A problem was also fixed that caused the hypervisor to be left in an inconsistent state after a partition create operation failed.
AM730_095_035

08/23/12
Impact:  Availability      Severity:  SPE

New Features and Functions

  • Support for booting the IBM i operating system from a USB tape drive.

System firmware changes that affect all systems

  • A problem was fixed that caused a partition with dedicated processors to hang with SRC  BA33xxxx when rebooted, after it was migrated using a Live Partition Mobility (LPM) operation from a system running Ax730 to a system running Ax740, or vice versa.
  • The firmware was enhanced to call out the correct field replaceable units (FRUs) when SRC B124E504 with description "Chnl init TO due to SN stuck in recovery" was logged.
  • A problem was fixed that caused SRC B1818A10 to be erroneously logged after a system firmware installation.
  • A problem was fixed that caused booting from a virtual fibre channel tape device to fail with SRC B2008105.
  • The firmware was enhanced to log SRCs BA180030 and BA180031 as informational instead of predictive.
  • A problem was fixed that caused a "code accept" during a concurrent firmware installation from the HMC to fail with SRC E302F85C.  This is most likely to occur on model FHB systems.
  • On systems running the AIX operating system, a problem was fixed that caused the hypervisor to crash with SRC B7000103, after an HEA (Host Ethernet Adapter) error was logged, when there is a lot of AIX activity on the HEAs.
  • A problem was fixed that caused the suspension of a partition to fail if a large amount of data has to be stored to resume the partition.
  • A problem was fixed that caused a system crash with unrecoverable SRC B7000103 with "ErFlightRecorder" in the failing stack..
  • On systems booting from an NPIV (N-port ID virtualization) device, a problem was fixed that caused the boot to intermittently terminate with the message "PReP-BOOT: unable to load full PReP image.".  This problem occurs more frequently on the IBM V7000 Storage System running the SAN Volume Controller (SVC), but not on every boot.
  • A problem was fixed that caused SRC B181E6F1 with the description "RMGR_PERSISTENT_EVENT_TIMEOUT" to be erroneously logged.
  • A problem was fixed that prevented a change to the system operating mode ("M" or "N") made in the Advanced System Management Interface (ASMI) menu from being displayed in the physical control (operator) panel.
  • A problem was fixed that caused a memory leak in the service processor firmware.
  • A problem was fixed that caused SRC B155A491 to be erroneously logged during multiple system IPLs.  This SRC may cause the system to terminate.
  • A problem was fixed that caused the lsstat command on the HMC to display an erroneously high number of packets transmitted and received on a vlan interface.
System firmware changes that affect certain systems
  • The firmware was enhanced to fix a potential performance degradation on systems utilizing the stride-N stream prefetch instructions dcbt (with TH=1011) or dcbtst (with TH=1011).  Typical applications executing these algorithms include High Performance Computing, data intensive applications exploiting streaming instruction prefetchs, and applications utilizing the Engineering and Scientific Subroutine Library (ESSL) 5.1.
  • On systems on which Internet Explorer (IE) is used to access the Advanced System Management Interface (ASMI) on the Hardware Management Console (HMC), a problem was fixed that caused IE to hang for about 10 minutes after saving changes to network parameters on the ASMI.
  • A problem was fixed that caused informational SRC A70047FF, which may indicate that the Anchor (VPD) card should be replaced, to be erroneously logged again after the Anchor card was replaced.
  • A problem was fixed that caused a network installation of IBM i to fail when the client was on the same subnet as the server.
  • On systems with a 5796 or 5797 I/O drawer attached, a problem was fixed that could cause a system hang.
  • On systems with a feature code (F/C) 5802 or 5877 I/O drawer attached, a problem was fixed that prevented the system from booting with SRC B1818903, with a signature of "SINK_REASON_CODE_FILE_LOCK_TIMEOUT".
  • On systems with the F/C 1804 (Integrated 4 Port (2x1Gb and 2x10Gb SFP+ Optical-SR ports)) or F/C 1813 (Integrated, 4 Port (2x1Gb and 2x10Gb SFP+ Copper twinax ports)), the firmware was enhanced to prevent the attached network switch from prematurely shutting down the Ethernet port due to link flaps detected during IPL.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed the prevented the DASD roll-up fault LED from working properly after a node add or node remove operation.
  • A problem was fixed that caused a hot node repair operation to fail with PhypRc=0x0300, indicating the deactivate system resource operation failed.
  • During a CHARM replacement of a memory card on a system running with mirrored memory, a problem was fixed that caused the operation to fail with "PhypRc = 0x0326".
AM730_087_035

05/18/12
Impact:  Availability      Severity:  SPE 

New Features and Functions 

  • Support for IBM i Live Partition Mobility (LPM)
System firmware changes that affect all systems
  • A problem was fixed that prevented the user from changing the boot mode or keylock setting after a remote restart-capable partition is created, even after the partition's paging device is on-line.
System firmware changes that affect certain systems
The firmware resolves undetected N-mode stability problems and improves error reporting on the feature code (F/C) 5802 and 5877 I/O drawer power subsystem.
AM730_078_035

03/14/12
Impact:  Availability      Severity:  SPE

System firmware changes that affect all systems

  • The firmware was enhanced to properly display a memory controller that has been guarded out manually on the "Deconfiguration Records" menu option (under "System Service Aids") on the Advanced System Management Interface (ASMI).
  • A problem was fixed that caused multiple service processor dumps to be unnecessarily taken during a concurrent firmware update.  SRC B181EF9A, which indicates that the dump space on the service processor is full, was logged as a result.
  • The firmware was enhanced to increase the threshold for recoverable SRC B113E504 so that the processor core reporting the SRC is not guarded out.  This prevents unnecessary performance loss and the unnecessary replacement of processor modules.
  • A problem was fixed that caused SRC B7000602 to be erroneously logged at power on.
  • The firmware was enhanced to recognize new USB-attached devices so that they will be listed as boot devices in the System Management Services (SMS) menus.
  • A problem was fixed that caused booting or installing a partition or system from a USB device to fail with error code BA210012.  This usually occurs when an operating system (OS) other than the OS that is already on the partition or system is booted or installed.
  • On the System Management Services (SMS) remote IPL (RIPL) menus, a problem was fixed that caused the SMS menu to continue to show that an Ethernet device is configured for iSCSI, even though the user has changed it to BOOTP.
  • The firmware was enhanced to log SRCs BA180030 and BA180031 as informational instead of predictive.
  • The firmware was enhanced to increase the threshold of soft NVRAM errors on the service processor to 32 before SRC B15xF109 is logged.  (Replacement of the service processor is recommended if more than one B15xF109 is logged per week.)
  • A problem was fixed that caused a system to crash when the system was in low power (or safe) mode, and the system attempted to switch over to nominal mode.
  • On a multi-drawer system, a problem was fixed that prevented the system attention LED from correctly reflecting the status of the DASD fault LEDs in drawers 2, 3, and 4.
  • A problem was fixed that caused the system to fail to boot with SRC B1xxB507.
  • A problem was fixed that prevented a node from being deconfigured manually using the  Advanced System Management Interface (ASMI).
  • A problem was fixed the caused system fans to be erroneously called out as failing.
System firmware changes that affect certain systems
  • A problem was fixed that caused the hypervisor to hang during a concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O drawer.  Recovering from the hypervisor hang required a platform reboot.
  • A problem was fixed that impacted performance if profiling was enabled in one or more partitions.  Performance profiling is enabled:
    - In an AIX or VIOS partition using the tprof (-a, -b, -B, -E option) command or pmctl (-a, -E option) command.
    - In an IBM i partition when the PEX *TRACE profile (TPROF) collections or PEX *PROFILE collections are active.
    - In a Linux partition using the perf command, which is available in RHEL6 and SLES11; profiling with oprofile does not cause the problem.
  • A problem was fixed that prevented the operating system from being notified that a F/C 5802 or 5877 I/O drawer had recovered from an input power fault (SRC 10001512 or 10001522).
  • On a system that is being upgraded from Ax720 system firmware to Ax730 system firmware, the firmware was enhanced to log B1818A0F as informational instead of predictive if it occurs during the firmware upgrade.
  • On systems running Active Memory Sharing (AMS), the allocation of the memory was enhanced to improve performance.
  • A problem was fixed that caused the suspension of a logical partition running Active Memory Sharing (AMS) to fail because the disk headers had not been erased.
  • On systems with an iSCSI network, when booting a logical partition using that iSCSI network, a problem was fixed that caused the iSCSI gateway parameter displayed on the screen to be incorrect.  It did not impact iSCSI boot functionality.
  • On systems running Active Memory Sharing (AMS) and Active Memory Mirrorring (AMM), a problem was fixed that caused memory allocation to fail.  This in turn caused a partition to fail to boot with SRC A2009030.
  • On systems using affinity groups, a problem was fixed that prevented one of the partitions from being placed correctly.
  • On 9117-MMB and 9179-MHB systems without an optional GX adapter, a problem was fixed that caused the system fans to ramp up to their maximum speed.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a checkstop to occur during a node repair operation.
  • A problem was  fixed that caused the system to hang during a CHARM operation.
  • A problem was fixed that caused multiple types of failures (CHARM node  operations and Advanced Energy Manager (AEM) state changes, among others), after a CHARM hot node operation on the first (top) drawer was followed by a concurrent firmware installation.
  • On systems with more than one node, a problem was fixed that caused a CHARM operation on node B to fail with a Repair and Verify (R&V) panel that indicated a "Deactivate power domain for the FruType.CEC_ENCLOSURE at U78C0.001.xxxxxx" failure due to a "0x0007 COMMAND_TIMEOUT".
AM730_066_035

12/08/11
Impact:  Availability       Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect certain systems

  • HIPER/Pervasive on systems with a Virtual Input/Output (VIO) client running AIX, and with a F/C 5802 or 5877 I/O drawer attached:  A problem was fixed that caused the system to crash with SRC B700F103.
AM730_065_035

11/22/11
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible. 

System firmware changes that affect all systems 

  • HIPER/Pervasive:  On systems running firmware level AM730_049 or AM730_058, a problem was fixed that caused the target server to hang, or go to the incomplete state on the management console, after a Live Partition Mobility (LPM) operation.  This problem can also occur when a partition hibernation operation is done.
AM730_058_035

11/07/11
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible. 

New Features and Functions 

  • Support for the PCIe2 1.8GB cache RAID SAS adapter (tri-port 6Gb), F/C 5913.
System firmware changes that affect all systems
  • A problem was fixed that caused SRC B7005442 to be erroneously logged, and functional processor cores to be erroneously guarded out, when an error occurred in the operating system or an application.
  • HIPER/Non-Pervasive: A problem was fixed that caused the system to crash with SRC B18187DA.
  • A problem was fixed that prevented a partition from being activated with SRC B2006009.
  • HIPER/Pervasive:  A problem was fixed that caused the managed system to go the incomplete state with SRC B7000602, and have to be rebooted, if these conditions were met:
  • - An inactive partition is present on the managed system.
    - A concurrent system firmware update to AM730_049 was done.
    - The inactive partition is deleted before being activated with the new firmware level, either by the user or a partition migration operation.
System firmware changes that affect certain systems
  • On systems or logical partitions with a large number of virtual processors, a performance problem was fixed that prevented the utilization of the entitled capacity of partitions.
  • A problem was fixed that caused a shared processor partition that is configured with two virtual processors, and an entitled capacity of 1.0 processors, to hang when only one processor is in the physical shared pool.
  • A problem was fixed that caused the managed system's  processors displayed by the HMC to be incorrect.  This problem occurs when the system is booted when no partitions are defined, which for example can occur after an MES model upgrade.
  • On systems with redundant service processors, a problem was fixed that caused a service processor firmware synchronization to fail with SRC E302F842 when:
  • - A system firmware upgrade to a new release (from AM720_xxx to AM730_yyy, for example) was installed, then
    - A service processor card was replaced.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • On a system with mirrored memory, a problem was fixed that caused a hot node repair operation to fail.
  • A problem was fixed that caused the host Ethernet adapters (HEA) to be in a non-functional state after a hot node add.
AM730_049_035

09/15/11
Impact:  Performance      Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • A problem was fixed that caused SRC B18138B7 to be erroneously logged, and the service processor to terminate, when errors were continuously logged due to failing hardware.
  • A problem was fixed that caused the Advanced System Management Interface (ASMI) menus to be displayed in English no matter which language was selected.
  • The firmware was enhanced to verify that no uncorrectable memory errors are present in all of a partition's memory when the hypervisor accesses that memory.
  • The firmware was enhanced to reduce the number of times informational SRC 10009002 is logged when a system is booted.
  • A problem was fixed that caused two calls home for the same error to be made when a platform dump was generated.
  • A problem was fixed that caused unrecoverable SRC B181A809 to be erroneously logged.
  • A problem was fixed that caused a system boot to terminate with unrecoverable SRC B181A403.
  • A problem was fixed prevented a platform system dump from being deleted when the file system space on the service processor was full.
  • A problem was fixed prevented an encapsulated state partition from being activated after a main store dump (MSD).
  • A problem was fixed that caused a partition to fail to activate when the activation took place within an hour of the system being powered on.  This problem is much more likely to occur on large systems with a large number of I/O slots.
  • A problem was fixed that caused the system to terminate when rebooting after the power was removed, then reapplied.
  • A problem was fixed that caused a firmware installation from the HMC with the "do not auto accept" option selected to fail.
  • A problem was fixed that caused a partition to fail with SRC B170E540 when rebooting after an unrecoverable error was logged that impacted the partition's reserved memory area (RMA).
  • A problem was fixed that caused SRCs B181156C and B181A40F to be erroneously logged after a service processor reset.
  • The firmware was enhanced to delay the rebooting of a partition after a uncorrectable error (UE) is logged in the partition's memory.  This gives  the service processor sufficient time to gard out the memory in which the UE occurred.
  • The firmware was enhanced to log SRC B181C3251 as informational rather than predictive.
  • The firmware was enhanced to log SRC B1812A11 as informational, instead of "service action required", when the thermal/power management device (TPMD) is successfully reset.
  • A problem was fixed that erroneously caused SRC B18186x1 to be logged and an FSP dump to be generated.
  • The field replaceable unit (FRU) callouts were enhanced for SRC B181E550.
  • A problem was fixed that caused a system's partition dates to revert back to 1969 after the service processor or its battery was replaced.  This occurred regardless of whether or not the service processor's time-of-day (TOD) clock was correctly set during the service action.
  • A problem was fixed that caused the system to crash with SRC B700F103.
System firmware changes that affect certain systems
  • HIPER/Pervasive:  On systems running VIOS, a problem was fixed that caused the system to crash with SRC B700F103.
  • HIPER/Pervasive:  On systems with processors that don't have memory associated with them, a problem was fixed that was degrading system performance.
  • On systems running Advanced Memory Sharing (AMS), a problem was fixed that caused the system to crash during the creation of a logical partition (LPAR).
  • On systems with two or more drawers, a problem was fixed that caused SRCs B181156C and B181A40F to be erroneously logged.
  • On a system that terminates when in dynamic power save mode, a problem was fixed that caused SRCs B150B943, B113C660, and B113C661 to be erroneously logged when the system rebooted.
  • On systems running more than 100 logical partitions, a problem was fixed that caused a concurrent firmware installation to fail.
  • On systems running IBM i partitions, a problem was fixed that prevented IBM i partitions that were suspended from being reactivated after a main store dump (MSD).
  • On systems running IBM i partitions, a problem was fixed that caused changing the processor weight on an IBM i partition to 255 to have no effect.
  • On systems running Advanced Memory Sharing (AMS), a problem was fixed that prevented the virtual I/O server (VIOS) partition associated with an AMS pool from shutting down.
  • On systems with partitions with dedicated memory assigned, a problem was fixed that caused a resume operation on a partition with dedicated memory to fail with HMC SRC HSC0A945.
  • On systems running an IBM i partition with dedicated memory, and redundant virtual I/O server (VIOS) partitions, a problem was fixed that caused the resumption of the IBM i partition to fail if the hypervisor failed-over to the other VIOS partition while the IBM i partition was in hibernation.
  • The firmware was enhanced to allow the installation of IBM i from the HMC command line interface (CLI) using the "chsysstate" command.
  • On systems running shared processor partitions, a problem was fixed that caused a partition to hang until powered off and back on.
  • On systems running the Advanced Energy Manager (AEM), a problem was fixed that caused the work rate calculation for a processor to be incorrect if the system dropped into safe mode.
  • On systems from which a node has been removed, a problem was fixed that caused the node to continue to be listed when the Processing Unit Deconfiguration option is selected on the Advanced System Management Interface (ASMI) menus.
  • On systems with an uninterruptible power supply (UPS) attached, a problem was fixed that caused the system to power cycle  after a power failure, instead of waiting for power to be restored before powering on.
  • A problem was fixed that prevented an automatic system reboot after a checkstop when a service processor fail over occurred during the checkstop recovery.
  • On systems with F/C 1954 (4-port GB Ethernet adapter) installed, a problem was fixed that prevented the adapter from being configured during boot, and two B7006970 SRCs to be erroneously logged.
  • On systems running VIOS, a problem was fixed that caused the location code in the output of the VIOS command "lsmap -npiv -all" to be incorrect.
  • A problem was fixed that caused a partition migration operation to abort when the partition has more than 4096 virtual slots.
  • On systems running Advanced Memory Sharing (AMS), the firmware was enhanced to reduce the time required to migrate an AMS partition.
  • On systems running DB2 pureScale, a problem was fixed that caused intermittent remote direct memory access (RMDA) errors, and a core dump of the pureScale server process.
  • On systems with processors that don't have memory associated with them, the firmware was enhanced to improve boot time and system performance.
  • A problem was fixed that caused the system to appear to hang, and a service processor reset/reload to occur, when multiple hardware errors occurred.
  • On systems running virtual switches, the firmware was enhanced to limit the number of partitions that have access to a particular vswitch.
  • On systems with more than 1000 partitions, a problem was fixed that caused the error logs to be flooded with informational SRC B7005120 when all of the partitions are rebooted at the same time.
  • On systems in which a service processor had been guarded out manually, a problem was fixed that caused the Deconfiguration Records option, which is under the System Service Aids in the Advanced System Management Interface (ASMI), to display null data for that service processor.
  • On systems with redundant service processors, a problem was fixed that prevented a service processor fail-over from occurring.
  Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • On partitions running Red Hat Linux 6.1, a problem was fixed that caused a node evacuation operation to fail.
  • A problem was fixed that caused the system to crash during a hot GX adapter repair.
AM730_035_035

05/27/11
Impact:  New        Severity:  New

New Features and Functions

  • Support for the attachment of a System Director Management Console (SDMC).
  • Support for up to 1000 partitions on 9117-MMB, 9179-MHB, and 9119-FHB systems.
  • Support for IBM i live partition hibernation.
  • Support for server platform system dumps (SYSDUMP files) larger than 4GB.
  • Support for power savings settings for certain partitions and the system processor pool.
  • Support for the feature code (F/C) 5887 media drawer.
  • Support for remote restart of partitions.


AM720
Systems  9117-MMB and 9179-MHB ONLY
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM720_113_064

06/28/12
Impact: Availability          Severity:  SPE

New Features and Functions

  • PARTITION-DEFERRED: Support for Live Partition Mobility (LPM) between systems running Ax720 system firmware, and 8246-L2S systems.

System firmware changes that affect all systems

  • The firmware was enhanced to increase the threshold of soft NVRAM errors on the service processor to 32 before SRC B15xF109 is logged.  (Replacement of the service processor is recommended if more than one B15xF109 is logged per week.)
  • The firmware was enhanced to call out the correct field replaceable units (FRUs) when SRC B124E504 with description "Chnl init TO due to SN stuck in recovery" was logged.
  • A problem was fixed that caused informational SRC A70047FF, which may indicate that the Anchor (VPD) card should be replaced, to be erroneously logged again after the Anchor card was replaced.
  • A problem was fixed that caused booting from a virtual fibre channel tape device to fail with SRC B2008105.
  • A problem was fixed that caused a dynamic LPAR (DLPAR) add operation to fail on an empty PCI slot that is not hot-pluggable.
  • The firmware resolves undetected N-mode stability problems and improves error reporting on the feature code (F/C) 5802 and 5877 I/O drawer power subsystem.
  • A problem was fixed that caused a system to crash when the system was in low power (or safe) mode, and the system attempted to switch over to nominal mode.
System firmware changes that affect certain systems
  • The firmware was enhanced to fix a potential performance degradation on systems utilizing the stride-N stream prefetch instructions dcbt (with TH=1011) or dcbtst (with TH=1011).  Typical applications executing these algorithms include High Performance Computing, data intensive applications exploiting streaming instruction prefetchs, and applications utilizing the Engineering and Scientific Subroutine Library (ESSL) 5.1.
  • A problem was fixed that caused the hypervisor to hang during a concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O drawer.  Recovering from the hypervisor hang required a platform reboot.
  • On systems with the F/C 1804 (Integrated 4 Port (2x1Gb and 2x10Gb SFP+ Optical-SR ports)) or F/C 1813 (Integrated, 4 Port (2x1Gb and 2x10Gb SFP+ Copper twinax ports)), the firmware was enhanced to prevent the attached network switch from prematurely shutting down the Ethernet port due to link flaps detected during IPL.
  • A problem was fixed that impacted performance if profiling was enabled in one or more partitions.  Performance profiling is enabled:
        - In an AIX or VIOS partition using the tprof (-a, -b, -B, -E option) command or pmctl (-a, -E option) command.
        - In an IBM i partition when the PEX *TRACE profile (TPROF) collections or PEX *PROFILE collections are active.
        - In a Linux partition using the perf command, which is available in RHEL6 and SLES11; profiling with oprofile does not cause the problem.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused multiple types of failures (CHARM node  operations and Advanced Energy Manager (AEM) state changes, among others), after a CHARM hot node operation on the first (top) drawer (or the first logical node) was followed by a concurrent firmware installation.
  • A problem was fixed that caused unrecoverable SRCs B1813918 and B182953C during a CHARM operation.
  • A problem was fixed that caused SRC BA180020 to be erroneously logged when the partitions on a system were restarted after a CHARM operation
AM720_108_064

01/23/12
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • HIPER/Not pervasive:  A problem was fixed that caused the system to crash with SRC B18187DA.
  • The firmware was enhanced to log SRC B1768B76 as informational instead of unrecoverable.
  • The firmware was enhanced to increase the threshold for recoverable SRC B113E504 so that the processor core reporting the SRC is not guarded out.  This prevents performance loss and the unnecessary replacement of processor modules.
  • A problem was fixed prevented a platform system dump from being deleted when the file system space on the service processor was full.
  • The firmware was enhanced to log SRC B1812A11 as informational, instead of "service action required", when the thermal/power management device (TPMD) is successfully reset.
  • The field replaceable unit (FRU) callouts were enhanced for SRC B181E550.
  • A problem was fixed that caused the message "500 - Internal Server Error." to be displayed when a setting was changed on the Advanced System Management Interface's (ASMI's) power on/off menu, when the change was attempted when the system was powering down.
  • A problem was fixed that erroneously caused SRC B1818601 to be logged and an FSP dump to be generated.
  • The firmware was enhanced to log an error, instead of causing a kernel panic, if a guard record was corrupted or truncated.
  • A problem was fixed that caused the wrong error code to be logged when the memory test took longer than normal during system boot.
  • A problem was fixed that caused a system's partition dates to revert back to 1969 after the service processor or its battery was replaced.  This occurred regardless of whether or not the service processor's time-of-day (TOD) clock was correctly set during the service action. 
  • A problem was fixed that caused the system to appear to hang, and a service processor reset/reload to occur, when multiple hardware errors occurred.
  • A problem was fixed that caused SRC B7005442 to be erroneously logged, and functional processor cores to be guarded out, when an error occurred in the operation system or an application.
  • A problem was fixed that erroneously caused SRC B1818601 to be logged and an FSP dump to be generated. 
  • A problem was fixed that caused multiple service processor dumps to be unnecessarily taken during a concurrent firmware update.  SRC B181EF9A, which indicates that the dump space on the service processor is full, was logged as a result.
  • The firmware was enhanced by the addition of a new option in the system management services (SMS) "Multi-boot" menu that facilitates zoning of physical and virtual fibre channel adapters.
  • A problem was fixed that caused a partition migration operation to abort when the partition has more than 4096 virtual slots.
  • A problem was fixed that caused SRC B18138B7 to be erroneously logged, and the service processor to terminate, when errors were continuously logged due to failing hardware.
  • A problem was fixed that caused a firmware installation from the HMC with the "do not auto accept" option selected to fail.
  • A problem was fixed that caused the system to fail to boot with SRC B1xxB507.
  • A problem was fixed the caused system fans to be erroneously called out as failing.
System firmware changes that affect certain systems
  • HIPER/Pervasive on systems with a Virtual Input/Output (VIO) client running AIX, and with a F/C 5802 or 5877 I/O drawer attached:  A problem was fixed that caused the system to crash with SRC B700F103.
  • On systems running more than 100 logical partitions, a problem was fixed that caused a concurrent firmware installation to fail.
  • On systems running the Advanced Energy Manager (AEM), that terminates when in dynamic power save mode, a problem was fixed that caused SRCs B150B943, B113C660, and B113C661 to be erroneously logged when the system rebooted.
  • On systems running Active Memory Sharing (AMS), the firmware was enhanced to reduce the time required to migrate an AMS partition.
  • On systems running Active Memory Sharing (AMS), a problem was fixed that caused the system to crash during the creation of a logical partition (LPAR).
  • On systems running Active Memory Sharing (AMS), a problem was fixed that prevented an AMS partition from being activated with SRC B2006009.
  • On systems running VIOS, a problem was fixed that caused the location code in the output of the VIOS command "lsmap -npiv -all" to be incorrect.
  • A problem was fixed that caused a shared processor partition that is configured with two virtual processors and an entitled capacity of 1.0 processors to hang when only one processor is in the physical shared pool.
  • On systems running iSCSI, a problem was fixed that caused the system to hang when booting from an iSCSI device in the system management services (SMS) menus.
  • On the System Management Services (SMS) remote IPL (RIPL) menus, a problem was fixed that caused the SMS menu to continue to show that an Ethernet device is configured for iSCSI, even though the user has changed it to BOOTP.
  • On systems running the Advanced Energy Manager (AEM), a problem was fixed that caused the work rate calculation for a processor to be incorrect if the system dropped into safe mode.
  • On systems from which a node has been removed, a problem was fixed that caused the node to continue to be listed when the Processing Unit Deconfiguration option was selected on the Advanced System Management Interface (ASMI) menus.
  • On systems in which a service processor had been guarded out manually, a problem was fixed that caused the Deconfiguration Records option, which is under the System Service Aids in the Advanced System Management Interface (ASMI), to display null data for that service processor.
  • A problem was fixed that prevented the operating system from being notified that a F/C 5802 or 5877 I/O drawer had recovered from an input power fault (SRC 10001512 or 10001522).
  • On a multi-drawer system, a problem was fixed that prevented the system attention LED from correctly reflecting the status of the DASD fault LEDs in drawers 2, 3, and 4.
  • On systems using Capacity on Demand (CoD), a problem was fixed that caused informational SRC B7005300 to be logged so often that the error logs wrapped, and other information in the error logs was lost.
  • On systems are upgraded from Ax710 system firmware to Ax720 system firmware, a problem was fixed that caused the utility processor capacity-on-demand (CoD) parameters to erroneously change when the Ax720 system firmware was installed.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • On a system with mirrored memory, a problem was fixed that caused a hot node repair operation to fail.
  • A problem was fixed that caused the host Ethernet adapters (HEA) to be in a non-functional state after a hot node add.
  • A problem was fixed that caused the hypervisor's memory usage to grow during a CHARM node evacuation operation.  When this problem occurred, the amount of reserved memory (the memory the hypervisor is using) increases, and the amount of available memory decreases, as viewed on the Hardware Management Console (HMC).
AM720_102_064

08/03/11
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • A problem was fixed that caused the Advanced System Management Interface (ASMI) menus to be displayed in English no matter which language was selected.
  • The firmware was enhanced to verify that no uncorrectable memory errors are present in all of a partition's memory when the hypervisor accesses that memory.
System firmware changes that affect certain systems
  • HIPER:  On systems running VIOS, a problem was fixed that caused the system to crash with SRC B700F103.
  • On systems running the Enhanced Cache Option (ECO), a problem was fixed that caused the number of processor cores to be reported incorrectly.
  • On partitions running the Red Hat Linux 6.1, a problem was fixed that caused a partition migration operation to fail.
  • On systems with an uninterruptible power supply (UPS) attached, a problem was fixed that caused the system to power cycle between power on and power off after a power failure, instead of waiting for the power to be restored before powering on.
  • On systems running Active Memory Sharing (AMS), a problem was fixed that caused the system to crash with SRC B170E540 after a warm boot or platform dump IPL.
  • A problem was fixed that prevented the virtual I/O server (VIOS) partition associated with an Advanced Memory Sharing (AMS) pool from shutting down.
  • A problem was fixed that caused a resume operation on a partition with dedicated memory to fail with HMC SRC HSC0A945.
  • On systems running an IBM i partition with dedicated memory, and redundant virtual I/O server (VIOS) partitions, a problem was fixed that caused the resumption of the IBM i partition to fail if the hypervisor failed-over to the other VIOS partition while the IBM i partition was in hibernation.
  • On systems running shared processor partitions, a problem was fixed that caused a partition to hang until powered off and back on.
  • On systems running DB2 pureScale, a problem was fixed that caused intermittent remote direct memory access (RMDA) errors, and a core dump of the pureScale server process.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • On partitions running Red Hat Linux 6.1, a problem was fixed that caused a node evacuation operation to fail.
  • A problem was fixed that caused the system to crash with SRC B170E540 during a node evacuation that is done after a warm boot or platform dump IPL.
AM720_101_064

05/20/11
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • HIPER:  IBM testing has uncovered a potential undetected data corruption issue.  The problem can occur in rare instances due to an issue in the firmware and is most likely to impact hypervisor data.  This issue was discovered during internal IBM testing, and has not been reported on any customer system.  However, IBM recommends that  customers running on POWER7 systems with Ax720_090 and earlier firmware move to Ax720_101.  POWER7 systems running with Ax710 firmware do not have an exposure to this issue, so no action is recommended.
  •  HIPER:  A problem was fixed that caused the hypervisor to delay dispatching a partition even though it was ready to run, which added latency (delays) that adversely affected performance.  This problem can affect POWER7 systems running any level of Ax720 firmware prior to Ax720_101.
  • A problem was fixed that caused certain service processor error log entries with a severity of "predictive", and a failing subsystem of "service processor firmware", to be erroneously converted to "informational".
  • A problem was fixed that caused three B181951C SRCs to be erroneously logged, and the system IPL time to increase by as much as an hour.  This problem is more likely to occur on systems with firmware level AL720_082 or AL720_090, AM720_084 or AM720_090, or AH720_082 or AH720_090 installed.
  • A problem was fixed that caused the EnergyScale firmware to erroneously go into safe mode when processor 0 was guarded out.
  • A problem was fixed that caused SRC B1812A61 to be erroneously logged.
  • A problem was fixed that prevented the setting of the boot diagnostic level in the power on/off menu (in the Advanced System Management
    Interface (ASMI)) from being shown correctly after it was changed.
  • A problem was fixed that prevented a system dump from being off-loaded from the service processor.  When this occurred, additional dumps were not allowed.
  • The firmware was enhanced so that a message is displayed if setting the brand keyword in the ASMI menu (System Configuration -> Program Vital Product Data -> System brand) fails because the service processor is not in the correct state.
  • The firmware was enhanced such that a call home is not made when an error logged by the system controller, node controller, or service processor is informational, or recovered, and the reset/reload bit is set.
  • A problem was fixed that caused multiple DR_DMA_MIGRATE_FAIL entries in the AIX error log.
  • A problem was fixed that caused SRC B7000803 to be erroneously logged multiple times.
  • A problem was fixed that prevented processor resources from being moved to another partition by a DLPAR (dynamic LPAR) operation.
  • A problem was fixed that prevented partitions from booting.
  • A problem was fixed that caused the HMC component interval activity report to always show 100% uncapped CPU available.
  • A problem was fixed that caused incorrect data to be displayed in the "Deconfiguration Records" menu option on the ASMI (System Service Aids > Deconfiguration Records) when a service processor was guarded out.
  • A problem was fixed that prevented the battery on the secondary service processor from being called out when it needed to be replaced.
  • The firmware was enhanced to log SRC 11007610, 11007620, 11007630, 11007640, or 11007650 only when a system fan's speed drops below 2800 RPM.
  • A problem was fixed that prevented the system fans from running at the correct speed if a service processor reset to runtime was done, then a fan failure occurred.
  • A problem was fixed that caused the green power enclosure LED to be off, instead of blinking at a slow rate, when the system is at standby.
  • On systems with AM710 system firmware, a problem was fixed that caused utility capacity on demand (COD) processors to erroneously become enabled when a firmware upgrade was done to AM720.
  • A problem was fixed that caused an administrative service processor fail-over (AFO), followed by another AFO without a reset in between, to fail.  If this occurred during a concurrent hot add or repair maintenance operation on a 9117-MMB or 9179-MHB, the operation failed.
  • A problem was fixed that caused VIOS partitions to fail to boot.
  • A problem was fixed that caused a partition suspend operation to hang.  When this problem occurred, all subsequent suspend operations were locked out as well.
  • A problem was fixed that could cause the target partition to crash after a successful P6 to P7 partition migration.  Possible AIX error log entries include:  label: DSI_PROC, resource:  SYSVMM, with description: "DATA STORAGE INTERRUPT, PROCESSOR".  Other partition-related crash descriptors may also be logged.
  • A problem was fixed that could cause AIX error log entries following a successful partition migration.  Possible AIX error log entries include: label: RTAS_ERROR, resource: sysplanar0, with description: "INTERNAL ERROR CODE".  Other errors may also be logged.
  • A problem was fixed that caused the installation of some versions of Linux to fail.
System firmware changes that affect certain systems
  • On systems with two HMCs attached, a problem was fixed that caused one of the HMCs to frequently go to an incomplete state.
  • On systems running IBM i partitions, a problem was fixed that caused a RAID array of SCSI disks to be exposed if an MES upgrade was done, or a system plan was created.
  • On systems running IBM i partitions, a problem was fixed that caused SRC BA040030 to be erroneously logged, and a call home to be made, even though the partition booted successfully.
  • On systems using the host Ethernet adapter (HEA) function, a problem was fixed that caused the HMC to erroneously report that deleting a logical port had failed.
  • On partitions running Advanced Memory Sharing (AMS), a problem was fixed that prevented shutdown of a partition when all paging VIOS's servicing the partition were hung and unable to complete outstanding I/O operations.
  • On systems running Advanced Memory Sharing (AMS), a problem was fixed that caused an AMS partition to crash with SRC B700F103.  This problem may occur when reducing the size of the AMS pool (or doing a hot node repair on a model MMB or MHB) at the same time as dynamically creating an AMS partition, or changing an AMS partition's maximum memory.
  • A problem was fixed that caused AIX licensing issues when migrating a partition from a POWER6 to a POWER7 system.
  • The "USB Service Functions" option was removed from the ASMI menus on 9117-MMB, 9179-MHB and 9119-FHB systems, which do not support this function.
  • On systems with a F/C 5802 or 5877 I/O expansion drawer, a problem was fixed that caused SRC 10003144 or 10003154 to be erroneously logged when a repair was done on the I/O drawer.
  • On systems with a F/C 5802 or 5877 I/O expansion drawer, a problem was fixed that caused the lamp test on the HMC to turn off all of the LEDs when the test was complete instead of returning them to their original states.
  • On stand-alone systems running AIX or Linux, and on systems managed by IVM (Integrated Virtualization Manager), a problem was fixed that prevented platform dumps from being off-loaded, or resulted in corrupted or incomplete platform dumps.
  • A problem was fixed that caused the exhaust heat index value displayed by IBM Director to be invalid when the system is located near, at, or below sea level.
  • On systems on which a NIM installation is being set up using the system management services (SMS) menus, the firmware was changed to limit the packet size options to 512 and 1024 bytes.
  • On systems with Selective Memory Mirroring and the Enhanced Cache Option enabled, a problem was fixed that caused unpredictable system behavior when a processor hardware failure occurred.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused the hot repair of a GX adapter to fail, and the system to crash, if the GX adapter had previously logged a non-checkstop type of error.
  • A problem was fixed that caused the system to crash with SRCs B170E540, B181F02D and B700F103 during a hot node upgrade (memory), or hot node repair, of node A.
AM720_090_064

03/07/11
Impact: Data           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect certain systems 
  • HIPER: IBM testing has uncovered a potential undetected data corruption issue when a mobility operation is performed on an AMS (Active Memory Sharing) partition.  The data corruption can occur in rare instances due to a problem in IBM firmware.  This issue was discovered during internal IBM testing, and has not been reported on any customer system.
  • On systems with a F/C 5802 or 5877 I/O drawer attached, a problem was fixed that caused a partition to crash during a page migration operation.
  • On systems with a F/C 5723 communications adapter, a problem was fixed that prevented the adapter from being seen by partition firmware (PFW) if the adapter was not in the first 144 slots that are probed by PFW.
AM720_084_064

01/04/11
Impact:  Function       Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

New Features and Functions

  • Support for partition suspend/resume.  AIX 61 TL6 SP3 or later, or AIX 71 TL0 SP2 or later, is required for partition suspend/resume.
  • CEC Hot Node Add & Repair Maintenance (CHARM) support on Power 770 and Power 780 systems.
NOTE: Support for CHARM operations on Power 770 and Power 780 systems was introduced in 7.2.0 (AM720_064).  However, IBM recommends installing this firmware level (service pack 7.2.1, AM720_084) before starting a CHARM operation on a Power 770 or Power 780 system.

System firmware changes that affect all systems

  • HIPER:  On systems using the HEA (host Ethernet adapter) function, and on which a CEC concurrent maintenance operation that requires a node evacuation is being done, this fix corrects an issue that has the potential to corrupt information stored in the system memory, which may cause undetected data errors.  This issue was discovered during internal IBM testing, and while it has not been reported on any customer systems, IBM strongly recommends that this fix be applied to all model MMA systems that are running AIX partitions.
  • HIPER:  A problem was fixed that caused repeated reset/reloads of the service processor, and fail-overs, to occur after a hypervisor-initiated reset/reload of the service processor was completed.  That led to loss of communication between the service processor and the hypervisor (indicated by SRC B182951C).
  • A problem was fixed that caused disks that were not bootable to be displayed in the system management services boot menus.  This problem also prevented the operating system level from being displayed for bootable hard disks in the system management services boot menus.
  • A problem was fixed that caused an error log indicating a dynamic LPAR (DLPAR) error when no DLPAR operations were done, and unrecoverable SRCs BA180010 and BA250010 to be erroneously logged, when a recoverable enhanced error handling (EEH) error was logged on an I/O adapter.
  • The firmware was enhanced to use the fan speed signal, as well as the fan present signal, to determine if fans are present in a drawer.  This change keeps the firmware from shutting down a drawer, and logging 110076x1 SRCs, if the fans are functional but the fan presence signals are corrupted.
  • A problem was fixed that caused a service processor reset/reload, a service processor dump to be taken, and B181EF88 to be logged.
  • A problem was fixed that caused the managed system to go to the incomplete state on the HMC.
  • A problem was fixed that caused the system to hang at C700406E during boot.
  • A problem was fixed that caused the platform to become unresponsive; this was indicated by an "incomplete" state on the HMC.  When this problem occurred, the partitions on the managed system became unresponsive.
  • A problem was fixed that caused SRC B1561111 to be erroneously logged, and the control (operator) panel to erroneously deactivated, if there is no activity on the control panel for several weeks.

System firmware changes that affect certain systems
  • On systems with a solid state disk drive (SSD), the fan speeds were increased to provide additional cooling to the SSD drives.
  • A problem was fixed that caused a virtual SCSI or virtual fibre channel adapter to be seen by the operating system as not bootable when it was added to a partition using a dynamic LPAR (DLPAR) operation.
  • A problem was fixed that caused the system ID to change, which caused software licensing problems, when a live partition mobility operation was done where the target system was an 8203-E4A or an 8204-E8A.
  • PARTITION-DEFERRED:  A problem was fixed that caused SRC BA210000 to be erroneously logged on the target system when a partition was moved (using Live Partition Mobility) from a Power7 system to a Power6 system.
  • A problem was fixed that caused SRC BA280000 to be erroneously logged on the target system when a partition was moved (using Live Partition Mobility) from a Power7 system to a Power6 system.
  • A problem was fixed that caused a partition to hang following a partition migration operation (using Live Partition Mobility) from a system running Ax720 system firmware to a system running Ex340, or older, system firmware.
  • A problem was fixed that caused a system or partition running Linux to crash when the "serv_config -l".
  • On systems using the HEA broadcast/multicast application to send and receive millions of packets, such as video streaming, the packet storm mitigation algorithm was enhanced so that a packet will only be dropped when a packet storm is detected.
  • A problem was fixed that caused a partition to fail to reboot with SRC B2001230 and word 3 = 000000BF.  This failure can be seen on a partition that owns a PCI, PCI-E, or PCI-X slot.
  • On systems with a F/C 5802 or 5877 I/O drawer attached, and a PCI-E adapter in the CEC, a problem was fixed that caused the system to crash during a page migration operation with SRC B700F103.

Concurrent maintenance (CM) firmware fixes
  • A problem was fixed that caused the system to hang with SRC B170E540 during a node repair operation.
AM720_064_064

09/17/10
Impact:  Availability        Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

New Features and Functions:

  • Support for autonomic IPL, which allows the service processor to decide which diagnostic tests to run at boot time.
  • Support for the network installation of the IBM i operating system from the hardware management console (HMC) command line interface (CLI).
  • Support for the 7216-1U2 media drawer.
  • Support for VIOS storage integration.
  • Support for 32 GB DIMMs, F/C 5602.
  • Support for CEC Hot Node Add & Repair Maintenance (CHARM) operations on Power 770 and Power 780 systems.  However, IBM recommends installing firmware service pack 7.2.1 (AM720_084) before starting a CHARM operation on a Power 770 or Power 780 system.

System firmware changes that affect all systems

  • HIPER: A problem was fixed that caused the HMC to show the server's status as incomplete, and SRC B7000602 to be logged against SFLPHMCCMDTASK in serviceable events.  This problem can also cause the system to crash when it occurs.
  • HIPER: A problem was fixed that caused an AIX or Linux partition to fail to boot with SRC B2008151, which prevented further access to that partition, and potentially preventing prior LPAR configuration changes from being completed.  A reboot of the server is required to recover from this problem.
  • HIPER: A problem was fixed that could have prevented a successful emergency fail-over to the backup service processor if multiple reset/reload commands were issued to both service processors at roughly the same time.  If the emergency fail-over was not successful, the system might have been terminated by either the service processor or the hypervisor.  This problem did not affect an administrative fail-over of the service processor.
  • HIPER: A problem was fixed that caused informational SRC B70069DA from a host Ethernet adapter (HEA) to be logged erroneously.  These messages are sent from the hypervisor to the service processor and cause unnecessary loading of the hypervisor-service processor communication link.
  • HIPER: A problem was fixed that caused a partition, or the hypervisor, to appear to hang, then recover; the time of the apparent hang varied.  SRC B182953C might also be logged.
  • A problem was fixed that prevented an SRC from being recorded in a service processor dump produced by a host-initiated reset.
  • The wording on the memory deconfiguration menu on the advanced system management interface (ASMI) was enhanced to better differentiate between guarded resources and deconfigured resources.
  • The firmware was enhanced to improve the hardware called out with SRC B121B8AB.
System firmware changes that affect certain systems
  • On systems running host Ethernet adapter (HEA), a problem was fixed that caused unrecoverable SRCs BA154050 and BA154070 to be erroneously logged.
  • On systems with a F/C 5802 or 5877 I/O drawer attached, the firmware was enhanced to allow the drawer to power on with only one working offline converter assembly (OCA).
  • On 9179-MHB systems running the enhanced cache option (ECO), a problem was fixed that periodically caused the system fans to run at full speed.



AM710
Systems  9117-MMB and 9179-MHB ONLY
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM710_119_043

12/07/11
Impact:  Serviceability       Severity:  SPE

New Features and Functions

  • Support for F/C 5289, a full-height 2-port async EIA-232 PCIe adapter.
  • Support for F/C 5290, a low-profile 2-port async EIA-232 PCIe adapter.

System firmware changes that affect all systems

  • A problem was fixed that prevented a system dump from being off-loaded from the service processor.  When this occurred, additional dumps were not allowed.
  • The firmware was enhanced to log SRC B1768B76 as informational instead of unrecoverable.
  • The firmware was enhanced to log SRC B1812A11 as informational, instead of service action required, when the thermal/power management device (TPMD) is successfully reset.
  • A problem was fixed that caused the message "500 - Internal Server Error." to be displayed when a setting was changed on the Advanced System Management Interface's (ASMI's) power on/off menu, when the change was attempted when the system was powering down.
  • A problem was fixed that caused a system's partition dates to revert back to 1969 after the service processor or its battery was replaced.  This occurred regardless of whether or not the service processor's time-of-day (TOD) clock was correctly set during the service action.
  • The firmware was enhanced to call out an isolation procedures (FSPSP63) when SRC B160B73F is logged.
  • A problem was fixed that caused SRC B7005442 to be erroneously logged, and functional processor cores to be guarded out, when an error occurred in the operation system or an application.
  • A problem was fixed that caused a partition migration or partition hibernation operation to hang with the partition left in the "suspending" state.
  • The firmware was enhanced by the addition of a new option in the system management services (SMS) "Mutli-boot" menu that facilitates zoning of physical and virtual fibre channel adapters.
  • On the System Management Services (SMS) remote IPL (RIPL) menus, a problem was fixed that caused the SMS menu to continue to show that an Ethernet device is configured for iSCSI, even though the user has changed it to BOOTP.
  • On a multi-drawer system, a problem was fixed that prevented the system attention LED from correctly reflecting the status of the DASD fault LEDs in drawers 2, 3, and 4.
  • A problem was fixed that caused the wrong voltage regulator module (VRM) to be called out with SRC 11002630.
  • A problem was fixed that caused a firmware installation from the HMC with the "do not auto accept" option selected to fail.

System firmware changes that affect certain systems

  • On systems running VIOS, a problem was fixed that prevented virtual LANs (VLANs) in a VIOS with partition ID of 1 from being displayed as bootable devices in the system management services (SMS) menus.
  • On systems running Active Memory Sharing (AMS), a problem was fixed that caused a partition to crash with SRC B700F103 if the size of an AMS pool is reduced at the same time as an AMS partition is dynamically created, or an AMS partition's maximum memory is changed. 
  • On a system that terminates when in dynamic power save mode, a problem was fixed that caused SRCs B150B943, B113C660, and B113C661 to be erroneously logged when the system rebooted.
  • On partitions running Red Hat Linux 6.1, a problem was fixed that caused a partition migration operation to fail.
  • A problem was fixed that caused the installation of some versions of Linux to fail.
  • On systems running AIX, a problem was fixed that caused AIX to log an "INTERNAL ERROR CODE" against sysplanar0 after a partition migration operation from a POWER7 to a POWER6 system.
  • A problem was fixed that caused a partition migration operation to abort when the partition has more than 4096 virtual slots.
  • On systems running the Advanced Energy Manager (AEM), a problem was fixed that caused the work rate calculation for a processor to be incorrect if the system dropped into safe mode.
  • On systems running VIOS, a problem was fixed that caused the location code in the output of the "lsmap -npiv -all" command to be incorrect.
  • On systems running AIX partitions, a problem was fixed that caused the virtual memory manager in AIX to crash on the target system after the migration of a partition from a POWER6 system to a POWER7 system.
  • On systems on which a NIM installation is being set up using the system management services (SMS) menus, the firmware was changed to limit the packet size options to 512 and 1024 bytes.
  • On systems running iSCSI, a problem was fixed that caused the system to hang when booting from an iSCSI device in the system management services (SMS) menus.
  • On systems with an iSCSI network, when booting a logical partition using that iSCSI network, a problem was fixed that caused the iSCSI gateway parameter displayed on the screen to be incorrect.  It did not impact iSCSI boot functionality.
  • On systems running a virtual I/O (VIO) partition, or using a Shared Ethernet Adapter (SEA), a problem was fixed that caused a severe performance degradation.
  • On systems using Capacity on Demand (CoD), a problem was fixed that caused informational SRC B7005300 to be logged so often that the error logs wrapped, and other information in the error logs was lost.
  • On systems with more than one drawer, a problem was fixed that prevented the battery on the secondary service processor from being called out when it needed to be replaced.
AM710_117_043

07/27/11
Impact:  Availability         Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect certain systems

  • HIPER:  On systems running VIOS, a problem was fixed that caused the system to crash with SRC B700F103. 
  • On systems running more than 100 logical partitions, a problem was fixed that caused a concurrent firmware installation to fail.
  • On systems running Active Memory Sharing (AMS), a problem was fixed that caused the system to crash with SRC B170E540 after a warm boot or platform dump IPL.
  • On systems running shared processor partitions, a problem was fixed that caused a partition to hang until powered off and back on
AM710_115_043

05/27/11
Impact:  Data               Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • The firmware was enhanced to log SRC 11007610, 11007620, 11007630, 11007640, or 11007650 only when a system fan's speed drops below 2800 RPM.
  • A problem was fixed that prevented the system fans from running at the correct speed if a service processor reset to runtime was done, then a fan failure occurred.
  • A problem was fixed that caused the exhaust heat index value displayed by IBM Director to be invalid when the system is located near, at, or below sea level.
  • A problem was fixed that caused a firmware installation to fail with SRC B181EF7C.
AM710_114_043

03/25/11
Impact:  Data               Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

New Features and Functions

  • Support for the network installation of the IBM i operating system from the hardware management console (HMC) command line interface (CLI).

System firmware changes that affect all systems

  • A problem was fixed that prevented the timed-power-on function from turning the system back on if the service processor's clock was adjusted to an earlier time.  This problem could occur during the fall when clocks are set back when daylight savings time ends, for example.
  • A problem was fixed that caused the managed system to go to the incomplete state on the HMC.
  • A problem was fixed that prevented the green power enclosure LED from blinking when the service processor was at standby.
  • A problem was fixed that caused incorrect fan speeds to be reported in call home data.
  • The firmware was enhanced to log an SRC when a system fan's speed drops below 3800 RPM.
  • A problem was fixed that caused multiple DR_DMA_MIGRATE_FAIL entries in the AIX error log.

System firmware changes that affect certain systems
  • HIPER: IBM testing has uncovered a potential undetected data corruption issue when a mobility operation is performed on an AMS (Active Memory Sharing) partition.  The data corruption can occur in rare instances due to a problem in IBM firmware.  This issue was discovered during internal IBM testing, and has not been reported on any customer system.
  • On systems with two or more nodes, a problem was fixed that caused SRC B1818A10 to be erroneously logged.
  • A problem was fixed that prevented the lamp test from running if it was initiated from the operator (control) panel or the ASMI (advanced system management interface).
  • A problem was fixed that caused the HMC2 port on the advanced system management interface (ASMI) to erroneously default to static IP addressing instead of dynamic.
  • A problem was fixed that caused a partition to fail to reboot, or fail to boot if it had been shut down once since the platform was booted, with SRC B2001230 and word 3 = 000000BF.  This failure can be seen on a partition that owns a PCI, PCI-E, or PCI-X slot.
  • A problem was fixed that prevented processor resources from being moved to another partition by a DLPAR (dynamic LPAR) operation.
  • On systems with a F/C 5802 or 5877 I/O drawer attached, a problem was fixed that caused a partition to crash during a page migration operation.
  • A problem was fixed that caused a component report (component interval activity) on the HMC to show 100% uncapped CPU available; it remained at 100% and did not move up or down as the other systems used and released processor resources.
  • A problem was fixed that caused AIX licensing issues when migrating a partition from a P6 to a P7 system. When this problem was present on the target system after the partition migration (but before the partition was rebooted), the AIX command "uname -m" might not have produced the expected result.
  • The firmware was enhanced to list the attached devices when viewing the adapter information for a partition profile on the HMC GUI.
  • A problem was fixed that caused a partition to crash with SRC BA330002 after several concurrent installations of system firmware, or partition migrations, without a reboot.
  • On stand-alone systems running AIX or Linux, and on systems managed by IVM (Integrated Virtualization Manager), a problem was fixed that prevented platform dumps from being off-loaded, or resulted in corrupted or incomplete platform dumps.
  • On systems with a F/C 5803 or 5877 I/O drawer attached, a problem was fixed that caused a power-on unit operation from the HMC to time out, and informational SRC 10009107 to be logged, if the following conditions were met:
          - The SPCN firmware update policy in the ASMI was set to "Expanded", instead of "Enabled" (the default and recommended setting).
          - An installation of the power firmware on the I/O drawer was taking place over the SPCN cables.
AM710_099_043

11/11/10
Impact:  Availability         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed that caused the platform to become unresponsive; this was indicated by an "incomplete" state on the HMC.  When this problem occurred, the partitions on the managed system became unresponsive.
System firmware changes that affect certain systems
  • On systems with a F/C 5802 or 5877 I/O drawer attached, the firmware was enhanced to allow the drawer to power on with only one working offline converter assembly (OCA).
  • On systems with a F/C 5802 or 5877 I/O drawer attached, and a PCI-E adapter in the CEC, a problem was fixed that caused the system to crash during a page migration operation with SRC B700F103.
AM710_097_043

10/04/10
Impact:  Availability         Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • HIPER: This fix corrects an issue that has the potential to corrupt information stored in the POWER7 core's translation cache and may cause undetected data errors.  This issue was discovered during internal IBM testing, and while it has not been reported on any customer system, IBM strongly recommends that this fix be applied to all POWER7 systems.
  • HIPER: A problem was fixed that caused the HMC to show the server's status as incomplete, and SRC B7000602 to be logged against SFLPHMCCMDTASK in serviceable events. This problem can also cause the system to crash when it occurs.
  • HIPER: A problem was fixed that caused repeated reset/reloads of the service processor, and fail-overs, to occur after a hypervisor-initiated reset/reload of the service processor was completed.  That led to loss of communication between the service processor and the hypervisor (indicated by SRC B182951C).
  • HIPER: A problem was fixed that caused an AIX or Linux partition to fail to boot with SRC B2008151, which prevented further access to that partition, and potentially preventing prior LPAR configuration changes from being completed.  A reboot of the server is required to recover from this problem.
  • HIPER:  A problem was fixed that could have prevented a successful emergency fail-over to the backup service processor if multiple reset/reload commands were issued to both service processors at roughly the same time.  If the emergency fail-over was not successful, the system might have been terminated by either the service processor or the hypervisor.  This problem did not affect an administrative fail-over of the service processor.
  • HIPER:  A problem was fixed that caused informational SRC B70069DA from a host Ethernet adapter (HEA) to be logged erroneously.  These messages are sent from the hypervisor to the service processor and cause unnecessary loading of the hypervisor-service processor communication link.
  • The firmware was enhanced to improve the hardware called out with SRC B121B8AB.
  • A problem was fixed that prevented the user from turning off an indicator LED using the ASMI menus when the previous state of the LED was fault/identify.
  • A problem was fixed that caused SRC B1812A60 to be erroneously logged.
  • The firmware was enhanced so that the bus numbering on all model MMB systems with I/O drawers or towers attached will be consistent.
  • A problem was fixed that caused an AIX or Linux partition to crash with SRC B2008151 logged.
  • A problem was fixed that caused an error log indicating a dynamic LPAR (DLPAR) error when no DLPAR operations were done, and unrecoverable SRCs BA180010 and BA250010 to be erroneously logged, when a recoverable enhanced error handling (EEH) error was logged on an I/O adapter.

System firmware changes that affect certain systems

  • On 9179-MHB systems running the enhanced cache option (ECO), a problem was fixed that periodically caused the system fans to run at full speed.
  • The firmware was enhanced to support the network installation of the IBM i operating system from the hardware management console (HMC) command line interface (CLI).
  • On single-CEC-drawer systems, a problem was fixed that caused a system to hang in the "undetermined" state during the code activation step of a firmware installation, then go off-line completely after about ten minutes.
 
Product and Development Engineering recommends the installation of AM710_097 to eliminate any exposure to the above issues.
Updating to this level of firmware can be performed concurrently.
AM710_086_043

07/21/10
Impact:  Function         Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • HIPER:  A problem was fixed that caused a system crash with SRC B170E504 with word 8 of the SRC data = 0x01EE0005.  Although this problem can occur under other circumstances, it is most likely to occur when running shared partitions or in SMT2 (symmetric multi-threading 2) mode. 
System firmware changes that affect certain systems HIPER:  On systems in dynamic power save mode using the Active Energy Manager plug-in with Systems Director, a problem was fixed that caused SRC B1812616, then a hardware checkstop (SRC B113E504), to be logged.
AM710_083_043

06/07/10
Impact:  Serviceability          Severity:  ATT

System firmware changes that affect all systems

  • The firmware was enhanced to dynamically update the IPL speed on the control (operator) panel when the IPL speed is changed by another method.
  • A problem was fixed that caused the service processor to crash with SRC B181720D due to an out-of-memory condition.
  • A problem was fixed that caused SRC B113E504 SRCs with a description of "Undefined Error Code" to be erroneously logged. 
  • The firmware was enhanced such that if a memory controller fails or overheats, the memory controller/DIMM will be called out rather than the processor card.
  • Two problems were fixed that caused SRCs B181B8F8 and SCR B181B86A to be erroneously logged during a mainstore dump.  In both cases, the SRC being logged prevented the mainstore dump data from being collected and erroneously called out hardware for replacement.
  • A problem was fixed the prevented the reset/reload bit from being set correctly in a service processor error log entry.
  • A problem was fixed that caused a call home to be erroneously made with SRC B181E911, and a service processor dump to be taken unnecessarily.
  • A problem was fixed that caused the HMC to show a status of "Incomplete" for the managed system, and numerous service processor dumps to be generated.
System firmware changes that affect certain systems
  • PARTITION-DEFERRED:  A problem was fixed that caused SRC BA210000 to be erroneously logged on the target system when a partition was moved (using Live Partition Mobility) from a Power7 system to a Power6 system.
  • A problem was fixed that caused SRC BA280000 to be erroneously logged on the target system when a partition was moved (using Live Partition Mobility) from a Power7 system to a Power6 system.
  • On systems with the Active Energy Manager in IBM Director activated, a problem was fixed that caused a small error in the processor usage calculations.
  • On systems with redundant service processors, a problem was fixed that caused SRC B181E617 to be erroneously logged and a service processor dump to be unnecessarily generated.
  • A problem was fixed that caused a system or partition running Linux to crash when the "serv_config -l" command was run.
  • On systems in an i5/OS clustering configuration, a problem was fixed that prevented a partition in an I/O pool from being deleted.
AM710_065_043

03/16/10
Impact:  New            Severity:  New

GA Level