PCIe Gen3 x 16 GPU Adapter (FC EC49, EC4B, CCIN 2CE9)

 

 

******* PLEASE READ THIS ENTIRE NOTICE *********

 
DATE: November 3, 2016 

Table of Contents

 

1.0 Microcode and Document Revision History

2.0 General information

3.0 Installation time

4.0 Machine's Affected

5.0 Linux Requirements

 

 

1.0 Microcode and Document Revision History

Firmware Level

Description

80.21.1F.00.01
80.21.1F.00.02

Latest release required for new Hynix memory support. Compatible with current Micron memory.
Self-extracting images with NVIDIA Firmware Update Utility Version 5.323.0

 

The Firmware Levels Below Are No Longer Supported By IBM Once They Have Been Removed From The Microcode Down Load Website.

It is best practices to update to the latest FW level not only for IBM support of these products, but for optimal performance and to ensure that all of the required HW/FW fixes are installed. Once new FW has been released to the field, we will provide a 6 month grace period for customers to update these products to the currently supported FW level.

Please Update To The Latest Level At Your Earliest Convenience

80.21.1B.00.01
80.21.1B.00.02

Original release

 

 

 

Document Revision History

Description

11/03/2016

Creating readme file with latest VBIOS – 80.21.1F.00.01 and 80.21.1F.00.02

 

 

 

 

2.0 General information

This Readme file is intended to give directions on how to update the VBIOS found on the Nvidia K80 PCIe Gen3 x 16 GPU Adapter (FC EC49, EC4B, CCIN 2CE9)

Special Requirement:

Reboot of the OS is required.

Non-Concurrent Download:

The microcode installation does NOT support concurrent download.

NOTE: It is recommended that the installation be scheduled during a maintenance window or during non-peak production periods.

Supported OS
Linux

EC49

EC4B

·         Linux Ubuntu 14.04.2, or later, with CUDA 7.5, or later

·         Red Hat Enterprise Linux 7.2, little endian, or later, with CUDA 7.5, or later

·         Nvidia driver support can be downloaded direct from Nvidia

 

3.0 Installation time

Approximately 10 minutes.

 

4.0 Machine's Affected

Feature Codes

EC49 ( for the GPU on the 822LC system)
EC4B ( For the GPU on the expansion drawer on the E870 or E880 systems)

CCIN 2CE9

5.0 Linux Requirements Error! Bookmark not defined.

Instructions (for Linux) to update VBIOS FW:

 

1.  Identify GPU Adapters.
In this example we have 2 K80 adapters. Each K80 shows up as 2 GPUs.

# lspci |grep -i Nvidia

0002:03:00.0 3D controller: NVIDIA Corporation Device 102d (rev a1)

0002:04:00.0 3D controller: NVIDIA Corporation Device 102d (rev a1)

0006:03:00.0 3D controller: NVIDIA Corporation Device 102d (rev a1)

0006:04:00.0 3D controller: NVIDIA Corporation Device 102d (rev a1)

 

#nvidia-smi -L

GPU 0: Tesla K80 (UUID: GPU-ee166c42-ed8a-ba18-573e-d25fa04b1713)

GPU 1: Tesla K80 (UUID: GPU-fdea1681-9e98-e775-c508-c2fa00966c26)

GPU 2: Tesla K80 (UUID: GPU-ede2afd0-6631-e8a4-c85a-e634ca885632)

GPU 3: Tesla K80 (UUID: GPU-cef54d95-55fb-9523-c4a8-5c1b9ba0ec7a)

 

2. Verify the current Firmware level with nvidia-smi -q

 

# nvidia-smi -q |grep -E 'GPU 00|VBIOS'

GPU 0002:03:00.0

    VBIOS Version                   : 80.21.1B.00.01

GPU 0002:04:00.0

    VBIOS Version                   : 80.21.1B.00.02

GPU 0006:03:00.0

    VBIOS Version                   : 80.21.1B.00.01

GPU 0006:04:00.0

    VBIOS Version                   : 80.21.1B.00.02 

 

If you see any level lower than 80.21.1F.00.01 for K80 GPU 0 and 80.21.1F.00.02 for K80 GPU 1, an update is recommended.

 

3. Download self_extract_2080_k80_80.21.1F.00.01 and self_extract_2080_k80_80.21.1F.00.02 from Fix Central.

They are included in the downloadable package K80_2080_firmware.tgz

Unpack with command: tar –xzvf K80_2080_firmware.tgz
Change directories: cd K80_2080_firmware/

 

 

4. Verify the size and checksum of the executables before flashing:

# ls -l self_extract_2080_k80_80.21.1F.00.01

-rwxr-xr-x 1 root root 6550824 Nov 3 14:22 self_extract_2080_k80_80.21.1F.00.01

# ls -l self_extract_2080_k80_80.21.1F.00.02

-rwxr-xr-x 1 root root 6550824 Nov 3 14:22 self_extract_2080_k80_80.21.1F.00.02

 

# sum self_extract_2080_k80_80.21.1F.00.01

46130  6398

# sum self_extract_2080_k80_80.21.1F.00.02

56678  6398

 

 

5. Update the VBIOS by running the executable files as root user.

The executable will detect all available K80 adapters and update the respective GPU with the correct VBIOS firmware.

 

Note: you will be required to unload the nvidia driver


#sudo rmmod nvidia


Update VBIOS for GPU1 on the K80

./self_extract_2080_k80_80.21.1F.00.01

 

NVIDIA Firmware Update Utility

Version 5.323.0

 

                              *** IMPORTANT ***

Do not turn off the computer or attempt to reboot your computer while the

NVIDIA firmware is being updated.  If the computer is turned off, or

power is lost, you may be unable to restart your computer.

 

Searching for display adapters to update...

Adapter: PLX (8747h)          Device Path: S:02,B:01,D:00,F:00

 

Adapter: Tesla K80            Device Path: S:02,B:03,D:00,F:00

 

Firmware Image Version: 80.21.1F.00.01

Found display adapter suitable for this update:

Tesla K80            Device Path: S:02,B:03,D:00,F:00

                     Current Firmware Version: 80.21.1B.00.01

CONFIRM: You are about to update the firmware of the display adapter.

Are you sure you want to do this?

Press 'y' to confirm ('s' to skip, 'a' to abort):  y

.........................................................

Update successful.

 

Firmware image has been updated from version 80.21.1B.00.01 to 80.21.1F.00.01.

 

 

Adapter: Tesla K80            Device Path: S:02,B:04,D:00,F:00

 

Board is incompatible with firmware version 80.21.1B.00.02.

 

No more matches found.

Update VBIOS for GPU2 on the K80

 ./self_extract_2080_k80_80.21.1F.00.02

 

NVIDIA Firmware Update Utility

Version 5.323.0

 

                              *** IMPORTANT ***

Do not turn off the computer or attempt to reboot your computer while the

NVIDIA firmware is being updated.  If the computer is turned off, or

power is lost, you may be unable to restart your computer.

 

Searching for display adapters to update...

Adapter: PLX (8747h)          Device Path: S:02,B:01,D:00,F:00

 

Adapter: Tesla K80            Device Path: S:02,B:03,D:00,F:00

 

Board is incompatible with firmware version 80.21.1F.00.01.

 

Adapter: Tesla K80            Device Path: S:02,B:04,D:00,F:00

 

Firmware Image Version: 80.21.1F.00.02

Found display adapter suitable for this update:

Tesla K80            Device Path: S:02,B:04,D:00,F:00

                     Current Firmware Version: 80.21.1B.00.02

CONFIRM: You are about to update the firmware of the display adapter.

Are you sure you want to do this?

Press 'y' to confirm ('s' to skip, 'a' to abort):  y

.........................................................

Update successful.

 

Firmware image has been updated from version 80.21.1B.00.02 to 80.21.1F.00.02.

 

 

No more matches found.

 

6. At this point we have burned the new VBIOS FW on to the K80 adapter. However, this FW will not take effect till the system is rebooted.

Reloading the drivers is not enough; it requires the system to be rebooted.

7. Verify the VBIOS FW upgrade with nvidia-smi –q. This requires that you reload the nvidia driver.
Note the modprobe command may change depending on what version of the nvidia driver is installed in your OS.

# modprobe nvidia-352

# nvidia-smi -q |grep -E 'GPU 00|VBIOS'

GPU 0002:03:00.0

    VBIOS Version                   : 80.21.1F.00.01

GPU 0002:04:00.0

    VBIOS Version                   : 80.21.1F.00.02

GPU 0006:03:00.0

    VBIOS Version                   : 80.21.1F.00.01

GPU 0006:04:00.0

    VBIOS Version                   : 80.21.1F.00.02