NVIDIA Tesla P100 SXM2 NVLINK GPU (FC EC4C, EC4D, EC4F, CCIN EC4C)

 

 

******* PLEASE READ THIS ENTIRE NOTICE *********

 
DATE: January 23, 2017

Table of Contents

 

1.0 Microcode and Document Revision History

2.0 General information

3.0 Installation time

4.0 Machine's Affected

5.0 Linux Requirements

 

 

1.0 Microcode and Document Revision History

Firmware Level

Description

86.00.26.00.02

GA release fw.
Self-extracting image with NVIDIA Firmware Update Utility Version 5.323.0

 

The Firmware Levels Below Are No Longer Supported By IBM Once They Have Been Removed From The Microcode Down Load Website.

It is best practices to update to the latest FW level not only for IBM support of these products, but for optimal performance and to ensure that all of the required HW/FW fixes are installed. Once new FW has been released to the field, we will provide a 6 month grace period for customers to update these products to the currently supported FW level.

Please Update To The Latest Level At Your Earliest Convenience

86.00.1C.00.01

Early shipment release

 

 

 

Document Revision History

Description

1/23/2017

Creating readme file with latest VBIOS – 86.00.26.00.02

 

 

 

 

2.0 General information

This Readme file is intended to give directions on how to update the VBIOS found on the NVIDIA Tesla P100 SXM2 NVLINK GPU (FC EC4C, EC4D, EC4F, CCIN EC4C)

Special Requirement:

Reboot of the OS is required.

Non-Concurrent Download:

The microcode installation does NOT support concurrent download.

NOTE: It is recommended that the installation be scheduled during a maintenance window or during non-peak production periods.

Supported OS
Linux

 

3.0 Installation time

Approximately 10 minutes.

 

4.0 Machine's Affected

S822LC     (8335-GTB)
Feature Codes

EC4C Air-Cooled NVIDIA Tesla P100 GPU (for First Pair)
EC4D Air-Cooled NVIDIA Tesla P100 GPU (for Second Pair)
EC4F Water-Cooled NVIDIA Tesla P100 GPU

CCIN EC4C

5.0 Linux Requirements Error! Bookmark not defined.

Instructions (for Linux) to update VBIOS FW:

 

1.  Identify GPU Adapters.
In this example we have 1 GP100 adapter. Each GP100 shows up as 1 GPU.

# lspci |grep -i Nvidia

0002:01:00.0 3D controller: NVIDIA Corporation GP100GL (rev a1)

# nvidia-smi -L

GPU 0: Tesla P100-SXM2-16GB (UUID: GPU-3a3e26d8-8c35-59de-caa1-8a64cc270df9)

 

2. Verify the current Firmware level with nvidia-smi -q

 

# nvidia-smi -q |grep -E 'GPU 00|VBIOS'

GPU 0002:01:00.0

    VBIOS Version                   : 86.00.26.00.02

If you see any level lower than 86.00.26.00.02 for Tesla P100-SXM2 GPU an update is recommended.

 

3. Download auto_confirm_self_extract_h403_0201_890_0__8600260002 from Fix Central

 

 

4. Verify the size and checksum of the executables before flashing:

# ls -l auto_confirm_self_extract_h403_0201_890_0__8600260002

-rwxr-xr-x. 1 root root 6550824 Jan 18 13:22 auto_confirm_self_extract_h403_0201_890_0__8600260002

# sum auto_confirm_self_extract_h403_0201_890_0__8600260002

12336  6398

 

5. Update the VBIOS by running the executable files as root user.

The executable will detect all available GP100 adapters and update GPU with the new VBIOS firmware.

 

Note: you will be required to unload the nvidia driver and other modules that may be using the nvidia driver.


#sudo rmmod nvidia nvidia_uvm

Note: May need to stop services or application that are using the nvidia modules. Then unload the nvidia driver and other subsequent drivers that depend on the nvidia module.
#systemctl disable nvidia-persistenced

#sudo rmmod nvidia nvidia_uvm


Update VBIOS for GPU1 on the GP100

 # ./auto_confirm_self_extract_h403_0201_890_0__8600260002

NVIDIA Firmware Update Utility

Version 5.323.0

 

                              *** IMPORTANT ***

Do not turn off the computer or attempt to reboot your computer while the

NVIDIA firmware is being updated.  If the computer is turned off, or

power is lost, you may be unable to restart your computer.

 

Searching for display adapters to update...

Adapter: Tesla P100-SXM2-16GB Device Path: S:03,B:01,D:00,F:00

 

Firmware Image Version: 86.00.26.00.02

Found display adapter suitable for this update:

Tesla P100-SXM2-16GB Device Path: S:03,B:01,D:00,F:00

                     Current Firmware Version: 86.00.26.00.02

UPDATE WILL BEGIN IN ABOUT 3 SECONDS.

 

....................................

Update successful.

 

Firmware image has been updated from version 86.00.26.00.02 to 86.00.26.00.02.

 

No more matches found.

 

6. At this point we have burned the new VBIOS FW on to the GP100 adapter. However, this FW will not take effect till the system is rebooted.

Reloading the drivers is not enough; it requires the system to be rebooted.

 

7. Verify the VBIOS FW upgrade with nvidia-smi -q. This requires that you reload the nvidia driver.
Note the modprobe command may change depending on what version of the nvidia driver is installed in your OS.

# modprobe nvidia

# nvidia-smi -q |grep -E 'GPU 00|VBIOS'

GPU 0002:01:00.0

    VBIOS Version                   : 86.00.26.00.02