A year ago I posted a blog about replacing a failed array controller, only it turned out the array controller didn’t really fail. The array controller needed firmware.
This year I actually have an array controller fail so I can share a few extra tips.
The Hardware, 5 of the following:
DL380 Gen9
3 disk groups of:
1x800gb SSD, 7×1.2TB HDD
1 Disk Group connected to the P440ar
1 Disk Group connected to a P440
1 Disk Group connected to another P440
All of the array controllers are in HBA mode, meaning the disks are passed directly to ESXi for VSAN’s consumption.
So I noticed that I was missing a diskgroup on one of the hosts, and traced that back to one of the controllers.
The ILO and IML both noted a bad controller. The Storage tab had a red X on the controller. I noted the PCI location and the SN.
Once I received the replacement part, I did the following:
- Put the host into maintenance mode
- Replace the failed card
- Powered on the host
- Set the card back into HBA mode
- Enable SSH
- SSh into box
- /opt/hp/hpssacli/bin/hpssacli ctrl all show statusSmart Array P440 in Slot 3 (HBA Mode)Controller Status: OK
Smart Array P440 in Slot 4
Controller Status: OK
Cache Status: Not Configured
Battery/Capacitor Status: OK
Smart Array P440ar in Slot 0 (Embedded) (HBA Mode)
Controller Status: OK
- /opt/hp/hpssacli/bin/hpssacli ctrl slot=4 modify hbamode=on forced
- Reboot Host
- Exit Maintenance Mode
At this point the host should find the diskgroup and disks and start resyncing.