Linux Guest Unresponsive After vMotion

By | February 22, 2015

Last week my linux sysadmins had two linux guests become unresponsive after the guest vMotioned (by DRS). The linux team pointed the finger to the host they vMotioned from/to. I wasn’t so sure, I grabbed the logs and opened a case with VMware.

VMware didn’t find anything and the guest logs stopped after the vMotion. Performance graphs shows CPU jumped considerably after the vMotion.

Nothing made sense until I remembered that the linux team was asking about using PowerCLI to change an e1000 virtual nic to VMXNET3. I pinged the linux guys and determined these VMs were ‘staged’ with the change and were to be rebooted the day after we had the issue. They had used PowerCLI to change the nic type (probably in the VMX file) but it doesn’t take effect until a power cycle. The vMotion occurred before they could be power cycled. I believe they did something like this:

We setup a new VM in a similar manner to their staged upgrade and did a vMotion. Boom, CPU spike. The guest wasn’t unresponsive but there also wasn’t any application running in this VM. This makes sense since a vMotion is like a quick stop/start of the VM. The VMX file is probably read in At this point, linux probably couldn’t handle the nic swap in this manner though.

TL;DR if using PowerCLI to swap a virtual nic to VMXNET3, either power cycle immediately or DISABLE DRS until after the VN is power cycled.

Leave a Reply

Your email address will not be published. Required fields are marked *