I have been working with VMware on a possible FT bug and got to the point where I needed to try to recreate the bug while they had some enhanced logging going on.
My test bed consisted of the following
- 3xBL460c Gen8 Hosts with 256GB Ram with 8 Flex Nics (3 vmkernels, Management, vMotion, NFS/FT)
- 30 RHEL 6.4 VMs with varying amounts of memory and cpu (I put CPU limits on all of the VMs) and a 100 GB ext4 filesystem mounted at /local
- 4 RHEL 6.4 VMs with 1 vCPU with 4GB that will be used for FT
- The Stress package built and placed at /usr/local/bin/stress
In each VM I ran stress with various combinations, but this is the format
- stress -c 10 -i 10 –vm 4 –vm-bytes 1G
- This will spin up 10 processes hammering the cpu, 10 processes running sync (supposed to flush the buffer cache I believe), and 4 processes eating 1GB each.
- you can also add the following “–hdd 10 –hdd-bytes 10G” have 10 processes writing 1GB each. Be sure to be in the /local directory or a directory with sufficient space or it will abort
- EDIT: 06/11/2015 -> After using the tool more, I have standardized on stress -c 1 –vm 1 –vm-bytes 3G . For a single vCPU FT, the single CPU process is fine, also I didn’t really need to test the IO. Lastly, my FT VMs are usually 4GB, possibly 8GB so I used a single process maxing out between 3GB and 7GB. This simplifies the settings greatly.
Now for my specific tests I needed to migrate the primary and secondary FT VMs around and also turn FT on and off.
For turning FT on and Off I used a PowerCLI function from vNiklas, it is part of the code below.
For migrating the VMs around I made my own function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
if (( Get-PSSnapin -name Vmware.Vimautomation.core -ErrorAction SilentlyContinue ) -eq $null ) { Add-PSSnapin vmware.vimautomation.core } function Set-VMFaultTolerance{ <# .SYNOPSIS Enable FT for the VM sent as parameter .DESCRIPTION Use this function to enable or disable Fault Tolerance .PARAMETER xyz .NOTES Author: Niklas Akerlund / RTS Date: 2012-02-27 #> param ( [Parameter(Position=0,Mandatory=$true,HelpMessage="A virtual machine", ValueFromPipeline=$True)] $VM, $VMHost = $null, [switch]$enableFT, [switch]$disableSecondaryFT, [switch]$removeFT, [switch]$failOverFT, [switch]$restartSecondaryFT ) if ($VM.ExtensionData.Config.FtInfo -ne $null){ $VM = Get-VM $VM | where {$_.Extensiondata.Config.FtInfo.Role -eq 1} } else { $VM = Get-VM $VM } if ($VMHost -ne $null){ $VMHost = Get-VMHost $VMHost | Get-View if ($VMHost -ne $null) { $VMHostobj = New-Object VMware.Vim.ManagedObjectReference $VMHostobj.type = "HostSystem" $VMHostobj.value = $VMHost.MoRef.Value $VMHost = $VMHostobj } } if ($enableFT) { if ($VM.ExtensionData.Config.FtInfo -eq $null){ $VMview = $VM | Get-View $VMview.CreateSecondaryVM($VMHost) } else{ $VMsec = Get-VM $VM.Name | where {$_.Extensiondata.Config.FtInfo.Role -eq 2}| Get-View $VMobj = New-Object VMware.Vim.ManagedObjectReference $VMobj.type = "VirtualMachine" $VMobj.value = $VMsec.MoRef.Value $VMview = $VM | Get-View $VMview.EnableSecondaryVM($VMobj, $null) } }elseif ($disableSecondaryFT) { if ($VM.ExtensionData.Config.FtInfo -ne $null){ $VMsec = Get-VM $VM.Name | where {$_.Extensiondata.Config.FtInfo.Role -eq 2 -and $_.PowerState -eq "PoweredOn"}| Get-View if ($VMsec -ne $null){ $VMobj = New-Object VMware.Vim.ManagedObjectReference $VMobj.type = "VirtualMachine" $VMobj.value = $VMsec.MoRef.Value $VMview = $VM | Get-View $VMview.DisableSecondaryVM($VMobj) }else { Write-Host "The Secondary is already disabled" } }else { Write-Host "This VM is not FT enabled" } }elseif ($failOverFT) { if ($VM.ExtensionData.Config.FtInfo -ne $null){ $VMsec = Get-VM $VM.Name | where {$_.Extensiondata.Config.FtInfo.Role -eq 2 -and $_.PowerState -eq "PoweredOn"}| Get-View if ($VMsec -ne $null){ $VMobj = New-Object VMware.Vim.ManagedObjectReference $VMobj.type = "VirtualMachine" $VMobj.value = $VMsec.MoRef.Value $VMview = $VM | Get-View $VMview.MakePrimaryVM($VMobj) }else { Write-Host "The Secondary is disabled" } }else { Write-Host "This VM is not FT enabled" } }elseif ($restartSecondaryFT) { if ($VM.ExtensionData.Config.FtInfo -ne $null){ $VMsec = Get-VM $VM.Name | where {$_.Extensiondata.Config.FtInfo.Role -eq 2 -and $_.PowerState -eq "PoweredOn"}| Get-View if ($VMsec -ne $null){ $VMobj = New-Object VMware.Vim.ManagedObjectReference $VMobj.type = "VirtualMachine" $VMobj.value = $VMsec.MoRef.Value $VMview = $VM | Get-View $VMview.TerminateFaultTolerantVM($VMobj) }else { Write-Host "The Secondary is disabled" } }else { Write-Host "This VM is not FT enabled" } }elseif ($removeFT){ if ($VM.ExtensionData.Config.FtInfo -ne $null){ $VMview = Get-VM $VM | where {$_.Extensiondata.Config.FtInfo.Role -eq 1}| Get-View $VMview.TurnOffFaultToleranceForVM() } else { Write-Host "This VM is not FT enabled" } } } function migrate-FT{ <# .SYNOPSIS Migrate FT primary or secondary vm .DESCRIPTION Use this function to move the primary or secondary vm as part of a stress test .PARAMETER xyz .NOTES Author: Chris Chua Date: 2014-11-04 #> param ( [Parameter(Position=0,Mandatory=$true,HelpMessage="A virtual machine", ValueFromPipeline=$True)] $VM, [switch]$movePrimary, [switch]$moveSecondary ) if($movePrimary){ $FTnum=1 $FTsecondnum=2 } if($moveSecondary){ $FTnum=2 $FTsecondnum=1 } if ($VM.ExtensionData.Config.FtInfo -ne $null){ $PrimaryVM = Get-VM $VM | where {$_.Extensiondata.Config.FtInfo.Role -eq $FTnum} $PrimaryHost = $PrimaryVM.host.name $SecondaryVM = Get-VM $VM | where {$_.Extensiondata.Config.FtInfo.Role -eq $FTsecondnum} $SecondaryHost = $SecondaryVM.host.name } else { Write-Host "VM does not have FT enabled" return 1 } $cluster = Get-Cluster -VM $PrimaryVM foreach ($vmhost in $cluster | Get-VMHost){ if (($vmhost.name -ne $PrimaryHost) -and ($vmhost.name -ne $SecondaryHost)){ $DestinationHost=$vmhost.Name } } Move-VM -VM $PrimaryVM -Destination (Get-VMHost $DestinationHost) -RunAsync } Connect-VIServer VCENTER while($true) { $vm1 = Get-VM "FTVM" | where {$_.Extensiondata.Config.FtInfo.Role -eq 1} Write-Host "moving primaries" migrate-FT -VM $vm1 -movePrimary Write-Host "moving secondaries" migrate-FT -VM $vm1 -moveSecondary Write-Host "disabling ft" Set-VMFaultTolerance -VM $vm1 -removeFt Write-Host "enabling ft" $vm1 = Get-VM "FTVM" Set-VMFaultTolerance -VM $vm1 -enableFT Write-Host "failover ft" $vm1 = Get-VM "FTVM" | where {$_.Extensiondata.Config.FtInfo.Role -eq 1} set-vmfaulttolerance -VM $vm1 -failoverFT sleep 120 } |
Just change the VCENTER and the FTVM placeholders and you are good to go. This thing will loop forever.
I also added alarms on the FT test cluster so that I knew if a FT vm had to be restarted by HA (indicating something bad)