Updated: Comparing Performance R5 vs R1 with and without Compression/Dedupe on All-Flash VSAN

By | March 27, 2017

In my last post, I shared some HCI Bench numbers in different configurations.

Chen Wei (author of HCI Bench) thought the numbers were a bit low and offered to review my config and logs. He recommended decreasing the number of disks and threads, clearing the host cache and preparing the disks. Initially I just used the ‘easy’ setup which already did most of those things automatically, but I was just impatient when it came for waiting for the tests to finish. (disk prep usually took two hours).

He also suggested a few advanced settings:

 

esxcfg-advcfg -s 131072 /LSOM/blPLOGCacheLines

esxcfg-advcfg -s 32768 /LSOM/blPLOGLsnCacheLines

[root@w2-pe-vsan-esx-001:~] vsish

/> cat /config/LSOM/intOpts/blLLOGCacheLines

Vmkernel Config Option {

   Default value:128

   Min value:16

   Max value:32768

   Current value:128

   hidden config option:1

   Description:Number of cache lines for bucketlist in LLOG (Requires Reboot)

}

/> exit

 

Get the max value, here is 32768, might be different in your env, and set the value to the max value by running

[root@w2-pe-vsan-esx-001:~] esxcfg-advcfg -s 32768 /LSOM/blLLOGCacheLines

I didn’t find any information about the blLLOGCacheLines, but from the issue I was having he mentioned that the VSAN objects were not being distributed through the cluster, which hurt the R5 tests the mosts. The  blPLOGCacheLines and blPLOGLsnCacheLines settings are mentioned in the VSAN Performance Whitepaper.

My standard settings on the ESXi host also disabled SSH after 15 minutes, which means that the host cache was not being cleared (cache is cleared between disk prep and disk warmup which is at least a few hours into the test). From the command line I increased the timeout (you can also do this from Advanced Settings in the web client).

esxcli system settings advanced set -o /UserVars/ESXiShellTimeOut -i 21600

esxcli system settings advanced set -o /UserVars/ESXiShellInteractiveTimeOut -i 21600

Here are my updated results:

Compression/Dedupe No Compression/No Dedupe
RAID R5 R1 R5 R1 R1 Hybrid
IOPS IO/s 67909.1 125662.92 59836.49 160063.7 61970.21
THROUGHPUT MB/s 265.27 490.86 233.75 625.24 242.06
LATENCY ms 7.5479 4.1603 8.5948 3.3625 9.4491
R_LATENCY ms 4.4493 4.1294 1.9708 2.4579 4.9589
W_LATENCY ms 14.7758 4.2324 24.0651 5.4739 19.9434

 

 

R5 w/compress R1 w/compress R5 R1 R1 Hybrid
R5 w/compress 54.04% 113.49% 42.43% 109.58%
R1 w/compress 185.05% 210.01% 78.51% 202.78%
R5 88.11% 47.62% 37.38% 96.56%
R1 235.70% 127.38% 267.50% 258.29%
R1 Hybrid 91.25% 49.31% 103.57% 38.72%

Both R1 cases increased performance by 60K IOPS

The R5 results were different now (not hindered by configuration settings), with R5 w/compression actually edging out R5 w/o compression

Note that this system has two diskgroups on each of the four nodes.

For raw power, R1 w/no compression is the best choice but isn’t really the most economical. For the most balance, my conclusion is still the same: R1 with compression. If the compression cuts the storage use in half, you still have 78% of the IOPs when not using compression.

I was trying to lean towards R5 w/ compression as a good bang for your buck, but it’s 42% of the IOPS of R1 w/ compression and definitely not a 2x in storage in efficiency over R1 with compression.

For the 4-node use case, I am very happy with the R5 with compression option. I did not have requirements for blazing fast speed, but I definitely need the space savings in order to make the economics of VSAN Enterprise and the SSDs work.

Leave a Reply

Your email address will not be published. Required fields are marked *