Post VMworld is among us all and we all have our takeaways from the conference that we want to apply into our production environments at work. One big one from me came from the Performance Best Practices and Troubleshooting (VSP3866). This session was jam packed with all the best practices around monitoring, tuning, and troubleshooting vms and hosts with cpu, memory, storage, or networking issues.
Although a lot of information was covered in a short time, and I jotted down the many different scenarios and fixes that I wanted to apply to my own production cluster, however the biggest one that stuck out for me was something called The Resource Pool Priority-Pie Paradox. Now, this is nothing new, it's been around for quite some time. Craig Risinger has a great guest blog post on Duncan Epping's Yellow Bricks blog here dating back to February of 2010. The main point of the article states that having many vms in a production pool with high shares, and few vms in a test pool with low shares could in some scenarios end up with your production vms receiving less cpu and memory than your test vms.
Although there have been many other blog posts about this subject it was something that I have never noticed or even thought of. The main reason it has never affected our environment is that the resource pool shares will only kick in when contention occurs, and since in our environment we have the physical resources to support all of our vms, we have never had to see the shares mechanisms come into play. However, if contention ever does occur, this would become a major issue. It's best to read Duncan's post for a more in depth explanation of this, however, for my own learning, I decided to recreate this with a simple lab example.
I have created a cluster containing two resource pools (Production and Test). Production has its' shares set to High, whereas Test has its shares set to low. I've used 6 small VM's( 1 vcpu, 256 Mb Ram) for this example, laid out in a 5 to 1 ratio of Production to Test. So, if the share mechanisms were to kick in, the Production resource pool would receive 80% off the resources to split amongst its' 5 VMs (16%/vm) and the Test pool would receive 20% of the resources to split amongst only 1 VM (20%/vm). Looking at the 'Worst Case Scenario' column in the screenshots below you see that in fact, it's much better to be offered the big piece of small pie…
So, what is the answer? I think I will take the easy way out and say it depends. It depends on the amount of resources in your environment, it depends on the vm's that reside in your resource pools, and it depends on the limit's, reservations, and shares setup on your resource pools. In this situation, simply setting the shares to custom and setting Production to 9500 and Test to 500 results in the following
As you can see, the Production VMs increased to 636/VM and the Test VM decreased to 169. You can set the custom shares to whatever you need to in order to get your desired end 'Worst Case Scenario'. In addition to this, you can also add some reservations and limits, however the main point to get across is that you need to do the math for your environment. Remember, 2 vCPU VMs will get twice the shares as a 1 vCPU VM, which in turn will sway the numbers even more. So, right-size your VMs, keep an eye on your 'Worst Case Scenario', and if all else fails, hook up with @DuncanYB or @FrankDenemen on twitter.