VCP 5 - Objective 5.1 – Create and Configure VMware Clusters

Describe DRS virtual machine entitlement

When available resources do not meet the demands of the environment, thats when contention occurs. When contention occurs you need to know how many resources that each VM will consume or is entitled to. In order to do this, you use the resource allocation settings of the VMs The settings are broken down into three categories

Shares specify the relative importance of a VM (or resource pool). I.E. If one VM has twice as many shares as another, then it is entitled to twice as much of the resource when contention occurs.
Shares can be set in a High, Medium, Low, or Custom. Which map relatively to 4:2:1
- High – 2000 shares/CPU, 20 shares/MB of configured VM Memory
- Medium – 1000 shares/CPU, 10 shares/MB of configured VM Memory
- Low – 500 shares/CPU, 5 shares/MB of configured VM Memory.
- Custom – specified by the user – beware as VMs become powered on and off this value stays the same.
Shares only make senses when applied at a sibling level. So a parent container can be assigned a share, and all the child objects are assigned shares within it that correspond to their relative importance within the parent container.
Apply only to powered on VMs
When a new VM is powered on, the relative priority of all other VMs that are siblings will change.

Reservations

Reservations specify the guaranteed minimum allocation or resources for a VM
You may only power on a VM if there is enough unreserved resources to meet the VMs reservation.
The host will guarantee the reservation, even when contention occurs.
Reservations are specified in concrete units and by default are set to 0.

Limits

Limits specify the upper bound for CPU, Memory, or storage I/O that can be allocated.
A host can always allocate more resources than a VMs reservation, but never more than a VMs limit, whether contention is occurring or not.
Expressed in concrete Units.
Default is unlimited and in most cases there is no need to use this.
Benefits – does allow you to simulate having few resources or contention.
Drawbacks – could waste idle resources. Resources can not be assigned above a VMs limit even if they are available.

Create/Delete a DRS/HA Cluster

DRS Clusters are a collection of ESXi hosts with shared resources. DRS gives you the following cluster level resource management capabilities.

Load Balancing – the usage and distribution of CPU and memory amongst all hosts and VMs is continuously monitored. DRS then compares this to an ideal resource utilization given the attributes of the clusters resource pools and VMs. It will compare the current demand and the imbalance target. Depending on the settings it will then perform or recommend migrations to migrate VMs to balance the load.
Power Management – When DPM is enabled, DRS will compare the total resources of the cluster to the demands of the clusters VMs, including recent history. If possible it will migrate VMs off of hosts in order to place them into a standby power mode.
Affinity Rules – Allows you to control the placement of VMs to hosts by assigning rules.

There are a few requirements before you can create a DRS cluster.

All hosts within the cluster need to be attached to shared storage
All volumes on the hosts must use the same volume names.
All processors must be from the same vendor class and the same processor family. EVC will help to solve the feature differences between the family, but processors must be of the same family.
All vMotion requirements must be met (explained later)

Creating an HA/DRS Cluster

Right click a Datacenter object and select 'New Cluster'
Give the cluster a name.
Check whether to enable or disable HA and/or DRS in the cluster.
Select an automation level for DRS.
- Manual – Initial placement and recommendations are both displayed and will need to be approved.
- Partially Automated – Initial placement will be performed automatically, migration will be displayed.
- Fully Automated – Initial placement and migration is fully automated.
Set the migration threshold (Priority 1 – Priority 5)
Select whether to enable DPM and configure its settings
- Off – no DPM
- Manual – Only recommendations of power off and on are recommended
- Automatic – vCenter will bring hosts in and out of standby according to the threshold settings
Select whether to enable host monitoring for HA or not (this allows the hosts to exchange their heartbeats).
Select whether to enable or disable Admission control and set the desired Admission Control Policy.
- Hosts Failures the cluster tolerates – Specified in the number of hosts.
- Percentage of cluster resources reserved as failover spare capacity – % for CPU and memory
- Specify failover hosts – specify host to use for HA failover.
Specify the Virtual Machine Cluster Defaults
- VM restart priority – Disabled, Low, Medium, High
- Host Isolation response – Leave Powered On, Power Off, Shutdown
Select the VM Monitoring Settings (Disabled, VM Monitoring, VM and Application Monitoring) and the monitoring sensitivity (Low, Medium, High)
Select whether to enable or disable EVC and select its corresponding Mode.
Select your swap file policy for VMs in the cluster.
- Store with Virtual Machine
- Store on a datastore specified by host

Deleting an HA/DRS Cluster

Pretty Easy, right click on the cluster and select 'Remove'

Add/Remove ESXi Hosts from a DRS/HA Cluster

The procedure for adding a host to an HA/DRS Cluster is different for hosts management by vCenter and those that are not. After hosts have been added, the VMs residing on those hosts are now part of the cluster and will be protected by HA and migrated with DRS.

Adding a managed host

Select the host and drag it into the target cluster object.
Select what to do with the VMs and resource pools that reside on the host.
- Put this hosts VMs in the clusters root resource pool – vCenter will strip the hosts of all of its resource pools and hierarchy and places all the VMs into the clusters main root resource pool. Share allocations might need to be manually changed after this since they are relative to resource pools.
- Create a resource pool for this hosts VMs and Resource Pools – vCenter will create a top level resource pools that becomes a direct child of the cluster. All of the hosts resource pools and VMs are then inserted into this resource pool. You can supply a name for the new resource pool

Adding an unmanaged host

Right click the cluster and select Add Host.
Supply the host name/IP and authentication credentials
You are then presented with the same options as above regarding existing VMs and resource pools.

Removing a host from a cluster

There are certain precautions to take when removing a host from a cluster and you must take the following into account

Resource Pool Hierarchies

When a host is removed, the host retains only its root resource pool. All resource pools created in the cluster are removed, even if you decided to create one when joining the cluster
VMs – A host needs to be in maintenance mode to leave a cluster, thus all VMs must be migrated off of the host.
Invalid Clusters – By removing a host, you are decreasing the overall resources the cluster has. If there are reservations set on the VMs you could cause your cluster to me marked as yellow and an alarm to be triggered, you could also affect HA and failover capacity.

The process to remove a host is as follows

Place host in maintenance mode
Now you may either drag it to a different location in the inventory, or right click and select 'Remove'.

Add/Remove virtual machines from a DRS/HA Cluster

Adding VMs to a cluster are performed in a few ways

When you add a host to a cluster, all VMs on the host are added as well
When a VM is created, the wizard will prompt you for the location to place it. You can select a host, cluster, or resource pool within the cluster.
You can use the Migrate VM wizard to migrate a VM into a cluster, or simply drag the VM into the clusters hierarchy.

Removing a VM from a Cluster

When you remove a host from a cluster that contains powered off VMs, the VMs are also removed from the cluster
Use the Migrate VM wizard to move the VM outside of the cluster. If the VM is a member of DRS cluster rules group, a warning will be displayed but it will not stop the migration.

Configure Storage DRS

Storage DRS is new to vSphere 5 and provides the following resource management capabilities

Space Utilization Load balancing – A threshold can be set for space use. When usage exceeds this, SDRS will generate recommendations or perform migrations to balance the space
I/O Latency load balancing – a threshold for latency can also be set to avoid a bottleneck. SDRS will again migrate VMs in order to alleviate the High I/O
Anti-Affinity Rules – Rules can be created to separate disks of a VM on to different datastores.

Storage DRS is applied to a datastore cluster, and then can be overridden per VM, just as DRS is. Again, just as DRS does, SDRS provides Initial placement and ongoing balancing. SDRS is invoked at a configured frequency (be default this is every 8 hours) or whenever one or more of the datastores within the cluster exceeds it's space threshold.

Storage DRS makes recommendations to enforce SDRS rules and balance space and I/O. The reason for the recommendations could either Balance datastore space used or Balance datastore I/O load. In some cases, SDRS will make mandatory recommendations such as The datastore is out of space, Anti-affinity or affinity rules are being violated, or the datastore is entering maintenance mode and must be evacuated.

Configuring SDRS

In the datastores inventory, right click and select 'New Datastore Cluster' Give the cluster a name and check the Enable SDRS box.
Select your automation level (No Automation, Fully Automated).
Select your runtime rules. If you chose to enabled I/O metrics for recommendations, storage I/O control will be enabled on all datastores in the cluster. Set your utilized space and I/O latency thresholds (80% utilized and 15 ms latency by default).
You can also click advanced options and set a utilization difference threshold between source and destination (5% default), Check frequency (8hrs default), and I/O imbalance threshold (aggressive – conservative).
Select the hosts or clusters you wish to add the datastore cluster to.
Select the datastores you wish to include in the datastore cluster.

Once SDRS is initially setup if you right click on the datastore cluster and select 'Edit Settings' you will be presented with some additional options.

SDRS Scheduling – Used to change the thresholds and settings in order to balance your datastores at a scheduled time.
Rules – Affinity and Anti-affinity rules to keep VM disks together or apart. Done on a per VM basis.
Virtual Machine Settings – can change the automation level on a per VM basis, as well as select whether to keep vmdk's together or not.

Configure Enhanced vMotion Compatibility

Enhanced vMotion Compatibility (EVC) is a feature that will hide or mask certain CPU instructions from the CPU's in all hosts in a cluster in order to improve CPU compatibility between hosts, allowing for vMotion to occur. EVC leverages AMD-V Extended Migration technology (AMD) and Intel FlexMigration (Intel) in order to come up with a common baseline processor which in EVC terms is the EVC Mode.

In order to use EVC, hosts and VMs must meet the following requirements

All VMs in the cluster that are using a feature set greater than the target EVC mode must be powered off or migrated out of the cluster before enabling EVC
All hosts must have CPUs from a single vendor
All hosts must be running ESX(i) 3.5 U2 or higher
All hosts must be connected to vCenter
All hosts must have their advanced features enabled (AMD-V or Intel VT as well as No Execute NX or Intel eXecute Disable XD)
All hosts should be configured for vMotion
All hosts must have the supported CPUs for the mode you enable.

Create an EVC Cluster

Create an empty cluster, enable EVC and select the desired EVC mode.
Select a host to move into the cluster
If the hosts feature set is greater than the EVC Mode then do the following
- Power off the VMs on the host
- Migrate the VMs to another host
Drag the host into the cluster

Enable EVC on an existing cluster

Select the cluster
If VMs are running on hosts that have feature sets greater than the desired EVC Mode you must power them off or migrate them to another host/cluster and then migrate them back after enabling.
Ensure the cluster has a standard vendor for CPU on its hosts.
Edit the cluster settings
Power VMs back on and migrate back.

Changing EVC Mode

If you raise the mode, be sure all hosts support the new mode. VMs can continue running, but they will not have access to the new features available in the EVC mode until they are powered off and back on. Just restarting the VM will not work, a full power cycle is required.

To lower the mode, you must power off VMs that are utilizing a higher EVC mode, change the mode, and power them back on.

Monitor a DRS/HA Cluster

There are a few different tabs in which you can monitor an HA/DRS cluster when selecting a cluster.

Summary Tab

General box shows
- Displays running status of HA/DRS
- Displays EVC Mode
VMware HA box shows
- Admission Control
- Current Failover Capacity – number of hosts available for failover
- Configured Failover Capacity – depends on admission control policy selected
- Status of Host/VM/Application Monitoring
- Advanced runtime info will show you the current slot size, the total slots, used slots, available slots, failover slots, total powered on VMs, total hosts, and total good hosts.
- Cluster Status shows which host is the master and which are the slaves, the number of protected and unprotected VMs, and which datastores are being used for datastore heartbeating.
- Configuration issues will display any configuration issues with the hosts.
vSphere DRS box shows
- Migration Automation Level
- DPM Automation Level
- Current number of DRS recommendations and faults
- Migration Threshold
- Target host load deviation and Standard host load deviation
- The resource distribution chart will show you the sum of VMs of CPU and Memory utilization by host.
HA and DRS will also trigger different alerts across the top of the Summary tab displaying alerts. In turn, it will flag the host with either a warning or an error.

DRS Tab

More detailed look at recommendations, faults, and history.
The ability to trigger DRS and apply recommendations

A cluster enabled for vSphere HA will turn red when the number of VMs powered on exceed the failover requirements. This only occurs if admission control is enabled. DRS will not be affected by this.

Configure migration thresholds for DRS and virtual machines

I explained the DRS portion of migration thresholds above. You can however over ride the automation levels of the cluster on a per VM basis, by setting the VMs automation level to either Disabled, Default (inherit from cluster), manual, partially automated or fully automated.

Configure automation levels for DRS and virtual machines

Whoops, just mentioned this above. 🙂

Create VM-Host and VM-VM affinity rules

VM-VM Affinity/Anti-Affinity Rules

specifies whether VMs should run on the same host or be kept on separate hosts.
Might want to keep VMs on the same host for performance reasons
Might want to keep VMs separated to ensure certain VMs remaining running if one host fails.
If to VM-VM rules conflict with each other, the older rule will take precedence over the newer one and the newer one will be disabled.
DRS will also give higher precedence to preventing violation of ant-affinity rules than that of affinity.

VM-Host Affinity/Anti-Affinity Rules

Specifies whether or not VMs in a VM DRS group should or shouldn't run on hosts in a host DRS group.
May want to keep certain VMs running on certain hosts due to licensing issues.
Options to specify whether the rule is a hard rule (must not/must run on hosts) or a soft rule (should/should not run on hosts).

Enable/Disable Host Monitoring

Host monitoring is one of the technologies that HA uses to determine whether or not a host is isolated. To enable and disable this is quite simple and done through the HA settings of the cluster. Simply check/uncheck the Host Monitoring checkbox.

Enable/Configure/Disable virtual machine and application monitoring

Virtual Machine Monitoring

Acts much like HA, however it will restart individual virtual machines if their VMware tools heartbeats are not received within a set time.
Enabled/Disabled within the VM Monitoring section of the HA configuration options on the cluster
Monitoring sensitivity is configurable as follows
- Low – VM will restart if no heartbeat between host and VM within 2 minutes. VM will restart 3 times every 7 days.
- Medium – no heartbeat for 60 seconds, 3 restarts within 24 hrs.
- High – no heartbeat for 30 seconds, 3 restarts per hour.
- Custom – allows you to customize interval, number of restarts and time frame.
Can have a global cluster setting as well as a per VM setting

Application monitoring

Restarts individual VMs if their VMware tools application heartbeats are not received within a set time.
Enabled/Disabled within the VM Monitoring section of the HA configuration options on the cluster
In order to use application monitoring, you must obtain the appropriate SDK or use an application that supports VMware application monitoring and set it up to send heartbeats.
Deployed on a per VM basis. I believe it uses the same monitoring sensibility as VM Monitoring.

Configure admission control for HA and virtual machines

Admission control is used to ensure that sufficient resources are available in a cluster to provide failover protection and ensure that virtual machines get their reservations respected. Admission control configuration could prevent you from powering on a VM, migrating a VM into a cluster, or increasing the amount of resources allotted to a VM. Even when admission control is disabled, vSphere will ensure that at least two hosts are powered on in a cluster, and that all VMs are able to be consolidated on to a single host.

There are three types of Admission control policies that you can use for HA

Host Failures Cluster Tolerates

Specify the number of hosts that a cluster can tolerate if they fail.
vSphere will reserve the required resources to restart all the VMs on those failed hosts.
It does this by
- Calculating a slot size – a slot is a logical representation of CPU and memory for any powered on VM in the cluster. CPU is determined by the largest reservation of any powered on VM. If there are no reservations it uses a default value of 32 Mhz. It calculates its memory slot by obtaining the largest memory reservation plus overhead. No default here.
- Determines the number of slots in the cluster
- Determines the current failover capacity of the cluster – the number of hosts that can fail and still leave enough slots to satisfy all VMs
- Determines whether the current failover capacity is less than the configured failover capacity. If it is, admission control will deny the operation requested.

Percentage of Cluster resources reserved

HA will reserve a specific percentage of cluster CPU and memory for recovery of host failures
It does this by
- Calculating the total resource requirements for all powered on VMs
- calculates the total host resources available for VMs
- Calculates the current CPU failover capacity and current memory failover capacity.
- Determines if the current CPU or current memory is less than the configured capacity. If so, denies the operation.

Specify Failover Hosts

Pretty simple, you specify the hosts you want to use for failover
This host will then not be available to run VMs, it's set a side for HA.

HA admission control is a complicated thing, but easy to set up. simply select your policy from the HA configuration options in the cluster configuration.

Determine appropriate failover methodology and required resources for an HA implementation

Policies should be picked based on your availability needs and characteristics of your cluster. You should certainly consider the following

Resource Fragmentation

When there are enough resources available, but they are located on multiple hosts, thus one host doesn’t have enough resources to run the VM.
The host failures cluster tolerates avoids this by using it's slot mechanism.
The percentage policy does not since it's looking at a percentage of resources based on the cluster itself.

Flexibility of Failover Resource Reservation

Host Failures allows you specify number of hosts that can fail
Percentage allows you to look at the cluster resources as a whole
Failover hosts allows you to determine where and which hosts will be used.

Heterogeneity of Cluster

When using large virtual machines, the Host Failures cluster tolerates slot size will be impacted and grow very large, thus giving you unexpected results, especially if you use reservations.
The remaining two policies are not so much affected by the 'monster VM'

2 thoughts on “VCP 5 – Objective 5.1 – Create and Configure VMware Clusters”

Peter Cronwright says:

June 10, 2012 at 3:39 am

In vSphere 5, VMware added a further verification mechanism to VM Monitoring. To avoid false positives VM Monitoring also monitors I/O activity of the virtual machine.

1. mwpreston says:
  
  June 11, 2012 at 8:06 am
  
  Thanks Peter for the info…