Monthly Archives: November 2013

8 weeks of #VCAP – Section 3 scenario – CPU Affinity!

Thanks once again to Tom Verhaeg for this great scenario.

The voice team has recently setup Cisco Unity. The VoIP administrator sends you an e-mail. To comply with Cisco best practices, the Cisco Unity VM needs to have CPU affinity set. You really don’t like this, but the VoIP administrator and your boss insist. Make it happen……..

Damn, this really isn’t a fun thing to do. CPU affinity restricts a VM only to run on specific cores / processors that you specify. There may be some requirements for this (such as the above), but overall you shouldn’t do it. This breaks NUMA architecture, and more important, Fully Automated DRS! To support this, the DRS level should either be manual or partially automated.

The process itself isn’t that complicated. Edit the settings of the VM and go to the resources tab. Under advanced CPU, you find the option for CPU affinity.

cpuaffinity1

If you do not see the Scheduling Affinity piece on a DRS-Cluster host, you are running DRS in fully automated mode. You can set DRS to manual for this VM by going to the cluster settings, and under DRS select Virtual Machine options. Set the DRS mode for this VM to either disabled, manual or partially automated.

cpuaffinity2Hurray!

8 weeks of #VCAP – More Networking Scenarios by Tom!

Another top notch scenario built by Tom Verhaeg! (blog/twitter)  Thanks Tom!

Your recent work on the new portgroup was top notch! Now, the network administrators have some new requirements. You currently use one vNIC for the DvS. A second pNIC has been connected to the network and you have been tasked with adding it to the DvS. Also ensure that the DvS_StorageNetwork Port Group only uses the new pNIC and does VLAN tagging on VLAN ID 20.

Another networking objective. Whoohoo! Allright, let us first check out the current network adapters available on the host:

ns-scenario1

Allright, so vmnic2 is the one that we can add to the DvS_AMS01. Go over to the networking view (Ctrl + Shift + N) and edit the settings of your DvS. We first need to check if the DvS allows for 2 uplinks, instead of just 1.

ns-scenario2

And check this out! It’s still set to 1. This is a good one to remember for the exam, on the DvS object itself, you configure the maximum number of physical adapters (also called uplink ports) per host. So set that one to 2 and let’s continue with adding vmnic2 to the DvS.

Since the host is already connected to the DvS, click the DvS and select Manage Hosts. You will find your host, and you can add the second nic.

ns-scenario3

You could also do this from the hosts and clusters view, do whatever works for you.

Now that we have added that pNIC to the DvS, we need to create the DvS_StorageNetwork port group. Remember that we need to do VLAN tagging on VLAN ID 20 here. Create the new port group now, it’s settings should look like this:

ns-scenario4

Now, for the last part: As ESXi does load balancing by default (originating port ID based) we will now have load balancing on the DvS_ProductionNetwork, which is great, but not what we need for the Storage Network.

Open up the settings of that port group and go to the Teaming and Failover section.

ns-scenario5

Both uplink ports are now under Active Uplinks. Let’s review real quick what the options are:

Active Uplinks – actively being used for traffic flow

Standby Uplinks – will only become active until a failure occurs on one of the active uplinks

Unused Uplinks – this adapter will never be used for this port group

We need to ensure that it will never use this uplink, so move the dvUplink1 over to the Unused Uplinks. It should then look like this:

ns-scenario6Hurray!

8 weeks of #VCAP – Network Scenario by @tomverhaeg

First off I want to thank Tom Verhaeg (blog/twitter) for providing this scenario.  Tom had gotten in contact with myself and wanted to do what he can to help our with the 8 weeks of #VCAP series as he is going through a similar type process as me in studying for the VCAP5-DCA.  So props to Tom for taking the time and initiative to give back.  Hopefully we see more from him in the coming weeks!  Even better for myself as I can run through some scenarios that I didn't make up 🙂  Be sure to follow Tom on Twitter and check out his blog  Thanks for the help Tom!!!

Your company leverages the full Enterprise Plus licensing and has set up a Distributed vSwitch. Recently, the number of ports needed on a particular portgroup exceeded the number configured. You are tasked with creating a new Portgroup, called DvS_ProductionNetwork which only connects the running VM’s and also functions when vCenter is down.

Off we go again. So, let’s recall. There are 3 different options of port binding on a DvS. 

Static binding – Which creates a port group with a manual set number of ports. A port is assigned whenever a vNIC is added to a VM. You can connect a vNIC static binding only through vCenter.

Dynamic binding (Deprecated in vSphere 5.0!) – A port is assigned to a vNIC when the VM is powered on, and it’s vNIC is in a connected state. You can connect this dynamic binding only through vCenter.

Empheral binding – A port is assigned to a vNIC when the VM is powered on, and it’s vNIC is in a connected state. This binding method allows the bypass of vCenter, allowing you to manage virtual machine networking when vCenter is down.

So, that’s the one we need! Empheral binding! Luckily, it’s quite simple to configure. Hop over to the networking inventory (Ctrl + Shift + N) and create the new port group. Give it a name and leave the number of ports on the default of 128.

Now edit the settings of this port group, and select the Empheral binding under the port binding dropdown. Also note, that the number of ports is greyed out now.

Hurray!

tom

PHD Virtual provides free Recovery Time Calculator

phd_virtual_partner_logoHave you ever wondered what your plan of attack will be in the event you need to recover your virtual infrastructure and the applications running on it?  How long before users will be able to resume their work?  Well, PHD Virtual, makers of PHD Virtual Backup have put the power in your hands to figure out just what that number is.  Today they have released a free, yes FREE tool dubbed the RTA calculator which can provide visibility into your organizations recover time will be.

The tool itself runs on Windows, and simply points to your vCenter instance.  From there you have a wizard driven interface where you select which VMs you would like to analyze and set a boot order.  Now the magic happens.  RTA will take a snapshot of the selected VMs and utilizes link clones to create a copy of of these.  Lastly, it calculates the total time it will take for you to boot that group of VMs, leaving you with a nice estimate on your RTA (Recovery Time Actual).

phd

You can go and grab your free copy of the RTA calculator here – and for those that may be interested in seeing it in action there's a pretty cool video embedded below.

8 weeks of #VCAP – Storage Scenarios (Section 1 – Part 2)

Hopefully you all enjoyed the last scenario based post because you are about to get another one 🙂  Kind of a different take on covering the remaining skills from the storage section, section 1.  So, here we go!

Scenario 1

A coworker has come to you complaining that every time he performs storage related functions from within the vSphere client, VMware kicks off these long running rescan operations.  He's downright sick of seeing them and wants them to stop, saying he will rescan when he feels the need to, rather than having vSphere decide when to do it.  Make it happen!

So, quite the guy your coworker, thinking he's smarter than the inner workings of vSphere but luckily we  have a way we can help him.  And also the functions we are going to perform are also part of the VCAP blueprint as well – coincidence?  Either way, the answer to our coworkers prayers is something called vCenter Server storage filters and there are 4 of them, explained below…

RDM Filter (config.vpxd.filter.rdmFilter) – filters out LUNs that are already mapped as an RDM

VMFS Filter (config.vpxd.filter.vmfsFilter) – filters out LUNs that are already used as a VMFS datastore

Same Hosts and Transports Filter (config.vpxd.filter.sameHostsAndTransporstFilter) – Filters out LUNS that cannot be used as a datastore extent

Host Rescan Filter (config.vpxd.filter.hostRescanFilter) – Automatically rescans storage adapters after storage-related management functions are performed.

As you might of concluded it's the Host Rescan Filter that we will need to setup.  Also, you may have concluded that these are advanced vCenter Server settings, judging by the config.vpxd prefixes.  What is conclusive is that all of these settings are enabled by default – so if we need to disable one, such as the Host Rescan Filter, we will need to set the corresponding key to false.  Another funny thing is that we won't see these setup by default.  Basically they are silently enabled.  Anyways, let's get on to solving our coworkers issue.

Head into the advanced settings of vCenter Server (Home-vCenter Server Settings->Advanced Options).  From here, disabling the host rescan filter is as easy as adding the config.vpxd.filter.hostRescanFilter and false values to the text boxes near the bottom of the screen and clicking 'Add' – see below

hostrescanfilterAnd voila!  That coworker of yours should no longer have to put up with those pesky storage rescans after he's done performing his storage related functions.

Scenario 2

You work for the mayors office in the largest city in Canada.  The mayor himself has told you that he installed some SSD into a host last night and it is showing as mpx.vmhba1:C0:T0:L0 – but not being picked up as SSD!  You mention that you think that is simply SAS disks but he persists it isn't (what is this guy on crack :)).  Either way, you are asked if there is anything you can do to somehow 'trick' vSphere into thinking that this is in fact an SSD.

Ok, so this one isn't that bad really, a whole lot of words for one task.  Although most SSD devices will be tagged as SSD by default there are times when they aren't.  Obviously this datastore isn't an SSD device, but the thing is we can tag it as SSD if we want to.  To start, we need to find the identifier of the device we wish to tag.  This time I'm going to run esxcfg-scsidevs to do so (with -c to show a compact display).

esxcfg-scsidevs -c

From there I'll grab the UUID of the device I wish to tag, in my case mpx.vmhba1:C0:T0:L0 – (crazy Rob Ford).  Now if I have a look at that device with the esxcli command I can see that it is most certainly not ssd.

esxcli storage core device list -d mpx.vmhba1:C0:T0:L0

ssd-noSo, our first step is to find out which SATP is claiming this device.  The following command will let us do just that

esxcli storage nmp device list -d mpx.vmhba1:C0:T0:L0

whichsatpAlright, so now that we know the SATP we can go ahead and define a SATP rule that states this is SSD

​esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d mpx.vmhba1:C0:T0:L0 -o enable_ssd

And from here we need to reclaim the device

esxcli storage core claiming reclaim -d mpx.vmhba1:C0:T0:L0

And, another look at our listing out of the device should now show us that we are dealing with a device that is SSD.

esxcli storage core device list -d mpx.vmhba1:C0:T0:L0

ssd-yesSo there you go Mr. Ford, I mean Mr. Mayor – it's now SSD!!!!

And that's all for now 🙂

8 weeks of #VCAP – Random Storage Scenarios (Section 1 – Part 1)

So my 8 weeks of #VCAP is quickly turning into just under 4 weeks of #VCAP so as I attempt to learn and practice everything on the blueprint you might find that I'm jumping around quite a bit.  Also, I thought I would try presenting myself with a scenario with this post.  Now all of the prep for the scenario is made by myself, therefore it's a pretty simple thing for me to solve, but none the less it will help get me into the act of reading a scenario and performing the tasks that are on it.  So, this post will cover a bunch of random storage skills listed in Objective 1 of the blueprint – without ado, the scenario

Scenario 1

Let's say we've been tasked with the following.  We have an iSCSI datastore (iSCSI2) which utlizes iSCSI port bonding to provide multiple paths to our array.  We want to change the default PSP for iSCSI2 from mru to fixed, and set the preferred path to travel down CO:T1:L0 – only one problem, C0:T1:L0 doesn't seem to be available at the moment.  Fix the issues with C0:T1:L0 and change the PSP on iSCSI2 and set the preferred path.

​Alright, so to start this one off let's have a look first why we can't see that second path to our datastore.  If browsing through the GUI you aren't even seeing the path at all, the first place I would look at is claimrules (now how did I know that 🙂 ) and make sure that the path isn't masked away – remember the LUN Masking section.  So ssh on into your host and run the following command.

esxcli storage core claimrule list

scenario1-1

As you can see from my output lun masking is most certainly the cause of why we can't see the path.  Rule 5001 loads the MASK_PATH plugin on the exact path that is in question.  So, do you remember from the LUN Masking post how we get rid of it?  If not, we are going to go ahead and do it here again.

First step, we need to remove that rule.  That's done using the following command.

esxcli storage core claimrule remove -r 5001

Now that its gone we can load that current list into runtime with the following command

esxcli storage core claimrule load

But we aren't done yet!  Instead of waiting for the next reclaim to happen or the next reboot, let's go ahead and unclaim that path from the MASK_PATH plugin.  Again, we use esxcli to do so

esxcli storage core claiming unclaim -t location -A vmhba33 -C 0 -T 1 -L 0

And rescan that hba in question – why not just do it via command line since we are already there…

esxcfg-rescan vmhba33

And voila – flip back into your Manage Paths section of iSCSI2 and you should see both paths are now available.  Now we can move on to the next task, which is switching the PSP on iSCSI2 from MRU to Fixed.  Now we will be doing this a bit later via the command line, and if you went into the GUI to check your path status, and since we are only doing it on one LUN we probably can get away with simply changing this via the vSphere Client.  Honestly, it's all about just selecting a dropdown at this point – see below.

managepathsI circled the 'Change' button on this screenshot because it's pretty easy to simply select from the drop down and go and hit close.  Nothing will happen until you actually press 'Change' so don't forget that.  Also, remember, PSP is done on a per-host basis.  So if you have more than one host and the VCAP didn't specify to do it on only one host, you will have to go and duplicate everything you did on the other host.  Oh, and setting the preferred path is as easy as right-clicking the desired path and marking it as preferred.  And, this scenario is completed!

​Scenario 2

The storage team thanks you very much for doing that but requirements have changed and they now wish for all of the iSCSI datastores, both current and any newly added datastores, to utilize the Round Robin PSP.  How real life is that, people changing their mind 🙂

No problem you might say!  We can simply change the PSP on each and every iSCSI datastore – not a big deal, there's only three of them.  Well, you could do this, but the question specifically mentions that we need to have the PSP set to Round Robin on all newly added iSCSI datastores as well, so there's a bit of command line work we have to do.  And, since we used the vSphere Client to set the PSP in the last scenario, we'll do it via command line in this one.

First up, let's switch over our existing iSCSI datastores (iSCSI1, iSCSI2, iSCSI3).  To do this we will need their identifier which we can get from the GUI, however since we are doing the work inside the CLI, why not utilize it to do the mappings.  To have a look at identifiers and their corresponding datastore names we can run the following

esxcfg-scsidevs -m

maptodatastoreAs you can see there are three datastores we will be targeting here.  The identifier that we need will be the first string field listed beginning with t10 and ending with :1 (although we don't need the :1).  Once you have the string identifier of the device we want to alter we can change its' PSP with the following command.

esxcli storage nmp device set -d t10.FreeBSD_iSCSI_Disk______000c299f1aec010_________________ -P VMW_PSP_RR

​So, just do this three times, once for each datastore.  Now, to handle any newly added datastores to defaulr to round robin we need to first figure out what SATP the iSCSI datastores are utilizing, then associate the VMW_PSP_RR PSP to it.  We can use the following command to see which SATP is associated with our devices.

esxcli storage nmp device list

defaultsatpAs you can see, our iSCSI datastores are being claimed by the VMW_SATP_DEFAULT_AA SATP.  So, our next step would be to associate the VMW_PSP_RR PSP with this SATP – I know, crazy acronyms!  To do that we can use the following command.

esxcli storage nmp satp set -s VMW_SATP_DEFAULT_AA -P VMW_PSP_RR

This command will ensure that any newly added iSCSI datastores claimed by the default AA SATP will get the round robin PSP.

At this point we are done this scenario but while I was doing this I realized there might be a quicker way to to change those PSP's on our existing LUNs.  If we set associate our SATP with our PSP first then we can simply utilized the following command on each of our datastores to force them to change their PSP back to default (which will be RR since we just changed it).

esxcli storage nmp device set -d t10.FreeBSD_iSCSI_Disk______000c299f1aec010_________________ -E

Of course we have to run this on each datastore as well – oh, and on every host 😉

Scenario 3

Big Joe, your coworker just finished reading a ton of vSphere related material because his poor little SQL server on his iSCSI datastore just isn't cutting it in terms of performance.  He read some best practices which stated that the max IOPs for the Round Robin policy should be changed to 1.  He requested that you do so for his datastore (iSCSI1).  The storage team has given you the go ahead but said not to touch any of the other datastores or your fired.

Nice, so there is really only one thing to do in this scenario – change our default max IOPs setting for the SCSI1 device.  So, first off, let's get our identifier for SCSI1

​esxcfg-scsidevs -m

Once we have our identifier we can take a look on the roundrobin settings for that device with the following command

esxcli storage nmp psp roundrobin deviceconfig get -d t10.FreeBSD_iSCSI_Disk______000c299f1aec000_________________

rr-getinfoAs we can see, the IOOperation Limit is 1000, meaning it will send 1000 IOPs down each path before switching to the next.  The storage team is pretty adamant we switch this to 1, so let's go ahead and do that with the following command.

esxcli storage nmp psp roundrobin deviceconfig set -d t10.FreeBSD_iSCSI_Disk______000c299f1aec000_________________ -t iops -I 1

Basically what we define with the above command is that we will change that 1000 to 1, and specify that the type of switching we will use is iops (-t).  This could also be set with a -t bytes and entering the number of bytes to send before switching.

So, that's basically it for this post!  Let me know if you like the scenario based posts over me just rambling on about how to do a certain task!  I've still got lots more to cover so I'd rather put it out there in a format that you all prefer!  Use the comments box below!  Good Luck!

8 weeks of #VCAP – iSCSI Port Binding

My plan is to go over all the skills in Objective 1.3 but before we get into PSA commands and what not let's first configure iSCSI port bonding – this way we will have a datastore with multiple paths that we can fiddle around with 🙂

First off iSCSI port binding basically takes two separate paths to an iSCSI target (the paths are defined by vmkernel ports) and bonds them together.  So, we need two vmkernel ports.  They can be on the same switch or separate switches, but the key is that you can only have one network adapter assigned to it.  Meaning the vSwitch can contain multiple nics, but you need to ensure that the config is overridden on the vmkernel level to only have one NIC active.  Let's have a look at this.  Below you will see the current setup of my vmkernel ports (IPStore1 and IPStore2).

ipBondBefore

As you can see, my configuration here is actually wrong and needs to be adjusted – remember, one nic per vmkernel port.  So, with a little click magic we can turn it into what you see below.

ipBondAfter

Basically, for IPStore1 I have overridden the default switch config on the vmkernel port, setting vmnic0 as active and vmnic1 as unused.  For IPStore2 we will do the same except the opposite (hehe, nice, that makes no sense) – basically, override but this time set vmnic1 as active and vmnic0 as unused.  This way we are left with two vmkernel ports, each utilizing a different NIC.

Now that we have the requirements setup and configured we can go ahead and get started on bonding the vmkernel ports together.  This is not a hard thing to do!  What we are going to want to do is right-click on our software iSCSI initiator and select 'Properties'.  From there we can browse to the 'Network Configuration' tab and simply click 'Add'.  We should now see something similar to below.

ipbond

As you can see above, our VMkernel adapters are listed.  If they weren't, that would indicate that they are not compatible to be bonded, meaning we haven't met the requirements outlined earlier.  By selecting IPStore1 and then going back in and selecting IPStore2 ( I know, you can't do it at the same time 🙂 ), then selecting OK, then performing the recommended rescan you will have completed the task.  We can now see that below inside of our 'Manage Paths' section for a datastore that has been mounted with our iSCSI initiator we have some nifty multipath options.  First, we have an additional channel and path listed, as well, we are able to switch our PSP to thinks like Round Robin!

ipbondmultipath

And kapow!  That's it!  We are done!  In the next post we will look at how to perform some PSP/PSA related commands against this bad boy!  

Holy crap the book is done – Troubleshooting vSphere Storage is available!

As some of you may now for the past, what feels like years but is probably closer to 6 months or so I have been working on a book project revolving around troubleshooting storage in a vSphere environment.  At last I'm happy to say that the book is finally published and sitting on a variety of websites (Packt, Amazon) waiting to be purchased and consumed by you 🙂 !  The book, cleverly titled 'Troubleshooting vSphere Storage' is 150 pages straight to the point exercises that a vSphere admin can take when dealing with storage visibility, contention, and capacity issues.

2062EN_mockupcover_normal

Early on when I was pondering the idea of doing this I had no idea about the amount of work and time commitment that writing a book would consume!  I most certainly have a new found respect for the rock stars that are putting out 500 page books out there!  It really takes a major commitment from the authors, reviewers, and editors to get everything done!  Speaking of reviewers, my technical reviewers, Angelo Luciani ( blog / twitter ), Jason Langer ( blog / twitter ), and Eric Wright ( blog / twitter ) were key to me actually finishing this project.  Their feedback was awesome and without it, well, who knows what state the book would be in. So a big thanks goes out to them for all their help!

Needless to say I'm pretty excited to have a published piece of work out there – and if it helps just one person, well, then I guess I've done what I set out to do 🙂

8 weeks of #VCAP – The ESXi Firewall

Alright, continuing on the realm of security let's have a look at the built in firewall on ESXi.  This post will relate directly to Objective 7.2 on the blueprint!  Basically, a lot of this work can be done in either the GUI or the CLI, so chose what you are most comfortable with.  I'll be jumping back and forth from both!  Some things are just easier in the GUI I find….anyways, I only have like 4 weeks to go so let's get going…

First up, enable/disable pre configured services

Easy/Peasy!  Hit up the 'Security Profile' on a hosts configuration tab and select 'Properties' in the 'Services' section.  You should see something similar to that of below

builtinservices

I guess as far as enabling/disabling you would simply stop the service and set it to manual automation.

Speaking of automation, that's the second skill

As you can see above we have a few options in regards to automation behavior. We can Start/Stop with the host (basically on startup and shutdown), Start/Stop manually (we will go in here and do it), or Start automatically when …( I have no idea what this means 🙂 sorry – let me know in the comments 🙂 ).  Anyways, that's all there is to this!

We are flying through this, Open/Close Ports

Same spot as above just hit the 'Properties' link on the Firewall section this time.  Again, this is just as easy – just check/uncheck the boxes beside the service containing the port you want to open or close!  Have a look below – it's pretty simple!

opencloseports

Another releavant spot here is the 'Firewall' button at the bottom.  Aside from opening and closing a port, we can also specify which networks are able to get through if our port is open.  Below I'm allowing access only from the 192.168.1.0/24 network.  

allowedipsAgain this can be done within the CLI, but i find it much easier to accomplish inside of the GUI.  But, that's a personal preference so pick your poison!

That's what I get for talk about the CLI, custom services!

Aha!  Too much talk of the CLI leads us to a task that can only be completed via the CLI; Custom Services.  Basically, if you have a service that utilizes ports that aren't covered off by the default services you need to create your own spiffy little service so you can enable/disable it and open/close those ports and allow access to it.  So, off to the CLI we go…

The services in the ESXi firewall are defined by XML files located in /etc/vmware/firewall  The service.xml file contains the bulk of them and you can define yours in there, or you can simply add any xml file in the directory and it will be picked up (so long as it is defined properly).  If you have enabled HA you are in luck – you will see an fdm.xml file there.  Since the VCAP is time sensitive this might be your quickest way out as you can just copy that file, rename it to your service and modify as it fits.  If not, then you will have to get into service.xml and copy text out of there.  I'm going to assume HA is enabled and go the copy/modify route.

So, copy fdm.xml to your service name

cp fdm.xml mynewservice.xml

Before modifying mynewservice.xml you will need to give root access to write to it, use the following to do so…

chmod o+w mynewservice.xml

Now vi mynewservice.xml – if you don't know how to use 'vi', well, you better just learn, go find a site 🙂  Let's say we have a requirement to open up inbound tcp/udp 8000 and tcp/udp 8001 on the outbound.  We would make that file look as follows, simply replacing the name and ports and setting the enabled flag.

customservice

Alright, save that bad boy, and probably it's a good idea to run 'chmod o-w mynewservice.xml' and take away that write permission.  If you go and look at your services, or simply run 'esxcli network firewall ruleset list' you might say, "hey, where's my new service?"  Well, it won't show up until you refresh the firewall – to do so, use the following command..

esxcli network firewall refresh

Now you can go check in the GUI or do the following to list out your services…

esxcli network firewall ruleset list

rulesetWoot!  Woot!  It's there!  But wait, it's disabled.  No biggie, we can go ahead and enable it just as we did the others in the steps earlier in this post – or, hey, since we are in the CLI let's just do it now!

esxcli network firewall ruleset set -r mynewservice -e true

And that's that!  You are done!  If asked to set the allowedIP information, I'd probably just jump back to the GUI and do that!

Set firewall security level – More CLI goodness

Well before we can set the firewall security level let's first understand what security levels are available to us.  ESXi gives us three…

High – This is the default – basically, firewall blocks all incoming and outgoing ports except for the the essential ports it needs to run.

Medium  – All Incoming is blocked, except for any port you open – outgoing is a free for all

Low – Nada – have at it, everything is open.  

Anyway, we can get the default action by specifying

esxcli network firewall get

and to change it we have a few options…  Passing '-d false' would set us to DROP (the default HIGH security level), passing a '-d true' will set us up to PASS traffic (I think this would be the medium security) and setting a '-e false' will disable the firewall completely (the low settings).  So, to switch to medium we could do the following

esxcli network firewall set -d true

I could be wrong here, so if I am just let me know and I'll update it 🙂

And guess what?  We are done with the firewall!  I would practice this stuff as it's easy measurable and can be quickly identified as you doing something right or wrong – I'd bet this will be on the exam in one way or another.  Good Luck!

8 weeks of #VCAP – Security

Just as I said I'm going to hop around from topic to topic, so without further ado we move from HA to security. This post will be pretty much all of objective 7 on the blueprint – some things I may graze over while focusing heavily on others.  

So first up is Objective 7.1 – now there is a lot of information in here and I'll just pull out the most important in my opinion, as well as the task I don't commonly perform.  So that said, I'm going to leave out the users, groups, lockdown mode, and AD authentication.  These things are pretty simple to configure anyways.  Also, this whole authentication proxy thing – I'm just going to hope for the best that it isn't on the exam 🙂  So, let's get started on this beast of an objective.

SSH

Yeah, we all enable it right – and we all suppress that warning with that advanced setting.  The point is, ssh is something that is near and dear to all our hearts, and we like to have the ability to access something via the CLI in the case the GUI or vCenter or something is down.  So with that said, let's have a look at what the blueprint states in regards to SSH – customization.  Aside from enabling and disabling this, which is quite easy so I won't go over it, I'm not sure what the blue print is getting at.  I've seen lots of sites referencing the timeout setting so we can show that.  Simply change the value in the Advanced Settings of a host to the desired time in seconds (Uservars->ESXiShellTimeOut) as shown below

esxishelltimeoutAs far as 'Customize SSH settings for increased security' goes, I'm not sure what else you can enable/disable or tweak to do so.  If you are familiar with sshd I suppose you could permit root from logging in and simply utilize SSH with a local user account.  

Certificates and SSL

The blueprint mentions the enabling and disabling of certificate checking.  This is simply done by checking/unchecking a checkbox in the SSL section of the vCenter Server settings.

The blueprint also calls out the generation of ESXi host certs.  Before doing any sort of certificate generation or crazy ssl administration always back your original certs up.  These are located in /etc/vmware/ssl – just copy them somewhere.  To regenerate new certs simply shell into ESXi and run generate-certificates – this will create new certs and keys, ignore the error regarding the config file 🙂  After doing this you will need to restart your management agents (/etc/init.d/hostd restart) and quite possibly reconnect your host to vCenter.

To deploy a CA signed cert you can simply just copy your certs to the same directory (/etc/vmware/ssl ) and be sure they are named rui.cert and rui.key and restart hostd the same as above.

As far as SSL timeouts I couldn't find this located in any of the recommended tools for this objective, it's actually in the security guide (which makes sense right, we are doing the security objective #fail  – either way, you need to edit the /etc/vmware/hostd/config.xml file and add the following two entries to modify the SSL read and handshake timeout values respectively (they are in milliseconds remember)

<readTimeoutMs>15000</readTimeoutMs>

<handshakeTimeoutMs>15000</handshakeTimeoutMs>

Once again you will need to restart hostd after doing this!

Password policies

Yikes!  You want to get confused try and understand the pam password policies.  I'll do my best to explain it – keep in mind it will be high level though – this is in the blueprint however I'm not sure if they are going to have you doing this on the exam.  Either way, it's good to know…  Honestly, I don't think I'm going to memorize this, if you work with it daily then you might, but me, no!  I'll just know that it is also in the security guide (search for PAM).  Anyways, here's the command

password requisite /lib/security/$ISA/pam_passwdqc.so retry=N min=N0,N1,N2,N3,N4

​Wow!  So what the hell does that mean?  Well, first off N represents numbers (N = retry attempts, N0 = length of password if only using one character class, N1 = length if using two character classes, N2 = length of words inside passphrases, N3 = length if using three character classes, N4 = length if using all four character classes).  Character classes are basically lower case, upper case, numbers and special characters.  They also confuse things by slamming the passphrase settings right in the middle as well – Nice!  Either way, this is the example from the security guide.

password requisite /bin/security/$ISA/pam_passwdqc.so retry=3 min=12,9,8,7,6

This translates into three retry attempts, 12 character password min if using only one class, 9 character minimum if using two classes, 7 character minimum if using three classes, and 6 character minimum if using all four classes.  As well, passphrases are required to have words that are at least 8 characters long.

No way can I remember this, I'm just going to remember Security Guide + CTRL+F + PAM 🙂

I'm going to cut this post off here and give the ESXi firewall its' own post – my head hurts!!!! 🙂

8 weeks of #VCAP – HA

Although High Availability is something I’ve been configuring for many years now I thought it might be a good idea to go over the whole process again.  This became especially evident after watching the HA section of Jason Nash’s TrainSignal/PluralSight course, as I quickly realized there are a lot of HA advanced settings that I’ve never modified or tested – with that said, here’s the HA post.

First off I’m not going to go over the basic configuration of HA – honestly, it’s a checkbox right – I think we can all handle that.  I will give a brief description of a few of HA bullet points that are listed within the blueprint and point everyone where we can manage them.

First up, Admission Control

HA-1-AdminisionControl.

When an HA event occurs in our cluster, we need to ensure that enough resources are available to successfully failover our infrastructure – Admission control dictates just how many resources we will set aside for this event.  If our admission control policies are violated, no more VMs can be powered on inside of our cluster – yikes!  There are three types…

Specify Failover Host – Ugly!  Basically you assign a host as the host that will be used in the event of an HA event.  The result of an HA event is the only time that this host will have VMs running on it – all other times, it sits there wasting money 🙂

Host failures cluster tolerates – This is perhaps the most complicated policy.  Essentially a slot size is calculated for CPU and memory, the cluster then does some calculations in order to determine how many slot sizes are available.  It then reserves a certain number of failover slots in your cluster to ensure that a certain number of hosts are able to failover.  There will be much more on slot size later on in this post so don’t worry if that doesn’t make too much sense.

Percentage of Cluster resources reserved – This is probably the one I use most often.  Allows you to reserve a certain percentage of both CPU and Memory for VM restarts.

So, back to slot size – a slot is made up of two components; memory and cpu.  HA will take the largest reservation of any powered on VM in your environment and use that as its memory slot size.  So even if you have 200 VMs that have only 2GB of RAM, if you place a reservation on just one VM of say, oh, 8GB of RAM, your memory slot size will be 8GB.  If you do not have any reservations set, the slot size is deemed to be 0MB + memory overhead.

As for CPU, the same rules apply – the slot size is the largest reservation set on a powered on VM.  If no reservations are used, the slot size is deemed to be 32MGHz.  Both the CPU and Memory slot sizes can be controlled by a couple of HA advanced settings – das.slotCpuInMhz and das.slotMemInMb (**Note – all HA advanced setting start with das. – so if you are doing the test and you can’t remember one, simply open the Availability doc and search for das – you’ll find them ).  These do not change the default slot size values, but more so specify an upper limit in wich a slot size can be.

So let’s have a look at these settings and slot size – first up, we can see our current slot size by selecting the ‘Advanced Runtime Info’ link off of a clusters’ Summary tab.  As shown below my current slot size for CPU is 500Mhz and 32MB for memory, also I have 16 total slots, 4 of which have been taken.

ha-2-slotsizebefore

So let’s now set the advanced setting das.slotCpuInMhz setting to something lower than 500 – say we only ever want our CPU slot size for a VM to be 64Mhz.   Within the clusters’ HA settings (Right-click cluster->Edit Settings, vSphere HA) you will see an Advanced Options button, select that and set our das.slotCpuInMhz to 64 as shown below.

ha-2-slotsizeadvNow we have essentially stated that HA should use the smallest of either the largest VM CPU reservation, or the value for das.slotCpuInMhz as our CPU slot size.  A quick check on our runtime settings again reflects the change we just made.  Also, if you look, you will see that we have also increased our total available slots to 128, since we are now using a CPU slot size of 64 Mhz rather than 500.

ha-2-slotsizeafter

So that’s admission control and slot sizes in a nutshell.  Seems like a good task to have you limit or change some slot sizes on the exam.  Also, I’m not sure how much troubleshooting needs to be performed on the exam but if presented with any VMs failing to power on scenarios, slot sizes and admission control could definitely be the answer.

More Advanced Settings

As you may have seen in the earlier screenshots there were a few other of those das. advanced settings shown.  Here’s a few that you may need to know for the exam, maybe, maybe not, either way, good to know…

das.heartbeatDsPerHost – used to increase the number of heartbeat datastores used – default is 2, however can be overridden to a maximum of 5.  Requires complete reconfiguration of HA on the hosts.

das.vmMemoryMinMb – value to use for the memory slot size if no reservation is present – default of 0

das.slotMemInMb – upper value of a memory slot size – meaning we can limit how large the slot size can be by using this value.

das.vmCpuMinMhz – value to use for the cpu slot size if no reservations are present – default of 32.

das.slotCpuInMhz – upper value of a CPU slot size – meaning we can limit how large the slot size can be by using this value

das.isolationAddress – can be used to change the IP address that HA pings when determining isolation – by default this is the default gateway.

das.isolationAddressX – can be used to add additional IPs to ping – X can be any number between 0 and 9.

das.useDefaultIsolationAddress – can be used to specify whether HA should even attempt to use the isolation address.

Anyways, those are the most commonly used settings – again, any others will be listed in the availability guide so use that if needed to find others on the exam – but remember, having to open those pdf’s will take away valuable time.

Other random things

Just a few notes on some other parts of HA that I haven’t used that often.  The first being VM Monitoring.  VM Monitoring is a process that will monitor for heartbeats and I/O activity from the VMware tools service inside your virtual machines.  If it doesn’t detect activity from the VM, it determines that it has failed and can proceed with a reboot of that VM.  vSphere has a few options as it pertains to VM monitoring that we can use to help prevent false positives and un needed VM reboots.

Failure Interval – Amount of time in seconds to check for heartbeats and I/O activity.

Minimum Uptime – The amount of time in seconds that VM monitoring will wait after a power on or restart before it starts to poll.

Maximum Per VM Resets – the number of times that a VM can be reset in a given time period (Reset Time Window)

Reset Time Window – used fo the maximum VM resets – specified in hours

The blueprint also mentions heartbeat datastore dependencies and preferences.  Quickly, vSphere will chose which datastores to use as HA heartbeat datastores automatically, depending on a number of things like storage transport, number of hosts connected, etc.  We can change this as well in our options.  We can instruct vSphere to only chose from our preferred list (and by which only selecting 2 datastores will in turn allows us to determine which datastores are used) or we can say to use our preferred if possible, but if you can’t, go ahead and chose the ones you want.

As well, most all of the settings we set for defaults such as isolation response and restart priority can be set on a per-VM basis as well.   This is pretty easy so I won’t explain it but just wanted to mention that it can be done.

I’d say that’s enough for HA – it’s not a hard item to administer.  That said, lab it, lab all of it!  Practice Practice Practice.