VCSA 6.5 Migration deployment sizes limited!

Recently I finally bit the bullet and decided to bring the vCenter portion of a vSphere environment up to version 6.5.  Since the migration from a Windows based vCenter to the VCSA is now a supported path I thought it would also be a good time to migrate to the appliance as well.  So with that I ran through a few blogs I found in regards to the migration, checked out the vSphere Upgrade Guide and peeled through a number KB’s looking for gotchya’s.  With my knowledge in hand I headed into the migration.

At this point I had already migrated my external windows based PSC to version 6.5 and got started on the migration of the windows-based vCenter Server.  Following the wizard I was prompted for the typical SSO information along with where I would like to place the appliance.  The problem though came when I was prompted to select a deployment size for my new VCSA.  My only options available were Large and X-Large.  Might not be a big deal if in fact this environment required this amount of resources – Looking at the table below those deployment sizes are scoped to fit at a 1000 host and above mark.

DeploymentSize

Did this environment have 1000+ hosts and 10000+ VMs?  Absolutely not!  At its largest it contained maybe 70 hosts and a few hundred VMs running on them – a Small configuration at best, medium if you want to be conservative!  At first I thought maybe I was over provisioned in terms of resources on my current vCenter Server – but again, it only had 8 vCPU’s and 16GB of RAM.  With nothing out of the ordinary with vCenter itself I turned my attention to the database – and that’s where my attention stayed as it was currently sitting at a size of 200GB.  Honestly, this seemed super big to me and knowing that it had been through a number of upgrades over the years I figured I would make it my goal to shrink this down as small as possible before trying again!  TL;DR; version – The database was the culprit and I did end up with the “small” option –  but I did a number of things after a frenzy of Google’s and searches – all listed below…

WAIT!!!!  Don’t be that guy!  Make sure you have  solid backups and can restore if things here go sideways – engage VMware GSS if needed – don’t just “do what I do” 🙂

 

Reset the vpx provider

The vpx data provider basically supplies the object cache for vCenter – caching all inventory objects such as hosts, clusters, VMs, etc in order to provide that super-snappy response time in the vSphere Web Client 6.0 (Is this sarcasm?).  Anyways, resetting this essentially will reduce the size of our Inventory Database.  Now, the problem in versions prior to 5.5 Update 3 is that there was no way to reset individual data providers – in order to do one you had to do them all – and that meant losing all of your tags, storage profiles/policies, etc.  Thankfully, 5.5 U3 and 6.0 allows us to simply reset just vpx, leaving the rest of our environment in-tact.  In order to do so we must first get into the vSphere Inventory Managed Object Browser (MOB) and get the UUID of the vpx provider.  **NOTE, this is different than the MOB you may be used to logging into, see below ***

First, log into the Inventory Service MOB by pointing your browser to https://vCenterIP/invsvc/mob1/    From there, simply click the ‘RetrieveAllProviderConfigs’ link within the Methods section as shown below

invsvcprovider

In the pop up dialog, click ‘Invoke Method’, then run a search for vpx

vpxprovider

It’s the providerUuid string that we are looking for – go ahead and copy that string to your clipboard and return to https://vCenterIP/InvSvc/mob1/ – this time, clicking the ‘ResetProviderContent’ link under Methods.  In the pop up dialog, paste in your copied UUID and click ‘Invoke Method’ as shown below…

resetcontent

After a little while the window should refresh and hopefully you see no errors!   The process of resetting for myself took roughly 5 minutes to complete….

Getting rid of logs

Although vCenter does its own log rotation you may want to check out and see just how much space your logs are taking up on your current vCenter server before migrating as some of this data is processed during the migration/upgrade.  I freed up around 30GB of disk by purging some old logs – not a lot, but 30GB that didn’t need to be copied across the wire during the migration.  There is a great KB article here outlining the location and purpose of all of the vCenter Server log files – have a look at it and then peruse through your install and see what you may be able to get rid of.   For the windows version of vCenter you can find all of the logs in the %ALLUSERSPROFILE%\VMware\vCenterServer\logs\ folder.  I mostly purged anything that was gzipped and archived from most of the subfolders within this directory.  Again, not a difference maker in terms of unlocking my “Small” deployment option – but certainly a time-saver during the migration!  So what was culprit that was not allowing me to select “Small” – yeah, let’s get to that right now…

My Bloated vCenter Database

bloateddbYeah, 200GB is a little much right – even after resetting the vpx provider and shrinking the database files I was still sitting pretty high!  So, since I had no intention of migrating historical events, tasks and performance data I thought I’d look at purging it before hand!  Now if you have ever looked at the tables within your vCenter Server database you will find that VMware seems to create a lot of tables by  appending a number to the VPX_HIST_STAT table.  I had a lot of these – and going through them one by one wasn’t an option I felt like pursuing.  Thankfully, there’s a KB that provides a script to clean all of this up – you can find that here!  Go and get the MSSQL script in that KB and copy it over to your SQL Server.  Once you stop the vCenter Service we can simply run the following command via the command prompt on our SQL Server to peel through and purge our data.

sqlcmd -S IP-address-or-FQDN-of-the-database-machine\instance_name -U vCenter-Server-database-user -P password -d database-name -v TaskMaxAgeInDays=task-days -v EventMaxAgeInDays=event-days -v StatMaxAgeInDays=stat-days -i download-path\2110031_MS_SQL_task_event_stat.sql

Obviously you will need to assign some values to the parameters passed (TaskMaxAgeInDays, EventMaxAgeInDays, & StatMaxAgeInDays).  For these you have a few options.

  • -1 – skips the respective parameter and deletes no data
  • 1 or more – specifies that the data older than that amount of days will be purged
  • 0 – deletes it all!

For instance, I went with the 0, making my command look like the following….

sqlcmd -S IP-address-or-FQDN-of-the-database-machine\instance_name -U vCenter-Server-database-user -P password -d database-name -v TaskMaxAgeInDays=0 -v EventMaxAgeInDays=0 -v StatMaxAgeInDays=0 -i download-path\2110031_MS_SQL_task_event_stat.sql

After purging this data, and running a shrink on both my data and log files I finally had my vCenter database reduced in size – but only to 30GB.  Which, in all honesty still seemed a bit large to me – and after running the migration process again I still didn’t see my “Small” deployment option.   So I went looking for other large tables within the database and…..

Hello VPX_TEXT_ARRAY

It’s not very nice to meet you at all!!!  After finally getting down to this table – and running “sp_spaceused ‘VPX_TEXT_ARRAY’” I found that it was sitting a whopping 27GB.  Again, a flurry of Google!  What is VPX_TEXT_ARRAY and what data does it hold?  Can I purge it?  Well, yes….and no.  VPX_TEXT_ARRAY, from what I can gather keeps track of VM/Host/Datastore information – including information in regards to snapshots being performed on your VMs.  Also from what I can gather, from my environment anyways, is that this data exists within this table from, well, the beginning of time!  So, think about backup/replication products which constantly perform snapshots on VMs in order to protect them – yeah, this could cause that table to grow.  Also, if you are like me, and have a database that has been through a number of upgrades over the years you may end up having quite a bit of data and records within this table as it doesn’t seem to be processed in any sort of maintenance job.  In my case, 7 million records resided within VPX_TEXT_ARRAY.  Now, don’t just go and truncate that table as it most likely has current data residing in it – data vCenter needs in order to work – there’s a reason it tracks it all in the first place right?  Instead, we have to parse through the table, comparing the records with those that are in the VPX_ENTITY table, ensuring we only delete items which do not exist.  The SQL you can use to do so, below…

DELETE FROM VPX_TEXT_ARRAY
WHERE NOT EXISTS(SELECT 1 FROM VPX_ENTITY WHERE ID=VPX_TEXT_ARRAY.MO_ID)

A long and boring process – 18 hours later I was left with a mere 9000 records in my VPX_TEXT_ARRAY table.  Almost 7 Million removed.  Just a note, there is a KB outlining this information as well – in which it says to drop to SINGLE_USER mode – You can if you wish, but I simply just stopped my vCenter Server service and stayed in MULTI_USER so I could check in from time to time to ensure I was still actually removing records.  an sp_spaceused ‘VPX_TEXT_ARRAY’ in another query window will let you track just that.   Also, it might be easier, if you have the space, to set the initial size of your transaction logs something bigger than the amount of data in this table.  This allows SQL to not have to worry about growing them as it deletes records – you can always go back in the end and reset the initial size of the tlogs to 0 to shrink them.

So – a dozen coffees and a few days later I finally ran another shrink on both the data and log files, setting their initial sizes to 0 and voila – a 3GB database.  Another run at the migration and upgrade and there it was – the option to be “Small”!  Again, this worked in my environment – it may not work in yours – but it might help get you pointed in the right direction!  Do reach out if you have any questions and do ensure you have solid backups before you attempt any of this or anything you read on the net really Smile  Also, there’s always that Global Support Services thing that VMware provides if you want some help!   Thanks for reading!

Spring forward to the Toronto VMUG UserCon

Ahh Spring –  Most people describe this as a time where the rain falls and cleans everything up around us – flowers blooming, grass growing – a sign of warmth to come!  In Canada though, it’s a sign of giant muddy snow piles full of gravel, salt and sand from all of the plowing and shoveling performed all Winter long – for me, it’s a muddy white dog and two little munchkins tracking muck all over the house – All that said, there is some hope for Spring this year!  March 23rd marks the date for our next Toronto VMUG UserCon – so, if you want to escape the mud and the muck come on down to the Metro Toronto Convention Centre this Thursday and join 600+ of your peers for some great learning, technical sessions and some awesome keynotes!  We’ve got a great one planned this year and I just wanted to highlight some of the keynotes and sponsors we have lined up for Thursday!

First up – Mr. Frank Denneman

Over the years we have been lucky enough to have some awesome keynote speakers for our UserCon – this year is no exception!  I’m super excited to hear from Frank Denneman!  If you don’t know who Frank is let me try and enlighten you a little – this man literally wrote the book on DRS – three times!   The “HA and DRS/Clustering Deepdive” books – written by Frank and his co-author Duncan Epping are honestly one of the greatest tech books ever.  It’s written in a text that is easy to read, and has literally taught me so much about HA and DRS I can’t even begin to explain it all!  Certainly a must read for any VMware admin.  Frank moved on from VMware for a little while to work with PernixData as the CTO and has just recently returned to VMware taking on the role of Senior Staff Architect within their SDDCaaS Cloud Platform Business Unit.  Frank will be giving a talk titled “A Closer Look at VMware Cloud on AWS”.  With VMware and Amazon announcing a partnership recently allowing us to consume bare-metal ESXi from within the wide range of Amazon’s data centers this will most certainly be an interesting keynote explaining just how it works – and what we can expect from it in terms of unified management between our on-premises and AWS infrastructure.

The Breakouts and Panels!

After Frank the morning breakout sessions will then kick off – here we will have sessions from a variety of partners and vendors whom provide everything from hardware to storage to back up to monitoring.  You will see all of the familiar names here with 30 minute breakout sessions covering off their technologies.  Take a look at our sponsors below – without these companies these events wouldn’t be possible!    A round of sessions from VMware follows a couple of rounds of sessions from third-party vendors, then, lunch, and an aspiring/VCDX panel talk where you can be sure to get some in-depth answers to any questions you may have about design, architecture, or every day management of your VMware infrastructure.

Drinks, Food, and DiscoPosse’s

After lunch we have another couple of rounds of breakout sessions by VMware and our sponsors – with a reception following immediately thereafter.  vSphere with Operations Management will sponsor our networking reception, complete with drinks and appetizers – a perfect way to end what I’m sure will be a jam-packed day!  That said, what’s a beer without entertainment right?  We are super happy to have our own VMUG co-leader Eric Wright (@discoposse) giving our closing keynote for the day!  Think of this a little like the technology version of CBC’s Hometown Heroes segment that they offer on Hockey Night in Canada!  Eric, our own hometown hero will deliver a jam packed hour of all things VMware and Terraform, showing us just how easy it is to start automating our infrastructure with the open source software!  I got a sneak peek of this at our last local VMUG meeting and this is something you won’t want to miss!

Free Stuff!

Then, yes, of course, Giveaways!  We have some pretty cool prizes this year including cold hard cash (VISA gift cards), GoPro’s, and the ever popular grand prize of a complete vSphere Homelab!   This is on top of all the great giveaway’s we see from our sponsors!

So if you aren’t busy this Thursday, register now & drop in – we’d love to see you there!  Even if you are busy, cancel everything and come on down!  Can’t make it?  Follow along via Twitter with the hashtag #tovmug and hey, we have more meetings coming up as well to help you all get the Toronto VMUG experience.  Our Q2 meeting is May 31st sponsored by Veeam and Mid-Range and our Q3 meeting is tentative for September 19th with sponsors Zerto and Tanium (still in development) – come and check us out.  As always, stay connected.  You can follow us on Twitter, connect on LinkedIn, watch our website, or become a member of the Toronto VMUG Community in order to stay up to date on all things tovmug!  See you Thursday!

 

Don’t delay!  Register now for the March 23rd Toronto VMUG UserCon!

 

What to expect from VeeamON 2017

I’ve had the opportunity to attend both the previous VeeamON conferences in Vegas as well as the mini VeeamON forum last year in the UK and since it’s still a relatively new conference on the scene I thought I’d give everyone a bit of an overview and heads up as what to expect from the event!  Before going to far into how the event is laid out let’s first take a look at the logistics.  While I do like Vegas it tends to get a bit monotonous when it comes to conferences – making them all kind of feel like the same event.  That’s why I was ecstatic to hear that VeeamON 2017 will be held in New Orleans this year from May 16th through the 18th!  So, as Veeam embarks on its’ third VeeamON event I thought I might go over a bit on what to expect for those that may be unfamiliar with the backup vendors availability event.

Expect A LOT of technical information

With over 80 breakout sessions you can most certainly expect to learn something!    The thing about the breakouts in VeeamON though is their level of technicality.  I’ve been to many breakout sessions at other conferences that tend to be pretty marketing heavy – while VeeamON most certainly has a marketing agenda, the sessions themselves are very technical – with a 100 level being the least technical and a 400 level introducing you to things you never even knew existed!  I can honestly say that I was skeptical when attending my first VeeamON – wondering how they could have so many breakout sessions dealing solely with backup – man was I wrong!  Veeam B&R is a big application that touches a lot of different aspects of your infrastructure – think Repository best practices, proxy sizing, automation, best practices, etc.  This year with the addition of new products such as 0365 backup, Agents for Linux/Windows and the many storage integrations with partners you can bet that there will be plenty of content to be shared.

Expect a smaller, more intimate conference

VeeamON, compared to the bigger conferences is relatively small.  With roughly 2500 people in attendance last year and over 3000 expected this year the conference is not as spread out as what you may be used to – which is a good thing!  Honestly, it’s nice being able to keep everything relatively confined to the same space and even nicer to have no crazy lineups to cross the street at the Moscone.  I found that VeeamON made it very easy to find people – whether you are looking for that person or not.  Meaning, don’t be surprised to accidentally run into some Veeam executives in the hallways – or even the CEO in the elevator Smile  The atmosphere during the conference days at VeeamON is nice – not so loud that you can’t have a conversation – the solution exchange isn’t over run with vendors competing to see who has the loudest mic.  It’s a nice, low key conference which makes it easy to have those valuable hallway conversations that are usually the best benefit from any conference.

Expect to learn a little more about the “other hypervisor”

VMworld – the place you go to learn all there is to know about vSphere.  MS Ignite – the place you go to get all your Hyper-V knowledge!  VeeamON – since Veeam B&R supports both vSphere and Hyper-V you are going to hear a lot about both the hypervisors.  You’ll see your typical VMware crowd intermingling with…you know, the other guys,  all in support of the product that is protecting their infrastructure.  I’ve wrote about how the Vanguard program bridges this gap before – and the VeeamON conference is fairly similar in how it brings together the best of both the vSphere and Hyper-V worlds.  As my good friend Angelo Luciani always says “We are all in this together!”

Expect announcements!

This is a given right – every vendor organized conference is always organized around some sort of announcement or product release!  VeeamON 2014 saw the introduction to Endpoint Backup Free Edition, while VeeamON 2015 saw it’s OS counterpart announced with Veeam Backup for Linux!  All the while lifting the lid on some major enhancements and features in their core product Veeam Backup & Replication.  So what will we see this year in New Orleans – your guess is as good as mine.  Veeam just recently had a major event where they announced the evolution of the physical Windows/Linux backup products (Veeam Agent for Windows/Linux) into paid versions coupled with the Veeam Backup Console for centralized management of our endpoints – as well, we saw the release of  Veeam Backup for O365 – What else is left to announce?  I’m sure we will hear more about v10 and some top secret features from it but with all of the other new product announcements one might think there is nothing left to release – but, a wise man who worked for Veeam once told me that they have this shelf containing a lot of products and ideas – you never know when they will take something down off of it Smile

Expect to have ALL your questions answered

Veeam sends a lot of employees, engineers, tech marketing folks to this conference – and I mean A LOT.  Last VeeamON you couldn’t even walk through the Aria casino without running into at least a half dozen Veeam engineers.  What this means is, if you have questions, VeeamON is the perfect venue to ask them.  I can pretty much guarantee you that they will all be answered – there will be a SME on site dealing in the areas you are having trouble with.  So don’t just make VeeamON all about learning – try and get some of those pain points that have been bugging you for a while firmed up while at the conference.  Everyone is approachable and more than willing to give you a few minutes.

Expect an EPIC party

Sometimes you just have to let go right – If you have ever been to a Veeam party at any of the VMworlds you know that Veeam knows how to do just that!  In fact, I’ve heard more than once Veeam being described as a “Drinking company with a backup problem” Smile  I don’t quite see it as being like that but certainly you have to agree that Veeam knows how to throw a party and make you feel welcome.  Whether you are just arriving and hitting up the welcome reception or you are attending their main VeeamON party I know you will have a good time, with good food and good drinks!  Veeam understands that it can’t be all about business all the time – so take the opportunity at the parties to let a little loose and meet someone new!  I’ve made many lifelong friends doing just that!

So there you have it!  Hopefully I’ve helped paint the picture of what VeeamON is like for me and maybe helped you understand it a little more!  I’m super excited for VeeamON in New Orleans this May and I hope to see you there!

Runecast– Proactive performance for your VMware environment–Part 2 – Knowledge Base Articles, Best Practices, and Hardening Guidelines.

logoIn part 1 of our Runecast review we took a look at just how quickly we can get Runecast installed and configured within our environment.  We had a brief look at the Runecast dashboard which highlights any misconfigurations, un-applied Knowledge base articles, or non-compliant security settings.  We saw that within just a few minutes we were reporting on all this information from within our environment, and comparing that to up-to-date lists of best practices and hardening guidelines.  With KB’s, Best Practices, and Hardening Guidelines being at the heart of Runecast it’s best we take a more in-depth look at how we report on, manage, and resolve them within our environment.  That is exactly what this final part of the review will focus on.

So with all that said let’s start diving deeper into our test environment to see if we can solve any problems!  As we can see above, I currently have 38 issues that were already detected within my small little lab setup here, broken down into 5 critical, 19 major, and 14 medium.  Clicking on either severity item within the dashboard display will take us directly to a filtered view of our issues list, or we can view all issues by selecting Issues List along the left hand navigational menu.

runecastissues

By default, our issues appear rolled up – to get more information in regards the Knowledge Base Article, Best Practice or Security setting we can click the ‘+’ icon next to our issue as shown above.  As we can see here Runecast is reporting that we don’t have NTP configured on our ESXi host, falling under the Best Practice category.  Certainly time is an important thing in the world of computing so I can see why they would flag this as a critical issue.  We can also see after expanding the issue that we have a lot of other information available to us – a more descriptive issue of the problem, as well as ratings, impact, and a link to any reference material/knowledge base article, or security hardening guide to further explain or describe the issue and how to fix it.  This is very handy to have.  Right from within Runecast we can discover our issues and immediately jump into a document, user guide, or KB article outlining the problems and resolutions.

The ‘Findings’ tab within the expanded issue allows us to view the inventory objects within our environment that the issue applies to – in this case, both of our ESXi hosts.  I should note here that we do not need to first click on an issue to view it’s associated objects – we can do this in the reverse direction as well by using the Inventory item on the left hand navigation – Inventory essentially gets us to the same place, but allows us to browse through our vCenter inventory, selecting a host, cluster, datastore, vm, etc  and displaying just its’ associated issues.  Either way we get to the same information though, just a couple of routes to get there.

Another useful tab on this screen is the ‘Note’ tab.  As shown below we are able to input any notes or information that applies to this issue (or KB/Security setting for that matter) that we want.  This can be extremely useful if we have multiple people working within the Runecast environment, or even just for documentation for yourself as to why you are making or not making a certain configuration change.

runecast-issue-notes

In order to clear issues within Runecast we have a couple of options – firstly, and probably the most preferred method is to simply fix your issue – I’ve since setup NTP on my hosts and no longer see this issue being reported.  That said, as mentioned above their may be times when we have an issue present for a certain reason, especially dealing with the best practices category like the forged transmits setting above.  For this, we can simply click the ‘Ignore’ link next to an issue, create an object filter as shown below, by giving it a name and selecting the objects it applies to.

runecast-issues-ignore

After applying the filter the issue in question will no longer be reported in Runecast.  We can edit or remove this filter at any time by selecting the ‘Filter’ tab from within Runecast’s settings in order to reset anything we may want to.

From within the ‘Configuration Analysis’ section we are able to to view our issues in a different fashion.

First up KBs discovered will show us all of the KBs that have been discovered that apply to our environment.  It does this by parsing the VMware Knowledge Base and pulling down only those KBs which apply to the hardware and software versions we have running within our virtual infrastructure.    As we can see below we still have the same options as we did within the Issue List screen – we have our link out to the actual VMware KB article, the article is also embedded into Runecast, and we can add notes and choose to ‘Ignore’ certain KBs that may not apply.

runecast-kb

The ‘Best Practices’ and ‘Security Hardening’ take somewhat of a different approach as to how they are displayed.  Since best practices and security settings are actual configurations that we can choose to make in our environment they are displayed in a simple Pass/Fail fashion – passing if we meet the criteria of the practice or security setting, and fail if we do not.  This gives us the ability to quickly see thing such as “How many major items from the security guideline have we implemented”  or “Have we applied all of the ‘critical’ best practices to our environment.

runecast-bp

As we can see above we are getting a pass on our NTP settings, as we have already tackled them from the Issues screen.  We are however receiving a fail in terms of Remote TSM, which is essentially having SSH enabled on our hosts.  In my environments this is a known configuration setting, so I would most likely chose to create a filter to ignore this security setting.

The last section of Runecast I want to go over is the Log Analysis section.  Within here we can see that we have another couple of screens we can access – KBs Discovered and Verbose dashboards.  The KBs discovered section here deals solely with those KBs that specify certain patterns which are visible in the logs, such as with KB 2144934, where you can see below the “you see entries similar too…”

runecast-vmware-kb

Nobody likes searching through log files – it’s a long and tedious task.   In this situation, since we are already shipping our logs to Runecast why not let the analyzer go ahead and comb them for you.  If it finds a pattern that applies to any specific KB article, it will be flagged here.  This allows us to be quite pro-active in nature – alerting us of a KB issue that we may not even know we have.

As far as ‘Verbose Dashboards’ goes this allows us quickly get a grasp on all of the events occurring within our log files.  Again, the task of combing through log files and greping out certain items such as SCSI Aborts on the command line can be daunting, not to mention very time consuming.  Here, as shown below, we can do this directly from within the Runecast UI.

runecast-verbose

As you can see we have a lot of options to filter out the events within logs to get just the data we are looking for.  For instance we can define we only want to see those logs entries flagged as an error and applying only to a certain ESX host.  We can also define a time period of logs to parse – from predefined settings of the last 1/3/7/30 days to a custom period set up by us if we needed to audit a certain event at a certain time.   This is a very useful feature to have within the UI.  Since Runecast already has the log data in order to determine issues, why not give us a screen in order to analyze the raw data.  I can see this being super useful in terms of things such as searching for certain logins during a specific time period – something that isn’t easy to do sitting within the cli of an ESXi hosts.

Runecast really has a very nice product here and brings a lot of information out of our environment and puts it front and center in a very easy, simple, UI.  It’s so easy to setup as well – Simply deploy the ova, point it to our vCenter and right away we know how our environment stacks up in terms of best practices and security guidelines – as well as we have discovered any potential issues we may have, with all of the information on how to fix them.  All of this, in about 5 minutes.  Think about the flip-side of this, downloading best practices and the hardening guide and going through each line item one by one, looking up build numbers and then searching through mountains of VMware KB’s – not something I want to do.  While other  products providing some similar functionality such as vROPs and Log Insight may bring us more metrics, Runecast instead displays only what we need to see to properly troubleshoot our environment, keeping the UI clean and crisp and easy to use – aside from that, when compared to vROPs, Runecast doesn’t come with the install footprint, nor the price tag, and as far as I know is the only product on the market which parses and filters out VMware KBs for us.   As far as development goes Runecast isn’t holding back, with a beta version set to be released soon we can see features such as multitenancy being added to the product – as well as a few more undisclosed features set to be released in Q1/Q2 of this year.  Runecast comes with a fully featured, free 30 day trial but honestly the product gives you valuable information in the first 15 minutes –  so 30 days is more than long enough to get your environment up to snuff.   That said, in order to keep your environment running at it’s peak performance you will want to consult Runecast often as we all know how fast Best Practices and Security guidelines can change in our industry.  Runecast automatically adjusts to these changes – ensuring your environment is ALWAYS compliant.   The amount of time Runecast saves you is instantly recognized, and the fact that they are constantly connected to the VMware knowledge base and hardening guides means you are always “in the know” about how your environment is configured according the “preferred” way – even if your environment changes, or the “preferred” way changes!  If you want to try out Runecast and what it has to offer for yourself you can do so by signing up for their 30 day trial! I guarantee you will find something in need of some attention in your environment!

Runecast – Proactive performance for your VMware environment! – Part 1 – Configuration

Have you ever opened up the VMware Hardening Guide and checked your environment against every single item listed?  How about combed through the VMware Knowledge Base looking for all KB articles that apply to the exact software builds and hardware you have?  No?  How about taken a list of industry best practices and ensured that you are indeed configured in the best possible way?  Of course we haven’t – that would certainly take a lot of time and most organizations simply don’t have the resources to throw at those types of tasks.  All that said what if I told you that there was a piece of software that could pretty much instantly tell you whether you are or are not compliant in those exact three scenarios?  Interested yet?  I thought you might be…

Enter Runecast

logoBefore writing this review I’d never heard of Runecast, so first, a little bit about the company.  Runecast was founded in 2014 in the quaint ol’ city of London in the UK.  Their goal, to provide pro-active monitoring to our vSphere environments in order to save us time, prevent outages before they  happen, ensure compliance at all times and simply make our environments more secure.  Now there is only four things listed there – but they are four things that Runecast does really, really well.  With that said, I could talk about how much I enjoyed doing this review forever, but it’s best just to jump right in and get monitoring…

Configuration

runecast-addvcenterAs far as installation goes Runecast come bundled as a virtual appliance, so it’s just a matter of deploying the analyzer into our environment.  To help you get started Runecast offers a 30 day full-featured free trial that you can try out!  Configuration wise we really only have a couple of steps to perform; pointing the Runecast Analyzer at our vCenter Server and configuring our ESXi hosts to forward their logs.  After deployment you should be brought to a screen similar to the one shown to the left.  Simply follow the ‘Settings’ link and enter in your required vCenter Server information into Runecast as shown below.

runecast-vcenteradditiondetails

Remember how we mentioned that configuration is divided into two steps.  The first, connecting to our vCenter environment is now complete.  The second, setting up the forwarding of logs is completely optional and can be completed at any time.  We can still get valuable data from Runecast without having log forwarding set up, however in order to achieve a more holistic view of our environment we will continue to setup log forwarding.

There are many ways to setup our ESXi hosts to send their logs to Runecast.  We can set them up manually, use some a PowerCLI script, or enter the Runecast Analyzer information into our Host Profile.  The Runecast interface has the smarts to configure this for us as well.  This review will follow the steps in order to setup log forwarding from within the Runecast Analyzer UI.

Selecting the “Status” section from the Log Analysis group, and then clicking on the ‘wrench’ icon will allow us to configure one or many of our hosts to send their log files to Runecast.  This process provides the same results as if we were to go and set the syslog advanced setting directly on the hosts configuration. That said, utilizing Runecast for this seems like a much more automated and easier process.   As you can see below, we also have the option to send our VM log files as well which is a good idea if you are looking for complete visibility into your virtualization stack.

runecast-logging

As far as configuration goes we are now done!  That’s it!.  2 simple steps and we are ready to start detecting problems within our environment.  The process of going out and collecting data from our vCenter Server is called ‘Analyze’ within Runecast.  Our analysis can be configured to occur via a schedule by navigating to the settings page (gear icon in top right) or can be run on-demand by clicking the ‘Analyze Now’ button from any screen within the application.

runecast-analyze

How long this process takes greatly depends on the size of your environment.  My test environment, be it simple and small, only took a couple of minutes to gather the data.  I’m sure this time would increase in a 32 host cluster with 1000 or so VMs though.    That said, for the amount of data it gathers and the amount of comparisons going on behind the scenes Runecast does a very efficient job at processing everything.

Navigating back to the ‘Dashboard’ as shown below immediately let’s us start to explore the results of this analysis process.  Almost instantaneously we can see many issues and best practices that can be applied within our environment.  As you can see below I had a number of issues discovered – and I’ve only had Runecast up and running for less than 5 minutes.

runecast-dashboard

Runecast Terminology

Lets take a minute and dig a little into the data that is displayed on the ‘Dashboard’ screen.  Mostly everything that Runecast monitors and does is rolled up here, giving us an at-a-glance view of everything you need to know.  Let’s break down the items that we are seeing here…

Issues – The term “issue” within Runecast basically represents a detected problem in our infrastructure – this can come from any single or combined instance of configuration settings, log file analysis, or software and hardware versions.  Although the source of discovering issues could be from configuration settings or log files, all issues belong to one of three categories within Runecast; Knowledge Base articles, Security Guidelines, or Best Practices, explained below…

KB’s – Runecast actively piles through the vast amounts of VMware Knowledge Base articles and displays to us any that may apply to our environment based on the hardware and software versions and configuration we are running.

Best Practices – All of our inventory objects and configuration items are routinely scanned to determine whether or not they meet any best practices related to VMware.  This allows us to see if we simply Pass or Fail in terms having our environment running in it’s best possible configuration.

Security Compliance – Security Compliance takes all of the items within the official VMware Security Hardening guides and compares that to of the configuration of our infrastructure.  At a glance we are able to see how we stack up against the recommended security practices provided by VMware.

It’s these four items; Issues, KB’s, Best Practices, and Security Compliance that are at the core of the Runecast analytical engine.  Runecast automatically combs through all of these items and determines which ones apply to our environment, then reports back in a slick clean UI, allowing us to see whether we are in compliance or not!  In the next part of our review we will go into each of these items in a lot more detail – explaining how to drill down, resolve, and exclude certain metrics from our dashboards.  For now , I certainly recommend checking out Runecast for yourself – as you saw, it’s a simple install that can be up and running in your environment very quickly.  So, while you wait for part 2 of the review head on over to the Runecast page and grab yourself a free 30 day trial  to start reporting on your environment.  I’m sure you will be surprised at all of the abnormalities and non-compliant configurations you find right off the hop – I know I was!  Stay tuned for part 2.

Automation using the Nakivo API

apiThe Software Defined Data Center – It’s everywhere.  You can’t go to any big trade show in the IT industry without hearing the phrase “Software Defined X” being tossed around at all of the booths.  Over the last decade or so we have seen software take center stage in our data centers – being the glue that holds everything together.  With this focus on software it’s extremely important that companies develop and support API’s within their products – one, it’s our way of taking application x and integrating it with application y.  Secondly, its important for the success of the company – without an API organizations may look elsewhere for a solution that provides one – and without an API vendors cannot securely control the access into their solutions, leaving customers developing unsupported and faulty applications to get around it.

One big example that shows the benefit of API integrations that I always like to use is that of the deployment of a VM.   Sure, we use our Hypervisor of choice to take our templates and clone VMs from them, providing some sort of automation and orchestration around the configuration of that said VM – but the job doesn’t simply end here – we have monitoring solutions we may need to add our VM into, we have IP management tools in order to integrate into to retrieve IPs and DNS information, and most importantly, we have to ensure that our newly created VM is adequately protected in terms of backup and recovery.   With so many hands inside of the data center creating VMs our backup administrators might not always know a certain solution has been created – and when a failure occurs, there’s a pretty good chance we won’t be able to recover without any backups – so it’s this situation we will look at today…

Automatically protecting our VMs

Our software of choice today will be Nakivo Backup and Replication – a company based out of Silicon Valley providing data protection solutions.   Nakivo provides full API integration into their backup suite allowing administrators and developers to create automation around the creation, modification, and removal of jobs.  The scope of our integration will be as follows – Let’s create a simply vRealize Orchestrator workflow that will allow us to simply right-click a VM from within the vSphere Web Client, and add this VM into an already existing backup job.  From here I’ll let your imagination run wild – maybe you integrate this code into your VM deployment workflow to automatically protect it on creation – the point is that we have a starting point to look at the possibilities of consuming Nakivo’s API and creating some automation within your environment for backup and recovery.

nakivoapi-apidoc

A little about the Nakivo API

Before we get into the actual creation of the vRO workflow it’s best we understand a little bit about the Nakivo API itself.  Nakivo provides an API based around JSON content – so all of our requests and responses will be formatted within JSON format.  These requests will all go through using POST, and are always provided to the /c/router realm (ie https://ip_of_nakivo:4443/c/router) As far as authentication goes Nakivo utilizes cookie based authentication – what this means is that our first request will be sent to the login method, upon which we will receive a JSESSIONID which we will have to pass with every subsequent request in order to secure our connection.  As we can see from the example request below they need to be formatted in such a way that we first specify and instance (IE AuthenticationManagement, BackupManagement, InventoryManagement, etc) and a method (IE login, saveJob, getJob, etc).  From there we attach the data associated with the method and instance, as well as a transaction id (tid).  The transaction id can utilize an auto increment integer if you like, or can simply be set to any integer – it’s main purpose is to group multiple method calls into a single POST – which we won’t be doing anyways so you will see I always use 1.

var requestJSON = “{‘action’: ‘AuthenticationManagement’,’method’:’login’,’data’: [admin,VMware1!,true],’type’: ‘rpc’,’tid’: 1}”;

Above we show an example of a login request in JavaScript, because this is the language of choice for vRealize Orchestrator which we will be using – but do remember that you could use PHP/JAVA/PowerShell – whatever language you want so long as you can form an HTTP request and send JSON along with it.

On with the workflow

Before diving right into the code it’s best to take a look at the different sections or components that we will need to run through in order to add a given VM to a Nakivo job through vRealize Orchestrator.  With that said we can break the process down into the following sections…

  • Add Nakivo as an HTTPRest object within vRO
  • Create workflow w/ VM as an input object and the Nakivo HTTPREST as an argument
  • Create some variables in regards to our VM (IE Name, Cluster, etc)
  • Login to Nakivo to retrieve session
  • Retrieve our target job
  • Find VMs Cluster ID within Nakivo (ClusterID is required in order to find the actual VM within Nakivo).
  • Gather VM information from within Nakivo
  • Gather information about our repository from within Nakivo
  • Build  JSON request and add VM to job

With our workflow broken down into manageable chunks let’s go ahead and start coding

Add Nakivo as an HTTPRest object.

If you have ever worked with the HTTPRest plugin within vRO then this will seem like review – however for those that haven’t let’s take a look at the process of getting this setup.  From within workflow view simply run the ‘Add a REST host’ workflow located under the HTTP-REST/Configuration folders.  As far as parameters go simply give the host a name, use https://ip_of_nakivo:4443 as the URL, and be sure to select ‘Yes’ under the certification acceptance as shown below

nakivoapi-addhost

The remaining steps are somewhat invalid as it pertains to adding Nakivo as a REST host within vRO – for authentication I selected basic and provided the credentials for Nakivo – however this really doesn’t matter as we are going to use cookie/header based authentication through our code anyways – however something needs to be selected and inputted within vRO.  After clicking submit the NakivoAPI REST host should be added to our vRO inventory.

Workflow creation

As far as the workflow goes I’ve tried to keep it as simple as possible, requiring only 1 input attribute and 1 input parameter as follows

  • Input Attribute (Name: NakivoAPI – Type: RESTHost – Value: set to the Nakivo object created earlier

Nakivoapi-attribute

  • Input Parameter (Name: sourceVM – Type: VC:VirtualMachine )

nakivoapi-parameter

Code time!

After this simply drag and drop a scriptable task into the Schema and we get started with the code!  I’ve always found it easier to simply just display all the code and then go through the main sections by line afterwards.  As far as the javascript we need you can find it below…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
var vmName = sourceVM.name
var cluster = System.getModule("com.vmware.library.vc.cluster").getComputeResourceOfVm(sourceVM);
var clusterName = cluster.name;
 
// login and retreive sessionID
var requestJSON = "{'action': 'AuthenticationManagement','method':'login','data': [admin,VMware1!,true],'type': 'rpc','tid': 1}";
var request = NakivoAPI.createRequest("POST", "/c/router", requestJSON);
request.setHeader("Content-Type","application/json");
var response = request.execute();
var headers = response.getAllHeaders();
var cookie = headers.get("Set-Cookie");
 
// retrieve target job
requestJSON = "{'action': 'JobManagement','method':'getJob','data': [1],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
var jsonResponse = JSON.parse(response.contentAsString);
var job = jsonResponse.data;
 
// find clusterID
requestJSON = "{'action': 'InventoryManagement','method':'collect','data': [{'viewType':'VIRTUAL_ENVIRONMENT'}],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
 
// reduce to datacenters
var vcenter = jsonResponse.data.children[0];
var datacenters = vcenter.children;
var datacenter;
var cluster;
for ( var p in datacenters)
{
	for (var c in datacenters[p].children)
	{
		if (datacenters[p].children[c].name == clusterName)
		{
			cluster = datacenters[p].children[c];
		}
	}
}
var clusterid = cluster.identifier;
 
// look in cluster for VM info...
requestJSON = "{'action': 'InventoryManagement','method':'list','data': [{'nodeType':'VMWARE_CLUSTER','nodeId': '" + clusterid + "','includeTypes': ['VM'] }],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
var vms = JSON.parse(response.contentAsString);
vms = vms.data.children;
var vm;
for (var p in vms)
{
	if (vms[p].name == vmName)
	{
		vm = vms[p];
	}
}
 
// get more info on VM
requestJSON = "{'action': 'InventoryManagement','method':'getNodes','data': [true, ['"+ vm.vid + "']],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
var vminfo = JSON.parse(response.contentAsString);
vminfo = vminfo.data.children[0];
var vmdisk = vminfo.extendedInfo.disks[0].vid;
 
// get target storage
requestJSON = "{'action': 'InventoryManagement','method':'list','data': [{'includeTypes': ['BACKUP_REPOSITORY'] }],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
var targetVid = jsonResponse.data.children[0].vid;
 
//build data portion of JSON to add VM to job
var jsonSTR = '{ "sourceVid": "' + vminfo.vid + '","targetStorageVid": "' + targetVid + '","mappings": [{"type": "NORMAL","sourceVid": "' + vmdisk + '"}], "appAwareEnabled": false}';
var json = JSON.parse(jsonSTR);
 
//push new object to original job
job.objects.push(json);
System.log(JSON.stringify(job));
 
// let's try and push this back in now....
requestJSON = "{'action': 'JobManagement','method': 'saveJob', 'data': [" + JSON.stringify(job) + "],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
 
// done!!!!

Lines 1 -3 – Here we simply setup a few variables we will need later on within the script; vmName, which is assigned the name attribute of our input parameter sourceVM as well as going out and running a built in action to get the name of the cluster that the VM belongs to.  Both these variables will be needed when we attempt to get all of the information we need to add the VM to the backup job.

Lines 5-11 – This is our request to login to Nakivo.  As you can see we simply create our request and send the login method, along with the associated login data to the AuthenticationManagement interface.  This request basically authenticates us and sends back the JSSESSIONID that we need in order to make subsequent requests, which we store in a cookie variable on line 11.

Lines 13 – 20 – Again, we make a request to Nakivo to get our job that we want to add the backup to.  I only have one job within my environment so I’ve simply utilized the getJob method and sent the data type of 1 (the jobID) since I know that is my one and only job id in the system.  If need be you may need to write a similar request to get the job id if you don’t know it – Nakivo does provide methods within their API to search for a job ID by the job name.  Also note, since this is a subsequent request after a login we are sending our cookie authentication data on line 17 – also, we are taking our response data and storing it in a variable named job on line 20 – we will need this later when we update the job.

Lines 22 – 45 – This is basically a request to the InventoryManagement interface that we can use to find out what the id (as it is within Nakivo) of the cluster housing the virtual machine.  First, on line 23 we build a request to basically return our complete virtual infrastructure inventory – upon which we parse through from lines 35-44 looking for a match on our cluster names.  I’ve  had to loop through data centers as my test environment contains more than one virtual data center.  Finally, on line 45 we simply assign the clusterid variable the nakivo identifier of the cluster.

Lines 47 – 73 – Here we use our cluster identifier and basically list out the VM inventory within it.  After looping through when we find a match on our VM name, we simply assign it to a vm variable.  We then, on line 66 send a request to the InventoryManagement interface again, this time looking at the Virtual Machine level and sending the identifier of our newly discovered VM.  Once we have the response we assign the identifier of the VMs disk(s) on Line 73 to a variable.  Again, I know this environment and I know the VM only contains one disk so I’ve hard coded my index – if it was unknown, or truly automated you would most likely have to loop through the disks here to get your desired output.

Lines 75 – 82 – This block of code is used to get the identifier of the target storage, or repository within Nakivo.  Again we need this information for our final request that will add the VM to the job – and again, this is a known environment so I could simply hard code my array index on line 82 to return the proper repository (as there is only one).

Lines 84 – 90 – Here we are simply building out the JSON variable that we need in order to push all of the information we have previously gathered above.  We basically form our string on line 85, convert it to JSON directly after, and push it into the original job variable we set on line 20.

Lines 92-99 – Ah, finally – This block basically takes all of our hard work and pushes the job back into the saveJob method of the Nakivo JobManagement interface.  Once executed you should manually see your job info within Nakivo update reflecting the new VMs added to the job.

So there you have it! A completely automated way of selecting a VM within vRealize Orchestrator and adding it to a Nakivo backup job – all without having to open up the Nakivo UI at all!

But wait, there’s more!

Ahh – we eliminated the need of opening up the Nakivo UI but how about eliminating the Orchestrator client as well – and simply just executing this job from directly within the vSphere Web Client – sounds like a good idea to me!  If you have properly, and I say that because it can sometimes be difficult – but if you have properly integrated vRO and vSphere then doing this is a pretty easy task.

Within the ‘Context Actions’ tab on our vRO configuration within the web client simply click ‘+’ to add a new action.  As shown below we can simply browse our workflow library and select our newly created Nakivo workflow and associate that with the right-click context menu of a virtual machine.

nakivoapi-context

What we have essentially done now is allowed our administrators to simply right-click on a VM, browse to ‘All vRealize Orchestrator Actions’ and click on our workflow name.  From there the vRO workflow will take the associated VM (the one we right-clicked on) and assign it to our sourceVM parameter – meaning we’ve taken the complete process of logging into Nakivo, editing our backup job, adding a new VM, and saving it and converted it to a simple right click, followed up by a left click – without having to leave the vSphere Web Client!

nakivoapi-rightclick

So in all this is a pretty basic example of some of the things we can do with the Nakivo API – and it followed a pretty simple and stripped down workflow – but the point is Nakivo offers a wide variety of methods and integration points into their product.  Pretty much anything you can do within the GUI can be performed by making calls to the API.  This is what helps a product integrate into the Software Defined Data center – and what allows administrators to save time, provide consistency, all the while ensuring our data is protected.  Nakivo also has a wide variety of documentation and also a Java SDK built around their API, complete with documents and explanations around all of the interfaces provided.  If you are interested in learning more, about Nakivo’s API or Nakivo’s products in general head on over to their site here – you can get started for the low cost of free!  Until next time, happy automating!