Category Archives: Posts

Veeam announces the new Veeam Powered Network (Veeam PN)

VeeamOn-2017-2During the final keynote of VeeamON 2017 Veeam took the stage and threw down the guantlet on a brand new Veeam product release, the Veeam Powered Network, or Veeam PN for short.

Veeam PN is a new product, not a feature added to any others, which was initially developed to solve an internal issue within Veeam.  Veeam has a lot of employees and developers in remote sites all accross the world – and the pain of constantly connecting those sites together via VPN. coupled with the frustration of tunnels dropping all the time gave birth to the Veeam PN.  Kind of feels a lot like how a VMware fling comes to life, first being internal only, then released to the masses, then actually built out as an application offering.  Although VeeamPN can be used to establish this connectivity between any site at all,  the real benefits and the initial design intentions all focus on Microsoft Azure.

Veeam PN – Disaster Recovery to Microsoft Azure

Veeam PN is deployed into your Azure environment via the Azure Marketplace.  Once your cloud network have been established another virtual appliance is then deployed from veeam.com into your on-premises environments.  From there it’s as simple as setting up which networks you wish to have access into Azure and importing site configuration files that are automatically generated to your remote sites – with that, you have a complete and secure site to site tunnel established.  I’m not sure of the scale-ability of Veeam PN just yet, but I do know it supports having multiple sites connected into Azure for those ROBO situations.  For those remote workers on the road, well they can simply connect into Veeam PN and download a configuration file that simplifies their setup of the Open VPN client to establish a client-to-site VPN.

VeeamPN

So at this point you may be thinking “Why would Veeam develop this tech focused around networking and what does it have to do with backup or DR?”  Well, let’s couple this together with a little feature Veeam has called “Direct Restore to Microsoft Azure”.  By recovering our VMs and physical endpoints directly into Azure, and then easily establishing the network connectivity using Veeam PN we can now leverage true DR in the cloud in an easy to use, scale-able, and secure way.  THis is the “nirvana” of recovery that we have all been looking for.

One more thing – it’s free!

There it is – the Veeam way!  They released Backup and Replication with a free tier, Windows/Linux endpoint agents free, Direct Restore to Azure – free, the explorer tech – free!  Let’s add one more to that list!  Veeam PN is absolutely free!  And even though they have talked a lot about it being leveraged for Azure, companies and organizations can essentially use this technology to connect any of their sites and clients together – absolutely free!

Details around any betas or GA haven’t been revealed yet but keep your eyes open and I’ll do my best to help spread around any opportunity for you to get your hands on the new Veeam Powered Network!

A glimpse into #VeeamVanguard day!

Sure the Veeam Vanguard program comes complete with tons of great swag and free trips to VeeamON and whatnot – but in all honesty the biggest benefit of the program in my opinion is the access that Veeam provides – access with fellow Vanguards and access with key people within Veeam, across the whole company from executives to SEs.  Here at VeeamON 2017 we get a special day jam packed full of access – and below is a bit of a lowdown on what happened (or as much as we can tell you about anyways).

Veeam Availability Orchestrator – Michael White

The day started off with a couple of hours with Michael White (@mwVme) giving the low down on Veeam Availability Orchestrator – One of Veeams newest products which helps orchestrate and automate disaster recovery fail over.  Before actually getting into any product specifics, Michael actually went through a brief discussion about what Disaster Recovery and Business Continuity actually is, and how we can best prepare for any situation that may occur.  Michael is a perfect fit to evangelize this product as he had a lot of examples with other companies he as worked for over the years, outlining how he was prepared, or at sometimes, un prepared for disasters that hit.  In all honesty it was a great way to start the day, getting a little bit of education rather than just immediately diving into product specifics!

Veeam Availability Console – Clint Wyckoff

Directly after Michael we had Veeam evangelist Clint Wyckoff come and in and give us a breakdown on the new release candidate of  Veeam Availability Console.  I’ve seen the product before, but like anything Veeam there is always a number of changes in a short time – and it was nice to see the product as moves into tech preview.  For those that don’t know, VAC is Veeams answer to a centralized management solution for those large, dispursed enterprises as well as Veeam Service Providers to manage, deploy, and configure both their Veeam Backup & Replication servers as well as the newly minted Veeam Agents for Microsoft and LInux.

Vanguard support from the top down

One great thing that I like about the Veeam Vanguard program is that it’s not just a “pet project” for the company.  During Vangaurd day we were introduced to Danny Allan, VP of Cloud and Alliance Strategy at Veeam.  Danny is our new executive sponsor at Veeam – meaning we have support at the highest levels of the company.  It’s really nice to see a company sink so much support and resources from all roles into a recognition program – one of the many reasons why I feel the Vanguard program is so successful.

Nimble

After lunch we had Nimble come in and brief us on their Secondary Flash Array and the interesting performance enhancements it has when being used with Veeam.  Last year during our Vanguard day we didn’t have another vendor other than Veeam present.  It’s nice to see some of Veeams partners and ecosystem vendors reaching to find some availability to talk with us.  Nimble certainly has a great product – and due to the fact that I’m not sure what all was covered under NDA I’ll simply leave this at that!

AMA with Veeam R&D

Earlier when I mentioned one of the biggest benefits of the Vanguard program was access this is basically what I was referring too.  For the rest of the afternoon we basically had a no holds barred, ask me anything session with Anton Gostev, Mike Resseler, Alexy Vasilev, Alec King, Vladimir Eremin, Dmitry Popov, and Andreas Newfert – all Veeam employees who manage, or work very closely with R&D deciding what features are implemented, when they get implemented, and basically define a road map for when these features get inserted into products.  Now this session was definitely NDA as a lot was talked about – but just let me say this was the best and most interesting portion of the whole day!

With so much being under NDA and embargo there isn’t a lot I can tell you about the content – but for those wondering this is just a brief description of how much access you get into Veeam being under the Vanguard label.  Certainly if you wish, I encourage you to apply for the program – you won’t regret it!

Veeam Availability Suite v10 – what we know so far…

Although we got a hint at some of the announcements coming out of VeeamON during partner day on Tuesday it was really the general session Wednesday morning which brought forth the details surrounding what Veeam has in store for the future.  In true Veeam fashion we yet even more innovation and expansion into their flagship Veeam Availability Suite – covering your data protection needs from all things virtual, physical, and cloud.  So without further ado let’s round up some of which we saw during the Wednesday keynote at VeeamON 2017.

 

Veeam Agent Management

It’s no surprise that as soon as Veeam released their support for protecting Windows and Linux physical workloads that customers and partners all begged for integration into VBR.  Today, we are seeing just that as Veeam has wrapped a very a nice management interface around managing backups for both our virtual machines, along with our physical Windows and Linux workloads.  This not only gives us the ability to manage those physical backups within VBR, but also gives us the ability to remotely discover, deploy, and configure the agents for the physical endpoints as well!

Backup and restore for file shares.

Veeam Availability Suite v10 brings with it the ability to backup and restore directly from our file shares.  Basically those SMB shares can be accessed via a UNC share and files backed up and protected by Veeam.  Different from Veeams traditional restore point though, Veeam Backup and Restore for file shares doesn’t necessarily store restore points, but acts almost like a versioning system instead – allowing administrators to state how many days they would like to version the files, whether or not to keep deleted files, and also specify some long term retention around the file.  This is some pretty cool feature set to be added to v10 and I can’t wait to see where this goes – whether the file share functionality can somehow be mapped to the image level backup and work together to restore complete restore points as well as apply any newer file versions that may exist.

Continuous Data Protection for all

Perhaps some of the most exciting news of all is Veeams announcement to support Continuous Data Protection, allowing enterprise and organizations to drastically lower the RPO by default to a whopping 15 second restore point.  Ever since Veeam hit the market their replication strategy has been to snapshot VMs in order to gain access to CBT data and replicate that across.   That said we all recognize the pain points of running our infrastructure with the impact of snapshots.  That’s why, with the new CDP strategy set forth by Veeam today they will utilize VMware vSphere’s Storage APIs for I/O filtering in order to intercept and capture those IO streams to our VMs and immediately replicating the data to another location.  This to me is a huge improvement for an already outstanding RTPO that organizations can leverage Veeam to achieve.  This is truly groundbreaking for Veeam as we can now say have 4 hours of 15 second restore points to chose from.  It’s nice to see a vendor finally take advantage of the APIs set forth by VMware.

vCloud Director Integration into Cloud Connect.

Veeam service providers have been providing many customers the ability to consume both backup and replication as a service – allowing customers to essentially ship off their data to them, allowing the SP to become the DR site.  That said, it’s always just been those VMs that live within just vCenter and vSphere.  Today Veeam announced the support for vCloud Director organizations and units to also take advantage of the Cloud Connect offering – allowing those running vCloud Director to also consume the DR as a Service that Veeam partners have been providing, keeping their virtual datacenters and hardware plans while failing over their environments.

Veeam Availability for AWS

Yeah, you heard that right!   We have seen Veeam hit the market focusing solely on virtualized workloads, slowly moving into the support of physical workloads – and now, supporting the most famous well know public cloud – Amazon AWS.  Cloud always presents risk into an environment, which in turn means that we need something exactly like Veeam Availability for AWS to protect those cloud workloads and ensure our data is always recoverable and available if need be.  In true Veeam fashion though, the solution will be agentless.

Ability to archive older backup files

Veeam v10 now brings with it the ability for us to essentially archive off any backup files as they age in our backup policies off to some cheaper storage.  Now we all know that cloud and archive storage is a great solution for this so guess what – yeah, we know have the ability to create what is called an “Archive Storage” repository which can live on any type of native object storage, be it Amazon or even your own swift integration.  This frees up your primary backup storage performance in order to manage things such as restores, etc. – while the archive storage can do what it does best – hold those large, lesser accessed backup files.

Universal Storage Integration API

For the last few VeeamON events the question of who the next storage vendor would be to integrate into Veeam is always on everyone’s mind.  With the announcement of the new Universal Storage Integration APIs the next storage vendor could literally be anyone.   This is basically an API set that will allow storage vendors to integrate into Veeam – basically giving Veeam the ability to control the array, creating, deleting and removing storage snapshots allowing customers to lower RTO and RPO without ever leaving the familiar Veeam Console.

This honestly just scrapes the surface on some of the announcements Veeam has in store for us this week so stay tuned as there is another keynote tomorrow where I’m sure we will hear more about VBR v10 and also, possibly some NEW product announcements.  For now, it’s off to some deep dives to learn some more about some of these great features!  Thanks for reading

Veeam Availability Orchestrator – Automation for your Disaster Recovery

As a member of the Veeam Vanguards here at VeeamON 2017  we got to spend a couple of hours with Michael White (@mwVme) who gave us an update on Veeam Availability Orchestrator – Veeams’ answer to orchestrating and automating fail-over to their replicated VMs.  Michael certainly is a great choice when looking for someone to evangelize this product as he had a number of examples of DR situations he has either helped with, or orchestrated companies through – which had both good and bad outcomes!  But back to topic – VAO was announced a while back, in fact, over  a year ago Veeam announced their plans for VAO during their “Next big thing” event in April of 2016.  Since then I’ve got to see the application move along through various beta stages and was pleasantly surprised to see how the product has matured as they gear up for their 1.0 release (no, I don’t know when that is).

For those not familiar with VAO let me give you a little bit of a breakdown.  VAO is essentially a wrapper, or an engine that simply interacts with other Veeam products via API calls.  Think Veeam ONE, Veeam Business View, and Veeam Backup & Replication all talking together to one centralized Disaster Recovery Orchestration machine.  As far as the architecture there really isn’t anything special – it’s a web interface, with a SQL backend.   As far as I know the only limitations associated with Veeam Availability Orchestrator are the fact that it is only supported within a VMware environment and that an Enterprise Plus license must be applied to the VBR instance VAO connects to.

So what does VAO do that VBR doesn’t?

Hearing the phrases like “testing our replicas” and “using the Virtual Labs” you might be wondering what exactly VAO does that VBR doesn’t.  I mean, we have the SureReplica technology within VBR and it works great at testing whether or not we can recover so why would we need VAO?  The answer here is really about the details.  Sure, VAO doesn’t re-invent the wheel when it comes to DR testing – why would they force you to reconfigure all of those Virtual Labs again?  They simply import them, along with a lot of information from VBR to use within VAO.  That said though, VAO does much much more.  From what I’ve seen we can basically break VAO down into three separate components.

Orchestration

VAO takes what you have already setup within VBR and allows you to automate and orchestrate around that.  Meaning we already have replicated our VMs to a DR location, setup our fail-over plans and virtual labs, and completed configuration around re-iping and post fail-over scripts to handle our recovery.  VAO takes all of this and adds flexibility into our recovery plans to execute and trigger pre and post fail-over scripts, along with per-VM testing scripts as well.  At the moment we are limited to just PowerShell, however we may see more scripting languages supported come GA time.   Essentially VAO gives us more flexibility in running and trigger external process during a fail-over even than what VBR provides on its’ own.

Automated DR Testing

VAO takes all of this fail-over orchestration and applies this to our testing environments as well.    By giving use the ability to test, and test often we, as organizations can drastically increase our success rate when a true disaster occurs.  Certainly virtualization has really impacted our ability to test DR plans, in a good way – but there are still a lot of challenges when it comes to performing a true test – VAO closes that gap even more.

Dynamic Documentation

Probably the biggest feature in my opinion of VAO is it’s ability to automatically and dynamically create Disaster Recovery documentation.  DR documentation are often overlooked, and left sitting on some file server, stale and not updated at all.  Environments today are under constant change, and when our production environments change so do our DR requirements.  VAO does a good job at dynamically pulling in any new VMs added or older VMs removed and adjusting it’s documentation accordingly.  In the end we are left with some nicely updated documentation and run books to reference when we the time comes that we need them.

All of this said though, to me the true value of VAO really is it’s ability to focus on the details.  From what I’ve seen VAO does a great job at reporting any warnings, errors or failures as it applies to any DR test or fail-over event.  Not just on its’ canned testing scripts (for instance connecting to a mailbox on a failed over exchange server), but on our custom built PowerShell scripts as well.  Without this attention to detail a lot of assumptions and false positives can be “assumed” during a DR test – leaving us left with an inconsistent state during an actual fail-over event.  VAO, in all of its reporting and messaging certainly provides a nice mechanism into the visibility of each and every VM, and each and every task associated with that VM inside of a fail-over plan.

We still don’t have a solid release date on VAO but in true Veeam fashion let me give you this estimate – “When its’ ready” 🙂

No vMotion for you! – A general system error occurred: vim.faultNotFound

vMotion is pretty awesome am I right?  Ever since I first saw my first VM migrate from one host to another without losing a beat I was pretty blown away – you always remember your first Smile  In my opinion it’s the vMotion feature that truly brought VMware to where they are today – laid the groundwork for all of the amazing features you see in the current release.  It’s something I’ve taken for granted as of late – which is why I was a little perplexed when all of a sudden, for only a few VMs, it just stopped working…

vMotionError

You can see above one of my VMs that just didn’t seem to want to budge!  Thankfully we get a very descriptive and helpful error message of “A general system error occurred: vim.faultNotFound” – you know, because that really helps a lot!  With my Google-Fu turning up no results and coming up empty handed in forum scouring I decided to take a step back to the VCP days and look at what the actual requirements of vMotion are – surely, this VM is not meeting one of them!  So with that, a simplified version of the requirements to vMotion…

  • Proper vSphere licencing
  • Compatible CPUs
  • Shared Storage (for normal vMotion)
  • vMotion portgroups on the hosts (min 1GB)
  • Sufficient Resources on target hosts
  • Same names for port groups

Licensing – check!  vCloud Suite

CPU Compatibility – check! Cluster of blades all identical

Shared Storage – check!  LUNs available on all hosts

vMotion interface – check!  Other VMs moved no problem

Sufficient Resources – check!  Lots of resources free!

Same names for port groups – check!  Using a distributed switch.

So, yeah, huh?

Since I’d already moved a couple dozen other VMs and the fact that this single VM was failing no matter what host I tried to move it to I ruled out the fact that there was anything host related causing this and focussed my attention to the single VM.  Firstly I thought maybe the VM was tied to the host somehow, using local resources of some sort – but the VM had no local storage attached to it, no CD ROMs mounted, nothing – it was the perfect candidate for vMotion but no matter what I tried I couldn’t get this VM to move!  I then turned my attention to networking – maybe there was an issue with the ports on the distributed switch, possibly having none available.

After a quick glance, there was lots of ports available, but there was another abnormality that reared its ugly head!  The VM was listed as being connected to the switch on the ‘VMs’ tab – however on the ‘Ports’ tab it was nowhere to be found!   So what port was this VM connected to?  Well, let’s ssh directly to the host to figure this one out…

To figure this out we need to run the “esxcli network vm port list” command and pass it the VMs worldID – to get that, we can simply execute the following

esxcli network vm list

From there, we can grab the world ID of our VM in question and run the following

esxcli network vm port list –w world_id

In my case, I came up with the following…

vmportid

Port 317!  Sounds normal right?  Not in my case.  In fact, I knew for certain from my documentation that the ports on this port group only went up to 309!  So, I had a VM, connected to the port group, on a port that essentially didn’t exist!

How about a TL;DR version?

Problem stemmed from the VM being connected to essentially a non-existent port!  Since I couldn’t have any downtime on this port my fix was to simply create a another port group on the dvSwitch, mimicking the settings from the first.  After attaching the VM to the newly built port group, then re-attaching back to the existing one I was finally attached to what I saw as a valid port, Port #271.

port-fixed

After doing this guess what finally started working again – that’s right, the wonderful and amazing vMotion Smile.  I’m sure you could achieve the same result by simply disconnecting and connecting, however you will experience downtime with that method – so I went the duplicate port group route.

Where there is one there’s many

All of this got me thinking – this can’t be the only VM that’s experiencing this issue is it?  I started looking around trying to find some PowerCLI scripts that I could piece together and as it turns out, knowing what the specific problem was certainly helps with the Google-Fu and I found a blog by Jason Coleman dealing with this exact same issue!  Wish I could’ve found that earlier Smile.  Anyways, Jason has  a great PowerCLI script attached to his post that peels through and detects which VMs in your environment are experiencing this exact problem!  He even has automated the creation of the temporary port groups as well!  Good work Jason!  After running it my conclusions were correct – there were about a dozen VMs that needed fixing in my environment.

How or why this occurred I have no idea – I’m just glad I found a way around it and as always, thought I’d share with intention of maybe helping others!  Also – it gave me a chance to throw in some Seinfeld action on the blog!  Thanks for reading!

VCSA 6.5 Migration deployment sizes limited!

Recently I finally bit the bullet and decided to bring the vCenter portion of a vSphere environment up to version 6.5.  Since the migration from a Windows based vCenter to the VCSA is now a supported path I thought it would also be a good time to migrate to the appliance as well.  So with that I ran through a few blogs I found in regards to the migration, checked out the vSphere Upgrade Guide and peeled through a number KB’s looking for gotchya’s.  With my knowledge in hand I headed into the migration.

At this point I had already migrated my external windows based PSC to version 6.5 and got started on the migration of the windows-based vCenter Server.  Following the wizard I was prompted for the typical SSO information along with where I would like to place the appliance.  The problem though came when I was prompted to select a deployment size for my new VCSA.  My only options available were Large and X-Large.  Might not be a big deal if in fact this environment required this amount of resources – Looking at the table below those deployment sizes are scoped to fit at a 1000 host and above mark.

DeploymentSize

Did this environment have 1000+ hosts and 10000+ VMs?  Absolutely not!  At its largest it contained maybe 70 hosts and a few hundred VMs running on them – a Small configuration at best, medium if you want to be conservative!  At first I thought maybe I was over provisioned in terms of resources on my current vCenter Server – but again, it only had 8 vCPU’s and 16GB of RAM.  With nothing out of the ordinary with vCenter itself I turned my attention to the database – and that’s where my attention stayed as it was currently sitting at a size of 200GB.  Honestly, this seemed super big to me and knowing that it had been through a number of upgrades over the years I figured I would make it my goal to shrink this down as small as possible before trying again!  TL;DR; version – The database was the culprit and I did end up with the “small” option –  but I did a number of things after a frenzy of Google’s and searches – all listed below…

WAIT!!!!  Don’t be that guy!  Make sure you have  solid backups and can restore if things here go sideways – engage VMware GSS if needed – don’t just “do what I do” 🙂

 

Reset the vpx provider

The vpx data provider basically supplies the object cache for vCenter – caching all inventory objects such as hosts, clusters, VMs, etc in order to provide that super-snappy response time in the vSphere Web Client 6.0 (Is this sarcasm?).  Anyways, resetting this essentially will reduce the size of our Inventory Database.  Now, the problem in versions prior to 5.5 Update 3 is that there was no way to reset individual data providers – in order to do one you had to do them all – and that meant losing all of your tags, storage profiles/policies, etc.  Thankfully, 5.5 U3 and 6.0 allows us to simply reset just vpx, leaving the rest of our environment in-tact.  In order to do so we must first get into the vSphere Inventory Managed Object Browser (MOB) and get the UUID of the vpx provider.  **NOTE, this is different than the MOB you may be used to logging into, see below ***

First, log into the Inventory Service MOB by pointing your browser to https://vCenterIP/invsvc/mob1/    From there, simply click the ‘RetrieveAllProviderConfigs’ link within the Methods section as shown below

invsvcprovider

In the pop up dialog, click ‘Invoke Method’, then run a search for vpx

vpxprovider

It’s the providerUuid string that we are looking for – go ahead and copy that string to your clipboard and return to https://vCenterIP/InvSvc/mob1/ – this time, clicking the ‘ResetProviderContent’ link under Methods.  In the pop up dialog, paste in your copied UUID and click ‘Invoke Method’ as shown below…

resetcontent

After a little while the window should refresh and hopefully you see no errors!   The process of resetting for myself took roughly 5 minutes to complete….

Getting rid of logs

Although vCenter does its own log rotation you may want to check out and see just how much space your logs are taking up on your current vCenter server before migrating as some of this data is processed during the migration/upgrade.  I freed up around 30GB of disk by purging some old logs – not a lot, but 30GB that didn’t need to be copied across the wire during the migration.  There is a great KB article here outlining the location and purpose of all of the vCenter Server log files – have a look at it and then peruse through your install and see what you may be able to get rid of.   For the windows version of vCenter you can find all of the logs in the %ALLUSERSPROFILE%\VMware\vCenterServer\logs\ folder.  I mostly purged anything that was gzipped and archived from most of the subfolders within this directory.  Again, not a difference maker in terms of unlocking my “Small” deployment option – but certainly a time-saver during the migration!  So what was culprit that was not allowing me to select “Small” – yeah, let’s get to that right now…

My Bloated vCenter Database

bloateddbYeah, 200GB is a little much right – even after resetting the vpx provider and shrinking the database files I was still sitting pretty high!  So, since I had no intention of migrating historical events, tasks and performance data I thought I’d look at purging it before hand!  Now if you have ever looked at the tables within your vCenter Server database you will find that VMware seems to create a lot of tables by  appending a number to the VPX_HIST_STAT table.  I had a lot of these – and going through them one by one wasn’t an option I felt like pursuing.  Thankfully, there’s a KB that provides a script to clean all of this up – you can find that here!  Go and get the MSSQL script in that KB and copy it over to your SQL Server.  Once you stop the vCenter Service we can simply run the following command via the command prompt on our SQL Server to peel through and purge our data.

sqlcmd -S IP-address-or-FQDN-of-the-database-machine\instance_name -U vCenter-Server-database-user -P password -d database-name -v TaskMaxAgeInDays=task-days -v EventMaxAgeInDays=event-days -v StatMaxAgeInDays=stat-days -i download-path\2110031_MS_SQL_task_event_stat.sql

Obviously you will need to assign some values to the parameters passed (TaskMaxAgeInDays, EventMaxAgeInDays, & StatMaxAgeInDays).  For these you have a few options.

  • -1 – skips the respective parameter and deletes no data
  • 1 or more – specifies that the data older than that amount of days will be purged
  • 0 – deletes it all!

For instance, I went with the 0, making my command look like the following….

sqlcmd -S IP-address-or-FQDN-of-the-database-machine\instance_name -U vCenter-Server-database-user -P password -d database-name -v TaskMaxAgeInDays=0 -v EventMaxAgeInDays=0 -v StatMaxAgeInDays=0 -i download-path\2110031_MS_SQL_task_event_stat.sql

After purging this data, and running a shrink on both my data and log files I finally had my vCenter database reduced in size – but only to 30GB.  Which, in all honesty still seemed a bit large to me – and after running the migration process again I still didn’t see my “Small” deployment option.   So I went looking for other large tables within the database and…..

Hello VPX_TEXT_ARRAY

It’s not very nice to meet you at all!!!  After finally getting down to this table – and running “sp_spaceused ‘VPX_TEXT_ARRAY’” I found that it was sitting a whopping 27GB.  Again, a flurry of Google!  What is VPX_TEXT_ARRAY and what data does it hold?  Can I purge it?  Well, yes….and no.  VPX_TEXT_ARRAY, from what I can gather keeps track of VM/Host/Datastore information – including information in regards to snapshots being performed on your VMs.  Also from what I can gather, from my environment anyways, is that this data exists within this table from, well, the beginning of time!  So, think about backup/replication products which constantly perform snapshots on VMs in order to protect them – yeah, this could cause that table to grow.  Also, if you are like me, and have a database that has been through a number of upgrades over the years you may end up having quite a bit of data and records within this table as it doesn’t seem to be processed in any sort of maintenance job.  In my case, 7 million records resided within VPX_TEXT_ARRAY.  Now, don’t just go and truncate that table as it most likely has current data residing in it – data vCenter needs in order to work – there’s a reason it tracks it all in the first place right?  Instead, we have to parse through the table, comparing the records with those that are in the VPX_ENTITY table, ensuring we only delete items which do not exist.  The SQL you can use to do so, below…

DELETE FROM VPX_TEXT_ARRAY
WHERE NOT EXISTS(SELECT 1 FROM VPX_ENTITY WHERE ID=VPX_TEXT_ARRAY.MO_ID)

A long and boring process – 18 hours later I was left with a mere 9000 records in my VPX_TEXT_ARRAY table.  Almost 7 Million removed.  Just a note, there is a KB outlining this information as well – in which it says to drop to SINGLE_USER mode – You can if you wish, but I simply just stopped my vCenter Server service and stayed in MULTI_USER so I could check in from time to time to ensure I was still actually removing records.  an sp_spaceused ‘VPX_TEXT_ARRAY’ in another query window will let you track just that.   Also, it might be easier, if you have the space, to set the initial size of your transaction logs something bigger than the amount of data in this table.  This allows SQL to not have to worry about growing them as it deletes records – you can always go back in the end and reset the initial size of the tlogs to 0 to shrink them.

So – a dozen coffees and a few days later I finally ran another shrink on both the data and log files, setting their initial sizes to 0 and voila – a 3GB database.  Another run at the migration and upgrade and there it was – the option to be “Small”!  Again, this worked in my environment – it may not work in yours – but it might help get you pointed in the right direction!  Do reach out if you have any questions and do ensure you have solid backups before you attempt any of this or anything you read on the net really Smile  Also, there’s always that Global Support Services thing that VMware provides if you want some help!   Thanks for reading!

Spring forward to the Toronto VMUG UserCon

Ahh Spring –  Most people describe this as a time where the rain falls and cleans everything up around us – flowers blooming, grass growing – a sign of warmth to come!  In Canada though, it’s a sign of giant muddy snow piles full of gravel, salt and sand from all of the plowing and shoveling performed all Winter long – for me, it’s a muddy white dog and two little munchkins tracking muck all over the house – All that said, there is some hope for Spring this year!  March 23rd marks the date for our next Toronto VMUG UserCon – so, if you want to escape the mud and the muck come on down to the Metro Toronto Convention Centre this Thursday and join 600+ of your peers for some great learning, technical sessions and some awesome keynotes!  We’ve got a great one planned this year and I just wanted to highlight some of the keynotes and sponsors we have lined up for Thursday!

First up – Mr. Frank Denneman

Over the years we have been lucky enough to have some awesome keynote speakers for our UserCon – this year is no exception!  I’m super excited to hear from Frank Denneman!  If you don’t know who Frank is let me try and enlighten you a little – this man literally wrote the book on DRS – three times!   The “HA and DRS/Clustering Deepdive” books – written by Frank and his co-author Duncan Epping are honestly one of the greatest tech books ever.  It’s written in a text that is easy to read, and has literally taught me so much about HA and DRS I can’t even begin to explain it all!  Certainly a must read for any VMware admin.  Frank moved on from VMware for a little while to work with PernixData as the CTO and has just recently returned to VMware taking on the role of Senior Staff Architect within their SDDCaaS Cloud Platform Business Unit.  Frank will be giving a talk titled “A Closer Look at VMware Cloud on AWS”.  With VMware and Amazon announcing a partnership recently allowing us to consume bare-metal ESXi from within the wide range of Amazon’s data centers this will most certainly be an interesting keynote explaining just how it works – and what we can expect from it in terms of unified management between our on-premises and AWS infrastructure.

The Breakouts and Panels!

After Frank the morning breakout sessions will then kick off – here we will have sessions from a variety of partners and vendors whom provide everything from hardware to storage to back up to monitoring.  You will see all of the familiar names here with 30 minute breakout sessions covering off their technologies.  Take a look at our sponsors below – without these companies these events wouldn’t be possible!    A round of sessions from VMware follows a couple of rounds of sessions from third-party vendors, then, lunch, and an aspiring/VCDX panel talk where you can be sure to get some in-depth answers to any questions you may have about design, architecture, or every day management of your VMware infrastructure.

Drinks, Food, and DiscoPosse’s

After lunch we have another couple of rounds of breakout sessions by VMware and our sponsors – with a reception following immediately thereafter.  vSphere with Operations Management will sponsor our networking reception, complete with drinks and appetizers – a perfect way to end what I’m sure will be a jam-packed day!  That said, what’s a beer without entertainment right?  We are super happy to have our own VMUG co-leader Eric Wright (@discoposse) giving our closing keynote for the day!  Think of this a little like the technology version of CBC’s Hometown Heroes segment that they offer on Hockey Night in Canada!  Eric, our own hometown hero will deliver a jam packed hour of all things VMware and Terraform, showing us just how easy it is to start automating our infrastructure with the open source software!  I got a sneak peek of this at our last local VMUG meeting and this is something you won’t want to miss!

Free Stuff!

Then, yes, of course, Giveaways!  We have some pretty cool prizes this year including cold hard cash (VISA gift cards), GoPro’s, and the ever popular grand prize of a complete vSphere Homelab!   This is on top of all the great giveaway’s we see from our sponsors!

So if you aren’t busy this Thursday, register now & drop in – we’d love to see you there!  Even if you are busy, cancel everything and come on down!  Can’t make it?  Follow along via Twitter with the hashtag #tovmug and hey, we have more meetings coming up as well to help you all get the Toronto VMUG experience.  Our Q2 meeting is May 31st sponsored by Veeam and Mid-Range and our Q3 meeting is tentative for September 19th with sponsors Zerto and Tanium (still in development) – come and check us out.  As always, stay connected.  You can follow us on Twitter, connect on LinkedIn, watch our website, or become a member of the Toronto VMUG Community in order to stay up to date on all things tovmug!  See you Thursday!

 

Don’t delay!  Register now for the March 23rd Toronto VMUG UserCon!

 

What to expect from VeeamON 2017

I’ve had the opportunity to attend both the previous VeeamON conferences in Vegas as well as the mini VeeamON forum last year in the UK and since it’s still a relatively new conference on the scene I thought I’d give everyone a bit of an overview and heads up as what to expect from the event!  Before going to far into how the event is laid out let’s first take a look at the logistics.  While I do like Vegas it tends to get a bit monotonous when it comes to conferences – making them all kind of feel like the same event.  That’s why I was ecstatic to hear that VeeamON 2017 will be held in New Orleans this year from May 16th through the 18th!  So, as Veeam embarks on its’ third VeeamON event I thought I might go over a bit on what to expect for those that may be unfamiliar with the backup vendors availability event.

Expect A LOT of technical information

With over 80 breakout sessions you can most certainly expect to learn something!    The thing about the breakouts in VeeamON though is their level of technicality.  I’ve been to many breakout sessions at other conferences that tend to be pretty marketing heavy – while VeeamON most certainly has a marketing agenda, the sessions themselves are very technical – with a 100 level being the least technical and a 400 level introducing you to things you never even knew existed!  I can honestly say that I was skeptical when attending my first VeeamON – wondering how they could have so many breakout sessions dealing solely with backup – man was I wrong!  Veeam B&R is a big application that touches a lot of different aspects of your infrastructure – think Repository best practices, proxy sizing, automation, best practices, etc.  This year with the addition of new products such as 0365 backup, Agents for Linux/Windows and the many storage integrations with partners you can bet that there will be plenty of content to be shared.

Expect a smaller, more intimate conference

VeeamON, compared to the bigger conferences is relatively small.  With roughly 2500 people in attendance last year and over 3000 expected this year the conference is not as spread out as what you may be used to – which is a good thing!  Honestly, it’s nice being able to keep everything relatively confined to the same space and even nicer to have no crazy lineups to cross the street at the Moscone.  I found that VeeamON made it very easy to find people – whether you are looking for that person or not.  Meaning, don’t be surprised to accidentally run into some Veeam executives in the hallways – or even the CEO in the elevator Smile  The atmosphere during the conference days at VeeamON is nice – not so loud that you can’t have a conversation – the solution exchange isn’t over run with vendors competing to see who has the loudest mic.  It’s a nice, low key conference which makes it easy to have those valuable hallway conversations that are usually the best benefit from any conference.

Expect to learn a little more about the “other hypervisor”

VMworld – the place you go to learn all there is to know about vSphere.  MS Ignite – the place you go to get all your Hyper-V knowledge!  VeeamON – since Veeam B&R supports both vSphere and Hyper-V you are going to hear a lot about both the hypervisors.  You’ll see your typical VMware crowd intermingling with…you know, the other guys,  all in support of the product that is protecting their infrastructure.  I’ve wrote about how the Vanguard program bridges this gap before – and the VeeamON conference is fairly similar in how it brings together the best of both the vSphere and Hyper-V worlds.  As my good friend Angelo Luciani always says “We are all in this together!”

Expect announcements!

This is a given right – every vendor organized conference is always organized around some sort of announcement or product release!  VeeamON 2014 saw the introduction to Endpoint Backup Free Edition, while VeeamON 2015 saw it’s OS counterpart announced with Veeam Backup for Linux!  All the while lifting the lid on some major enhancements and features in their core product Veeam Backup & Replication.  So what will we see this year in New Orleans – your guess is as good as mine.  Veeam just recently had a major event where they announced the evolution of the physical Windows/Linux backup products (Veeam Agent for Windows/Linux) into paid versions coupled with the Veeam Backup Console for centralized management of our endpoints – as well, we saw the release of  Veeam Backup for O365 – What else is left to announce?  I’m sure we will hear more about v10 and some top secret features from it but with all of the other new product announcements one might think there is nothing left to release – but, a wise man who worked for Veeam once told me that they have this shelf containing a lot of products and ideas – you never know when they will take something down off of it Smile

Expect to have ALL your questions answered

Veeam sends a lot of employees, engineers, tech marketing folks to this conference – and I mean A LOT.  Last VeeamON you couldn’t even walk through the Aria casino without running into at least a half dozen Veeam engineers.  What this means is, if you have questions, VeeamON is the perfect venue to ask them.  I can pretty much guarantee you that they will all be answered – there will be a SME on site dealing in the areas you are having trouble with.  So don’t just make VeeamON all about learning – try and get some of those pain points that have been bugging you for a while firmed up while at the conference.  Everyone is approachable and more than willing to give you a few minutes.

Expect an EPIC party

Sometimes you just have to let go right – If you have ever been to a Veeam party at any of the VMworlds you know that Veeam knows how to do just that!  In fact, I’ve heard more than once Veeam being described as a “Drinking company with a backup problem” Smile  I don’t quite see it as being like that but certainly you have to agree that Veeam knows how to throw a party and make you feel welcome.  Whether you are just arriving and hitting up the welcome reception or you are attending their main VeeamON party I know you will have a good time, with good food and good drinks!  Veeam understands that it can’t be all about business all the time – so take the opportunity at the parties to let a little loose and meet someone new!  I’ve made many lifelong friends doing just that!

So there you have it!  Hopefully I’ve helped paint the picture of what VeeamON is like for me and maybe helped you understand it a little more!  I’m super excited for VeeamON in New Orleans this May and I hope to see you there!

Runecast– Proactive performance for your VMware environment–Part 2 – Knowledge Base Articles, Best Practices, and Hardening Guidelines.

logoIn part 1 of our Runecast review we took a look at just how quickly we can get Runecast installed and configured within our environment.  We had a brief look at the Runecast dashboard which highlights any misconfigurations, un-applied Knowledge base articles, or non-compliant security settings.  We saw that within just a few minutes we were reporting on all this information from within our environment, and comparing that to up-to-date lists of best practices and hardening guidelines.  With KB’s, Best Practices, and Hardening Guidelines being at the heart of Runecast it’s best we take a more in-depth look at how we report on, manage, and resolve them within our environment.  That is exactly what this final part of the review will focus on.

So with all that said let’s start diving deeper into our test environment to see if we can solve any problems!  As we can see above, I currently have 38 issues that were already detected within my small little lab setup here, broken down into 5 critical, 19 major, and 14 medium.  Clicking on either severity item within the dashboard display will take us directly to a filtered view of our issues list, or we can view all issues by selecting Issues List along the left hand navigational menu.

runecastissues

By default, our issues appear rolled up – to get more information in regards the Knowledge Base Article, Best Practice or Security setting we can click the ‘+’ icon next to our issue as shown above.  As we can see here Runecast is reporting that we don’t have NTP configured on our ESXi host, falling under the Best Practice category.  Certainly time is an important thing in the world of computing so I can see why they would flag this as a critical issue.  We can also see after expanding the issue that we have a lot of other information available to us – a more descriptive issue of the problem, as well as ratings, impact, and a link to any reference material/knowledge base article, or security hardening guide to further explain or describe the issue and how to fix it.  This is very handy to have.  Right from within Runecast we can discover our issues and immediately jump into a document, user guide, or KB article outlining the problems and resolutions.

The ‘Findings’ tab within the expanded issue allows us to view the inventory objects within our environment that the issue applies to – in this case, both of our ESXi hosts.  I should note here that we do not need to first click on an issue to view it’s associated objects – we can do this in the reverse direction as well by using the Inventory item on the left hand navigation – Inventory essentially gets us to the same place, but allows us to browse through our vCenter inventory, selecting a host, cluster, datastore, vm, etc  and displaying just its’ associated issues.  Either way we get to the same information though, just a couple of routes to get there.

Another useful tab on this screen is the ‘Note’ tab.  As shown below we are able to input any notes or information that applies to this issue (or KB/Security setting for that matter) that we want.  This can be extremely useful if we have multiple people working within the Runecast environment, or even just for documentation for yourself as to why you are making or not making a certain configuration change.

runecast-issue-notes

In order to clear issues within Runecast we have a couple of options – firstly, and probably the most preferred method is to simply fix your issue – I’ve since setup NTP on my hosts and no longer see this issue being reported.  That said, as mentioned above their may be times when we have an issue present for a certain reason, especially dealing with the best practices category like the forged transmits setting above.  For this, we can simply click the ‘Ignore’ link next to an issue, create an object filter as shown below, by giving it a name and selecting the objects it applies to.

runecast-issues-ignore

After applying the filter the issue in question will no longer be reported in Runecast.  We can edit or remove this filter at any time by selecting the ‘Filter’ tab from within Runecast’s settings in order to reset anything we may want to.

From within the ‘Configuration Analysis’ section we are able to to view our issues in a different fashion.

First up KBs discovered will show us all of the KBs that have been discovered that apply to our environment.  It does this by parsing the VMware Knowledge Base and pulling down only those KBs which apply to the hardware and software versions we have running within our virtual infrastructure.    As we can see below we still have the same options as we did within the Issue List screen – we have our link out to the actual VMware KB article, the article is also embedded into Runecast, and we can add notes and choose to ‘Ignore’ certain KBs that may not apply.

runecast-kb

The ‘Best Practices’ and ‘Security Hardening’ take somewhat of a different approach as to how they are displayed.  Since best practices and security settings are actual configurations that we can choose to make in our environment they are displayed in a simple Pass/Fail fashion – passing if we meet the criteria of the practice or security setting, and fail if we do not.  This gives us the ability to quickly see thing such as “How many major items from the security guideline have we implemented”  or “Have we applied all of the ‘critical’ best practices to our environment.

runecast-bp

As we can see above we are getting a pass on our NTP settings, as we have already tackled them from the Issues screen.  We are however receiving a fail in terms of Remote TSM, which is essentially having SSH enabled on our hosts.  In my environments this is a known configuration setting, so I would most likely chose to create a filter to ignore this security setting.

The last section of Runecast I want to go over is the Log Analysis section.  Within here we can see that we have another couple of screens we can access – KBs Discovered and Verbose dashboards.  The KBs discovered section here deals solely with those KBs that specify certain patterns which are visible in the logs, such as with KB 2144934, where you can see below the “you see entries similar too…”

runecast-vmware-kb

Nobody likes searching through log files – it’s a long and tedious task.   In this situation, since we are already shipping our logs to Runecast why not let the analyzer go ahead and comb them for you.  If it finds a pattern that applies to any specific KB article, it will be flagged here.  This allows us to be quite pro-active in nature – alerting us of a KB issue that we may not even know we have.

As far as ‘Verbose Dashboards’ goes this allows us quickly get a grasp on all of the events occurring within our log files.  Again, the task of combing through log files and greping out certain items such as SCSI Aborts on the command line can be daunting, not to mention very time consuming.  Here, as shown below, we can do this directly from within the Runecast UI.

runecast-verbose

As you can see we have a lot of options to filter out the events within logs to get just the data we are looking for.  For instance we can define we only want to see those logs entries flagged as an error and applying only to a certain ESX host.  We can also define a time period of logs to parse – from predefined settings of the last 1/3/7/30 days to a custom period set up by us if we needed to audit a certain event at a certain time.   This is a very useful feature to have within the UI.  Since Runecast already has the log data in order to determine issues, why not give us a screen in order to analyze the raw data.  I can see this being super useful in terms of things such as searching for certain logins during a specific time period – something that isn’t easy to do sitting within the cli of an ESXi hosts.

Runecast really has a very nice product here and brings a lot of information out of our environment and puts it front and center in a very easy, simple, UI.  It’s so easy to setup as well – Simply deploy the ova, point it to our vCenter and right away we know how our environment stacks up in terms of best practices and security guidelines – as well as we have discovered any potential issues we may have, with all of the information on how to fix them.  All of this, in about 5 minutes.  Think about the flip-side of this, downloading best practices and the hardening guide and going through each line item one by one, looking up build numbers and then searching through mountains of VMware KB’s – not something I want to do.  While other  products providing some similar functionality such as vROPs and Log Insight may bring us more metrics, Runecast instead displays only what we need to see to properly troubleshoot our environment, keeping the UI clean and crisp and easy to use – aside from that, when compared to vROPs, Runecast doesn’t come with the install footprint, nor the price tag, and as far as I know is the only product on the market which parses and filters out VMware KBs for us.   As far as development goes Runecast isn’t holding back, with a beta version set to be released soon we can see features such as multitenancy being added to the product – as well as a few more undisclosed features set to be released in Q1/Q2 of this year.  Runecast comes with a fully featured, free 30 day trial but honestly the product gives you valuable information in the first 15 minutes –  so 30 days is more than long enough to get your environment up to snuff.   That said, in order to keep your environment running at it’s peak performance you will want to consult Runecast often as we all know how fast Best Practices and Security guidelines can change in our industry.  Runecast automatically adjusts to these changes – ensuring your environment is ALWAYS compliant.   The amount of time Runecast saves you is instantly recognized, and the fact that they are constantly connected to the VMware knowledge base and hardening guides means you are always “in the know” about how your environment is configured according the “preferred” way – even if your environment changes, or the “preferred” way changes!  If you want to try out Runecast and what it has to offer for yourself you can do so by signing up for their 30 day trial! I guarantee you will find something in need of some attention in your environment!

Runecast – Proactive performance for your VMware environment! – Part 1 – Configuration

Have you ever opened up the VMware Hardening Guide and checked your environment against every single item listed?  How about combed through the VMware Knowledge Base looking for all KB articles that apply to the exact software builds and hardware you have?  No?  How about taken a list of industry best practices and ensured that you are indeed configured in the best possible way?  Of course we haven’t – that would certainly take a lot of time and most organizations simply don’t have the resources to throw at those types of tasks.  All that said what if I told you that there was a piece of software that could pretty much instantly tell you whether you are or are not compliant in those exact three scenarios?  Interested yet?  I thought you might be…

Enter Runecast

logoBefore writing this review I’d never heard of Runecast, so first, a little bit about the company.  Runecast was founded in 2014 in the quaint ol’ city of London in the UK.  Their goal, to provide pro-active monitoring to our vSphere environments in order to save us time, prevent outages before they  happen, ensure compliance at all times and simply make our environments more secure.  Now there is only four things listed there – but they are four things that Runecast does really, really well.  With that said, I could talk about how much I enjoyed doing this review forever, but it’s best just to jump right in and get monitoring…

Configuration

runecast-addvcenterAs far as installation goes Runecast come bundled as a virtual appliance, so it’s just a matter of deploying the analyzer into our environment.  To help you get started Runecast offers a 30 day full-featured free trial that you can try out!  Configuration wise we really only have a couple of steps to perform; pointing the Runecast Analyzer at our vCenter Server and configuring our ESXi hosts to forward their logs.  After deployment you should be brought to a screen similar to the one shown to the left.  Simply follow the ‘Settings’ link and enter in your required vCenter Server information into Runecast as shown below.

runecast-vcenteradditiondetails

Remember how we mentioned that configuration is divided into two steps.  The first, connecting to our vCenter environment is now complete.  The second, setting up the forwarding of logs is completely optional and can be completed at any time.  We can still get valuable data from Runecast without having log forwarding set up, however in order to achieve a more holistic view of our environment we will continue to setup log forwarding.

There are many ways to setup our ESXi hosts to send their logs to Runecast.  We can set them up manually, use some a PowerCLI script, or enter the Runecast Analyzer information into our Host Profile.  The Runecast interface has the smarts to configure this for us as well.  This review will follow the steps in order to setup log forwarding from within the Runecast Analyzer UI.

Selecting the “Status” section from the Log Analysis group, and then clicking on the ‘wrench’ icon will allow us to configure one or many of our hosts to send their log files to Runecast.  This process provides the same results as if we were to go and set the syslog advanced setting directly on the hosts configuration. That said, utilizing Runecast for this seems like a much more automated and easier process.   As you can see below, we also have the option to send our VM log files as well which is a good idea if you are looking for complete visibility into your virtualization stack.

runecast-logging

As far as configuration goes we are now done!  That’s it!.  2 simple steps and we are ready to start detecting problems within our environment.  The process of going out and collecting data from our vCenter Server is called ‘Analyze’ within Runecast.  Our analysis can be configured to occur via a schedule by navigating to the settings page (gear icon in top right) or can be run on-demand by clicking the ‘Analyze Now’ button from any screen within the application.

runecast-analyze

How long this process takes greatly depends on the size of your environment.  My test environment, be it simple and small, only took a couple of minutes to gather the data.  I’m sure this time would increase in a 32 host cluster with 1000 or so VMs though.    That said, for the amount of data it gathers and the amount of comparisons going on behind the scenes Runecast does a very efficient job at processing everything.

Navigating back to the ‘Dashboard’ as shown below immediately let’s us start to explore the results of this analysis process.  Almost instantaneously we can see many issues and best practices that can be applied within our environment.  As you can see below I had a number of issues discovered – and I’ve only had Runecast up and running for less than 5 minutes.

runecast-dashboard

Runecast Terminology

Lets take a minute and dig a little into the data that is displayed on the ‘Dashboard’ screen.  Mostly everything that Runecast monitors and does is rolled up here, giving us an at-a-glance view of everything you need to know.  Let’s break down the items that we are seeing here…

Issues – The term “issue” within Runecast basically represents a detected problem in our infrastructure – this can come from any single or combined instance of configuration settings, log file analysis, or software and hardware versions.  Although the source of discovering issues could be from configuration settings or log files, all issues belong to one of three categories within Runecast; Knowledge Base articles, Security Guidelines, or Best Practices, explained below…

KB’s – Runecast actively piles through the vast amounts of VMware Knowledge Base articles and displays to us any that may apply to our environment based on the hardware and software versions and configuration we are running.

Best Practices – All of our inventory objects and configuration items are routinely scanned to determine whether or not they meet any best practices related to VMware.  This allows us to see if we simply Pass or Fail in terms having our environment running in it’s best possible configuration.

Security Compliance – Security Compliance takes all of the items within the official VMware Security Hardening guides and compares that to of the configuration of our infrastructure.  At a glance we are able to see how we stack up against the recommended security practices provided by VMware.

It’s these four items; Issues, KB’s, Best Practices, and Security Compliance that are at the core of the Runecast analytical engine.  Runecast automatically combs through all of these items and determines which ones apply to our environment, then reports back in a slick clean UI, allowing us to see whether we are in compliance or not!  In the next part of our review we will go into each of these items in a lot more detail – explaining how to drill down, resolve, and exclude certain metrics from our dashboards.  For now , I certainly recommend checking out Runecast for yourself – as you saw, it’s a simple install that can be up and running in your environment very quickly.  So, while you wait for part 2 of the review head on over to the Runecast page and grab yourself a free 30 day trial  to start reporting on your environment.  I’m sure you will be surprised at all of the abnormalities and non-compliant configurations you find right off the hop – I know I was!  Stay tuned for part 2.

Automation using the Nakivo API

apiThe Software Defined Data Center – It’s everywhere.  You can’t go to any big trade show in the IT industry without hearing the phrase “Software Defined X” being tossed around at all of the booths.  Over the last decade or so we have seen software take center stage in our data centers – being the glue that holds everything together.  With this focus on software it’s extremely important that companies develop and support API’s within their products – one, it’s our way of taking application x and integrating it with application y.  Secondly, its important for the success of the company – without an API organizations may look elsewhere for a solution that provides one – and without an API vendors cannot securely control the access into their solutions, leaving customers developing unsupported and faulty applications to get around it.

One big example that shows the benefit of API integrations that I always like to use is that of the deployment of a VM.   Sure, we use our Hypervisor of choice to take our templates and clone VMs from them, providing some sort of automation and orchestration around the configuration of that said VM – but the job doesn’t simply end here – we have monitoring solutions we may need to add our VM into, we have IP management tools in order to integrate into to retrieve IPs and DNS information, and most importantly, we have to ensure that our newly created VM is adequately protected in terms of backup and recovery.   With so many hands inside of the data center creating VMs our backup administrators might not always know a certain solution has been created – and when a failure occurs, there’s a pretty good chance we won’t be able to recover without any backups – so it’s this situation we will look at today…

Automatically protecting our VMs

Our software of choice today will be Nakivo Backup and Replication – a company based out of Silicon Valley providing data protection solutions.   Nakivo provides full API integration into their backup suite allowing administrators and developers to create automation around the creation, modification, and removal of jobs.  The scope of our integration will be as follows – Let’s create a simply vRealize Orchestrator workflow that will allow us to simply right-click a VM from within the vSphere Web Client, and add this VM into an already existing backup job.  From here I’ll let your imagination run wild – maybe you integrate this code into your VM deployment workflow to automatically protect it on creation – the point is that we have a starting point to look at the possibilities of consuming Nakivo’s API and creating some automation within your environment for backup and recovery.

nakivoapi-apidoc

A little about the Nakivo API

Before we get into the actual creation of the vRO workflow it’s best we understand a little bit about the Nakivo API itself.  Nakivo provides an API based around JSON content – so all of our requests and responses will be formatted within JSON format.  These requests will all go through using POST, and are always provided to the /c/router realm (ie https://ip_of_nakivo:4443/c/router) As far as authentication goes Nakivo utilizes cookie based authentication – what this means is that our first request will be sent to the login method, upon which we will receive a JSESSIONID which we will have to pass with every subsequent request in order to secure our connection.  As we can see from the example request below they need to be formatted in such a way that we first specify and instance (IE AuthenticationManagement, BackupManagement, InventoryManagement, etc) and a method (IE login, saveJob, getJob, etc).  From there we attach the data associated with the method and instance, as well as a transaction id (tid).  The transaction id can utilize an auto increment integer if you like, or can simply be set to any integer – it’s main purpose is to group multiple method calls into a single POST – which we won’t be doing anyways so you will see I always use 1.

var requestJSON = “{‘action’: ‘AuthenticationManagement’,’method’:’login’,’data’: [admin,VMware1!,true],’type’: ‘rpc’,’tid’: 1}”;

Above we show an example of a login request in JavaScript, because this is the language of choice for vRealize Orchestrator which we will be using – but do remember that you could use PHP/JAVA/PowerShell – whatever language you want so long as you can form an HTTP request and send JSON along with it.

On with the workflow

Before diving right into the code it’s best to take a look at the different sections or components that we will need to run through in order to add a given VM to a Nakivo job through vRealize Orchestrator.  With that said we can break the process down into the following sections…

  • Add Nakivo as an HTTPRest object within vRO
  • Create workflow w/ VM as an input object and the Nakivo HTTPREST as an argument
  • Create some variables in regards to our VM (IE Name, Cluster, etc)
  • Login to Nakivo to retrieve session
  • Retrieve our target job
  • Find VMs Cluster ID within Nakivo (ClusterID is required in order to find the actual VM within Nakivo).
  • Gather VM information from within Nakivo
  • Gather information about our repository from within Nakivo
  • Build  JSON request and add VM to job

With our workflow broken down into manageable chunks let’s go ahead and start coding

Add Nakivo as an HTTPRest object.

If you have ever worked with the HTTPRest plugin within vRO then this will seem like review – however for those that haven’t let’s take a look at the process of getting this setup.  From within workflow view simply run the ‘Add a REST host’ workflow located under the HTTP-REST/Configuration folders.  As far as parameters go simply give the host a name, use https://ip_of_nakivo:4443 as the URL, and be sure to select ‘Yes’ under the certification acceptance as shown below

nakivoapi-addhost

The remaining steps are somewhat invalid as it pertains to adding Nakivo as a REST host within vRO – for authentication I selected basic and provided the credentials for Nakivo – however this really doesn’t matter as we are going to use cookie/header based authentication through our code anyways – however something needs to be selected and inputted within vRO.  After clicking submit the NakivoAPI REST host should be added to our vRO inventory.

Workflow creation

As far as the workflow goes I’ve tried to keep it as simple as possible, requiring only 1 input attribute and 1 input parameter as follows

  • Input Attribute (Name: NakivoAPI – Type: RESTHost – Value: set to the Nakivo object created earlier

Nakivoapi-attribute

  • Input Parameter (Name: sourceVM – Type: VC:VirtualMachine )

nakivoapi-parameter

Code time!

After this simply drag and drop a scriptable task into the Schema and we get started with the code!  I’ve always found it easier to simply just display all the code and then go through the main sections by line afterwards.  As far as the javascript we need you can find it below…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
var vmName = sourceVM.name
var cluster = System.getModule("com.vmware.library.vc.cluster").getComputeResourceOfVm(sourceVM);
var clusterName = cluster.name;
 
// login and retreive sessionID
var requestJSON = "{'action': 'AuthenticationManagement','method':'login','data': [admin,VMware1!,true],'type': 'rpc','tid': 1}";
var request = NakivoAPI.createRequest("POST", "/c/router", requestJSON);
request.setHeader("Content-Type","application/json");
var response = request.execute();
var headers = response.getAllHeaders();
var cookie = headers.get("Set-Cookie");
 
// retrieve target job
requestJSON = "{'action': 'JobManagement','method':'getJob','data': [1],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
var jsonResponse = JSON.parse(response.contentAsString);
var job = jsonResponse.data;
 
// find clusterID
requestJSON = "{'action': 'InventoryManagement','method':'collect','data': [{'viewType':'VIRTUAL_ENVIRONMENT'}],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
 
// reduce to datacenters
var vcenter = jsonResponse.data.children[0];
var datacenters = vcenter.children;
var datacenter;
var cluster;
for ( var p in datacenters)
{
	for (var c in datacenters[p].children)
	{
		if (datacenters[p].children[c].name == clusterName)
		{
			cluster = datacenters[p].children[c];
		}
	}
}
var clusterid = cluster.identifier;
 
// look in cluster for VM info...
requestJSON = "{'action': 'InventoryManagement','method':'list','data': [{'nodeType':'VMWARE_CLUSTER','nodeId': '" + clusterid + "','includeTypes': ['VM'] }],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
var vms = JSON.parse(response.contentAsString);
vms = vms.data.children;
var vm;
for (var p in vms)
{
	if (vms[p].name == vmName)
	{
		vm = vms[p];
	}
}
 
// get more info on VM
requestJSON = "{'action': 'InventoryManagement','method':'getNodes','data': [true, ['"+ vm.vid + "']],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
var vminfo = JSON.parse(response.contentAsString);
vminfo = vminfo.data.children[0];
var vmdisk = vminfo.extendedInfo.disks[0].vid;
 
// get target storage
requestJSON = "{'action': 'InventoryManagement','method':'list','data': [{'includeTypes': ['BACKUP_REPOSITORY'] }],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
jsonResponse = JSON.parse(response.contentAsString);
var targetVid = jsonResponse.data.children[0].vid;
 
//build data portion of JSON to add VM to job
var jsonSTR = '{ "sourceVid": "' + vminfo.vid + '","targetStorageVid": "' + targetVid + '","mappings": [{"type": "NORMAL","sourceVid": "' + vmdisk + '"}], "appAwareEnabled": false}';
var json = JSON.parse(jsonSTR);
 
//push new object to original job
job.objects.push(json);
System.log(JSON.stringify(job));
 
// let's try and push this back in now....
requestJSON = "{'action': 'JobManagement','method': 'saveJob', 'data': [" + JSON.stringify(job) + "],'type': 'rpc','tid': 1}";
request = NakivoAPI.createRequest("POST","/c/router",requestJSON);
request.setHeader("Content-Type","application/json");
request.setHeader("Cookie", cookie);
response = request.execute();
 
// done!!!!

Lines 1 -3 – Here we simply setup a few variables we will need later on within the script; vmName, which is assigned the name attribute of our input parameter sourceVM as well as going out and running a built in action to get the name of the cluster that the VM belongs to.  Both these variables will be needed when we attempt to get all of the information we need to add the VM to the backup job.

Lines 5-11 – This is our request to login to Nakivo.  As you can see we simply create our request and send the login method, along with the associated login data to the AuthenticationManagement interface.  This request basically authenticates us and sends back the JSSESSIONID that we need in order to make subsequent requests, which we store in a cookie variable on line 11.

Lines 13 – 20 – Again, we make a request to Nakivo to get our job that we want to add the backup to.  I only have one job within my environment so I’ve simply utilized the getJob method and sent the data type of 1 (the jobID) since I know that is my one and only job id in the system.  If need be you may need to write a similar request to get the job id if you don’t know it – Nakivo does provide methods within their API to search for a job ID by the job name.  Also note, since this is a subsequent request after a login we are sending our cookie authentication data on line 17 – also, we are taking our response data and storing it in a variable named job on line 20 – we will need this later when we update the job.

Lines 22 – 45 – This is basically a request to the InventoryManagement interface that we can use to find out what the id (as it is within Nakivo) of the cluster housing the virtual machine.  First, on line 23 we build a request to basically return our complete virtual infrastructure inventory – upon which we parse through from lines 35-44 looking for a match on our cluster names.  I’ve  had to loop through data centers as my test environment contains more than one virtual data center.  Finally, on line 45 we simply assign the clusterid variable the nakivo identifier of the cluster.

Lines 47 – 73 – Here we use our cluster identifier and basically list out the VM inventory within it.  After looping through when we find a match on our VM name, we simply assign it to a vm variable.  We then, on line 66 send a request to the InventoryManagement interface again, this time looking at the Virtual Machine level and sending the identifier of our newly discovered VM.  Once we have the response we assign the identifier of the VMs disk(s) on Line 73 to a variable.  Again, I know this environment and I know the VM only contains one disk so I’ve hard coded my index – if it was unknown, or truly automated you would most likely have to loop through the disks here to get your desired output.

Lines 75 – 82 – This block of code is used to get the identifier of the target storage, or repository within Nakivo.  Again we need this information for our final request that will add the VM to the job – and again, this is a known environment so I could simply hard code my array index on line 82 to return the proper repository (as there is only one).

Lines 84 – 90 – Here we are simply building out the JSON variable that we need in order to push all of the information we have previously gathered above.  We basically form our string on line 85, convert it to JSON directly after, and push it into the original job variable we set on line 20.

Lines 92-99 – Ah, finally – This block basically takes all of our hard work and pushes the job back into the saveJob method of the Nakivo JobManagement interface.  Once executed you should manually see your job info within Nakivo update reflecting the new VMs added to the job.

So there you have it! A completely automated way of selecting a VM within vRealize Orchestrator and adding it to a Nakivo backup job – all without having to open up the Nakivo UI at all!

But wait, there’s more!

Ahh – we eliminated the need of opening up the Nakivo UI but how about eliminating the Orchestrator client as well – and simply just executing this job from directly within the vSphere Web Client – sounds like a good idea to me!  If you have properly, and I say that because it can sometimes be difficult – but if you have properly integrated vRO and vSphere then doing this is a pretty easy task.

Within the ‘Context Actions’ tab on our vRO configuration within the web client simply click ‘+’ to add a new action.  As shown below we can simply browse our workflow library and select our newly created Nakivo workflow and associate that with the right-click context menu of a virtual machine.

nakivoapi-context

What we have essentially done now is allowed our administrators to simply right-click on a VM, browse to ‘All vRealize Orchestrator Actions’ and click on our workflow name.  From there the vRO workflow will take the associated VM (the one we right-clicked on) and assign it to our sourceVM parameter – meaning we’ve taken the complete process of logging into Nakivo, editing our backup job, adding a new VM, and saving it and converted it to a simple right click, followed up by a left click – without having to leave the vSphere Web Client!

nakivoapi-rightclick

So in all this is a pretty basic example of some of the things we can do with the Nakivo API – and it followed a pretty simple and stripped down workflow – but the point is Nakivo offers a wide variety of methods and integration points into their product.  Pretty much anything you can do within the GUI can be performed by making calls to the API.  This is what helps a product integrate into the Software Defined Data center – and what allows administrators to save time, provide consistency, all the while ensuring our data is protected.  Nakivo also has a wide variety of documentation and also a Java SDK built around their API, complete with documents and explanations around all of the interfaces provided.  If you are interested in learning more, about Nakivo’s API or Nakivo’s products in general head on over to their site here – you can get started for the low cost of free!  Until next time, happy automating!