Monthly Archives: November 2016

Lessons learned from #vDM30in30

Phew!  I’m not sorry to say that #vDM30in30 is over with!  Not to say it wasn’t a lot of fun, but honestly, it’s a lot of work – especially when juggling family, travel, the day job and all!  One might think that simply blasting out 30 pieces of content in 30 days would be relatively easy – but it’s not!  That said, I learned a lot about my writing process and styles during this challenge, and as my final, and, unfortunately only 28th post of the month I’d like to share those with you…

The challenge of topics

It’s not easy coming up with topics to write about, especially when writing so often.  I was lucky enough to have had a handful of ideas already sitting in my draft folders – and #vDM30in30 finally gave me the opportunity to write about them.  That said, I know I had thought of more throughout the month and simply forgot to write them down.  So whatever your means of tracking your ideas are (drafts, post-its, bullet journals) write them down!  I found that if I didn’t commit it to something I would forget it.  Needless to say I have a dozen or so topics just sitting in my drafts now – which leads me to the next challenge…

The challenge of time

Surely this is probably the biggest hurdle of all – finding the time to articulate yourself and get a blog post written.  I find that this varies for me – for some topics I’ll simply start writing and have a complete post hashed out in an hour or so.  Others I find myself having to go do research, read other blogs, whitepapers, trying to fully understand what I’m writing about 🙂  Those are the ones that sometimes take days – 10 minutes here and there, revisiting the same ol’ things.  For me I’m best to dedicate all the time I need to write the post in one sitting – otherwise I have  a hard time reading my own writing once I revisit the post.  That said, time is tricky thing to find – we have families, commitments, other things we need to take care of – what I did was always critique myself with what I was doing.  If I was watching a habs game I would try and at least do something “blog productive” while doing so.  Those endless hours on an airplane – perfect for editing and getting things ready!  My advice here, just use your time wisely and don’t sacrifice the things you love the most just to write a blog post – the kids will eventually go to sleep – do it then 🙂

The challenge of writing

Perhaps this is the oddest hurdle to overcome.  Sometimes the words just come, other times I struggle trying to explain myself.  There were times where, even though I knew I would have a hard time coming back to complete a post I simply had to walk away.  If you are burnt out, nothing will make sense.  Take breaks, either small or large – we are all different just find what works for you.  For me, that was walking…

So I’m happy to say that even though I was two shy of the infamous thirty – I did learn some things about my writing process and styles.  With that said, here’s a look at what I accomplished throughout the month of November on mwpreston.net.

Tech Field Day 12 Stuff

My favorite Veeamy things…

Other vendor stuff

My Friday Shorts

Randoms

So there you have it!  Thanks all for following along and reading and I hope to participate next year as well.  All that said, don’t expect a post per day to continue here – I need some sleep!

The Atlas File System – The foundation of the Rubrik Platform

One of the core selling points of the Rubrik platform is the notion of something called “unlimited scale” – the ability to start small and scale as large as you need, all the while maintaining their masterless deployment!  Up until a few weeks ago I was unaware of how they actually achieved this, but after witnessing Adam Gee and Roland Miller present at Tech Field Day 12 in San Jose I have no doubts that the Atlas file system is the foundation upon which all of Rubrik is built.

atlascore

As shown above we can see how the platform is laid out by Rubrik – with the Atlas file system sitting within the core of the product and communicating with nearly every other component in the Rubrik platform.  Now picture each node containing exactly this same picture, scaling up to whatever number of nodes you might have – each node containing its’ own Atlas file system, with its own local applications accessing it – however the storage is distributed and treated as one scalable blob of storage addressable by a single global namespace.

Disclaimer: As a Tech Field Day 12 delegate all of my flight, travel, accommodations, eats, and drinks are paid for. However I did not receive any compensation nor am I required to write anything in regards to the event or the presenting companies. All that said, this is done at my own discretion.

Atlas – a distributed scalable file system.

As shown above other core modules such as Callisto, Rubriks distributed metadata store and the Cluster Management system all leverage Atlas under the hood – and in turn Atlas utilizes these for some of its functions.  For instance, to make Atlas scalable it leverages data from the Cluster Management system to grow and shrink – when a new brik is added, Atlas is notified via the CMS, at which point the capacity from the new nodes is added to the global namespace, thus increasing our total capacity available, as well as flash resources to consume for things such as ingest and cache.  It should also be noted that Atlas does take care of data placement as well, so adding a new node to the cluster will trigger it to re-balance, however it’s got the “smarts” to process this as a background task and take into affect all of the other activities occurring within the cluster, which it gets from the Distributed Task Framework –   Meaning we won’t see a giant performance hit directly after adding new nodes or briks due to the tight integration between all of the core components.

Adding disk and scaling is great, however the challenges of any distributed file-system is how to react when failures occur, especially when dealing with the low costs of commodity hardware.  Atlas performs file system replication in a way that it provides for the failure at a disk level and a node level, allowing for 2 disks, or 1 full node to fail without experiencing data loss.  How Atlas handles this replication depends solely on the version of Rubrik in your datacenter today.  Pre 3.0 releases used a technology called mirroring, which essentially triple replicated our data across nodes.  Although triple replication is a great way to ensure we don’t experience any loss of data it does so at the expense of capacity.  The Firefly release, 3.0 or higher, implements a different replication strategy via erasure coding.  By its nature, erasure coding essentially takes the same data that we once would of replicated three times and splits it  into chunks – the chunks are then processed and alternate chunks are encoded and created which can be used to rebuild the data if need be.  It’s these chunks that are intelligently placed across disks and nodes within our cluster to provide availability.  The short of the story here is that erasure coding gives us the same benefit of triple replication, without the cost of having triple the capacity – therefore more space will be available within Rubrik for what matters most, our data.

rubrik-selfheal Aside from replication of our data Atlas employs other techniques to keep our data available as well – items such as self healing and CRC detection allows Atlas to throw away and repair data as it becomes corrupt.   Now these are features within file-systems we expect to see, but Atlas can handle this a little different due to it’s distributed architecture.  The example given was with three briks, each containing four nodes – when a node fails, or data becomes corrupt Atlas actually repaired the data on a surviving node within the same brik, ensuring we are still spread out across briks.  If a brik happens to fail, the chunk of data would then be required to be on the same brik as another, but would be placed on another node, allowing still for node failure.  It’s this topology-aware deployment that really allows Rubrik to maximize it’s data availability and provide protection across not only nodes within a brik, but between brik failures as well, maximizing the failure tolerance guarantees they are providing.

Perhaps some of the most interesting ways Atlas works though are around how it exposes its’ underlying functions and integration points in the applications running on top of it, the Rubrik applications.  First up, the meat of Rubriks solution, mounting snapshots for restore/test purposes.  While all of our backup data is immutable, meaning it by no means can be changed in any way, Atlas does leverage a “Redirect on Write” technology in order to mount these backups for test/dev/restore purposes.  What this means is that when a snapshot is requested for mount, Atlas can immediately assemble the point in time using incremental pointers – no merging of incrementals to full backups or data creation of any kind – a simple presentation of the full VM in that point in time is presented.  Any writes issued to this VM are redirected, or written elsewhere and logged – thus not affecting the original source data whatsoever, all the while allowing the snapshot to be written to.

Atlas also exposes a lot of its underlying functionality to applications in order to create performance as well.  Take for instance the creation of a scratch or temporary partition for example – if Rubrik needs to instantiate one of these it can tell Atlas that this is indeed temporary – thus, Atlas doesn’t have the need to replicate the file making up the partition at all as it doesn’t necessarily require protection and can simply be tossed away when we are done with it.  And that tossing away, the cleaning up after itself can also be set from an application level.  In that same example we could simply set a ttl or expiry on our scratch file, and let the normal garbage collection maintenance job clean up during its normal run, rather than wasting time and resources in having the application make second or third calls to do it.  Applications can also leverage Atlas’s placement policies, specifying whether files or data should be placed on SSD or spinning disk, or even specify whether said data should be located as close as possible to other data.

So as you can see that although Rubrik is a very simple and easy policy based, set and forget, type of product there is a lot of complexity under the hood.  Complexity that is essentially abstracted away to the end-user, but available to the underlying applications making up the product.  In my mind this paves the way for a quick development cycle.  Being able to leverage the file-system for all its worth while not having to worry about “crazy” configurations customers may have.  We have certainly seen a major influx of custom-built file systems entering our data centers today – and this is not a bad thing.  While the “off the shelf”, commodity type play may fit well for hardware, the software is evolving – and this is evident in the Rubrik Atlas file system.  If you want to learn more definitely check out their Tech Field Day 12 videos here – they had a lot more to talk about than just Atlas!

VembuHIVE – A custom built file system for data protection

vembu-logoVirtualization has opened many doors in terms of how we treat our production environments.  We are now vMotioning or Live Migrating our workloads across a cluster of hosts – we are cloning workloads with much ease and deploying new servers into our environments at a very rapid rate.   We have seen many advantages and benefits to the portability and encapsulations that virtualization provides.  For a while, our backups though were treated as the same – simply copies of our data sitting somewhere else – only being utilized during those situations when a restore was required.  That said over the past 5 years or so we have seen a shift in what we do with our backup data as well.  Sure, it’s still primarily used for items such as restores, both on a file and image level – but backup companies have began to leverage that otherwise stale data in ways we could only imagine.  We see backups being used for analytics, compliance, and audit scans.  We see backups now being used in a devops nature – allowing us to spin up isolated, duplicate copies of our data for testing and development purposes.  We have also saw the ‘restore’ process dwindling away, with the “instant” recovery feature taking its’ place, powering up VMs immediately from within the deduplicated and compressed backup files, drastically decreasing our organizations RTO.

So with all of this action being performed on our backup files a question of performance comes into play.  No longer are we ok to simply store our backups on a USB drive formatted with a traditional file systems such as FAT or NTFS.  The type of data we are backing up, the modern virtualization disk images such as VHDx and VMDK depend on something more from the file system it’s living on – which is why Vembu, a data protection company out of India have developed their own file system for storing backups, the VembuHIVE.

Backups in the HIVE

beehiveWhen we hear the word VembuHIVE we can’t help but turn our attention towards bees – and honestly, they make the perfect comparison as to how the proprietary file system from Vembu performs.  A bee hive at its basics is the control center for bees – a place where they all work collectively to support themselves and each other – the hive is where the bees harvest their magic, organizing food, eggs, and honey.   The VembuHIVE is the central point of storage for Vembu’s magic, storing the bits and controlling how files are written, read and pieced together.  While VembuHIVE can’t produce honey (yet), it does produce data.  And it’s because of the way that VembuHIVE writes and reads our source data that we are able to mount and extract our backups in multiple file formats such as ISO, IMG, VMDK and VHDX – in a near instant fashion.

In essence, VembuHIVE is like a virtualized file system overlaid on top of your existing file system that can utilize utilities that mimic other OS file systems – I know that’s a mouthful but let’s explore that some more.

Version Control is key

In my opinion the key characteristic that makes VembuHIVE run is version control – where each and every file produced is accompanied by metadata controlling what version, or point in time, the file is from.  Probably the easiest comparison is to that of GIT.

versioncontrolWe all know of GIT – the version control system that keeps track of changes to our code.  GIT solved a number of issues within the software development ecosystem.  For instance, instead of copying complete projects before making changes we could simply branch out on GIT – which would basically track changes to source code and store only those lines which have changed – allowing us to easily roll back or to any point in time within our code – reverting and redoing any changes that were made.  This is all done by only storing changes and creating metadata to explain those changes – which in the end gives us a very fast way to revert to different points, fork off new points, all the while utilizing our storage capacity in the most efficient way possible.

VembuHIVE works much in the same way as GIT however instead of tracking source code we are tracking changed blocks within our backup files – allowing us to roll back and ahead within our backup file chain.  Like most backup products Vembu will create a full backup during the first run, and subsequently utilize CBT within VMware to copy only changed blocks during incremental backups.  That said, the way it handles and intelligently stores the metadata of those incremental backups allows Vembu to essentially present any incremental backup as what they call, a virtual full backup.  Basically, this is what allows Vembu BDR to expose our backups, be them full or incremental, in various file formats such as vmdk and vhdx.  This is done without performing any conversion on the underlying backup content and in the case of incremental backups there is no merging of changes to the previous full backup before hand.  It’s simply an instant export of our backups in whatever file format we chose.  I mention that we can instantly export these files, but it should be noted that these point in time backups can be instantly booted and mounted as well – again, no merge, no wait time.

VembuHIVE also contains most of the features you expect to see in a modern file system as well.  Features such as deduplication, compression and encryption are also available within VembuHIVE.  As well, VembuHIVE contains built-in error correction on top of all of this.  Every data chunk within the VembuHIVE file system has it’s own parity file – meaning when data corruption occurs, VembuHIVE can reference the parity file in order to rebuild or repair the data in question.  Error correction within VembuHIVE can be performed at many levels as well, protecting data from a disk image level, file-level, chunk-level or backup file-level basis – I think we are covered pretty good here

Finally we’ve mentioned a lot that we can instantly mount and exports our VMs on a VM level basis, however the intelligence and metadata within the VembuHIVE file system goes way beyond that.  Aside from exporting as vmkd’s or vhdx’s, VembuHIVE understands how content is organized within the backup file itself – paving the way for instant restores on an application level – think Exchange and Active Directory objects here.  Again, this can be done instantly, from any restore point at any point in time without performing any kind of merge process.

In the end VembuHIVE is really the foundation of almost all the functionality that Vembu BDR provides.  In my opinion Vembu have made the correct decision by architecting everything around VembuHIVE and by first developing a purpose built, modern file system geared solely at data protection.   A strong foundation always makes for a strong product and Vembu has certainly embraced that with their implementation of VembuHIVE

Friday Shorts – VeeamON, Storage Protocols, REST, and Murica!

“If that puck would’ve crossed the line Gord, that would’ve been a goal!” – Pierre McGuire – A Mr Obvious, annoying hockey commentator that drives me absolutely insane! (Sorry, watching the Habs game as I put all this together :))

Jambalaya and Backups – Get there!

veeam_logoVeeam had some big announcements this year along with a slew of releases of new products, beta’s and big updates to existing products.  All that said we can only assume that VeeamON, the availability conference focussed on the green is going to be a big one!  This year it takes place May 16-18 in New Orleans – a nice break from the standard Vegas conferences!  I’ve been to both VeeamON conferences thus far and I can tell you that they are certainly worth it – all of Veeams engineers/support is there so if you have a question, yeah, it’ll get answered and then some!  So, if you can go, go!  If you can’t, if it’s a money thing – guess what???  Veeams raffling off 10, yes 10 fully paid (airfare, hotel, conference) trips over the holidays – so yeah, go sign up!

But we have a REST API?

apiAlthough this post by John Hidlebrand may be a month old I just read it this week and it sparked some of my own inner frustrations that simmer around deep inside me 🙂  John talks about how having a REST API is just not enough at times – and I completely agree!  I’m seeing more and more companies simply state, oh yeah, we have a REST API, we are our first customer!  That’s all said and great – but guess what, you wrote it and you know how to use it!  All to often companies are simply developing the API and releasing it, but without any documentation or code examples on how to consume it!  John brings up a good point about, hey, how’s about having some PowerShell cmdlets built around it?  How about having an SDK we can consume?  Building your application off of a REST API is a great start don’t get me wrong, but if you want people to automate around your product – help us out a little please 🙂

In through iSCSI, out through SMB, in through SWIFT, out through REST

isolonFellow Veeam Vanguard and TFD12 delegate Tim Smith has a post over on his blog describing a lot of the different storage protocols on the market today and how EMC, sorry, Dell-EMC Isilon is working to support them all without locking down specific data to a specific protocol. If you have some time I’d certainly check it out!

Happy Thanksgiving Murica!

I’ve always found it odd that Canadians and Americans not only celebrate thanksgiving on different days, but in different months as well!   Come to find out there are quite a few other differences as well.  You can see the two holidays compared on the diffen.com site.  It makes sense that we here in Canada celebrate a bit earlier – especially if our thanks revolves around the harvest.  I mean, no one wants to wait till November in Canada to harvest their gardens and crops – you’d be shoveling snow off of everything!  Either way – Happy Thanksgiving to all my American friends – may your Turkey coma’s be long-lasting!

A VMware guy’s perspective on containers

docker-logo-300-71x60Recently I attended Tech Field Day 12 in San Jose and was lucky enough to sit down with Docker for a couple of hours.  Docker talked about a number of things including Containers as a Service, Security, Networking, Cloud and the recent integration points on Microsoft Server 2016.  Now I’m not going to pretend here – Docker, more specifically containers are something that I’ve heard of before (How could you not have?) but I’ve never really gone too deep into what they do, how they perform, or what use cases they fit well into.  I knew they had something to do with development – but that’s as far as I’ve really went with them.  Listening to Docker and other delegates questions during the presentation got me thinking that I should really start learning some of this stuff. – and it’s that thought right there which sent me down a rabbit hole for the last few days, reading countless blogs and articles, watching numerous videos and keynotes, and scratching my head more often than I would’ve liked too – in the end I’m left with the conclusion that there a lot of misconceptions in regards to containers, and I was falling right into mostly all of them…

VMware vs Docker

vmwaredockerHere’s the first misconception I was reading a lot about.  Quite a lot of chatter out there on the interwebs is happening about the downfall of the VM and the up-rise of the container.  And for some environments this may hold true, but, even according to Docker, these two technologies are not necessarily competitors.  You see, VM’s by their nature encapsulate a complete running VM – all of the OS, applications, libraries, and data running is encapsulated into a VM, with hardware emulation and a BIOS.  A container on the other hand is application focused – being more an application delivery construct while sharing processes related to the Linux kernel and operating system its’ running on.  Still confused?  Don’t worry – so was(am) I.  There’s an analogy that Docker uses quite often that might help; houses vs apartments.  Think of a VM as a house, complete with all the different living spaces and its self contained services such as heat, electricity, and plumbing.  On the flip-side, containers could be apartments – sure each one may be a little different but they share common services in the building – electricity and plumbing is shared and all comes from the same source.  So in essence there is room for both in the market, in fact, they really provide quite different platforms for running our applications – while Docker will focus in on stateless, scale able, non-persistent apps, mostly providing advantages around development and portability our VMs give us the “warm and fuzzy” feeling of having separate OS instances for our applications, with their front doors shut and locked.

Docker is just for developers

codeAnother pretty big misconception if you ask me!  Sure, Docker is getting huge adoption in the developer space because of it’s provided consistency – a developer can begin by by pulling down a Docker image and have the libraries and components setup on their laptop exactly how they want.  They then can share this image out to be forked by others, meaning we have a consistent environment no matter where the application is being developed.  When the time comes to move to test, or production, we are still running within that same, consistent environment – no more patch or library conflicts – A true developers nirvana!  But after reading so much about this I have come to the realization that Docker is not just a “Developer” thing, it’s for all of us, even us crazy operation guys!  The shear nature of having a container limited to one service, or micro-services if you will, allow us as administrators to deploy applications in our data center in the same way – think a container for Apache, a container for MySQL, each it’s own separate entity, each working together to provide a full application to our end users – and with the maturity and availability of images out there today take a guess who doesn’t have to go through all of the headaches and processes of setting all of this stuff up – operations doesn’t!  And spawning multiple instances of all of these is just one command line away!  It just feels right to me and just as we have seen the adoption of virtualization and the adoption of of companies shipping software bundled in virtual appliances I can see a day where we will soon see those same services packaged and shipped as containers.

But Docker is just for Linux nerds

linux-mac-windowsNot anymore…  Have yourself a copy of Windows 10 or Server 2016, yeah, simply install the feature called containers, grab the Docker engine and away you go!  Microsoft and Docker have made huge partnerships and as of right now you can even pull down some Microsoft applications right off of the “App Store” if you will.  Need yourself a SQL Server –docker run -d -p 1433:1433 -e sa_password=password -e ACCEPT_EULA=Y microsoft/mssql-server-windows-express – yeah, that’s all – your done!  Still think Docker is just for developers???  Microsoft has been doing some really out of character things as of late – think bash on windows, open sourcing .Net, SQL Server on Linux – just super weird non-traditional Microsoft things – but in a good way!  Don’t be surprised if we see Microsoft going all in with containers and Docker in the future!!!  Let the benefits of Continuous Integration and deployment be spread among all the nerds!!!

So I can deliver all my Windows apps through containers now!  Awesome!

Yes…but no!  Docker is not ThinApp/XenApp/App-V.  It doesn’t capture changes and compile things into an executable to be ran off a desktop or deployed through group policy.  In fact it’s just those server side applications that are supported in a Windows container.  We can’t for instance try and run Internet Explorer 6 with a certain version of the java plugin, nor can we run Microsoft Word within a container.  The purpose of this is to provide a portable, scale-able, consistent environment to run our server side, non-GUI Windows applications – think SQL Server, IIS, .NET, etc…  Now I can’t say where the technology will go in the future – a world in which we can all containerize desktop applications with Docker doesn’t sound too far fetched to me :).

So with all that I think I have a little better handle on containers and Docker since my Tech Field Day adventures – and wanted to simply lay it out the way I see it in the event that someone else may be struggling with the mountains of content out there.  If you want to learn more and dig deeper certainly check out all of the TFD videos that Docker has.  Also, Stephen Foskett has a great keynote that he has done – “What’s the deal with containers?” which I would certainly recommend you watch!  I’m still sort of discovering all of this but plan to really invest some time in the container world come next year – there is a lot of components that I want and need to understand a bit more such as persistent storage and networking – also, if I’m wrong or misinformed on any of this – do call me out 🙂 – that’s how we all learn!  Thanks for reading!

Disclaimer: As a Tech Field Day 12 delegate all of my flight, travel, accommodations, eats, and drinks are paid for. However I did not receive any compensation nor am I required to write anything in regards to the event or the presenting companies. All that said, this is done at my own discretion.

Did you know there is a Veeam User Group?

vuglogorevLike most of you I’ve been attending VMUGs for quite a while now and over the last few years I’ve helping out by co-leading the Toronto chapter.  Each and every one I attend I always get some value out of it – whether it’s from presenting sponsors, talking with peers, or just creepily listening to conversations from the corner – one of the challenges we seem to have is getting the “conversation” going – getting those customers and community members sitting in the audience to voice their opinion or even at times get up and do a presentation on something.  For our last meeting I reached out to Matt Crape (@MattThatITGuy) to see if he might be interested in presenting – Matt was quick to simply say yes – yes, but on one condition – do I want to come and present at his Veeam User Group?  So, with that a deal was cut and I headed out this morning to my first Veeam User Group.

Veeam User Group – VMUG with out the ‘M’

Matt runs the Southwest Ontario Veeam User Group (SWOVUG) – I’ve seen the tweets and blogs around the SWOVUG events taking place, and have always wanted to attend but something always seemed to get in the way – for those that know me I’m a huge Veeam user and fan – so these events are right up my alley.  So, I did the early morning thing again, battled the dreaded Toronto traffic and headed up to Mississauga for the day to check it out.

The layout of the meeting is somewhat similar to a VMUG meeting we have – two companies kindly supported the event; HPE and Mid-Range – and in return got the chance to speak.  HPE started with a short but good talk around their products that integrate with Veeam; mainly 3PAR, StoreOnce and StoreVirtual.  They also touched on HP OneView and the fact that they are laser focused on providing API entry points into all their products.

I’m glad HPE didn’t go too deep into the 3PAR integrations as I was up next and my talking points were around just that.  I simply outlined how my day job is benefiting from those said integrations; more specifically the Backup from Storage Snapshot, Restore from Storage Snapshot and On-Demand Sandbox for Storage Snapshots features.

After a quick, but super tasty lunch (Insert Justin Warren disclaimer post here) Mid-Range took the stage  Mid-Range is a local Veeam Cloud Connect partner offering DRaaS and a ton of other services around that.    Mid-Range did more than simply talk about the services they provide – they more-so went into the challenges and roadblocks of consuming disaster recovery as a service, then touched briefly on how Veeam and themselves could help solve some of those…

Finally to cap the day off we had David Sayavong, a local Veeam SE take the stage to talk to us about “What’s new in version 9.5?”.  David’s presentation was not just him up there flipping through slides of features, but more of a conversation around certain features such as ReFS integration and how all of the new Veeam Agents will come into play.  Just a fun fact for the day – the audience was asked who had already upgraded to 9.5 – and honestly around 1/3 of the room raised their hands.  That’s 33% that have already upgraded to a product that just GA’ed only 7 days ago – talk about instilling confidence in your customers.

Anyways I wanted to breifly outline the day for those that may be thinking of attending like I was, but haven’t yet set aside the time to do so.

But there’s more…

I mentioned at the beginning of the post that there is always struggles with getting people to “speak up” – this didn’t seem to be the case at the Veeam User Group.  I’m not sure what it was but conversations seemed to be flying all over the place – for instance, after I was done talking about the integration with 3PAR there was a big conversation that started around Ransomware and security.    Each presentation seemed more like a round table discussion than a sales pitch.  It truly was a great day with lots of interaction from the both the presenting companies and the audience – everything you want from user group.

The user group intrigued me – and maybe some day I’ll through my name in to try and get something started up on “my side of Toronto” – it’s Canada right – there’s only a handful of IT guys here so everything east of Toronto is mine Smile  For more information about the Veeam User Groups keep an eye out on the Veeam Events page and @veeamug on Twitter!  And to keep track of the SWOVUG dates I suggest following @MattThatITGuy and watching the swovug.ca site!  Good job Matt and team on a great day for all!

Scaling storage with an army of ARM!

igneousiologo-100x34It’s no surprise to anyone that storage is growing at an incredible rate – rich media, sensor devices, IoT – these are all affecting the amount of storage capacity that organizations need today and it’s only going to get worse in the future!  Organizations need somewhere to put this data, somewhere safe and protected, somewhere where availability is key.  For most that somewhere ends up being the cloud!  Public cloud services such as Amazon S3 give us access to oodles of storage on a pay as you go basis – and they remove the burden of having to manage this.  SLA’s are agreed upon and our data is just available when we need it!  That said, public cloud simply may not be an option for a lot of companies – the businesses that simply can’t, or sometimes won’t move to cloud, yet still want the agility and availability that cloud provides.  These organizations tend to move to an on-premises solutions – SANs and storage crammed into their own data centers – but with that comes a whole new bucket of challenges around scaling and availability…

How do we scale a SAN?

Most all storage out there today is designed in much the same way.  We have a controller of sorts, providing network and compute resources to move our data in and out of a number of drives sitting behind it.  But what if that controller goes down?  Well, there goes all of our infrastructure!  To alleviate this we add more controllers and more disk – This seems like a pretty common storage solution today – 2 controllers, each hosting a number of shelves full of drives, with dual path interconnects connected to the rest of our data center.  In this situation if we lose a controller we don’t necessarily lose access to our data, but we most certainly lose half of the bandwidth into it.  So, we yet again add more controllers and more disk – sitting with 4 controllers now – at which point the back of our racks and our interconnect infrastructure is getting so complex and complicated that we will most certainly hit struggles when the time comes to scale out even more.

So what is the perfect ratio of controller to disk, or cpu to disk?   How do we minimize complexity while maximizing performance?  And how do we accomplish all of this within our own data center? Lower ratios such  1 CPU for every 8 disks introduces complexity with connectivity – Higher ratio’s such as 1 CPU for 60 disks provides a huge fault domain.  Is it somewhere in the middle?   Igneous Systems has a answer that may surprise you!

Disclaimer: As a Tech Field Day 12 delegate all of my flight, travel, accommodations, eats, and drinks are paid for. However I did not receive any compensation nor am I required to write anything in regards to the event or the presenting companies. All that said, this is done at my own discretion.

RatioPerfect – 1:1 – Compute : Disk

Igneous presented at Tech Field Day 12 in November showcasing their managed on-premise cloudy solution– It looks much like  a traditional JBOD – a 4u box containing 60 drives – but underneath the hood things are certainly different.  Igneous, calling it their RatioPerfect architecture takes a 1:1 solution in terms of CPU to Disk.  Throwing out expensive Xeon CPU’s and the controller methodology,  RatioPerfect is essentially an army of nano servers, each equipped with its’ own ARM CPU, memory, and networking attached directly to each and every disk – essentially giving each disk its’ own controller!

nanoserver

These “server drives” are then crammed inside a JBOD – however instead of having dual SAS controllers within the JBOD they are replaced by dual Ethernet switches.  Each nano server then has two addressable MACs and two paths out to your infrastructure 10Gbe uplinks – you can almost picture this as a rack of infrastructure condensed down into a 4U unit, with 60 network addressable server/storage devices sitting inside of it, with 60 individual fault domains.  Don’t worry – it’s IPv6 – no need to free up 120 addresses Smile

Why the need?

To your everyday storage administrator working in a data center you might not see the need for this – 60 fault domains – seems a little excessive right?  The thing is, Igneous is not something that managed by your everyday storage administrator – in fact, the “human” element is something Igneous would love to eliminate totally.  Igneous set out to provide the benefits of public cloud, on premises, complete with flexible pricing and S3 compatible APIs.   The sheer nature of public cloud is that we don’t have to manage it – it’s simply a service right?  The same goes for Igneous – all management including installation, configuration, troubleshooting, upgrades is handled centrally by Igneous – you simply consume the storage – when you need more, you call, and another shelf shows up!

The design of Igneous’s management plane is key to their success.  With the “fleet” model in mind, Igneous built a management plane that proactively monitors all their systems deployed – being able to contrast and compare events and metrics to detect possible failure scenarios and rely heavily on automation to fix these issues before they are indeed, issues.  That said, no matter the amount of predictive analysis and automation the time will come when drives physically fail – and the nano server design of Igneous, coupled with the custom built data path deployed allows a single Igneous box to sustain up to 8 concurrent drive failures with out affecting performance – certainly buying them enough time to react to the situation.  The on-premises management plan is simply a group of micro-services running on commodity x86 servers – meaning software refreshes and upgrades are a breeze and non-disruptive at that.  It’s this design and architecture that allows Igneous to move fast and implement rapid code changes just as we would see within a cloud environment.

In the end Igneous certainly does contain an army of ARM processors working to bring the benefits and agility of public cloud to those who simply can’t move their data to cloud due to volume, or won’t due to security reasons.   Yeah, it’s a hardware appliance but you don’t manage it – in fact, you don’t even buy it – just as we “rent” cloud the Igneous service is a true operation expense – no capital costs whatsoever.   It’s funny – they sell a service, essentially software and storage that you consume, but it’s the hardware that left the lasting impression on me – not to often hardware steals the show at a Tech Field Day event.   If you are interested in learning more certainly take a look at their Tech Field Day videos – they cover all of this and A LOT more!  Thanks for reading!

Does Veeam Backup from Storage Snapshot make things faster?

veeamlogoLater on this week I’m getting the chance to present at my first ever Veeam User Group – My topic – Veeam & 3PAR  Now as I was preparing some slides and asking around/researching the community I came to the realization that some people may be under some false pretenses as it pertains to the Veeam Backup from Storage Snapshot feature.  For the most part I see people under the assumption that backing up from a storage snapshot is all about speed – however in my experiences it really hasn’t been.  Now that’s not to say it isn’t faster, it most certainly could be, but not for any reasons that it is actually copying the data out faster, but for the reasons that it speeds up other functions of the backup process.  To help with this, let’s take a look at both the “traditional” Veeam backup process and the Backup from Storage Snapshot Process

Traditional Veeam Backups

Veeam performs a lot of tasks when it completes a backup of a vSphere VM – but for the sake of this post, let’s just take a look at how it handles snapshots.  The traditional Veeam backup process can essentially be broken down into three major steps

  1. Veeam instructs vCenter to take a snapshot of the source VM
  2. Veeam utilizes CBT data and copies those blocks which have changed to the target.
  3. Veeam instructs vCenter to delete the snapshot of the source VM.

Looking at it like this it appears to be quite simple but there are a lot of challenges with this.  First up, the copying of data step – this could potentially take a long time depending on the initial size and change rate of your virtual machines.  During this time, the VMware snapshot will continue to grow – which could possible double in size.  When Veeam finally gets to step 3, the snapshot deletion, VMware is forced to copy all of those changed blocks that were written while the backup was running back into original vmdk – this process can and usually does involve a large amount of reads and writes, which most certainly affects the performance of our VM.  On top of this VMware attempts to ‘stun’ the virtual machine by creating yet another snapshot to help with the removal – now if our VMs are generating data fast enough we could experience an overall loss of responsiveness as our storage tries to catch up.  Now VMware has made a lot of changes as to how they consolidate and remove snapshots in vSphere 6 which I suggest you read about – but the issues of having to have an active VMware snapshot during the backup process remain….

Backup from Storage Snapshot

When Veeam performs a backup from a Storage Snapshot it is able to discard of the vSphere snapshot in a much more efficient way – we can break down the steps below…

  1. Veeam instructs vCenter to take a snapshot of the source VM
  2. Veeam instructs the storage array to take a SAN snapshot of the LUN containing the source VM
  3. Veeam instructs vCenter to delete the snapshot of the source VM. (Happens on production LUN)
  4. Veeam utilizes CBT and the VM snapshot data that still exists on the Storage Snapshot to copy out changed blocks to the target
  5. Veeam instructs the storage array to discard the SAN Snapshot

So how does this help you ask?  We still see the VMware snapshot being created.  The difference here is that Steps 1-3 take literally seconds.  The vSphere Snapshot is created, SAN Snapshot is created, then the vSphere snapshot is discarded immediately after.  Due to the nature of SAN Snapshots this essentially redirects all changed writes to our production LUN while leaving our snapshot LUN available for backup processing – all the while removing the requirement of having the VMware snapshot open for the backup duration.  At the end of the backup process, the SAN Snapshot is simply discarded…

So, is Backup from Storage Snapshot faster – well, it depends.  Sure, it does speed things up, but not in a data copy/processing way – more so in a snapshot commit way.  In a Backup from Storage Snapshot job, our VMware snapshot is only open for a little time, therefore it shouldn’t take as long to consolidate it as if it were open for the complete backup job.  That said, the real value of the Backup from Storage Snapshot comes from the short VMware snapshot commit time – and the no stun, more so than any decrease of the backup time.

So there you have it – just my two cents for the day!  And hey, if you are in or around Toronto/Mississauga on Wednesday be sure to register and stop by the Veeam User Group!

Thanks for reading!

She said I’m Tragically Hip

Come on, just lets go!

She kinda bit her lip…geez I dont know

I’ve written non tech related posts here before – usually it takes something near and dear to my heart to spark them. Hockey and music was kinda my crutch throughout my life – In the spirit of share, and the spirit of getting 30 posts out in 30 days this one will deal with the latter

tragicallyhip

For those that don’t know there’s an iconic Canadian band called The Tragically Hip   Their lead singer, Gordon Downie announced earlier this year that he had incurable brain cancer, you know, the shitty kind, as if their was a kind that wasn’t. Instead of undergoing treatment and calling it quits The Hip announced a country wide tour, starting in Vancouver, 15 shows later ending in their hometown of Kingston, Ontario – a small quaint city not far from where I live now and a place I called home for some time during college!

God knows I tried to get tickets – I was online right at 10 o’clock the day they were released and I had four in my shopping cart but during check out something happened – I along with many, many other people, tremendous other people (see what I did there :)) doing the same thing were denied and the tickets were gone within seconds.  Sheesh, you’d think these ticket companies could get some better IT behind these types of things!  But this wasn’t just for the Kingston show but for every show they had planned!  Within minutes of that you could see tickets popping up on StubHub more than 500% their face value!   Needless to say I wasn’t going to this one.  As a tour started the band tried to accommodate this by releasing new shows and holding lotteries for new tickets available only to people who could pick them up at the door of the venue!   Again I tried but was left dissapointed again!

So, in the end we were left with a country trying to mourn someone they grew up with and couldn’t get into see his, what some called,  last show!  Then to my surprise CBC  picked up on this and worked out a deal that would have final show, August 20th in Kingston, live-streamed across the country.   Satellite, Internet, CBC Mobile, Facebook, YouTube –  it was going to be on every media channel across Canada. So with all that said projectors went up in backyards across Canada!   Market Square in Kingston packed in 30,000 people,  the most they have ever had gather downtown to see what many were calling the final concert.  One of those 30,000 – our newly minted Prime Minister Justin Trudeau.

hip-markettrudeau-hip

Personally, I gathered in one of those backyards with some close friends to watch the concert, have some drinks, and celebrate the life of a man who had entertained me my entire life.  The music was great, the drinks were great (as always), but the biggest thing I took away from that warm night in August was the memories.  The whole night was full of friends reliving memories together – “Remember when?”  “This song reminds me of that time…”  Friends reminiscing, replaying times in their minds.  Seeing people whom I haven’t in quite a while, remembering times with some who are unfortunately no longer here.  This is what I took away from that night – memories, friendships, recollections of the handful of times I’ve saw them perform.

Before the show there was so much chatter on what he might be like, will he look ok?   And 15 seconds in I think we all saw the same ol’ Gord we have so many times before.  For nearly three hours he took us on a journey from that bar band that started out in Kingston to the national treasure they are now.  Even in his condition he used his time wisely, using the viewership of 11.7 million people to help spread the word about Canada’s struggles and ignorance towards the residential school issues of the past – that’s a whole other shameful post to be written though….

Though entertaining it was all but bittersweet seeing his interview a month or so after with Peter Mansbridge on CBC.  There, Downie declared he is indeed suffering, requiring 6 prompters on the stage in order to remember the lyrics of his own songs, having Mansbridges name written on his hand just so he can remember who he is talking to – even though they have known each other for some time…having to struggle to remember his kids names…

“For some reason every line, I just couldn’t, it’s the worst kind of punishment”, he said, “It was one savage kick in the pants, can’t remember peoples names and can’t remember lyrics”

A musician, an artist, a poet – who can’t remember lyrics.  A father, a husband, a friend – who can’t remember your name – Thinking of all this, it reminds to cherish, cherish what you have now, celebrate what you have now – someday it may no longer be there.

I mention “the country” a lot in this post – don’t get me wrong, it’s not all of Canada.  As with any music people have personal tastes and for some The Tragically Hip just isn’t their flavor of choice.  But, no one, no one can argue that something happened in our Country the night of August 20th – This group, this man, united our country and during that time their was not a sole that could claim they weren’t touched somehow by having been a part of it!  Personally, I feel more thankful having been part of that night – thankful for what I have and holding it tight – for someday it will all end and I’ll just be floating through Fiddler’s Green.

Friday Shorts – VMware, Veeam, Docker and Community

Dreams are important!  Otherwise, sleep is just 8 hours of nothing. – MacGyver

vSphere 6.5 is here!

VMware LogoNDA’s have been lifted and the flood gates have opened in terms of bloggers around the world talking about vSphere 6.5!  Now that the product has finally been declared GA we can all go ahead and download it today!  Certainly the new HTML5 client, the addition of a more functional vCenter Server Appliance and built-in disk level encryption are enough to make me want to make the jump…eventually 🙂  That said there is a lot that is new with vCenter and ESXi – you can check it all out here.

Veeam Backup & Replication 9.5 is here!

veeamlogoAnd just as VMware releases it’s flagship hypervisor software to the masses Veeam follows behind one day later with the support and availability back it all up with their updated release of version 9.5 of Backup & Replication.  There is a lot that’s new within this release – check it all out here!  With Nimble support and a ton of integration with other Veeam products such as the Veeam Agents this release has a lot – but perhaps some of my favourite enhancements will be the ones that will probably not be written about the most and that’s all of the engine enhancements and things like the vSphere inventory cache.  As with most Veeam releases I’m sure this one will be well adopted!

Tech Field Day 12 – that’s a wrap!

tfd-logo-100x100I’ve just returned home from Tech Field Day 12 in beautiful San Jose!  I say beautiful but I was told it was cold by multiple people there – that said, I’m Canadian, and 18C in the middle of November is not my idea of cold 🙂  Either way I sat with 11 of my peers around the TFD table for a couple of days and got blasted by the fire hose.  There was some great presenting companies there; think Dell-EMC, DriveScale, Rubrik, Cohesity, StorageOS, Docker, and Igneous.  Expect more in this space about what these guys had to talk about but for now if you want to check out the videos you can do so – links are all over here.

Fancy some community recognition!

Ah, November – cooler air, the transition from Halloween to Christmas – and of course, a flurry of forms to fill out if you’d like to be included in any vendor/community recognition programs.  Most of these things require you to nominate yourself – so suck up any pride as you may have to get a little self absorbed while you try and brag yourself up.  Recognition programs are a funny thing – Some are great, some are so-so – I’m not going to go into the details of each.  If you want a great spot to see a list of them all Jim Jones has a great post outlining everything here.  And to get a glimpse into one of my favourties, the Veeam Vanguard program – check out Matt Crapes post here.  I’ve also been invited to the KEMP VIP program – which is new to me so expect to see more about that as well in the future.

Beetlejuice, Beetlejuice, Docker

docker-logo-300-71x60I feel like I can’t attend any presentation anymore without hearing the work Docker – honestly, containers and Docker is just popping up everywhere – with the oddest of technologies claiming they have support for Docker.  So, with all this my problem is, What the eff is Docker – I’ve never had a use-case to even start looking at containers within my day job – therefore, I don’t really have that much knowledge around the challenges and benefits of them.  After seeing them at TFD I can now say that I need to explore this further – and jump on this container bandwagon to learn what they are all about.  First stop, Mr Stephen Foskett’s blog where he tells us just “What the deal with containers are“.  If you are just learning, or just love containers – check it out!

Setting yourself up for success with Veeam Pre-Job Scripts

For a while Veeam has been able to execute scripts post-job, or after the job completes – but it wasn’t until version 8 of their flagship Backup and Replication product that they added the ability to run a pre-job script, or a script that will execute before the job starts.  When v8 first came out with the ability to do this I strived to try and figure out what in the world I would need a pre-job script for – and for the longest time I never used it in any of my environments.  If a job failed I would execute post job scripts to run and hopefully correct the reason for failure – but a while back it kind of dawned on me – and with a bit of a change in mindset I realized something – Why fail first?

veeamprejobscript

Why fail when success is possible?

As I mentioned above I’d grown accustom to using post-job scripts to correct any failing jobs.  For instance, there were times when for whatever reason a proxy would hold on to a disk of one of my replica’s – subsequently, the next run of this job would fail trying to access this disk – and even more importantly consolidation of any VMs requiring it would fail as the original replica couldn’t access the disk mounted to the proxy.  What did I do to fix this?  Well, I added script that executed post-job looking to simply unmount any disks off of my Veeam proxies that shouldn’t be mounted.

Another scenario – I had some issues a while back with some NFS datastores simply becoming inaccessible.  The fix – simply remove and re-add them to the ESXi host.  The solution at the time was to run a post-job script in Veeam.  If the job failed with the error of not being able to find the datastore then I ran a script that would automatically remove and re-add the datastore for me – Next job run everything would be great!

“Fail and Fix” or “Fix and Pass”

So, the two solutions above, while they do fix the issues they do it after the fact – after we have already failed.  Even though it fixed everything up for the next run of the job I’d still lose that one restore point – and sure enough, the time WILL come where it’s that exact point in time you will need to recover from!  The answer to all this is pretty simple – migrate your post-job scripts to pre-job scripts.  Let’s set ourselves up for success before we even start our job!  Although this may seem like common sense – for whatever reason it took a while before I saw it that way.

So with all that – hey, let’s add some code to this post.  Below you will find one of my scripts that runs before each Veeam job – my proactive approach to removing non-veeam proxy disks from the proxies!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Add-PSSnapin VeeamPSSnapIn
Add-PSSnapin VMware.VIMAutomation.core
 
Connect-VIServer vcenter.mwpreston.local -u username -pass password 
 
# get job name out of parent process id
$parentpid = (Get-WmiObject Win32_Process -Filter "processid='$pid'").parentprocessid.ToString()
$parentcmd = (Get-WmiObject Win32_Process -Filter "processid='$parentpid'").CommandLine
$jobid = $parentcmd.split('" "')[16]
$vbrjob = get-vbrjob | where-object { $_.Id -eq "$jobid" }
 
#get some info to build replica VM names
$suffix = $vbrjob.options.ViReplicaTargetOptions.ReplicaNameSuffix
$vms = $vbrjob.getObjectsInJob()
 
#create array of replica names
$replicasinjob = @()
foreach ($vm in $vms)
{
 $replica = $vm.name+$suffix
 $replicasinjob += $replica
}
 
#loop through each replica and check Veeam proxies for foreign disks
foreach ($replicaitem in $replicasinjob)
{
 $replica = $replicaitem.tostring()
 Get-VM -Location ESXCluster -Name VBR* | Get-HardDisk | where-object { $_.FileName -like "*$replica*"} | Remove-HardDisk -Confirm:$false
}
 
exit

So as you can see this is a simple script that basically retrieves the job name it was called from (Lines 7-10) – By doing it this way we can reuse this block of code in any of our jobs.  Then simply searches through all of the disks belonging to Veeam proxies (Line 28) – if it finds one that belongs to one of our replica’s we are about to process, it removes it.  Simple as that!  Now, rather than failing our job because a certain file has been locked, we have set our self up for a successful job run – without having to do a thing!  Which is the way I normally like it 🙂  Thanks for reading!