Tag Archives: TFD12
One of the core selling points of the Rubrik platform is the notion of something called “unlimited scale” – the ability to start small and scale as large as you need, all the while maintaining their masterless deployment! Up until a few weeks ago I was unaware of how they actually achieved this, but after witnessing Adam Gee and Roland Miller present at Tech Field Day 12 in San Jose I have no doubts that the Atlas file system is the foundation upon which all of Rubrik is built.
As shown above we can see how the platform is laid out by Rubrik – with the Atlas file system sitting within the core of the product and communicating with nearly every other component in the Rubrik platform. Now picture each node containing exactly this same picture, scaling up to whatever number of nodes you might have – each node containing its’ own Atlas file system, with its own local applications accessing it – however the storage is distributed and treated as one scalable blob of storage addressable by a single global namespace.
Atlas – a distributed scalable file system.
As shown above other core modules such as Callisto, Rubriks distributed metadata store and the Cluster Management system all leverage Atlas under the hood – and in turn Atlas utilizes these for some of its functions. For instance, to make Atlas scalable it leverages data from the Cluster Management system to grow and shrink – when a new brik is added, Atlas is notified via the CMS, at which point the capacity from the new nodes is added to the global namespace, thus increasing our total capacity available, as well as flash resources to consume for things such as ingest and cache. It should also be noted that Atlas does take care of data placement as well, so adding a new node to the cluster will trigger it to re-balance, however it’s got the “smarts” to process this as a background task and take into affect all of the other activities occurring within the cluster, which it gets from the Distributed Task Framework – Meaning we won’t see a giant performance hit directly after adding new nodes or briks due to the tight integration between all of the core components.
Adding disk and scaling is great, however the challenges of any distributed file-system is how to react when failures occur, especially when dealing with the low costs of commodity hardware. Atlas performs file system replication in a way that it provides for the failure at a disk level and a node level, allowing for 2 disks, or 1 full node to fail without experiencing data loss. How Atlas handles this replication depends solely on the version of Rubrik in your datacenter today. Pre 3.0 releases used a technology called mirroring, which essentially triple replicated our data across nodes. Although triple replication is a great way to ensure we don’t experience any loss of data it does so at the expense of capacity. The Firefly release, 3.0 or higher, implements a different replication strategy via erasure coding. By its nature, erasure coding essentially takes the same data that we once would of replicated three times and splits it into chunks – the chunks are then processed and alternate chunks are encoded and created which can be used to rebuild the data if need be. It’s these chunks that are intelligently placed across disks and nodes within our cluster to provide availability. The short of the story here is that erasure coding gives us the same benefit of triple replication, without the cost of having triple the capacity – therefore more space will be available within Rubrik for what matters most, our data.
Aside from replication of our data Atlas employs other techniques to keep our data available as well – items such as self healing and CRC detection allows Atlas to throw away and repair data as it becomes corrupt. Now these are features within file-systems we expect to see, but Atlas can handle this a little different due to it’s distributed architecture. The example given was with three briks, each containing four nodes – when a node fails, or data becomes corrupt Atlas actually repaired the data on a surviving node within the same brik, ensuring we are still spread out across briks. If a brik happens to fail, the chunk of data would then be required to be on the same brik as another, but would be placed on another node, allowing still for node failure. It’s this topology-aware deployment that really allows Rubrik to maximize it’s data availability and provide protection across not only nodes within a brik, but between brik failures as well, maximizing the failure tolerance guarantees they are providing.
Perhaps some of the most interesting ways Atlas works though are around how it exposes its’ underlying functions and integration points in the applications running on top of it, the Rubrik applications. First up, the meat of Rubriks solution, mounting snapshots for restore/test purposes. While all of our backup data is immutable, meaning it by no means can be changed in any way, Atlas does leverage a “Redirect on Write” technology in order to mount these backups for test/dev/restore purposes. What this means is that when a snapshot is requested for mount, Atlas can immediately assemble the point in time using incremental pointers – no merging of incrementals to full backups or data creation of any kind – a simple presentation of the full VM in that point in time is presented. Any writes issued to this VM are redirected, or written elsewhere and logged – thus not affecting the original source data whatsoever, all the while allowing the snapshot to be written to.
Atlas also exposes a lot of its underlying functionality to applications in order to create performance as well. Take for instance the creation of a scratch or temporary partition for example – if Rubrik needs to instantiate one of these it can tell Atlas that this is indeed temporary – thus, Atlas doesn’t have the need to replicate the file making up the partition at all as it doesn’t necessarily require protection and can simply be tossed away when we are done with it. And that tossing away, the cleaning up after itself can also be set from an application level. In that same example we could simply set a ttl or expiry on our scratch file, and let the normal garbage collection maintenance job clean up during its normal run, rather than wasting time and resources in having the application make second or third calls to do it. Applications can also leverage Atlas’s placement policies, specifying whether files or data should be placed on SSD or spinning disk, or even specify whether said data should be located as close as possible to other data.
So as you can see that although Rubrik is a very simple and easy policy based, set and forget, type of product there is a lot of complexity under the hood. Complexity that is essentially abstracted away to the end-user, but available to the underlying applications making up the product. In my mind this paves the way for a quick development cycle. Being able to leverage the file-system for all its worth while not having to worry about “crazy” configurations customers may have. We have certainly seen a major influx of custom-built file systems entering our data centers today – and this is not a bad thing. While the “off the shelf”, commodity type play may fit well for hardware, the software is evolving – and this is evident in the Rubrik Atlas file system. If you want to learn more definitely check out their Tech Field Day 12 videos here – they had a lot more to talk about than just Atlas!
Recently I attended Tech Field Day 12 in San Jose and was lucky enough to sit down with Docker for a couple of hours. Docker talked about a number of things including Containers as a Service, Security, Networking, Cloud and the recent integration points on Microsoft Server 2016. Now I’m not going to pretend here – Docker, more specifically containers are something that I’ve heard of before (How could you not have?) but I’ve never really gone too deep into what they do, how they perform, or what use cases they fit well into. I knew they had something to do with development – but that’s as far as I’ve really went with them. Listening to Docker and other delegates questions during the presentation got me thinking that I should really start learning some of this stuff. – and it’s that thought right there which sent me down a rabbit hole for the last few days, reading countless blogs and articles, watching numerous videos and keynotes, and scratching my head more often than I would’ve liked too – in the end I’m left with the conclusion that there a lot of misconceptions in regards to containers, and I was falling right into mostly all of them…
VMware vs Docker
Here’s the first misconception I was reading a lot about. Quite a lot of chatter out there on the interwebs is happening about the downfall of the VM and the up-rise of the container. And for some environments this may hold true, but, even according to Docker, these two technologies are not necessarily competitors. You see, VM’s by their nature encapsulate a complete running VM – all of the OS, applications, libraries, and data running is encapsulated into a VM, with hardware emulation and a BIOS. A container on the other hand is application focused – being more an application delivery construct while sharing processes related to the Linux kernel and operating system its’ running on. Still confused? Don’t worry – so was(am) I. There’s an analogy that Docker uses quite often that might help; houses vs apartments. Think of a VM as a house, complete with all the different living spaces and its self contained services such as heat, electricity, and plumbing. On the flip-side, containers could be apartments – sure each one may be a little different but they share common services in the building – electricity and plumbing is shared and all comes from the same source. So in essence there is room for both in the market, in fact, they really provide quite different platforms for running our applications – while Docker will focus in on stateless, scale able, non-persistent apps, mostly providing advantages around development and portability our VMs give us the “warm and fuzzy” feeling of having separate OS instances for our applications, with their front doors shut and locked.
Docker is just for developers
Another pretty big misconception if you ask me! Sure, Docker is getting huge adoption in the developer space because of it’s provided consistency – a developer can begin by by pulling down a Docker image and have the libraries and components setup on their laptop exactly how they want. They then can share this image out to be forked by others, meaning we have a consistent environment no matter where the application is being developed. When the time comes to move to test, or production, we are still running within that same, consistent environment – no more patch or library conflicts – A true developers nirvana! But after reading so much about this I have come to the realization that Docker is not just a “Developer” thing, it’s for all of us, even us crazy operation guys! The shear nature of having a container limited to one service, or micro-services if you will, allow us as administrators to deploy applications in our data center in the same way – think a container for Apache, a container for MySQL, each it’s own separate entity, each working together to provide a full application to our end users – and with the maturity and availability of images out there today take a guess who doesn’t have to go through all of the headaches and processes of setting all of this stuff up – operations doesn’t! And spawning multiple instances of all of these is just one command line away! It just feels right to me and just as we have seen the adoption of virtualization and the adoption of of companies shipping software bundled in virtual appliances I can see a day where we will soon see those same services packaged and shipped as containers.
But Docker is just for Linux nerds
Not anymore… Have yourself a copy of Windows 10 or Server 2016, yeah, simply install the feature called containers, grab the Docker engine and away you go! Microsoft and Docker have made huge partnerships and as of right now you can even pull down some Microsoft applications right off of the “App Store” if you will. Need yourself a SQL Server –docker run -d -p 1433:1433 -e sa_password=password -e ACCEPT_EULA=Y microsoft/mssql-server-windows-express – yeah, that’s all – your done! Still think Docker is just for developers??? Microsoft has been doing some really out of character things as of late – think bash on windows, open sourcing .Net, SQL Server on Linux – just super weird non-traditional Microsoft things – but in a good way! Don’t be surprised if we see Microsoft going all in with containers and Docker in the future!!! Let the benefits of Continuous Integration and deployment be spread among all the nerds!!!
So I can deliver all my Windows apps through containers now! Awesome!
Yes…but no! Docker is not ThinApp/XenApp/App-V. It doesn’t capture changes and compile things into an executable to be ran off a desktop or deployed through group policy. In fact it’s just those server side applications that are supported in a Windows container. We can’t for instance try and run Internet Explorer 6 with a certain version of the java plugin, nor can we run Microsoft Word within a container. The purpose of this is to provide a portable, scale-able, consistent environment to run our server side, non-GUI Windows applications – think SQL Server, IIS, .NET, etc… Now I can’t say where the technology will go in the future – a world in which we can all containerize desktop applications with Docker doesn’t sound too far fetched to me :).
So with all that I think I have a little better handle on containers and Docker since my Tech Field Day adventures – and wanted to simply lay it out the way I see it in the event that someone else may be struggling with the mountains of content out there. If you want to learn more and dig deeper certainly check out all of the TFD videos that Docker has. Also, Stephen Foskett has a great keynote that he has done – “What’s the deal with containers?” which I would certainly recommend you watch! I’m still sort of discovering all of this but plan to really invest some time in the container world come next year – there is a lot of components that I want and need to understand a bit more such as persistent storage and networking – also, if I’m wrong or misinformed on any of this – do call me out 🙂 – that’s how we all learn! Thanks for reading!
It’s no surprise to anyone that storage is growing at an incredible rate – rich media, sensor devices, IoT – these are all affecting the amount of storage capacity that organizations need today and it’s only going to get worse in the future! Organizations need somewhere to put this data, somewhere safe and protected, somewhere where availability is key. For most that somewhere ends up being the cloud! Public cloud services such as Amazon S3 give us access to oodles of storage on a pay as you go basis – and they remove the burden of having to manage this. SLA’s are agreed upon and our data is just available when we need it! That said, public cloud simply may not be an option for a lot of companies – the businesses that simply can’t, or sometimes won’t move to cloud, yet still want the agility and availability that cloud provides. These organizations tend to move to an on-premises solutions – SANs and storage crammed into their own data centers – but with that comes a whole new bucket of challenges around scaling and availability…
How do we scale a SAN?
Most all storage out there today is designed in much the same way. We have a controller of sorts, providing network and compute resources to move our data in and out of a number of drives sitting behind it. But what if that controller goes down? Well, there goes all of our infrastructure! To alleviate this we add more controllers and more disk – This seems like a pretty common storage solution today – 2 controllers, each hosting a number of shelves full of drives, with dual path interconnects connected to the rest of our data center. In this situation if we lose a controller we don’t necessarily lose access to our data, but we most certainly lose half of the bandwidth into it. So, we yet again add more controllers and more disk – sitting with 4 controllers now – at which point the back of our racks and our interconnect infrastructure is getting so complex and complicated that we will most certainly hit struggles when the time comes to scale out even more.
So what is the perfect ratio of controller to disk, or cpu to disk? How do we minimize complexity while maximizing performance? And how do we accomplish all of this within our own data center? Lower ratios such 1 CPU for every 8 disks introduces complexity with connectivity – Higher ratio’s such as 1 CPU for 60 disks provides a huge fault domain. Is it somewhere in the middle? Igneous Systems has a answer that may surprise you!
RatioPerfect – 1:1 – Compute : Disk
Igneous presented at Tech Field Day 12 in November showcasing their managed on-premise cloudy solution– It looks much like a traditional JBOD – a 4u box containing 60 drives – but underneath the hood things are certainly different. Igneous, calling it their RatioPerfect architecture takes a 1:1 solution in terms of CPU to Disk. Throwing out expensive Xeon CPU’s and the controller methodology, RatioPerfect is essentially an army of nano servers, each equipped with its’ own ARM CPU, memory, and networking attached directly to each and every disk – essentially giving each disk its’ own controller!
These “server drives” are then crammed inside a JBOD – however instead of having dual SAS controllers within the JBOD they are replaced by dual Ethernet switches. Each nano server then has two addressable MACs and two paths out to your infrastructure 10Gbe uplinks – you can almost picture this as a rack of infrastructure condensed down into a 4U unit, with 60 network addressable server/storage devices sitting inside of it, with 60 individual fault domains. Don’t worry – it’s IPv6 – no need to free up 120 addresses
Why the need?
To your everyday storage administrator working in a data center you might not see the need for this – 60 fault domains – seems a little excessive right? The thing is, Igneous is not something that managed by your everyday storage administrator – in fact, the “human” element is something Igneous would love to eliminate totally. Igneous set out to provide the benefits of public cloud, on premises, complete with flexible pricing and S3 compatible APIs. The sheer nature of public cloud is that we don’t have to manage it – it’s simply a service right? The same goes for Igneous – all management including installation, configuration, troubleshooting, upgrades is handled centrally by Igneous – you simply consume the storage – when you need more, you call, and another shelf shows up!
The design of Igneous’s management plane is key to their success. With the “fleet” model in mind, Igneous built a management plane that proactively monitors all their systems deployed – being able to contrast and compare events and metrics to detect possible failure scenarios and rely heavily on automation to fix these issues before they are indeed, issues. That said, no matter the amount of predictive analysis and automation the time will come when drives physically fail – and the nano server design of Igneous, coupled with the custom built data path deployed allows a single Igneous box to sustain up to 8 concurrent drive failures with out affecting performance – certainly buying them enough time to react to the situation. The on-premises management plan is simply a group of micro-services running on commodity x86 servers – meaning software refreshes and upgrades are a breeze and non-disruptive at that. It’s this design and architecture that allows Igneous to move fast and implement rapid code changes just as we would see within a cloud environment.
In the end Igneous certainly does contain an army of ARM processors working to bring the benefits and agility of public cloud to those who simply can’t move their data to cloud due to volume, or won’t due to security reasons. Yeah, it’s a hardware appliance but you don’t manage it – in fact, you don’t even buy it – just as we “rent” cloud the Igneous service is a true operation expense – no capital costs whatsoever. It’s funny – they sell a service, essentially software and storage that you consume, but it’s the hardware that left the lasting impression on me – not to often hardware steals the show at a Tech Field Day event. If you are interested in learning more certainly take a look at their Tech Field Day videos – they cover all of this and A LOT more! Thanks for reading!
Dreams are important! Otherwise, sleep is just 8 hours of nothing. – MacGyver
vSphere 6.5 is here!
NDA’s have been lifted and the flood gates have opened in terms of bloggers around the world talking about vSphere 6.5! Now that the product has finally been declared GA we can all go ahead and download it today! Certainly the new HTML5 client, the addition of a more functional vCenter Server Appliance and built-in disk level encryption are enough to make me want to make the jump…eventually 🙂 That said there is a lot that is new with vCenter and ESXi – you can check it all out here.
Veeam Backup & Replication 9.5 is here!
And just as VMware releases it’s flagship hypervisor software to the masses Veeam follows behind one day later with the support and availability back it all up with their updated release of version 9.5 of Backup & Replication. There is a lot that’s new within this release – check it all out here! With Nimble support and a ton of integration with other Veeam products such as the Veeam Agents this release has a lot – but perhaps some of my favourite enhancements will be the ones that will probably not be written about the most and that’s all of the engine enhancements and things like the vSphere inventory cache. As with most Veeam releases I’m sure this one will be well adopted!
Tech Field Day 12 – that’s a wrap!
I’ve just returned home from Tech Field Day 12 in beautiful San Jose! I say beautiful but I was told it was cold by multiple people there – that said, I’m Canadian, and 18C in the middle of November is not my idea of cold 🙂 Either way I sat with 11 of my peers around the TFD table for a couple of days and got blasted by the fire hose. There was some great presenting companies there; think Dell-EMC, DriveScale, Rubrik, Cohesity, StorageOS, Docker, and Igneous. Expect more in this space about what these guys had to talk about but for now if you want to check out the videos you can do so – links are all over here.
Fancy some community recognition!
Ah, November – cooler air, the transition from Halloween to Christmas – and of course, a flurry of forms to fill out if you’d like to be included in any vendor/community recognition programs. Most of these things require you to nominate yourself – so suck up any pride as you may have to get a little self absorbed while you try and brag yourself up. Recognition programs are a funny thing – Some are great, some are so-so – I’m not going to go into the details of each. If you want a great spot to see a list of them all Jim Jones has a great post outlining everything here. And to get a glimpse into one of my favourties, the Veeam Vanguard program – check out Matt Crapes post here. I’ve also been invited to the KEMP VIP program – which is new to me so expect to see more about that as well in the future.
Beetlejuice, Beetlejuice, Docker
I feel like I can’t attend any presentation anymore without hearing the work Docker – honestly, containers and Docker is just popping up everywhere – with the oddest of technologies claiming they have support for Docker. So, with all this my problem is, What the eff is Docker – I’ve never had a use-case to even start looking at containers within my day job – therefore, I don’t really have that much knowledge around the challenges and benefits of them. After seeing them at TFD I can now say that I need to explore this further – and jump on this container bandwagon to learn what they are all about. First stop, Mr Stephen Foskett’s blog where he tells us just “What the deal with containers are“. If you are just learning, or just love containers – check it out!
Cohesity is next up in my flurry of Tech Field Day 12 previews with their secondary storage play. I just recently got to see Cohesity present as they were a sponsor at our Toronto VMUG which took place at the legendary Hockey Hall of Fame, so I guess you could say that Cohesity is the only vendor I’ve seen present in the same building as the Stanley Cup. Okay, I’ll try and get the Canadian out of me here and continue on with the post…
Who is Cohesity?
Cohesity was founded in 2013 (I’m detecting somewhat of a Tech Field Day 12 pattern here) by Mohit Aron, former CTO and co-founder of Nutanix. You can certainly see Mohit’s previous experience at Google and Nutanix shining through in Cohesity’s offering – Offering complete visibility into an organizations “dark data” on their secondary storage appliance.
Cohesity’s appliance in itself doesn’t claim to be a primary storage array – they aim at the secondary storage market. Think of non mission critical data – data such as backups, file shares and test/dev copies. All this data is a perfect fit for a Cohesity appliance. How this data gets there and what we can do with it all lies within Cohesity’s DataProtect and DataPlatform software!
DataProtect and DataPlatform
For the most part, the on boarding of all this data onto their appliance is done through backups – Cohesity’s DataProtect platform to be more specific. DataProtect seamlessly integrates into your vSphere environment and begins to back up your infrastructure using a set of predefined and custom policies, or SLA’s if you will. Policies are setup to define things such as RPO – how often we want to back up, as well as retention policies for archival – Backups over 30 days shall be archived to Azure/Amazon/Google.
Once the data resides within Cohesity’s appliance, another technology DataPlatform takes over – DataPlatform provides a Google-esque search across all the data, be it on premises or archived to cloud. Here is where we can do some risk management, searching for patterns such as credit card number or social insurance numbers. DataPlatform also allows us to leverage our backups for items such as test/dev, creating a complete copy of our environments very fast – isolated from our actual production networks.
With the release of 3.0, we have also seen physical Windows and Linux support added into the platform – so just as we protect our VMs, we can protect our physical servers, along with the applications such as SQL/Exchange/Sharepoint that are running on them.
With a best of VMworld 2016 award under their belts I’m pretty excited to go deeper into Cohesity – and expect to hear a lot more as to what their next steps might be! Stay up to date on Cohesity and all things mwpreston/Tech Field Dayby watching my page here – and see all there is to know about Tech Field Day 12 on the main landing page here! Thanks for reading and see yah in November 🙂
Next in the long list of previews for Tech Field Day 12 is DellEMC – you know, that small company previously known as EMC that provides a slew of products primarily based on storage, backup, cloud and security. Yeah, well, apparently 67 billion dollars and the largest acquisition in the tech industry ever allows you to throw Dell in front of their name 🙂 November 16th will DellEMC’s first Tech Field Day presentation under the actual DellEMC name – split out we have saw Dell @ 7 events and EMC @ 5 events. So let’s call this their first rather than combining them both for that dreaded number 13….
We all got a look at just what these two companies look like when combined as the newly minted DellEMC World just wrapped up! We saw a number of announcements around how things play out while these two companies are now sharing the same playground, summarized as best I can as follows…
- Hyper-converged – Big announcements around how PowerEdge servers will now be a flavor of choice for the VxRail/VxRack deployments. Certainly this brings an element of choice in terms of the customization of performance and capacity provided by Dell – to the Hyperconverged solution once provided by EMC. Same goes for the rails big brother, VxRack.
- DataDomain – the former EMC backup storage solution will also be available on DellEMC PowerEdge servers. What was once a hardware appliance is now a peice of software bundled on top of your favourite PowerEdge servers. On top of that, some updates allowing data to be archived to cloud and multi-tenancy for service providers.
- Updates to the Isilon series, including a new All Flash version being added to the scale-out NAS system.
Dell has not be shy as of late at making BIG moves – going private then buying out EMC. Certainly this transition is far from over – there is a lot of transition that still has to take place in order to really merge the two companies together. From the outside things appear on the upside (except for the fact that I’m getting a ton of calls from both companies looking to explain everything now) however there are still many unanswered questions as to what will happen with overlapping product lines… From the inside I can’t really say – I have no idea – all I know is I’m sure it’s not an easy thing for anyone when you take 70,000 EMC employees and throw them in with Dell’s 100,000+ – There will definitely be some growing pains there…
Only time will tell how DellEMC changes the story, if at all at Tech Field Day 12. DellEMC are up first thing on November 16th – follow along with the live-stream, keep up with all things mwpreston @ Tech Field Day 12 here, and stay tuned on the official landing page for more info! This is destined to be a good one! Thanks for reading!