VMworld 2012: VM granular storage aka vvols – Bridging the gap between VMs and storage

The second session at VMworld that in my mind introduced another game changer was the tech preview session introducing VM granular storage.

This post is a circuitous one that starts with a problem (especially if you are an ESX administrator who prefers to deal with block storage). Let me set this up a bit.

You love what VMware is offering in terms of features and function. You are leading the crusade towards more and more virtualization. Heck, some days it even feels like you are winning the battle. Through your evangelization you have been able to get your organization to adopt a `virtual first` stance (In other words, as new applications are provisioned, they are deployed on virtual instances by default). The burden of proof is on the application admin on why they would like to deploy on a non virtualized environment. You have been able to drive tremendous efficiency within your organization and now have an environment that is 60+% virtualized. So where is the problem?

When you reach this level of virtualization, an interesting thing happens. You start to think of your basic unit of operation as a virtual machine. Since you are now at a common platform across the majority of your data center, you would like to operate at that level since it introduces a lot of operational efficiency for you. It works great until you think about your storage and network. In other words, of the three pillars of your data center infrastructure, you have solved compute and are now raring to solve the other two.

If this describes you, then the rest of the post is for you (at least for the storage piece anyways).

Let me next walk through your provisioning process today. If you need to allocate a virtual machine, the first task is whether there space available on your vmdk. Let us assume there isn’t. You then have to work with the admin to figure out what their application characteristics are and what SLA levels they require for this app. Next you go to the storage admin and start to translate those to LUNs, pools and esx servers that need access to those. Deprovisioning is a similar story. Multiply this for the 500+ VMs (much much larger for that you are provisioning and deprovisioning in the course of a year. Doesn’t sound like fun does it? Other things that are problematic in this world view:
1. There is an impedance mismatch between the storage characteristics and the VM requirements. Think about operations creating a copy of some sort of the datastore for a VM. Unless you do it as a clone from the VM, you are creating a copy of the entire LUN.
2. Application requirements are more dynamic than a one time allocation. That change does not get expressed without another round of coordination between the application admin, you and the storage admin.

So how do you solve this? In comes VM granular storage. Imagine a world in which the VM was directly expressed to the storage. So, the application admin says `Create a virtual machine with 200 GB of storage with permission to grow up to 500 GB and provisioned on silver storage`. In your organization, silver translates to 5% flash, 45% fibre channel and 50% SATA with a local snap taken once two hours, DR protection with an RPO of 8 hours. Based on this, there is a charge back to the line of business. This (through the magic of VMware and storage platforms) is passed directly onto the storage which then allocates the corresponding storage. As the application requirements change, those characteristics are passed through to the storage layer and the adjustment is automatic. When the VM is deprovisioned, the storage is AUTOMATICALLY freed up and returned back to the pool. Who would have thunk that?

So how can this be achieved. Last year, VMware introduced vStorage API for Array Awareness (VASA). This is an out-of-band management interface which creates a standard protocol through which vCenter can talk to the storage platform. This layer is being expanded to allow precisely the kind of communication that I described here. On the storage side, what this allows us to do is to realize what the bounds of the VM are (without truly changing the SCSI characteristics of the platform). Being aware of the bounds of the VM allows the services that the storage platform to be provided at the level of a VM. The snap or clone is no longer at the level of a LUN, it is at the level of a VM. In case you are wondering the same paradigm applies to other services – backup, encryption, dedupe, etc. Have I whetted your appetite? Want to learn more – here are ways that you can:

  1. Session from VMworld 2011: Link here
  2. Attend session INF-STO2223 at VMworld 2012: Link here

What do you think? Does this solve problems you are encountering? What does implementing something like this mean for you?

Leave a comment