Category Archives: vmworld

VMworld 2012: They did WHAT?!?

[DISCLAIMER] This is about the future – everything here is being looked at / worked on but there is no guarantee if or when this capability will become available. This does not impact any existing support statements. So there, you have been warned. On to the coolness ..

When one of my prior posts talked about VM granular storage, this was what I could not talk about. But, now that the curtains are off at Barcelona, I am able to post this. Chad has posted the demo on his blog here (“VMworld 2012 – Psst… Want to see the future of storage with VMware and EMC?”).

Here is the demo itself.

What EMC and VMware demonstrated at VMworld Barcelona is a proof of concept displaying virtual machines moving non-disruptively across asynchronous latencies and under load, using VM granular storage from VMware and VPLEX Geo from EMC.

This demo won the partner demo challenge in the Steve Herrod keynote:

Chad presented this at the 47 min mark – thank you to all those that voted!)

Can’t I do this today? aka What’s the big deal?

In a word, NO. You can move VMs from one side to another with VPLEX Metro (synchronous latencies now up to 10 msec). However, going asynchronous is a whole different ball of wax. Why is that? (By the way, this is a topic of discussion in ~100% of my VPLEX Geo conversations so this post is long overdue).

The answer lies in the interaction between vmfs and the asynchronous behavior of VPLEX Geo.

When vmfs was originally designed, it was a file system expecting disk attached to a server. It was extensible to storage coming from the SAN. Then a technology like VPLEX Metro extended vmfs across data centers. However, the common thread running through all of this is that the disk underlying vmfs is ‘synchronous’. In other words, when a write is issued from the host, before success is returned to the host, the write is on the media (yes, I understand that it is on the cache in the array but it is ‘on the box’ and will be on media should a failure happen).

This paradigm breaks when you go to disks that are asynchronously replicated. In this case, the big difference is that when a write is acknowledged on one side, the peer (asynchronously replicated) leg(s) of the disk, will not have access to the data until such time as the write is flushed from one side to the other. This should have made active / active on asynchronous disks impossible (after all, you should not have been able to maintain a single consistent disk image and be able to read on the second side the data that you just wrote on the first side until the flush time has completed).

VPLEX Geo solves this by creating an intelligent distributed multi-site coherent cache (AccessAnywhere™) which is able to fetch the most current data even if the underlying disk is asynchronous. The data on the disk can come later (with the real flush of the data from site 1) while maintaining write order consistency.

With me so far?

The problem happens when there are failures in this scenario (either a site goes down or sites partition). Now, the ESX Cluster on the second side is expecting data on the disk to match what was acknowledged (i.e., synchronous) but the underlying disk data has not reached the second site (i.e., asynchronous). This risk is what caused both EMC and VMware to back away from supporting the combination of vSphere and VPLEX Geo.

A second layer to the problem

If you imagine VMs working with shared storage and now stretch that across data centers over asynchronous latencies, one potential way that you can imagine solving the above problem is by having knowledge of which VM is accessing which portions of the data (you can already see VM granular concepts starting to eke their way here). If one is able to make that determination, you can now allow the partition scenario to play out in very interesting ways. So long as you ensure that the data remains current for a given VM on the side that it is active, you have the inside track to avoiding the situation above.

As it stands (in the world of the here and now), vmfs and VMware HA use heartbeat timeouts to help determine the health of the vmdk (even when VMs might not be active on the ESX server). Again, now switching to the view from a VPLEX perspective, it appears to the VPLEX Geo instance as if both sides are writing and therefore, both sides of the VPLEX Geo instance are active. Furthermore, the VM boundaries are not known at the storage layer. This prevents the storage from doing anything intelligent with the writes received.

Bottom line, when site failures or partitions happen, the failures cannot be limited to the VMs on the failing side (in a site failure scenario) or to the VMs on the non-preferred side (as would be the case with VPLEX Metro for instance). Rather all VMs are impacted.

Okay, I get it – VPLEX Geo is not supported with VMware. What are you doing to fix that?

That is probably the immediate follow on question after the details above are unwillingly accepted by most customers I interact with. As you can imagine, prior to VMworld 2012 Barcelona, a lot of it was ‘yes we are working on it’. But, as VMware has gone public with VM granular storage as a tech preview, this allows partners such as EMC to be a bit more open about what we are cooking.

Both VMware and EMC recognized this gap a while back. A team of product managers, architects and developers from both companies have been working very closely with each other over the last two years vetting the use cases, understanding the potential technical options and finally, what is needed to bring this solution to the market. (To all the customers and partners who participated in giving us input, answering our annoying questions, our ‘what if’ scenarios, THANK YOU!)

The solution is built using the VM granular storage infrastructure that is built to resolve other problems which have a similar symptom (i.e. impedance mismatch between LUN and the storage needed by a VM). Spelling out where a VM lies via vvols allows VPLEX Geo to understand where a particular VM is active. Even if the volume is distributed, since the vvol will be uniquely used by a particular VM, only one side of the vvol will continue to be accessed. As a Geo vMotion gets initiated, VPLEX Geo can now start to optimize the availability of the complete data on the disk on the other side. What this means is that a vvol based solution for Geo vMotion is no longer subject to the failure conditions that were described above. Before the engineers jump all over this post – Naturally, I am oversimplifying. There is a TON of work that needs to happen on both the VMware and EMC sides to deliver this.

The coolest part of the demo for me is decidedly the least ‘unsexy’ part of the demo. If you have used vMotion before, doing this over Geo latencies is pretty underwhelming. You do EXACTLY what you did before. You right click and migrate the VM and underneath vSphere and VPLEX Geo weave their magic and the VM is transported live to the remote side. Good stuff!

Finally, a BIG shout out to the vMotion and vvol team at VMware (Jennifer, Patrick, Haripriya, Gabe and the rest) and the VPLEX Project Baltimore team at EMC (Mike, Brad, Roel, Ranjit, Bill, Amir, Brian, Kurt, Thomas, Justin, Rob and several others). Great job guys in being able to pull the demo off!

As one of the VMware PMs remarked at VMworld, ‘If and when this GAs, it will be awesome!’ 😉

VMworld 2012: VM granular storage aka vvols – Bridging the gap between VMs and storage

The second session at VMworld that in my mind introduced another game changer was the tech preview session introducing VM granular storage.

This post is a circuitous one that starts with a problem (especially if you are an ESX administrator who prefers to deal with block storage). Let me set this up a bit.

You love what VMware is offering in terms of features and function. You are leading the crusade towards more and more virtualization. Heck, some days it even feels like you are winning the battle. Through your evangelization you have been able to get your organization to adopt a `virtual first` stance (In other words, as new applications are provisioned, they are deployed on virtual instances by default). The burden of proof is on the application admin on why they would like to deploy on a non virtualized environment. You have been able to drive tremendous efficiency within your organization and now have an environment that is 60+% virtualized. So where is the problem?

When you reach this level of virtualization, an interesting thing happens. You start to think of your basic unit of operation as a virtual machine. Since you are now at a common platform across the majority of your data center, you would like to operate at that level since it introduces a lot of operational efficiency for you. It works great until you think about your storage and network. In other words, of the three pillars of your data center infrastructure, you have solved compute and are now raring to solve the other two.

If this describes you, then the rest of the post is for you (at least for the storage piece anyways).

Let me next walk through your provisioning process today. If you need to allocate a virtual machine, the first task is whether there space available on your vmdk. Let us assume there isn’t. You then have to work with the admin to figure out what their application characteristics are and what SLA levels they require for this app. Next you go to the storage admin and start to translate those to LUNs, pools and esx servers that need access to those. Deprovisioning is a similar story. Multiply this for the 500+ VMs (much much larger for that you are provisioning and deprovisioning in the course of a year. Doesn’t sound like fun does it? Other things that are problematic in this world view:
1. There is an impedance mismatch between the storage characteristics and the VM requirements. Think about operations creating a copy of some sort of the datastore for a VM. Unless you do it as a clone from the VM, you are creating a copy of the entire LUN.
2. Application requirements are more dynamic than a one time allocation. That change does not get expressed without another round of coordination between the application admin, you and the storage admin.

So how do you solve this? In comes VM granular storage. Imagine a world in which the VM was directly expressed to the storage. So, the application admin says `Create a virtual machine with 200 GB of storage with permission to grow up to 500 GB and provisioned on silver storage`. In your organization, silver translates to 5% flash, 45% fibre channel and 50% SATA with a local snap taken once two hours, DR protection with an RPO of 8 hours. Based on this, there is a charge back to the line of business. This (through the magic of VMware and storage platforms) is passed directly onto the storage which then allocates the corresponding storage. As the application requirements change, those characteristics are passed through to the storage layer and the adjustment is automatic. When the VM is deprovisioned, the storage is AUTOMATICALLY freed up and returned back to the pool. Who would have thunk that?

So how can this be achieved. Last year, VMware introduced vStorage API for Array Awareness (VASA). This is an out-of-band management interface which creates a standard protocol through which vCenter can talk to the storage platform. This layer is being expanded to allow precisely the kind of communication that I described here. On the storage side, what this allows us to do is to realize what the bounds of the VM are (without truly changing the SCSI characteristics of the platform). Being aware of the bounds of the VM allows the services that the storage platform to be provided at the level of a VM. The snap or clone is no longer at the level of a LUN, it is at the level of a VM. In case you are wondering the same paradigm applies to other services – backup, encryption, dedupe, etc. Have I whetted your appetite? Want to learn more – here are ways that you can:

  1. Session from VMworld 2011: Link here
  2. Attend session INF-STO2223 at VMworld 2012: Link here

What do you think? Does this solve problems you are encountering? What does implementing something like this mean for you?

VMworld 2012: Software defined Storage Technology aka VMware Distributed Storage

Another fascinating VMworld closed out last week. As always, I met a ton of customers, partners and field folks. To all those that took the time to stop by and say hi, thank you! It was amazing making new friends and touching base with old friends once again. A lot of bloggers have done a thorough job of capturing a summary of VMworld (Duncan Epping captures it here, Josh Coen captures it here, Stu Miniman captures it here). I am going to focus instead on three sessions that piqued my interest, were well delivered and in my opinion, have the potential to be game changing. (By the way, you can e-attend the top 10 VMworld sessions here. None of the sessions I am going to talk about are up there FWIW).

A disclaimer: The descriptions below are a combination of what was presented plus my take. Where it is the latter, I will try my best to call it out. As the official sessions / presentations become available, always use those as your official reference.

The first of these was a Tech Preview titled ‘Software defined Storage Technology’. As the speakers disclosed, the internal VMware name for this is ‘VSAN’. TO BE CLEAR, this is a TECH PREVIEW with no guarantee of if and when this will become available.

In case you have been hiding under a rock, the big focus at this year’s VMworld is around the software-defined data center. The whole idea of looking at all aspects within a data center and offering them within the software stack is a very powerful concept. In my opinion, VSAN seems to be a core part of that strategy.

The basic idea is this – you have a bunch of servers with disks connected to them (or not). You can now create shared storage across these servers using disk attached storage. Pretty neat stuff!

In effect, this will create disk attached virtual storage for your VMware environment. In my opinion, depending on the performance you need, all classic virtual storage and shared storage use cases are fair game. In effect, VSAN is creating another tier of virtual storage.

What are the problems in today’s data center?

There is a basic gap between how customers provision for VMs and how storage is provisioned underneath (By the way, you will see this loss of fidelity in provisioning theme repeated in a subsequent post about VM granular storage or vvols). The storage provisioning is inconsistent and results in waste with storage being provisioned often at a higher tier than is needed by the VM (to guarantee operational SLAs).

A second trend that is becoming compelling is that servers can be loaded with flash storage. In other words, some really fast storage is now available right next to the CPUs.

Lastly with server proliferation, there is a fair bit of storage that is now attached directly to the server (DAS). So, the question becomes ‘What if’ vSphere was able to virtualize the disks in the servers and make them appear as a common datastore to a cluster of hosts.

If you have ever asked yourself that question then VMware Distributed Storage is looking to provide an answer.

What does this help you do?

So what would you be able to do with this that you previously could not and what use-cases is VMware targeting with this technology.

At an abstract level, VSAN is converting a set of servers into a converged platform. The layer that ‘connects’ or more correctly ‘clusters’ the storage is directly built into the vmkernel. From there, stuff gets even cooler.

You can now have hosts with no local storage participating in the Distributed Storage cluster. In other words, you can have thin servers connecting to a pool of storage. And now, you can see why this is referred to as VSANs. It is creating a software defined SAN by making the storage from a cluster of servers appearing to be shared storage (Dare I say federation at the DAS level?).

As you add servers with storage to the cluster, the shared vmdk grows. This allows you to dynamically grow as your needs change over time.

One of the big pushes that VMware is implementing is policy based management. The basic idea is that the VM allocation all the way through should be on the basis of requirements from the infrastructure. From a storage standpoint, examples of policies would be capacity, latency, IOPS, availability, level of disaster recovery and operational recovery. Each of these policies can be rolled into profile templates (the classic gold, silver, bronze). When a VM is now created, you have to specify the profile. Based on the profile itself, storage can now be allocated. The VM and the profile are monitored for compliance with the SLA. If it does not meet the SLA then changes can be made to bring the VM back into compliance. (By the way, in case you are wondering, this is similar to the approach being thought about with vm granular storage).

From this point on, VMware distributed storage behaves identically to regular storage. And that sounds like meh until you realize that this means like things needing shared storage (VMware HA, normal vMotion – without the enhanced bit in vSphere 5.1, DRS, storage I/O control, sDRS) should ‘just work’ (translation: there is tons of engineering work behind the scenes to make this happen). Examples of this: if there is a component server failure, then the VMs on the particular host (and the associated storage) will be subject to restart based on the original policies specified.

Use cases being targeted

Here were the four use-cases VMware said they were targeting.

  1. VDI deployments
  2. Reducing TCO and provisioning time in test and dev environments
  3. To enable high bandwidth / scaleout big data environments
  4. As a target of DR activities to reduce HW requirements at target site using vSphere replication

My take

Simply put, this has huge potential. To a large extent, it removes the barrier between DAS and SAN and creates a uniform way to address storage no matter where it is located. By creating this DAS tier, VMware is giving more customers the ability to use the capabilities provided by shared storage. I also see this as being a stepping stone to allow customers to move seamlessly between DAS storage and SAN storage. For example, a customer could start out using DR at the DAS tier and over time (for various reasons) move to a SAN tier DR or even vice versa. If you now combine this with the promise of VM granular storage, VMware is creating a very fluid end-to-end storage environment independent of where the storage is.

And that brings me back full circle to where we started. Virtualization is changing the bounds of the traditional stack. With VSAN, VMware is blurring the boundary between DAS and SAN. One of the big changes that is happening right in front of our eyes is that the tranditional application, compute, network and storage stack is collapsing, becoming more seamless independent of where the resources are located. Interestingly VCE had a badge that they handed out at this VMworld.

Frankly, nothing says it better. It truly does eventually starts to look like one collapsed stack be it one server, a cluster of servers, converged infrastructure (Vblocks, FlexPods and their ilk) and even datacenters (think VPLEX). And to me, the possibilities with that paradigm are exciting and numerous!

So what do you think? Game changing or not?

[If you are attending VMworld Barcelona, I would definitely recommend attending this session. It is being offered on Oct 10. Here is the link].

UPDATE: 09/29/2012: Here are some other blog posts related to this subject that might be illuminating: