Category Archives: VPLEX How To

How To: Expand VPLEX Virtual Volumes

One of the important yet lesser known additions to the VPLEX 5.2 release was the ability to expand virtual volumes non-disruptively.

Actually, let me step back a bit. VPLEX has always had the capability of expanding local-only (i.e. non-distributed) virtual volumes. We accomplished this by concatenating the virtual volume with another segment that needed to be tacked onto the virtual volume. In GeoSynchrony 5.2, we introduced a newer method as a complement to the existing method. In general, this newer method is our preferred method of virtual volume expansion.

Why the new method?

To understand why we embarked on developing a new virtual volume expansion method, you need to understand the limitations of the concatenation approach:

  1. Local only volumes can be expanded by using concatenated volume expansion. For distributed volumes, there was no convenient way to expand the virtual volume. (You would have to break the distributed volume, expand the underlying device and rejoin the distributed virtual volume with a complete resync)
  2. If you are using array based local replication functionality (aka snaps and clones) (Look at ViPR with VPLEX for an example of how you can do this), then when you concatenate different volumes to create a larger volume, those local replicas are no longer useful.
  3. Concatenating volumes (especially those coming from different arrays or from different performance tiers from different arrays) is generally not a good practice from a performance standpoint. There are two reasons why: First, imagine the two portions of the same volume having different performance characteristics. Secondly, for I/Os that cross the volume boundaries, you end up having to break I/Os into multiple smaller parts which usually (again, 80:20 rule in play here) leads to poorer relative performance.

Alright – tell me more about the new method

We call this the storage-volume based virtual volume expansion method. If you look at the constraints established above, preservation of the volume map geometry becomes crucial to address all the goals outlined above. This method works for local as well as distributed virtual volumes.

Supported geometries for storage-volume based virtual volume expansion

Here are the supported geometries for this method.

Supported Geometries for virtual volume expansion
Supported Geometries for virtual volume expansion

Supported virtual volumes can be:

  1. A local virtual volume that is fully mapped to a single storage volume (1:1 mapped, RAID-0)
  2. A local virtual volume that is mirrored across two storage volumes (RAID-1, R1)
  3. A distributed virtual volume that is mirrored completely across two storage volumes (Distributed RAID-1, DR1)

If you confirm that the virtual volume you need meets the criteria above, you are ready to expand!

Step 1: Expanding the storage-volume

The first step is array specific. You now need to expand the storage volume on the array to the capacity that you need (e.g. let’s say you have a 500GB storage volume. You would now expand it to 750GB on the backend if you need to add 250GB). Remember that if you have a mirrored VPLEX virtual volume, then you will need to do this for every leg of that mirror.

Virtual Volume Expansion - Step 1
Virtual Volume Expansion – Step 1

You now need to get VPLEX to detect the increased capacity for the storage volumes. If you have I/Os going on to the virtual volume (and therefore, to the storage volume), then upon volume expansion, the storage volume will generate a Unit Attention that VPLEX will detect and probe the storage volume to detect the additional capacity. If I/Os are not running to the storage volume, then you can run the rediscover command on VPLEX to reprobe the array to detect the added capacity.

Step 2: Expanding the virtual volume

The next step is to expand the virtual volume so that it uses the additional capacity.

Virtual Volume Expansion Step 2
Virtual Volume Expansion Step 2

You need to run the virtual-volume-expand command on VPLEX. Here is what the command looks like:
virtual-volume expand
[-v | –virtual-volume] context path
[-e | –extent] extent
[-f | –force]

NOTE: I have listed the optional extent parameter above to be complete. This is used by the concatenation expansion method not by the storage volume expansion method.

To expand the volume, you issue the above command with the specific virtual volume that you need to expand. The command makes some checks (more on that later) and lo and behold, you have expanded the virtual volume without ever stopping I/Os.

Things to remember

This section captures an assortment of varied details that are important to know or tips and tricks about the command that I find useful.

  1. While VPLEX supports non-disruptive expansion of virtual volumes, Whether a host mounted volume can be expanded depends on the OS, File-systems and in some cases, applications. Windows, for example, allows non-disruptive volume expansions with a host rescan. Older UNIXs do not. Check your host OS, filesystem or application details for clarification on this. From a SCSI standpoint, once the additional capacity is available, VPLEX will report a Unit Attention indicating that the LUN capacity has changed. Host rescans will also show the added capacity.
  2. We have added four new attributes to help you figure out whether a volume can be expanded and what its current status is. If you run an ll on a VPLEX virtual volume, you can now see:
    • expandable (boolean denoting whether a virtual volume can be expanded or not)
    • expandable-capacity (how much capacity is available to expand)
    • expansion-method (what method needs to be used for volume expansion)
    • expansion-status (if a volume is being expanded, what is the current status)
  3. What if my volume is not one of the supported geometries? – If your volumes are not mapped 1:1, then you have two choices:
    • Perform an extent migration to migrate the extent to a storage volume that is 1:1 mapped
    • Migrate the virtual volume to a larger storage volume to become 1:1 mapped
    • From there you should be able to perform the virtual volume expansion as above.

  4. The newly added capacity of the virtual volume will be zero initialized (i.e. VPLEX will write zeroes to the new capacity) prior to the additional capacity being exposable to the host. The reason to do this is especially true on mirrored volumes (R1s or DR1s) since from a host perspective, the added capacity should return the *same* data on read from either leg. In other words, as with everything else, VPLEX ensures single disk consistency even with distributed virtual volumes when the capacity is added
  5. Today RecoverPoint protected virtual volumes cannot be expanded while the protection is in effect. This is something we are looking at for future releases. For now, you can turn off the RP protection and then expand the virtual volume and re-engage the RP protection for that virtual volume
  6. If a virtual volume is undergoing migrations, or if the system is undergoing a non-disruptive upgrade or if the system or the virtual volume has a failing health check, then VPLEX will block expansion of the virtual volume
Advertisements

Can engines in a VPLEX Cluster be split?

UPDATE Feb 1st, 2014: I had captured some details incorrectly about the Director Witness which I have corrected below.

This question has been asked on the EMC Community Network and comes up multiple times in various contexts.

The goal is to allow multiple engines within a VPLEX cluster to be deployed across multiple racks instead of a single rack.

There are two primary reasons that this request comes up:

  • Customer intends to upgrade the number of VPLEX engines. However, in the time between the original deployment and when new engines are being purchased, they have repurposed the space in the rack where VPLEX is deployed for other equipment.
  • Customer considers a single rack as a single point of failure. More on that later.

Our usual (only?) answer to this is that we do not support this configuration. That usually leads to some perplexed looks followed by a long explanation.

Let’s start with how the VPLEX HW is built.

VPLEX hardware has been built with redundancy all the way through for a high availability infrastructure. Every component in the platform is redundant. The basic building block is a VPLEX Engine that has two directors. As multiple engines are added, each of these engines are connected through an intra-cluster communication channel (colloquially called the Local COM). Again, with redundancy in mind, the Local COM consists of two physically independent networks.

The plot thickens. Some more platform details: The VPLEX directors share the responsibility of monitoring the Local COM for any failures so that partitions (severing of Local COM links between VPLEX engines) can be handled if appropriate. In fact, each VPLEX cluster has another witness we internally refer to as the Director Witness (not to be confused with its more illustrious and well known sibling – the VPLEX Witness which is responsible for monitoring across VPLEX Clusters).

Now, given the variability of potential customer deployments, it was critical that we find a scalable way of maintaining four physically redundant networks to enable delivery of the high availability that our customers expect.

The way that we accomplish this is by requiring that the engines be collocated in the same rack and configured in the same exact way. Without this requirement, the level of redundancy becomes difficult to ensure. Deployments can be highly variable and the core platform requirements that I described above get compromised. Not to mention the challenges to our services organization of working through these variable configuration details. The bottom line is that without mandating the strict configuration and deployment requirements I outlined above, the probability of multiple failures happening simultaneously increases leading to compromised availability.

If this explanation works for you, you can skip the next two paragraphs.

======== [gory tech detail alert begin] ===========
[For those who want the next level of detail, we dream up quadruple failures and argue about the probability of failures before determining how failure handling should take place within the system. If that sounds like a whole lot of fun, it is! With a co-racked system, the perturbations to the system are dramatically reduced, changing the equation for what assumptions the director witness can make.

The VPLEX Witness has completely different failure handling characteristics since it has to account for two separate racks, WAN links, two data centers … You get the idea].
======== [gory tech detail alert end] ============

There was one additional question above – about a rack being considered as a single point of failure. There are multiple things to consider:

  • First and foremost, VPLEX has hardware availability built from the ground up. Everything in the basic platform building block is a multiple of two. So, the classic reason for rack separation (around fault domain separation such as power phases etc) are accounted for in the HW deployment architecture.
  • As we started engaging in this conversation further, what usually emerges is that the customer is concerned about fault domain redundancy (e.g. I want to protect across sprinkler heads as an example). And VPLEX Metro with the Witness is designed precisely to enable this particular use-case.

We are always open to feedback from you about new use cases we can build our products for. And as I have seen, customers always provide insights that constantly confound our assumptions (and that is GOOD!). This is one which has some interesting possibilities that we continue to explore. So in case you want to talk, you know how to reach me!

How to: Invalidate the VPLEX Cache

This is a brief note about a new capability we have introduced starting GeoSynchrony 5.2. I have been seeing a bunch of questions on our internal mailing lists about this. Hopefully, this note addresses a bunch of them.

Why do you need it?

Quite a few VPLEX Local and Metro customers use local replication on backend arrays – as an example, if the backend array is a VMAX, this would be using TimeFinder Snaps and Clones. The same is true for all supported EMC and non-EMC arrays on VPLEX. In the lifecycle management of these local replicas, you often need to expose a copy of the data as the original volume itself (when recovering from operational errors or going back to a different point in time). From a host standpoint, this volume would be indistinguishable from the original volume – it has the same volume identifier etc.
Let me now draw up the problem statement for you.
VPLEX Local and Metro have a read cache. In other words, for a given volume / consistency group, VPLEX will store data so as to enable cache access for that volume / consistency group without having to access the backend. This cache access is accurate assuming that the media underneath (i.e. the storage volume from the backend array) is not changed.
For such a volume exposed to the host, when you decide to mount the replica for that volume and present it to the host, in effect, the underlying media has been altered. This renders (unavoidably) that the VPLEX cache is out of sync with that underlying volume (since the cache may be written later than the replica timestamp). Any host that is likely to access the volume is likely to detect an inconsistency in the data. If you like analogies, this is equivalent to the disk drives being changed on your computer while the applications are running. Yeah, BAD idea.
Up until GeoSynchrony 5.2, the way to address this was to remove the virtual volumes for which the replica was mounted from the storage view and then re-add that volume back in. This would force VPLEX to delete any cached data associated with that volumes – allowing the volume to be consistent with the backend storage. And, in case you are wondering, yeah – not our favorite procedures either :-).
So, starting GeoSynchony 5.2 (released in May this year), we introduced the cache invalidate command which performs the invalidation function described above without having to remove and re-add the volume into the storage view. Let’s look at the command in more detail.

How the command works

The cache-invalidate command operates in the virtual volume as well as the consistency group context. In the virtual volume context, the command clears the cache for that virtual volume on _all_ directors where the volume is made visible within that cluster. In the consistency group context, the same operation as above is accomplished except for all volumes in the consistency group across all VPLEX directors within the cluster where the consistency group (or volumes from it) are made visible. Once you do this, any subsequent read for the volume / consistency group will result in a cache miss and will be performed off the backend array.
So, what does the command look like
  • virtual-volume cache-invalidate [-v |–volume] virtual-volume –force
  • consistency-group cache-invalidate [-g |–consistency-group ] consistency-group –force
  • Things to remember

    There are a few things to remember while executing this command:

    • This command will suspend host I/Os while the cache is being invalidated. Ideally, host I/Os should be stopped prior to invalidating cache. Remember that suspending I/Os on a live system could cause the host to detect a data unavailability (depending on how timing sensitive it is)
    • This command performs pre-checks to see if the volume is healthy on all directors before executing. If the volume is not healthy, you will get warnings about this.
    • The VPLEX cluster should not be undergoing an NDU while this command is being executed
    • There should be no mirror rebuilds or migrations in progress for virtual volumes affected by this command
    • No volume expansions (more about this in a later post) when this command is executed

    Checking the status of the command

    We have introduced a supplemental *get* style command to help customers understand what the current status of the cache-invalidate command when executed.
  • cache-invalidate-status [-h | –help] [–verbose] [-v | –virtual-volume=]
  • The output from the command will have the following potential values:
    • Status
      • in-progress: cache-invalidate is running.
      • completed: cache-invalidate has completed execution successfully.
      • error: cache-invalidate has finished but with error.
      • unknown: This status is reported when a particular director where the command needed to be executed cannot be reached
    • Result
      • successful: cache-invalidate completed successfully.
      • failure: cache-invalidate finished with failure. Cause field indicates the appropriate reason
      • error: cache-invalidate is in error state.
    • Cause
      • This is a text string indicating the reason for any failure / error during execution.

    And from there you are good to go! Happy cache-invalidating! If you have any questions, please post them in the comments section.

      Update:

    Don Kirouac (@kirostor) made me aware via twitter of a whitepaper that describes the usage of cache invalidate. Thanks Don!

    Finally, to all readers of this blog, a very Happy Holiday Season to you and your near and dear ones and Best wishes for 2014! We have some exciting stuff coming your way courtesy of the engineers working on VPLEX and RecoverPoint. And I cannot wait to tell you all about it … Stay tuned!