How to: Invalidate the VPLEX Cache

This is a brief note about a new capability we have introduced starting GeoSynchrony 5.2. I have been seeing a bunch of questions on our internal mailing lists about this. Hopefully, this note addresses a bunch of them.

Why do you need it?

Quite a few VPLEX Local and Metro customers use local replication on backend arrays – as an example, if the backend array is a VMAX, this would be using TimeFinder Snaps and Clones. The same is true for all supported EMC and non-EMC arrays on VPLEX. In the lifecycle management of these local replicas, you often need to expose a copy of the data as the original volume itself (when recovering from operational errors or going back to a different point in time). From a host standpoint, this volume would be indistinguishable from the original volume – it has the same volume identifier etc.
Let me now draw up the problem statement for you.
VPLEX Local and Metro have a read cache. In other words, for a given volume / consistency group, VPLEX will store data so as to enable cache access for that volume / consistency group without having to access the backend. This cache access is accurate assuming that the media underneath (i.e. the storage volume from the backend array) is not changed.
For such a volume exposed to the host, when you decide to mount the replica for that volume and present it to the host, in effect, the underlying media has been altered. This renders (unavoidably) that the VPLEX cache is out of sync with that underlying volume (since the cache may be written later than the replica timestamp). Any host that is likely to access the volume is likely to detect an inconsistency in the data. If you like analogies, this is equivalent to the disk drives being changed on your computer while the applications are running. Yeah, BAD idea.
Up until GeoSynchrony 5.2, the way to address this was to remove the virtual volumes for which the replica was mounted from the storage view and then re-add that volume back in. This would force VPLEX to delete any cached data associated with that volumes – allowing the volume to be consistent with the backend storage. And, in case you are wondering, yeah – not our favorite procedures either :-).
So, starting GeoSynchony 5.2 (released in May this year), we introduced the cache invalidate command which performs the invalidation function described above without having to remove and re-add the volume into the storage view. Let’s look at the command in more detail.

How the command works

The cache-invalidate command operates in the virtual volume as well as the consistency group context. In the virtual volume context, the command clears the cache for that virtual volume on _all_ directors where the volume is made visible within that cluster. In the consistency group context, the same operation as above is accomplished except for all volumes in the consistency group across all VPLEX directors within the cluster where the consistency group (or volumes from it) are made visible. Once you do this, any subsequent read for the volume / consistency group will result in a cache miss and will be performed off the backend array.
So, what does the command look like
  • virtual-volume cache-invalidate [-v |–volume] virtual-volume –force
  • consistency-group cache-invalidate [-g |–consistency-group ] consistency-group –force
  • Things to remember

    There are a few things to remember while executing this command:

    • This command will suspend host I/Os while the cache is being invalidated. Ideally, host I/Os should be stopped prior to invalidating cache. Remember that suspending I/Os on a live system could cause the host to detect a data unavailability (depending on how timing sensitive it is)
    • This command performs pre-checks to see if the volume is healthy on all directors before executing. If the volume is not healthy, you will get warnings about this.
    • The VPLEX cluster should not be undergoing an NDU while this command is being executed
    • There should be no mirror rebuilds or migrations in progress for virtual volumes affected by this command
    • No volume expansions (more about this in a later post) when this command is executed

    Checking the status of the command

    We have introduced a supplemental *get* style command to help customers understand what the current status of the cache-invalidate command when executed.
  • cache-invalidate-status [-h | –help] [–verbose] [-v | –virtual-volume=]
  • The output from the command will have the following potential values:
    • Status
      • in-progress: cache-invalidate is running.
      • completed: cache-invalidate has completed execution successfully.
      • error: cache-invalidate has finished but with error.
      • unknown: This status is reported when a particular director where the command needed to be executed cannot be reached
    • Result
      • successful: cache-invalidate completed successfully.
      • failure: cache-invalidate finished with failure. Cause field indicates the appropriate reason
      • error: cache-invalidate is in error state.
    • Cause
      • This is a text string indicating the reason for any failure / error during execution.

    And from there you are good to go! Happy cache-invalidating! If you have any questions, please post them in the comments section.

      Update:

    Don Kirouac (@kirostor) made me aware via twitter of a whitepaper that describes the usage of cache invalidate. Thanks Don!

    Finally, to all readers of this blog, a very Happy Holiday Season to you and your near and dear ones and Best wishes for 2014! We have some exciting stuff coming your way courtesy of the engineers working on VPLEX and RecoverPoint. And I cannot wait to tell you all about it … Stay tuned!

    Mobility and Availability Go Xtrem

    Have you recently heard about this new all flash array from EMC? Might have gone past your RSS feeds, your twitter timelines, your blog rolls and your press release markers.

    XtremeIO

    Of course I am kidding. Unless you have been hiding under a rock or data storage is not a meaningful technology category for you, you could not possibly have missed the tremendous launch that XtremIO just had. On second thoughts, even if you were hiding under a rock, the EMC marketing team would have found a way to get to you. The reception from customers, partners and competitors to the XtremIO launch has been overwhelming. Customers have been raving about the XtremIO technology, partners are excited to sell XtremIO. Competitors – well, let’s just say that it has been interesting to say the least – there were twitter feuds, ad wars, positioning conversations, good natured ribbing and some downright FUD. And so I am above board, I am sure we have done our fair share of all of the above. Keeps life fun and interesting in the tech space for all of us.
    Itzik covers XtremIO in all its gory glory in his blog posts
    All of these are highly recommended reads if you want to learn about XtremIO.
    For my part, I will focus this blog on the intersection between VPLEX and XtremIO. I am already seeing a ton of interest in our customer base and our field for this combination. Part of my focus is going to be to clarify what use-cases we are seeing customers use VPLEX in front of XtremIO for as well as to answer some of the questions we are getting.

    Use Cases for XtremIO and VPLEX

    Load balancing / Operational Flexibility

    This is one we have seen a lot of customers use VPLEX with XtremIO for.
    I will be the first one to admit – there are customers who put all their workloads on all flash arrays and there are customers that do not. If you are in the first category, this use-case does not apply that much to you. If you are in the second bucket (which is the overwhelming majority of the customers I talk to) then, you are deploying some of your workloads on all flash arrays and most on hybrid or non-flash arrays. In this mode, customers have workloads that belong on flash but not all the time. In other words, they have workloads that are temporarily resident on flash before moving back to the hybrid array tiers. In other words, these workloads have a temporal performance need. Because of this, customers put VPLEX in front of XtremIO combined with other arrays (we OBVIOUSLY love it when these other arrays are VMAX/VNX but they do not have to be). The workload largely resides on the non-all-flash-array and then moves to the all-flash array temporarily and then is moved back to the non-all-flash array once the associated performance need diminishes. We typically see this with IT shops that operate on storage charge back models based on SLAs. This way, the charge back costs are kept as low as possible.

    Cross Array Availability

    This use-case is certainly not unique to all flash arrays by any means. However, customers are increasingly using VPLEX as a cross array data protection tool. The value of doing this is that if you happen to need downtime on one array (either planned or unplanned), then with VPLEX you can mirror volumes across two arrays and in that way accomplish a higher level of protection. With flash arrays, we see customers typically protect between multiple flash arrays. An interesting variant of this use-case is where customers are deploying VPLEX Metro within the data center and are using cross array availability as a means to protect across two entirely different failure domains within the same data center (e.g. two fire cells, two different floors). One note of caution for customers: Since the protection mirrors in this case (RAID-1s) are synchronous mirrors, for flash array customers especially, it is worth remembering that the latency of your I/O will be the slowest array that is a part of the R1 volume map. Given this, it is beneficial for the arrays forming the R1 legs to have similar latency characteristics. In the case of all flash arrays, this means that the typical flash R1, all legs should be all flash.

    Non-disruptive Tech Refresh

    Another big use-case that we hear from customers in the all-flash-array space is future flexibility. A lot of the all flash array platforms are continuing to evolve rapidly with newer versions being available sooner rather than later. This has implied that customers have felt the need to future proof themselves and enable migrations to those newer platforms more seamlessly. VPLEX because of its ability to migrate non-disruptively (Does VPLEX do migrations?) becomes a logical choice for customers looking for this option. The same also applies for customers who anticipate their future flash needs growing. VPLEX provides a means to present and aggregate flash storage.

    Long distance DR protection with RecoverPoint

    Here is a two-fer. With the RecoverPoint splitter in-built into the VPLEX platform, it can be used for DR protection for XtremIO. Given the heterogeneous support of VPLEX and RecoverPoint, you can copy the DR protection leg to any EMC or non-EMC array. RecoverPoint will also give you continuous data protection in addition to the continuous remote replication. This means that a combination of VPLEX and RecoverPoint will give you HA and DR in combination with XtremIO.

    On boarding

    Another form of migration – this one to move data onto the flash array. Too often customers have more than one array type and are looking to move a portion of their workload from those disparate array onto XtremIO. Traditionally, this would mean figuring out a way to copy the data over (which is likely different between every combination of arrays). Putting VPLEX in front of the non flash arrays as well as XtremIO, will enable a seamless and a uniform migration experience between the source array (any one of the 45+ EMC and non-EMC arrays that are supported by VPLEX) and XtremIO. By the way, if you are moving lock stock and barrel to XtremIO from existing arrays, you can use the free 180 day VPLEX Migration license (described here) available with the purchase of a new EMC array.

    Heterogeneous Host connectivity

    Another relative freebie – While I do not foresee this as being the primary reason to put a VPLEX in front of an XtremIO, because of the vast host side interoperability that has been built over the years with VPLEX, you get host connectivity for all the hosts supported by VPLEX in VPLEX Support Matrix (AIX and HPUX anyone?). I am sure over time XtremIO will build this support natively. Until then, this can tide you over if you happen to have hosts and clustering infrastructure that is not on the XtremIO support matrix.
    Josh Goldstein, VP of Product Management/Marketing for the XtremIO team does an exceptional job describing the interplay between XtremIO and VPLEX here:

    Questions we have seen thus far

    What latency does VPLEX add to my all flash workloads?

    The most direct answer to this question is that it depends.
    However, the real question behind the question is ‘Am I lose the benefit of the latency reduction I got from my purchase of XtremIO?’. Again, the straight answer is that VPLEX will add latency to the mix. So, the combination of VPLEX and XtremIO will, for most workloads (non read cache hit intensive) have higher latency than XtremIO alone. So, if you have workloads that need the absolute latency of XtremIO, then you should direct connect to XtremIO. However, these workloads are far and few in between. If you are a typical customer with a typical workload, the more appropriate compare is the latency incurred with a non-all-flash-arrays. Here, for most workloads, VPLEX + XtremIO will come out ahead in terms of total latency. Now, the real answer will depend on the latency that your application needs, your workload mix and the use-cases that are important to you from the list above. And from there, it becomes conversation about the relative priorities between them which will help you understand which workloads are suited for the VPLEX/XtremIO combination.
    As we get more questions, I will post them to this blog. If you are a VPLEX/XtremIO customer, we would love to hear from you!

    Does VPLEX do migrations?

    One of the fun parts of my job is interacting regularly with customers at various forums. And often, these discussions result in insights about what the product needs to do, where we need to focus. Once in while, it tells us about what we are communicating out, what our customers and field hear. Here was one such exchange at an EBC this past week.

    ME: VPLEX is the best thing since sliced bread (paraphrasing my hyperbole here :-))
    CUSTOMER: Does VPLEX do migrations?
    ME: Yes, VPLEX does Mobility and Availability
    CUSTOMER: Understand and we are really excited about that. However, can VPLEX allow me to refresh my storage
    ME: (confused) Yes, it can
    CUSTOMER: Meaning if I have VPLEX in place, I can bring a new storage array in and migrate my current array to that new array without any disruption to the host?
    ME: (Getting the hang of what’s going on) Yes
    CUSTOMER: And do the arrays need to be of the same type?
    ME: No. They can be different.
    CUSTOMER: Can VPLEX do this for non EMC arrays as well?
    ME: Yes. We have over 45 different array families supported and more families are being added every month.
    CUSTOMER: Nothing in the product details (white papers / collateral) describes this use-case …
    ME: (Sheepishly) Valid feedback – I will take that back to my team to figure out what we can do about this.

    For the customer who helped bring this to my attention (and you know who you are) – many thanks. You were TOTALLY right. Because here is what happened that very day. I got a note from our field team asking about which products to use for a migration activity. And the same discussion that we had in the morning happened as bits on the wire later that day. So, independent of what we say or dont say about migrations, it is clear that the message is not being heard as much as it should be. This post is to at least begin to set the record straight on VPLEX and Migrations.

    So, VPLEX definitely does do migrations. There are two variants of the use-case.

    1. Tech-refresh of an array
      Here a new array is brought in (either because a lease on an existing array has run out or because a new array has been purchased). Volumes from an existing array or arrays are migrated onto the new arrays. Typically, the older arrays are then retired or repurposed for other usage.
    2. Load balancing across arrays
      Here there are multiple arrays behind VPLEX. Either because of capacity reasons or performance reasons or the need for some specific capability, volumes are moved from one array to another. Both arrays continue to be kept in service.

    VPLEX Local can be used to accomplish both use-cases above. VPLEX Metro adds one more variant to the above use-case(s) – Migrating across arrays across data centers. In other words, VPLEX Metro extends the pool of arrays that you can manage beyond the confines of your data center.

    Specifically, here are things to remember about VPLEX migrations:

    1. VPLEX migrations are non-disruptive. In other words, the application does not need to be stopped in order to migrate storage.
    2. VPLEX is fully heterogeneous. It supports both EMC and non-EMC arrays. My standard note to customers is always refer to the VPLEX Simple Support Matrix on powerlink.
    3. The source array and target array of the migration can be any of the supported set of arrays. In other words, you do _NOT_ need to migrate from like to like.

    How do migrations in VPLEX work?

    Initial state
    Initial state

    Here is the basic process of migrations within VPLEX:

    • The new array is connected and exposed to VPLEX and volumes from the new array are exposed to VPLEX
    Step 1: Add New Array
    Add New Array
    • From here, you have two options (really dependent on the scale of the operations):
      • Migrate on a volume by volume basis
      • Migrate as a batch (especially useful for the tech refresh piece)
    Step 2: Create a Mirror
    Create a Mirror
    • From then on, VPLEX does its thing and ensures that the volumes on the two arrays are in sync. During this time, I/Os from the host continue. As far as the host is concerned, it continues to see volumes from VPLEX. Host READ I/Os are directed to the source leg. Host WRITE I/Os (if the section has been copied over onto both legs) are sent to both legs of the mirror. After both volumes are in complete sync, I/Os continue until you decide to disconnect the source volume. It is worthwhile pointing out that even after the volumes are in sync, you have the option to remove the destination volume and go back to the source. From that point on, you make the call on when to disconnect the source volume.
    Step 3: Disconnect the source array
    Disconnect the source array

    From the host standpoint, quite literally, it does not know that anything has changed.

    More questions that I get about migrations:

    1. Can I control the amount of impact on my host I/Os?Before answering this question, it is important to understand why there may be impact (if any). FWIW, this explanation is true of all storage virtualization solutions doing migrations. Anyone that tells you otherwise is factually incorrect.The host connected to VPLEX has a fixed set of paths to the virtual target presented by VPLEX. The same for the target arrays connected to VPLEX on the backend. Think of these as fixed capacity pipes carrying your I/Os from the front-end to the back-end of VPLEX. Along these same pipes, VPLEX needs to perform copy I/Os (read from the source leg and write to the target leg). So, in a fixed pipe, a migration adds additional I/Os. In other words, some of the capacity in that I/O pipe gets consumed for migrations. How much of that can impact the host depends on how full the I/O pipe was in the first place.To account for the case when the pipe is completely full, VPLEX gives you three knobs that allow you to select the rate of migrations (ASAP / High / Medium / Low). As you can imagine, the higher the copy rate, the higher the impact on host I/Os. So, if you are concerned about host I/O impact, then you should start with the copy rate to low and increase the rate from there.
    2. What should my licensing model be if I have to migrate from old storage to new storage?This is more relevant to the tech refresh variant of the use-case. We heard feedback from a _LOT_ of customers about them wanting to use VPLEX for migrations. However, they were balking at having to license the storage that they were going to migrate from (i.e. the source array). To help with this, we have introduced a free 180-day migration license for VPLEX. This migration license is available with the purchase of an EMC array. So long as VPLEX is licensed on this new array, you have a 180-day license for unlimited capacity to migrate onto the array behind VPLEX. This is compelling if you are especially going through a storage consolidation phase in your data center.

    Along the way, we have had some tremendous customer success stories with respect to migrations – right from customers who have reduced their migration times by 90+% to customers who no longer schedule maintenance time for migrations nor do they involve external professional services. We clearly have a lot of work to do with respect to educating everyone about VPLEX and its role in migrations. But this should be a good starting point for the conversation.

    Virtual Storage Integrator (VSI) Support for VPLEX

    Here is a FREE plugin for VPLEX that you should be using if you aren’t already.

    What is Virtual Storage Integrator aka VSI?

    VSI is a standard EMC plugin for VMware vCenter. Using VSI, you can drill down into specifics for EMC platforms. VPLEX was supported with the Virtual Storage Integrator pretty much from the get go for the storage visibility use-case. In other words, for VPLEX, VSI would allow you to drill down from vCenter to your vmdk to the associated VPLEX storage and then tell you which particular backend storage it was.

    So what’s new

    With the release of VSI 5.6, the above functionality for VSI was enhanced to now allow provisioning VPLEX virtual volumes directly from within vCenter. You can now go into vCenter and select the VPLEX cluster that you want to provision and then create virtual volumes (local or distributed), choose what consistency groups and storage views you want to expose them out of and off you go – ALL directly from within vCenter. Pretty awesome stuff!!

    Check out this video by Drew Tonnesen (@drewtonnesen) which captures the steps you need to perform these operations:

    Many thanks to the VSI team for driving these sets of changes. Drew’s original guest post is available here: http://codyhosterman.com/2013/09/23/virtual-storage-integrator-5-6-vplex-provisioning/