PowerPath: Auto standby for VPLEX

Autostandby as a capability has been available for powerpath for over a year and a half. Must be something in the zeitgeist but all of a sudden, I have seen a couple of threads from customers and the field. And these threads have covered the entire range – from customers who are positively gushing about this capability, to questions about how this works, to operational questions like what tweaks are possible or not possible.

The background behind autostandby

We started down the autostandby road with some crucial observations:

    Most host I/O operations in a sequence are correlated to each other. In other words, random I/O workloads, while they do exist, are rare during customer operations.

(And yes, I realize that any generalization is dangerous territory. So, remember, we are following the 80:20 rule here).

    VPLEX has a read cache. To take advantage of this, you want to maximize the likelihood that read-type I/Os encounter cache hits, thereby reducing the latency for these I/Os.

Translation: If you combine the two observations above, then, for better performance, you want I/Os from a given host to a given volume to be directed to a given set of directors as much as possible.

Finally, let’s now bring the distance component into this. Particularly, the focus here is on the cross connect (Additional vMSC Qualifications for VPLEX Metro). In the case of the cross connected configuration, there is a latency advantage to having I/Os be directed to the local cluster. Otherwise, I/Os get subjected to the cross site round trip latency penalty. By the way, this is one of the reasons that we have chosen to restrict the support latency envelope for cross connected configurations to 1 msec RTT.

The solution

Working with the PowerPath team, we set out trying to address the design goals outlined above. Now PowerPath has a mechanism to address paths that should not be used for multipathing purposes. This is where paths get set to manual standby. That designates these paths (if alive) as usable once all the primary (non-standby) paths have failed.

For the VPLEX Metro cross connected environments, the designation of which path is on standby will depend on where the host is located corresponding to the VPLEX cluster. The host paths connected to the local VPLEX Cluster will be the *active* paths whereas those connected to the remote VPLEX Cluster will be the *standby* paths. As a result, the path setting needs to be automatic and at scale across all hosts.

How does the solution work?

A lot of the recent questions have been focused on how the algorithm for path selection works. So at a high level, here goes:

  • PowerPath measures the latency of SCSI INQ commands issued to each path
  • Determine the minimum path latency associated with each VPLEX cluster / frame
  • The VPLEX cluster / frame with the lowest latency is the designated as the preferred cluster.
    1. Each host sets the preferred cluster independently. So, each host affinitizes correctly to the appropriate VPLEX Cluster
    2. If the delta between the minimum latency between clusters is zero, the preferred path designation is applied to one cluster or the other
  • The paths associated with the preferred VPLEX cluster to active mode.User set active/standby always takes precedence over auto selection. So, if those paths have been previously set manually to standby, those settings will not be overruled.
  • The paths associated with the non-preferred VPLEX cluster are set to autostandby – the same caveat as the previous bullet applies
  • PowerPath versions where autostandby for VPLEX is supported

    Here are the minimum versions where autostandby for VPLEX is supported:

  • VMware: PP/VE 5.8
  • Linux: PP 5.7
  • Windows: PP 5.7
  • AIX: PP 5.7
  • HPUX: PP 5.2
  • Solaris: PP 5.5
  • Frequently Asked Questions

    For a given distributed volume, if there are multiple paths on a given cluster which is chosen as the preferred cluster, do all paths get utilized?

  • Yes.
  • What is the frequency of the path latency test? What is the trigger for the path latency test?

  • Path latency is evaluated for autostandby at boot time (if autostandby is enabled) or during runtime when the feature is turned from off to on or when a user issues a reinitialize from the command line.
  • What is the minimum latency difference between two paths before which one will be set on autostandby? What is the default? and is this settable?

  • The granularity varies from platform-to-platform (depends on the tick granularity of the OS). However, the granularity is really, really small and is not settable.
  • I have a VPLEX Metro cluster deployment in which the cross connect latency is extremely small. I do not need the autostandby algorithm. Can I turn it off?

  • Yes, you can turn it off. Refer to the PP administrative guide on how to turn it off. Now, here is the counter argument. If you expect your I/Os to have any level of read cache-hits, then it is still a good idea to leave the autostandby algorithm turned on.
  • On failure of all active paths, the standby paths get made active. When the original paths return, does the user have to take any steps to return the configuration back tot he original configuration or does the pathing revert back to the original state>

  • The pathing will automatically revert back to the original state as soon as an active path comes back alive.
  • Note

    PowerPath also has an autostandby mode that has been introduced to enable handling of flaky paths (IOs-Per-Failure autostandby). This blog is focused on the VPLEX portion of auto standby (referred to as the proximity based autostandby).

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s