If you are reading this, undoubtedly you have battled through the googlizer for search terms looking for iSCSI multipathing with VMware and NetApp storage arrays. I feel your pain, and to that end, I seek to provide some guidance learned from the field.
To discuss proper multipathing, let us first consider what we have for a goal: multiple paths for redundancy and load to a storage platform from our ESXi servers. iSCSI gives you a few choices when it comes to a configuration and I’ll try to cover the decision points. For the curious, I used a combination of technology for this post. From using the NetApp simulator and a virtual ESXi host to a physical ESXi host and a NetApp 3240. The goal is to show the many different methods for accomplishing the task of multipath configuration. What I have learned is that there are several ways to meet this goal and they are driven by your (customer) requirements. These decisions also include your choice of iSCSI adapter (hardware or software), the switches in your environment and link policy selection, and your connectivity from host to storage array.
Understanding Design Choices
First and foremost, it’s very important to understand the choices that can be made with iSCSI as a storage protocol for ESXi. Best practices encourage you to keep your iSCSI traffic on its on VLAN(s), with adapters dedicated to iSCSI traffic. This level of separation is important to ensure the bandwidth is dedicated, and also helps from a security standpoint. iSCSI supports the usage of CHAP which can help, but ideally you are keeping this traffic separated on your network. With this separation in mind, you can also benefit from using jumbo frames.
Next, the decision for a choice of iSCSI adapters: VMware has continually revamped their iSCSI software initiator and the improvements over time make the software initiator a potential choice. I would recommend a direction of hardware that supports offload but the software initiator is a viable alternative.
From an addressing standpoint, you can use a private IP addressing scheme since this traffic will not be an external, routable network. Make sure that you reserve enough IP addresses to allow for all your hosts to have two IP addresses for their port groups. Also, for the NetApp controllers, you’ll want to set an IP for iSCSI and an alias to allow for a second connection. Data ONTAP supports multiple TCP sessions allowing for Active I/O connections similarly seen with fibre channel.
Native Multipathing with VMware
VMware has the capability to do native multipathing for iSCSI. This will require two port groups to allow for the connections. There are two ways to go about configuring and setting this up, and this partly depends on your networking configuration. If you are not using portchanneling capabilities on your switches, VMware recommends using no more than 2 physical nics on a vSwitch that will be used for iSCSI. On that vSwitch, you should create two port groups and give them an IP address.
The vSwitch nic teaming policy is set to virtual port ID and is inherited by the iSCSI port groups. For this to work properly, each port group will have one adapter set to primary and the other set to unused. This will manually place iSCSI traffic onto each nic, providing a form of redundancy and load balancing. Note that a nic failure will result in any connections using that port group to fail (since there is no failover).
In my view, this isn’t an ideal way to handle traffic. VPID, as we know, will make the initial decision for where traffic will flow and from then on, will not deviate unless there is a failure of that link. This works fine for the connection if we are not using any kind of link aggregation at the switch level.
But what if I was using a portchannel or perhaps a new Nexus switch and using virtual portchannels (vPC)? That would bring a new set of rules into play. One thing of note, with the vPC connections from the Nexus to the host, you cannot use mode active for the portchannel on the Nexus. This will cause the vPC connections to not come up (as ESXi does not support LACP). Therefore just using the channel-group command will use a normal portchannel which an ESXi can handle.
One thing I struggled with in my initial assessment was how traffic would work with a portchannel (virtual or not) and two iSCSI port groups when it came to traffic and load. The answer comes from the way ESXi handles etherchannel (portchannel) connections. To make this work correctly, we create a vSwitch with two nics (just like before) but this time we set the load balancing policy to IP-Hash.
Now, for each port group used for iSCSI, we do not change the configuration to override the failover order. We will use both nics actively. The benefit here is that when one link is lost, the traffic will move over onto the other nic gracefully and still make those connections. I view this as a more seemless failover.
I’ve come across a number of folks who don’t seem to think there is something to be gained from ip hash, and that VPID is the way to go. What they fail to see is the even balancing of traffic over the ip hash link connections. A great way to test this (and I encourage you to go out and try this) is to fire up IOmeter on a VM and push some I/O to your array. Watch your host using esxtop (enable ssh) as you can actively see the traffic being balanced (and fairly evenly I might add) across both links. If you pull a link, the traffic will fail over and keep on going.
I believe that if you set VPID and use the individual nics, you will get similar results to some extent. In cases where you have to share the iSCSI links with other types of traffic, like VM data traffic, then using link aggregation allows that traffic to be balanced across the links alongside the iSCSI traffic. I prefer to not have to do such, but customer requirements do drive a project and set the goals. With this in mind, there can be a need to not fall precisely within best practices but still accomplish the goals of a project.
In a situation where link aggregation is not being used, VPID is the way to go. It reminds me of my old days working on fibre channel HP EVA arrays with VI 3.5 and Fixed paths to the LUNs.
Now, on the NetApp array side, it is fairly easy to allow for multiple paths to the datastores. This will let you set Round Robin and actively send I/O down all paths available. You need to configure the interface an IP on the interface (ifgrp if ONTAP 8+, vif if 7), and also enable iSCSI on that interface (and don’t forget to turn on the iSCSI protocol!). Then you need to set an alias for the interface with the second IP to allow for complete multipathing. This allows for multiple unique source-destination IP pairs that the switching will use properly via IP-Hash.
A fantastic reference point for me continues to be NetApp TR-3749, section 3.6. Reading through that section they focus on using System Manager, but I prefer to set things myself from the command line. Regardless, that TR will give you a great compliment to the discussion I have begun here.
Let it be known that using Cisco Nexus switches with vPC to NetApp storage and ESXi hosts is a pretty great thing, and I count myself lucky to be implementing this kind of solution for likely another year.
Links of note:
Scott Lowe discussing NIC utilization with and without link aggregation