No announcement yet.

Decoding the mysteries of MPIO

  • Filter
  • Time
  • Show
Clear All
new posts

  • Decoding the mysteries of MPIO

    All the articles/blogs/white papers I've come across assume that the hardware you use for your iSCSI storage solution is a 3rd party device, with accompanying drivers, etc. to govern it's side of the iSCSI MPIO setup. But what about when your storage is a Server 2012 Datacenter box running as a File & Storage server?

    Our development system is a copy of our Live system, as much as can be. 'Live' uses a Dell SAN shelf setup, and MPIO between our clustered Hypervisors works a treat. Our Dev setup uses older, lower-spec servers as Hypervisors, and 2 are our iSCSI storage for this cluster. What we're trying to work out is the difference between MPIO and LACP for each end of an iSCSI target link.

    Scenario: 2 clustered servers (call them H1 & H2), 2012 Datacenter, with 3 legs each in the vlan for iSCSI traffic, on a dedicated Cisco 3750G switch to keep the traffic separate. MPIO is running on them, as well. Our storage servers are also 2012 Datacenter (call them S1 & S2), but only running as iSCSI storage providers. Each one has 4 legs in the iSCSI vlan, originally running as a team using 2012's built-in teaming and LACP active mode on the switch ports (one team per server, obviously!)

    We've been having no end of stability/time-lag issues with this, and thought that maybe we should not be using LACP on the storage server side, but MPIO throughout. But we're having as much trouble finding anything definitive about an MS-based iSCSI target server, using either MPIO or LACP, and our own experiments have nearly cost us our Dev domain on more than one occasion.

    Anyone got any hands-on they can share, or point me in a direction that doesn't go off a cliff?
    MSCA (2003/XP), Security+, CCNA

    ** Remember: credit where credit is due, and reputation points as appropriate **

  • #2
    Re: Decoding the mysteries of MPIO

    Disregard the question--myself and a colleague went into work today (Sun) and did loads of experimentation to work out what we think is the solution.

    *-The storage (target) servers have as many legs into the iSCSI vlan as they have, MPIO is not needed on them. Do NOT aggregate these legs using LACP, it won't help with bandwidth, only failover.
    *-The initiator servers (hypervisors) have the MPIO feature running (reboot server after adding feature, even tho the wizard doesn't ask you to), with multiple legs in the iSCSI vlan. Again, no LACP.
    *-Set up a 'full mesh' MPIO config, meaning you specify every IP on the hypervisor to connect to every IP on the target server (don't leave things as 'default'). For each connection defined, save a 'favorite target' else your multiple connections won't recreate at next startup.
    *-For failover clusters, repeat this full mesh setup for each cluster member.

    Example: for a target server with 4 legs and initiator server with 3 legs, you'll have 12 connections defined, for every target on that server. Assuming you have a target named 'Data' and another named 'Profiles' then you'll have 24 total connections defined, and 24 'favorites' listed. If you don't manually define this, you won't have a fully-functioning MPIO solution.

    Where your storage solution is a 3rd-party purchase, there should be vendor support to aid in your setup. Having said that, as a result of today's work, we've identified a couple of points in our Production system which needed alteration, and which should have been caught by the button-monkey the vendor sent when the installation first took place, but didn't. We've had issues which we didn't understand and couldn't trace since rolling out the new system, because we were relying on their all-in-one solution that required settings which weren't handled properly, and their own documentation didn't include. Never again!
    MSCA (2003/XP), Security+, CCNA

    ** Remember: credit where credit is due, and reputation points as appropriate **