Multipathing iSCSI devices can be implemented at different levels in the Solaris storage protocol stack.
The following figure shows the Solaris block I/O stack.
Note – The iSCSI Multiple Connections per Session (MC/S) is currently not supported in Solaris but might be available in a future release.
iSCSI is built on the Solaris IP Stack, which includes:
• IP multipathing (IPMP) over TCP/IP
• Above IPMP, iSCSI provides native multipathing using MC/S
• At a higher level (that is independent of the transport layer), Solaris provides multipathing software
(MPxIO). Because MPxIO is independent of transport, it can multipath a target that is visible on both iSCSI and FC ports.
Because of their location in the network protocol stack, each multipath solution is useful for different purposes.
IP Multipath (IPMP) is a native Solaris system facility for network multipathing. Operating at the IP layer in the networking stack, IPMP provides for fail-over and aggregation over two or more NICs. For more information about IPMP, see the Solaris 10 System Administration Guide: IP Services, at http://docs.sun.com/app/docs/doc/816-4554.
To implement IPMP, a system administrator selects NICs that are on the same subnet and places them in logical IPMP groups. A daemon (part of the IPMP system) monitors the health of the ports and can be configured to monitor connections to specific iSCSI targets. In the event of a port failure, the other port on the same subnet assumes the same Media Access Control (MAC) address as the failed NIC, and the iSCSI connection continues uninterrupted.
The following figure shows a sample configuration for IPMP.
Figure 2. IP Multipathing (IPMP)
IPMP participates in dynamic reconfiguration (DR). On systems that support DR, administrators can replace NICs without disrupting networking traffic. When a NIC is replaced, it is added back to the IPMP group and used thereafter for I/O.
When used in combination with iSCSI, the major limitation of IPMP is that multiple target ports are not multipathed. IPMP enables redundancy between host ports but cannot fail-over to multiple target ports.
iSCSI Native Multipathing
The iSCSI specification addresses the requirement for redundant physical connections. While FC SANs support multiple paths, the iSCSI specification defines what is supported.
In TCP/IP, connections describe communication between two portals. A session is the association between an initiator and target, either of which may have one or more portals. Multiple Connections per Session (MC/S) allows initiator portals to communicate with target portals in an coordinated manner. Target portal and initiator portal redundancy are both supported. Link aggregation is also supported. The following figure shows one configuration that supports MC/S.
Figure 3. Multiple Connection/Session (MC/S)
MC/S also allows (but does not require) more sophisticated error handling than simply retrying a command. This error recovery allows commands from a failed connection to be recovered quickly by other good connections in the same session. The SCSI layer is not aware of the error.
In general, iSCSI vendors do not yet support MC/S. Therefore, MC/S is not supported in the Solaris 10 release, Update 1, of the Solaris software initiator, but it might be supported in a future release.
Sun Multipathing Software (MPxIO)
MPxIO is a Solaris component that supports multiple physical paths to storage. MPxIO is the current Solaris functionality that supports multiple physical FC connections. Because MPxIO operates above the transport layer (at the SCSI protocol layer), it can support FC, InfiniBand (IB), and iSCSI in certain configurations. For more information about MPxIO, see http://www.sun.com/products-n-solutions/hardware/docs/Software/Storage_Software/Sun_StorEdge_Traffic_Manager/.
FC and iSCSI drivers register logical units (LUNs) with MPxIO. MPxIO matches paths to the same logical unit at the SCSI protocol layer by querying the unique SCSI per LUN identifier from each device. MPxIO collapses duplicate paths to one device so that the target driver and layers above know only of the one device.
The iSCSI initiator driver determines which device(s) to register by examining the SCSI target port identifier of the target. The target port identifier consists of two parts:
• target node name
• target portal group tag (TPGT)
These two parts are concatenated, as shown in the following example target port identifier.
where the target node name is iqn.1921-02.com.sun.12432 and the TPGT is 1.
The iSCSI initiator registers an instance with MPxIO for each LUN for every unique target port identifier.
MPxIO and Multiple SCSI Target Portals IDs
MPxIO might seem to be the ideal solution to the current lack of native iSCSI multipathing support in the Solaris initiator. However, in order for MPxIO to support an iSCSI target, the target must support configuring different SCSI target port identifiers for each portal. One method of doing this, as shown in the following figure, is to allocate portals into multiple target portal groups so that the TPGT makes the target port identifier unique.
Figure 4. MPxIO with Multiple Target Port Identifiers
Another method is to simply have different iSCSI target names per portal. To create unique names, array vendors can choose either of these approaches.
The target’s port configuration determines whether MC/S or MPxIO can be used for multipathing.
• If an iSCSI target supports MC/S, it will present all of its target portals in a single target portal group.
With such a target, all target portals form one logical SCSI target port, and the Solaris iSCSI driver therefore registers only one instance of a LUN with MPxIO.
• If an iSCSI target supports MPxIO, it will have different target port groups. Different target port groups force different sessions, so MC/S cannot be used for target port redundancy.
MPxIO with Dual SCSI/FC Bridges
MPxIO can also be used when there are dual iSCSI to FC bridges to a FibreChannel SAN, as shown in the following figure.
As in the previous example, each LUN has a different target identifier because the iSCSI specification requires unique names for different devices. iSCSI presents both instances to MPxIO, and then MPxIO matches the unique SCSI per LUN identifier, finds that they are identical, and presents one target to the target driver.
MPxIO with Different Transports to the Same Device
Because MPxIO is above the transport layer, MPxIO can support different transports to the same device. In the example configuration shown in the following figure, one LUN appears to the host via FC and iSCSI paths. In this configuration, MPxIO will utilize both paths.
Using iSCSI Multipathing in the Solaris™ 10 Operating System — update 1
Figure 6. MPxIO with IP/FC Bridge
LUN0 at the disk array appears both to the IP NIC and FC HBA in the host. MPxIO will consolidate the two paths into one and then present it to the target drivers. This is how bridges currently work today. Arrays that support both FC and iSCSI connections natively can use the same mechanism.
Note that, in this configuration, MPxIO performs its default load balancing. For a symmetric access device, this is generally round robin load balancing, so that I/O requests alternate between active links. This is independent of the performance of relative links. Because load balancing is round robin, MPxIO is most useful in configurations in which all links between initiator and target have equal bandwidth and latency.
• Sun Microsystems, Inc. “Configuring iSCSI Initiators” in System Administration Guide: Devices and File Systems, in the Solaris 10 Product Documentation.
• Solaris Fibre Channel and Storage Multipathing Administration Guide, Sun Microsystems, Inc. Solaris
10 Product Documentation.
• Sun Microsystems, Inc. System Administration Guide: IP Services, in the Solaris 10 Product Documentation.
• Internet Protocol Network Multipathing (Updated), by Mark Garner (Sun BluePrints™ OnLine— November 2002)
• Enterprise Network Design Patterns: High Availability (Sun BluePrints Online—December, 2003) http://www.sun.com/blueprints/1203/817-4683.pdf