Tag Archives: emc

EMC and MPIO in AIX

You can run into an issue with EMC storage on AIX systems using MPIO (No Powerpath) for your boot disks:

After installing the ODM_DEFINITONS of EMC Symmetrix on your client system, the system won’t boot any more and will hang with LED 554 (unable to find boot disk).

The boot hang (LED 554) is not caused by the EMC ODM package itself, but by the boot process not detecting a path to the boot disk if the first MPIO path does not corresponding to the fscsiX driver instance where all hdisks are configured. Let me explain that more in detail:

Let’s say we have an AIX system with four HBAs configured in the following order:

# lscfg -v | grep fcs
fcs2 (wwn 71ca) -> no devices configured behind this fscsi2 driver instance (path only configured in CuPath ODM table)
fcs3 (wwn 71cb) -> no devices configured behind this fscsi3 driver instance (path only configured in CuPath ODM table)
fcs0 (wwn 71e4) -> no devices configured behind this fscsi0 driver instance (path only configured in CuPath ODM table)
fcs1 (wwn 71e5) -> ALL devices configured behind this fscsi1 driver instance

Looking at the MPIO path configuration, here is what we have for the rootvg disk:

# lspath -l hdisk2 -H -F”name parent path_id connection status”
name   parent path_id connection                      status
hdisk2 fscsi0 0       5006048452a83987,33000000000000 Enabled
hdisk2 fscsi1 1       5006048c52a83998,33000000000000 Enabled
hdisk2 fscsi2 2       5006048452a83986,33000000000000 Enabled
hdisk2 fscsi3 3       5006048c52a83999,33000000000000 Enabled

The fscsi1 driver instance is the second path (pathid 1), then remove the 3 paths keeping only the path corresponding to fscsi1 :

# rmpath -l hdisk2 -p fscsi0 -d
# rmpath -l hdisk2 -p fscsi2 -d
# rmpath -l hdisk2 -p fscsi3 -d
# lspath -l hdisk2 -H -F”name parent path_id connection status”

Afterwards, do a savebase to update the boot lv hd5. Set up the bootlist to hdisk2 and reboot the host.

It will come up successfully, no more hang LED 554.

When checking the status of the rootvg disk, a new hdisk10 has been configured with the correct ODM definitions as shown below:

# lspv
hdisk10 0003027f7f7ca7e2 rootvg active
# lsdev -Cc disk
hdisk2 Defined   00-09-01 MPIO Other FC SCSI Disk Drive
hdisk10 Available 00-08-01 EMC Symmetrix FCP MPIO Raid6

To summarize, it is recommended to setup ONLY ONE path when installing an AIX to a SAN disk, then install the EMC ODM package then reboot the host and only after that is complete, add the other paths. Dy doing that we ensure that the fscsiX driver instance used for the boot process has the hdisk configured behind.

Configuring MPIO

Use the Following steps to set up the scenario:

  1. Create two Virtual I/O server partition and them VIO_Server1 and VIO_Server2. Creating virtual I/O server partition Select one Fiber Channel adapter in addition to the physical adapter.
  2. Install the both VIO Servers using CD or NIM server.
  3. Change the fc_err_recov to fast fail and dyntrk to yes attributes on the Fibre Channel adapters.

lsdev –type adapter command to find the number of channel adapter.

$ chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes –perm

fscsi0 changed

$ lsdev -dev fscsi0 -attr

attribute value description

user_settable

attach switch How this adapter is CONNECTED False

dyntrk yes Dynamic Tracking of FC Devices True

fc_err_recov fast_fail FC Fabric Event Error RECOVERY Policy True

scsi_id 0x660c00 Adapter SCSI ID False

sw_fc_class 3 FC Class for Fabric True

Important: If you have two or more Fibre Channel adapters per Virtual I/O

Server you have to change the attributes for each of them.

  1. Reboot the VIO Servers for the changes to the Fibre Channel devices to take effect.
  2. Create the Client partition Shows the required virtual SCSI adapters based on the configuration shown in following Chart
VIO Server VIO Server Slot Client Partition Client Slot
VIO_Server1 30 DB_Server 21
VIO_Server1 40 Apps_Server 21
VIO_Server2 30 DB_Server 22
VIO_Server2 40 Apps_Server 22
  1. Also add two virtual Ethernet adapters to each client to provide the highly available network access. One adapter if you plan on using SEA failover for network redundancy.
  2. On VIO_Server1 and VIO_Server2 use the fget_config command to get the LUN to hdisk mappings.

# fget_config -vA

—dar0—

User array name = ‘FAST200’

dac0 ACTIVE dac1 ACTIVE

Disk DAC LUN Logical Drive

utm 1

hdisk0 dac1 0 1

hdisk1 dac0 2 2

hdisk2 dac0 3 4

hdisk3 dac1 4 3

hdisk4 dac1 5 5

hdisk5 dac0 6 6

You can also use the lsdev -dev hdiskn -vpd command, where n is the hdisk number, to retrieve this information

  1. The disk are to be accessed though both VIO Servers. The reserve_policy for each disk must be set to no_reserve on VIO_Server1 and VIO_Server2.

$ chdev -dev hdisk2 -attr reserve_policy=no_reserve

hdisk2 changed

$ chdev -dev hdisk3 -attr reserve_policy=no_reserve

hdisk3 changed

9. Check using the lsdev command, to make sure reserve_policy attribute is

now set to no_reserve

$ lsdev -dev hdisk2 -attr

attribute value description

user_settable

PR_key_value none Persistant Reserve Key Value True

cache_method fast_write Write Caching method False

ieee_volname 600A0B8000110D0E0000000E47436859 IEEE Unique volume name False

lun_id 0x0003000000000000 Logical Unit Number False

max_transfer 0x100000 Maximum TRANSFER Size True

prefetch_mult 1 Multiple of blocks to prefetch on read False

pvid none Physical volume identifier False

q_type simple Queuing Type False

queue_depth 10 Queue Depth True

raid_level 5 RAID Level False

reassign_to 120 Reassign Timeout value True

reserve_policy no_reserve Reserve Policy True

rw_timeout 30 Read/Write Timeout value True

scsi_id 0x660a00 SCSI ID False

size 20480 Size in Mbytes False

write_cache yes Write Caching enabled False

10.Double check both Virtual I/O Servers that the vhost adapters have the correct slot numbers by running the lsmap -all command.

11.Map the hdisks to the vhost adapters using the mkvdev command

$ mkvdev -vdev hdisk2 -vadapter vhost0 -dev app_server

app_server Available

$ mkvdev -vdev hdisk3 -vadapter vhost1 -dev db_server

db_server Available

12. Install the AIX OS in client partitions.

Configuring MPIO in the client partitions

  1. Check the MPIO configuration by running the commands shown in

# lspv

# lsdev -Cc disk

hdisk0 Available Virtual SCSI Disk Drive

  1. Run the lspath command to verify that the disk is attached using two different paths. shows hat hdisk0 is attached using the VSCSI0 and VSCSI1 adapter that point to different Virtual I/O servers. Both Virtual I/O Servers are up and running. Both paths are enabled.

# lspath

Enabled hdisk0 vscsi0

Enabled hdisk0 vscsi1

  1. Enable the health check mode for the disk so that the status of the disks is automatically updated

# chdev -l hdisk0 -a hcheck_interval=20 -P

hdisk0 changed

How to Set the date and time of the VNX ?

Set Control Station date and time

You must log in as root to perform this operation.

To set the date and time for a Control Station, use this command syntax:
# date -s “<hh:mm mm/dd/yy>”
where:
<hh:mm mm/dd/yy> = time and date format
Example:
To set the date and time to 5:40 P.M. on August  8, 2012, type:
# date -s “17:40 08/08/12”

Set Data Mover or blade date and time

You can customize the display of the date and time on a Data Mover or blade by
using the server_date command. Configuring Time Services on VNX provides
additional information on time services.

To set the current date and time for a Data Mover or blade, use this command syntax:
$ server_date <movername> <yymmddhhmm> [<ss>]
where:
<movername> = name of the Data Mover or blade
<yymmddhhmm> [<ss>] = where <yy> is the year; the first <mm> is the month; <dd> is the day;
<hh> is the hour (in 24-hour system); and the second <mm> is the minute, and <ss> is the second.
Example:
To set the date and time on server_2 to July 4, 2012, 10:30 A.M., type:
$ server_date server_2 1207041030

How to Configure NTP service using the CLI in VNX NAS ?

1. Log in to the Control Station as root.

2. Check the status of the NTP daemon by typing:
# ps -ef |grep ntpd
Output:
ntp      30818     1  0 Aug07 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid

3. Display information about the ntpd status by typing:
# /sbin/service ntpd status
Output:
ntpd is stopped

4. Display information about the ntpd configuration by typing:
# /sbin/chkconfig ntpd –list
Output:
ntpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

5. Open the /etc/ntp.conf file for editing.

6. Add the NTP server IP address to the file by typing:
server 10.xx.xx.xx

7. Save the file and exit.

8. Open the /etc/ntp/step-tickers file for editing.

9. Add the NTP server IP address to the file by typing:
server 10.xx.xx.xx

10. Save the file and exit.

11. Set up the NTP daemon for run-levels 3, 4, and 5 by typing:
# /sbin/chkconfig –level 345 ntpd on

12. Display information about the ntpd configuration by typing:
# /sbin/chkconfig ntpd –list
Output:
ntpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off

13. Start or restart the NTP daemon by typing:
# /sbin/service ntpd start
Output:
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]
# /sbin/service ntpd restart
Output:
Shutting down ntpd: [ OK ]
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]

Note: If the response for synchronizing with the time server is positive, the NTP client
was able to communicate with the NTP server.

14. Check the status of the NTP daemon by typing:
# ps -ef |grep ntp
Output:
ntp 25048 1 0 13:09 ? 00:00:00 ntpd -u ntp:ntp -p
/var/run/ntpd.pid

15. Display information about the ntpd status by typing:
# /sbin/service ntpd status
Output:
ntpd (pid 25346) is running…

16. Display the list and status of the peers for the NTP server by typing:
# /usr/sbin/ntpq -p
Output:
r     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*10.0.50.xx      212.26.18.41     2 u  655 1024  377    1.755  -17.314   1.915

How to Change VNX Control Station time zone using the CLI ?

Steps:
1. Log in to the Control Station as root.

2. To verify the current environment, type:
# date
Output:
Wed Aug  8 16:22:54 IST 2012

3. Display information about the current time zone of the Control Station by typing:
# ls -la /etc/localtime
Output:
lrwxrwxrwx 1 root root 32 Aug  7 22:48 /etc/localtime -> /usr/share/zoneinfo/Asia/Kolkata

4. Set the hardware clock to the current time zone of the Control Station by typing:
# vi /etc/sysconfig/clock
When the file opens, type:
ZONE=”America/New_York”
UTC=false
ARC=false

5. Save the file and exit.

6. Change the current time zone, New York, to Asia/Kolkata, by typing:
# /usr/bin/perl /nas/http/webui/bin/timezone.pl -s Asia/Kolkata
Note: A list of valid Linux time zones is located in the /usr/share/zoneinfo directory.

7. To verify the current environment, type:
# date
Output:
Wed Aug  8 16:22:54 IST 2012

8. Display information about the current time zone of the Control Station by typing:
# ls -la /etc/localtime
Output:
llrwxrwxrwx 1 root root 32 Aug  7 22:48 /etc/localtime -> /usr/share/zoneinfo/Asia/Kolkata

9. Set the hardware clock to the current time zone of the Control Station by typing:
# vi /etc/sysconfig/clock
When the file opens, type:
ZONE=”Asia/Kolkata”
UTC=false
ARC=false

10. Save the file and exit.

11. The time zone of the Control Station is changed to the new location specified in step 6.

How to Set the time zone of the VNX Data Mover ?

You can update the time zone information on the Data Mover by using simple and  decipherable strings that correspond to the time zones available in the Control Station. You can also update the daylight savings time on the Data Mover for the specified time zone.

Set Data Mover or blade time zone manually

To set the time zone on a Data Mover using the Linux time zone method, use this command
syntax:
$ server_date <movername> timezone -name <timezonename>
where:
<movername> = name of the Data Mover
<timezonename> = a Linux style time zone specification
Note: A list of valid Linux time zones is located in the /usr/share/zoneinfo directory.
Example:
To set the time zone to Central Time and adjust the daylight savings time for a Data Mover by using the Linux method, type:
$ server_date server_2 timezone -name  Asia/Kolkata

EMC NAS / VNX Health Checkup using command line

Login using nasadmin  and verify the system’s health, type:
$ /nas/bin/nas_checkup
The checkup command reports back on the state of the Control Station, Data Movers, and storage system.
Note: This health check ensures that there are no major errors in the system that would prevent the system from being turned on during the power up process.

[nasadmin@VNXCS01 ~]$ /nas/bin/nas_checkup
Check Version: 7.0.51.3
Check Command: /nas/bin/nas_checkup
Check Log    : /nas/log/checkup-run.120807-113919.log

————————————-Checks————————————-
Control Station: Checking statistics groups database………………….. Pass
Control Station: Checking if file system usage is under limit………….. Pass
Control Station: Checking if NAS Storage API is installed correctly…….. Pass
Control Station: Checking if NAS Storage APIs match…………………… Pass
Control Station: Checking if NBS clients are started………………….. Pass
Control Station: Checking if NBS configuration exists…………………. Pass
Control Station: Checking if NBS devices are accessible……………….. Pass
Control Station: Checking if NBS service is started…………………… Pass
Control Station: Checking if PXE service is stopped…………………… Pass
Control Station: Checking if standby is up…………………………… Pass
Control Station: Checking integrity of NASDB…………………………. Pass
Control Station: Checking if primary is active……………………….. Pass
Control Station: Checking all callhome files delivered………………… Warn
Control Station: Checking resolv conf……………………………….. Pass
Control Station: Checking if NAS partitions are mounted……………….. Pass
Control Station: Checking ipmi connection……………………………. Pass
Control Station: Checking nas site eventlog configuration……………… Pass
Control Station: Checking nas sys mcd configuration…………………… Pass
Control Station: Checking nas sys eventlog configuration………………. Pass
Control Station: Checking logical volume status………………………. Pass
Control Station: Checking valid nasdb backup files……………………. Pass
Control Station: Checking root disk reserved region…………………… Pass
Control Station: Checking if RDF configuration is valid………………..  N/A
Control Station: Checking if fstab contains duplicate entries………….. Pass
Control Station: Checking if sufficient swap memory available………….. Pass
Control Station: Checking for IP and subnet configuration……………… Pass
Control Station: Checking auto transfer status……………………….. Warn
Control Station: Checking for invalid entries in etc hosts…………….. Pass
Control Station: Checking the hard drive in the control station………… Pass
Control Station: Checking if Symapi data is present…………………… Pass
Control Station: Checking if Symapi is synced with Storage System………. Pass
Blades         : Checking boot files………………………………… Pass
Blades         : Checking if primary is active……………………….. Pass
Blades         : Checking if root filesystem is too large……………… Pass
Blades         : Checking if root filesystem has enough free space……… Pass
Blades         : Checking network connectivity……………………….. Pass
Blades         : Checking status……………………………………. Pass
Blades         : Checking dart release compatibility………………….. Pass
Blades         : Checking dart version compatibility………………….. Pass
Blades         : Checking server name……………………………….. Pass
Blades         : Checking unique id…………………………………. Pass
Blades         : Checking CIFS file server configuration………………. Pass
Blades         : Checking domain controller connectivity and configuration. Pass
Blades         : Checking DNS connectivity and configuration…………… Pass
Blades         : Checking connectivity to WINS servers………………… Pass
Blades         : Checking I18N mode and unicode translation tables……… Pass
Blades         : Checking connectivity to NTP servers…………………. Warn
Blades         : Checking connectivity to NIS servers…………………. Pass
Blades         : Checking virus checker server configuration…………… Pass
Blades         : Checking if workpart is OK………………………….. Pass
Blades         : Checking if free full dump is available………………. Pass
Blades         : Checking if each primary Blade has standby……………. Pass
Blades         : Checking if Blade parameters use EMC default values……. Pass
Blades         : Checking VDM root filesystem space usage………………  N/A
Blades         : Checking if file system usage is under limit………….. Pass
Blades         : Checking slic signature…………………………….. Pass
Storage System : Checking disk emulation type………………………… Pass
Storage System : Checking disk high availability access……………….. Pass
Storage System : Checking disks read cache enabled……………………. Pass
Storage System : Checking disks and storage processors write cache enabled. Pass
Storage System : Checking if FLARE is committed………………………. Pass
Storage System : Checking if FLARE is supported………………………. Pass
Storage System : Checking array model……………………………….. Pass
Storage System : Checking if microcode is supported……………………  N/A
Storage System : Checking no disks or storage processors are failed over… Pass
Storage System : Checking that no disks or storage processors are faulted.. Pass
Storage System : Checking that no hot spares are in use……………….. Pass
Storage System : Checking that no hot spares are rebuilding……………. Pass
Storage System : Checking minimum control lun size……………………. Pass
Storage System : Checking maximum control lun size…………………….  N/A
Storage System : Checking maximum lun address limit…………………… Pass
Storage System : Checking system lun configuration……………………. Pass
Storage System : Checking if storage processors are read cache enabled….. Warn
Storage System : Checking if auto assign are disabled for all luns………  N/A
Storage System : Checking if auto trespass are disabled for all luns…….  N/A
Storage System : Checking storage processor connectivity………………. Pass
Storage System : Checking control lun ownership……………………….  N/A
Storage System : Checking if Fibre Channel zone checker is set up……….  N/A
Storage System : Checking if Fibre Channel zoning is OK………………..  N/A
Storage System : Checking if proxy arp is setup………………………. Pass
Storage System : Checking if Product Serial Number is Correct………….. Pass
Storage System : Checking SPA SPB communication………………………. Pass
Storage System : Checking if secure communications is enabled………….. Pass
Storage System : Checking if backend has mixed disk types……………… Pass
Storage System : Checking for file and block enabler………………….. Pass
Storage System : Checking if nas storage command generates discrepancies… Pass
Storage System : Checking if Repset and CG configuration are consistent…. Pass
Storage System : Checking block operating environment…………………. Pass
Storage System : Checking thin pool usage…………………………….  N/A
Storage System : Checking for domain and federations health on VNX……… Pass
——————————————————————————–

One or more warnings have occurred. It is recommended that you follow the
instructions provided to correct the problem then try again.

———————————–Information———————————-
Control Station: Check if standby is up
Information HC_CS_27389984778: The standby Control Station is
currently powered on. It will be powered off during upgrade, and then
later restarted and upgraded.

——————————————————————————–

————————————Warnings————————————
Control Station: Check all callhome files delivered
Warning HC_CS_18800050328: There are 36 undelivered Call Home
incidents and 3 scheduled Call Home files left in the
/nas/log/ConnectHome directory(es)
Action :

Check the /nas/log/connectemc/ConnectEMC log to ensure the connection
is established correctly. To test your Callhome configuration, you can
run /nas/sbin/nas_connecthome -test { -email_1 | -email_2 | -ftp_1 |
-ftp_2 | -modem_1 | -modem_2 } command. View the RSC*.xml files under
the /nas/log/ConnectHome directory(es) and inspect the CDATA content
to find out and possibly resolve the problem. To remove the call home
incidents and files, run the command “/nas/sbin/nas_connecthome –
service clear”. Otherwise escalate this issue through your support
organization.

Control Station: Check auto transfer status
Warning HC_CS_18800050417: The automatic transfer feature is disabled.
Action :

EMC recommends the automatic transfer feature to be enabled via
command:

/nas/tools/automaticcollection -enable

or from Unisphere:

1. Select VNX > [VNX_name] > System. Click the link for “Manage Log
Collection for File” Under Service Tasks.
2. Select Enable Automatic Transfer.
3. Click Apply.

By default, support materials will be transferred to ftp.emc.com,
but you can modify the location in the
/nas/site/automaticcollection.cfg file. For more information, search
the Knowledgebase on Powerlink as follows:
1. Log in to http://powerlink.emc.com and go to Support >
Knowledgebase Search> Support Solutions Search.
2. Use ID emc221733 to search.

Blades : Check connectivity to NTP servers
Warning HC_DM_18800115743:
* server_2: Only one NTP server is configured. It is recommended to
define at least two different NTP servers for a high availability.
If the clock of the Data Mover is not correct, potential errors
during Kerberos authentication may happen (timeskew).
Action : Use the server_date command to define another NTP server on
the Data Mover. Read the man pages for details and examples.

Storage System : Check if storage processors are read cache enabled
Warning HC_BE_18799984735: SPA Read Cache State on VNX FCN0xxxxxxxx5
is not enabled
Action : Please contact EMC Customer Service for assistance. Include
this log with your support request.

Storage System : Check if storage processors are read cache enabled
Warning HC_BE_18799984735: SPB Read Cache State on VNX FCNxxxxxxxxx5
is not enabled
Action : Please contact EMC Customer Service for assistance. Include
this log with your support request.

——————————————————————————–

[nasadmin@VNXCS01 ~]$

EMC Client installation and checking

This web page is a quick guide on what to install and how to check that EMC SAN is attached and working

Solaris

Installing
==========================================================
Install Emulex driver/firmware, san packages (SANinfo, HBAinfo, lputil), EMC powerpath
Use lputil to update firmware
Use lputil to disable boot bios
Update /kernel/drv/lpfc.conf
Update /kernel/drv/sd.conf
Reboot
Install ECC agent

 

Note: when adding disks on different FA had to reboot server?

List HBA’s /usr/sbin/hbanyware/hbacmd listHBAS   (use to get WWN’s)

/opt/HBAinfo/bin/gethbainfo           (script wrapped around hbainfo)


grep ‘WWN’ /var/adm/messages
HBA attributes /opt/EMLXemlxu/bin/emlxadm

/usr/sbin/hbanyware/hbacmd HBAAttrib 10:00:00:00:c9:49:28:47
HBA port /opt/EMLXemlxu/bin/emlxadm

/usr/sbin/hbanyware/hbacmd PortAttrib 10:00:00:00:c9:49:28:47
HBA firmware /opt/EMLXemlxu/bin/emlxadm
Fabric login /opt/HBAinfo/bin/gethbainfo           (script wrapped around hbainfo)
Adding Additional Disks cfgadm -c configure c2
Disk available cfgadm -al -o show_SCSI_lun

echo|format


inq                                    (use to get serial numbers)
Labelling format
Partitioning vxdiskadm

format

 

Filesystem newfs or mkfs

Linux

Installing
==========================================================

Install Emulex driver, san packages (saninfo, hbanyware), firmware (lputil)
Configure /etc/modprobe.conf
Use lputil to update firmware
Use lputil to disable boot bios
Create new ram disk so changes to modprobe.conf can take affect.
Reboot
Install ECC agent

List HBA’s

/usr/sbin/hbanyware/hbacmd listHBAS             (use to get WWN’s)

cat /proc/scsi/lpfc/*

HBA attributes /usr/sbin/hbanyware/hbacmd HBAAttrib 10:00:00:00:c9:49:28:47

cat /sys/class/scsi_host/host*/infoHBA port/usr/sbin/hbanyware/hbacmd PortAttrib 10:00:00:00:c9:49:28:47HBA firmwarelputilFabric logincat /sys/class/scsi_host/host*/stateDisk availablecat /proc/scsi/scsi

fdisk -l |grep -I Disk |grep sd


inq                                  (use to get serial numbers)
 Labellingparted -s /dev/sda mklabel msdos     (like labelling in solaris)
parted -s /dev/sda printPartitioningfdisk

parted
Filesystemmkfs -j -L <disk label> /dev/vx/dsk/datadg/vol01

PowerPath

HBA Info /etc/powermt display 
Disk Info /etc/powermt display dev=all
Rebuild /kernel/drv/emcp.conf /etc/powercf -q
Reconfigure powerpath using emcp.conf /etc/powermt config
Save the configuration /etc/powermt save
Enable and Disable HBA cards used for testing /etc/powermt display (get card ID)

/etc/powermt disable hba=3072
/etc/powermt enable hba=3072

EMC Symmetrix Architecture

This document will be using the EMC symmetrix configuration. There are a number of EMC Symmetrix configurations but they all use the same architecture as detailed below.

Front End Director Ports (SA-16b:1)
Front End Director (SA-16b)
Cache
Back End Director (DA-02b)
Back End Director Ports (DA-02b:c)
Disk Devices

Front End Director
A channel director (front end director) is a card that connects a host to the symmetrix, each card can have upto four ports.

Cache
Symmetrix cache memory buffers I/O transfers between the director channels and the storage devices. The cache is divided up into regions to eliminate contension.

Back End Director
A disk director (back end director) transfers data from disk to cache. Each back-end director can have upto four interfaces (C,D,E and F). Each back-end director interface can handle seven SCSI ids (0-6)

Disk Devices
The disk devices that are attached to the back-end directors could be either SCSI or FC-AL.

Interconnect
The direct matrix interconnect is a matrix of high speed connections to all componentswith bandwidth up to 64Gb/s

SAN Components

The are many components to a SAN Architecture. A host can connect to a SAN via direct connection or via a SAN switch.

Host HBA Host bus adaptor cards are used to access SAN storage systems
SAN Cables There are many types of cables and connectors:

Types: Multimode (<500m), single mode (>500m) and copper
Connectors: ST, SC (1Gb), LC (2Gb)

SAN Switches The primary function of a switch is to provide a physical connection and logical routing of data frames between the attached devices.

Support multiple protocols: Fibre channel, iSCSI, FCIP, iFCP
Type of switch: Workgroup, Directors

SAN Zoning Zoning is used to partition a fibre channel switched fabric into subsets of logical devices. Each zone contains a set of members that are permitted to access each other. Members are HBA’s, switch ports and SAN ports.

Types of zoning: hard, soft and mixed

Zone set s This is a group of zones that relate to one another, only one zone set can be active at any one time.
Storage arrays Storage array is were all the disk devices are located.
Volume access control This is also know as LUN masking. The storage array maintains a database that contains a map of the storage volumes and WWN’s that are allowed to access it. The VCM database in a symmetrix would contain the LUN masking information.

SAN Login

The below table documents the various proccesses that occur when a fibre channel device is connected to a SAN

Information/process FLOGI (fabric login) PLOGI (port login) PRLI (process login)
What is need ? – Link initialization
– Cable
– HBA and driver
– Switch Port
– FLOGI
– Zoning
– Persistent binding
– Driver setting
– PLOGI
– Device masking (target)
– Device mapping (initiator)
– Driver setting (initiator)
What information is passed – WWN
– S_ID
– Protocol
– Class
– Zoning
– WWN
– S_ID
– ULP
– Class
– BB Credit
– LUN
Who does the communication ? – N_port to F_port – N_port to N_port – ULP( scsi-3 to scsi-3)
where to find the information ? Unix
– syslog
– switch utilites

Windows
– Event viewer
– Switch viewer

Unix
– Syslog
– Driver Ulitities

Windows
– Driver utilities

Unix
– Syslog
– Host based volume management

Windows
– Driver Utilities
– Host based volume management
– Device Manager

If any one of the above were to fail then the host will not be allowed to access the disks on the SAN.

VCM Database

The Symmetrix Volume Configuration Management (VCM) database stores access configurations that are used to grant host access to logical devices in a Symmetrix storage array.

The VCM database resides on a special system resource logical device, referred to as the VCMDB device, on each Symmetrix storage array.

Information stored in the VCM database includes, but is not limited to:

  • Host and storage World Wide Names
  • SID Lock and Volume Visibility settings
  • Native logical device data, such as the front-end directors and storage ports to which they are mapped

Masking operations performed on Symmetrix storage devices result in modifications to the VCM database in the Symmetrix array. The VCM database can be backed up, restored, initialized and activated. The Symmetrix SDM Agent must be running in order to perform VCM database operations (except deleting backup files).

Switches

There are three models of switchs M-series (Mcdata), B-series (Brocade) and the MDS-series (Cisco). Each of the switch offer a web interface and a CLI. The following tasks can be set on most switches:

  • Configure network params
  • Configure fabric params (BB Credit, R_A_TOV, E_D_TOV, switch PID format, Domain ID)
  • Enable/Disable ports
  • Configure port speeds
  • Configure Zoning
BB Credit Configure the number of buffers that are available to attached devices for frame receipt default 16. Values range 1-16.
R_A_TOV Resource allocation time out value. This works with the E_D_TOV to determine switch actions when presented with an error condition
E_D_TOV Error detect time out value. This timer is used to flag potential error condition when an expected response is not received within the set time

Host HBA’s

The table below outlines which card will work with a particular O/S

Solaris Emulex PCI (lputil)
Qlogic
HPUX PCI-X gigabit fibre channel and ethernet card
AIX FC6227/6228/6239 using IBM native drivers
Windows Emulex (HBAnyware or lputilnt)
Linux Emulex PCI (lputil)

EMC PowerPath Migration Enabler – Host Copy (Windows)

Requirements:

  • PowerPath, at least version 5.1 SP1 (I would suggest at least 5.3 if running Windows 2003/2008)
  • PowerPath Migration Enabler key (host-copy)
  • Supported storage array for target and destination

Details:

To explain this migration technique a little more, PPME is a block level migration.  It migrates the blocks on the LUN using the host to process the migration.  This can cause performance degradation depending on how active the host is and the throttle you specify during the synchronization. Since the migration is at the block level, whatever the filesystem/alignment offset on the source LUN is exactly what gets migrated. You can migrate to a larger LUN but NOT to a smaller LUN.
If you need to make a configuration change to the filesystem, such as correcting an alignment offset issue, PPME is NOT the tool.  EMC Open Migrator (OM), however, can do that. OM can migrate LUNs online, but the setup and takedown require reboots (no less than three). I digress, let’s get on with the PPME migration.

Process:

The high-level process to complete a migration with PPME is this:

  • powermig setup -src <pseduoname> -tgt <pseudoname> -techtype hostcopy
  • powermig sync -handle <x>
  • powermig query -handle <x>
  • powermig throttle -handle <x> -throttlevalue <y>
  • powermig selecttarget -handle <x>
  • powermig commit -handle <x>
  • powermig cleanup -handle <x>

Note: Adding -noprompt to many of these commands, specifically those that take action, prevent the yes/no prompt.

A live example for this migration…….

C:Documents and SettingsAdministrator>powermt set policy=co dev=all

C:Documents and SettingsAdministrator>powermt save

C:Documents and SettingsAdministrator>powermt display dev=all
Pseudo name=harddisk2
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=600601607BD01A00F68F2AAC48C3DB11 [LUN 77]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun0     c3t0d0    SP A5     active  alive      0      7
3 port3path0tgt1lun0     c3t1d0    SP B4     active  alive      0      1

Pseudo name=harddisk3
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160C2021B00D8807DC433F7DD11 [LUN 72]
state=alive; policy=CLAROpt; priority=0; queued-IOs=1
Owner: default=SP A, current=SP A       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun1     c3t0d1    SP A5     active  alive      1      7
3 port3path0tgt1lun1     c3t1d1    SP B4     active  alive      0      1

Pseudo name=harddisk4
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160416121000560A5593994E111 [LUN 301]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun2     c3t0d2    SP A5     active  alive      0      1
3 port3path0tgt1lun2     c3t1d2    SP B4     active  alive      0      1

Pseudo name=harddisk5
CLARiiON ID=FCNxxxxxxxx055 [SG_NOZOMI-SUP]
Logical device ID=6006016010A030009C49FE259E94E111 [LUN 300]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 4
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt2lun0     c3t2d0    SP B1     active  alive      0      1

C:Documents and SettingsAdministrator>powermig setup -src harddisk4 -tgt harddisk5 -techtype hostcopy

Setup migration? [yes]/no: y

Migration Handle = 1

C:Documents and SettingsAdministrator>powermig info -all
==========================================
Hnd  Source     Target         Tech  State
===  =========  =========  ========  =====
1  harddisk4  harddisk5  HostCopy  setup

C:Documents and SettingsAdministrator>powermig info -all
==========================================
Hnd  Source     Target         Tech  State
===  =========  =========  ========  =====
1  harddisk4  harddisk5  HostCopy  setup

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: setup

C:Documents and SettingsAdministrator>powermig sync -handle 1

Start sync? [yes]/no: y

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: syncing
Percent InSync: 0%
Throttle Value: 2
Suspend time: 5

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: syncing
Percent InSync: 67%
Throttle Value: 2
Suspend time: 5

C:Documents and SettingsAdministrator>powermig pause -handle 1

Pause migration? [yes]/no: y

PPME error(4): Not in proper state to perform the requested operation

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: sourceSelected

C:Documents and SettingsAdministrator>powermt display dev=all
Pseudo name=harddisk2
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=600601607BD01A00F68F2AAC48C3DB11 [LUN 77]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun0     c3t0d0    SP A5     active  alive      0      7
3 port3path0tgt1lun0     c3t1d0    SP B4     active  alive      0      1

Pseudo name=harddisk3
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160C2021B00D8807DC433F7DD11 [LUN 72]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP A, current=SP A       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun1     c3t0d1    SP A5     active  alive      0      7
3 port3path0tgt1lun1     c3t1d1    SP B4     active  alive      0      1

Pseudo name=harddisk4
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160416121000560A5593994E111 [LUN 301]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun2     c3t0d2    SP A5     active  alive      0      1
3 port3path0tgt1lun2     c3t1d2    SP B4     active  alive      0      1

Pseudo name=harddisk5
CLARiiON ID=FCNxxxxxxxx055 [SG_NOZOMI-SUP]
Logical device ID=6006016010A030009C49FE259E94E111 [LUN 300]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 4
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt2lun0     c3t2d0    SP B1     active  alive      0      1

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: sourceSelected

C:Documents and SettingsAdministrator>powermig selectTarget -handle 1

Transition to targetSelected state? [yes]/no: y

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: targetSelected

C:Documents and SettingsAdministrator>powermig commit -handle 1

Commit migration? [yes]/no: y

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: committed

C:Documents and SettingsAdministrator>powermt display dev=all
Pseudo name=harddisk2
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=600601607BD01A00F68F2AAC48C3DB11 [LUN 77]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun0     c3t0d0    SP A5     active  alive      0      7
3 port3path0tgt1lun0     c3t1d0    SP B4     active  alive      0      1

Pseudo name=harddisk3
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160C2021B00D8807DC433F7DD11 [LUN 72]
state=alive; policy=CLAROpt; priority=0; queued-IOs=1
Owner: default=SP A, current=SP A       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun1     c3t0d1    SP A5     active  alive      1      7
3 port3path0tgt1lun1     c3t1d1    SP B4     active  alive      0      1

Pseudo name=harddisk4
CLARiiON ID=FCNxxxxxxxx055 [SG_NOZOMI-SUP]
Logical device ID=6006016010A030009C49FE259E94E111 [LUN 300]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 4
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt2lun0     c3t2d0    SP B1     active  alive      0      1

Pseudo name=harddisk5
CLARiiON ID=CKxxxxxxxxx482 [NOZOMI_SUP_SG]
Logical device ID=60060160416121000560A5593994E111 [LUN 301]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP B, current=SP B       Array failover mode: 1
==============================================================================
—————- Host —————   – Stor –   — I/O Path –  — Stats —
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==============================================================================
3 port3path0tgt0lun2     c3t0d2    SP A5     active  alive      0      1
3 port3path0tgt1lun2     c3t1d2    SP B4     active  alive      0      1

C:Documents and SettingsAdministrator>powermig query -handle 1

Handle: 1
Source: harddisk4
Target: harddisk5
Technology: HostCopy
Migration state: committed

C:Documents and SettingsAdministrator>powermig cleanup -handle 1

Cleanup migration? [yes]/no: y

C:Documents and SettingsAdministrator>powermig query -handle 1

PPME error(6): Handle not found

C:Documents and SettingsAdministrator>

 

The detailed process to complete a migration with PPME is this:

  1. Install PowerPath and/or PPME
    • Be sure to use a custom install and choose Migration Enabler as an option.  This is a default installation I use because it doesn’t cost anything to install and does not interfere with any other functionality.
    • Supposedly you can install PPME after PowerPath without a reboot, but the few times I did that, I had to reboot. This would be the ONLY reboot during the entire process.
  2. Prepare/present the target LUN.
    • Once the LUN is accessible to the host, this is all that is necessary. There is no need to prepare the filesystem or alignment offset.  In fact, whatever is done to the destination LUN will be overwritten by the storage migration.
  3. License PPME using the PowerPath Licensing Tool.
  4. Setup the migration session using the command:
    powermig setup -src <source_pseduoname> -tgt <target_pseudoname> -techtype hostcopy
    Where source_pseudoname is whatever PowerPath lists as the hosts LUN name.  For instance, harddisk1.  The same goes for target_pseudoname. This creates the relationship of the migration, using a session ID for all future tasks. Keep note of the session ID that is provided by PPME.
  5. Next is to start the synchronization.  If there is going to be a negative performance impact for the migration, it will be during the synchronization.
    powermig sync -handle <x>
    Where <x> is the session ID. This starts the synchronization of the migration at a throttle level of 2 on a scale of 0-9.  0 is the fastest and 9 is the slowest.  I found that using 2 was acceptable for most of the migrations I completed. If I thought that was going to be be too aggressive, I set the throttle to a value of 5.  The migration takes longer to complete, but the IO contention was far less.
    At this point, data is copied from source to target while the read requests are serviced and the writes are mirrored to source and target.
  6. Since the synchronization starts at a throttle of 2, there may be times when you need to slow it down or speed it up for whatever reason. This is NOT required, but if you need to the command is:
    powermig throttle -handle <x> -throttlevalue <y>
    Where <y> is a value from 0-9. Again, 0 is the fastest and 9 is the slowest.
  7. To check the status of the migration, enter the command:
    powermig query -handle <x>
    The output will give you the percentage complete and give you an idea as to when the synchronization will be complete.
  8. Once the status of the synchronization is listed as “sourceselected”, it is time to finish the migration.  One thing to note, is that it is still possible to backout of this migration. Enter the following command to select the target LUN:
    powermig selecttarget -handle <x>
    At this point, the read requests are serviced by the target and the writes continue to be mirrored.
  9. Now is the time to have the owner validate that the migration is acceptable.  If performance is not as expected, you can still backout of the migration by selecting the source (selectsource) and cleanup the migration as specified below, however that is not what we are after, is it?! If validation is successful, enter the following command to commit the change:
    powermig commit -handle <x>
    The reads and writes are solely serviced by the target LUN and the source LUN is no longer being used.
  10. At this point, you can no longer backout the change.  The source LUN, however, is still presented to the host and the underlying filesystem is still valid. You have two choices on how to proceed:
    • Cleaning up the migration as documented by the PowerPath guide
      • This is destructive to the source LUN, as the guide says that it removes some data.  What that data contains is beyond me, but the end result is that the data on the LUN is not longer accessible by any host.
        powermig cleanup -handle <x>
    • My preferred option is to remove the LUN from host access and then forcing the session cleanup. That provides the ability to mount that LUN on a different host, or back to the original host if there was a need.  I would rename the LUN with the date of the migration and the name of the host. I would then leave that LUN bound for a week before destroying it, call me paranoid or over cautious.
      powermig cleanup -handle <x> -force
  11. This completes the migration.

Thanks to Mike