UNIX_HPUX_AIX_LINUX putting all together to make an allrounder System Admin: 2013

Friday, August 16, 2013

VM MIgration issue frozen state time out

For Migration of a HP virtual machine from one host to another .

what is needed is :

1. Same network
2. Same name of the Virtual Switch should be configured on both hots .
3. Both servers should be connected physically to the same SAN
4. All the disks in the GUEST VM should be on SAN
5. All the disks should be zoneed to directly WWN of the GUEST VM . not to HOST .

when we have all this .

Setup a ssh authentication between both the hosts .

From HOST1
secsetup host2

And from HOST2
secsetup host1

NOw it is much simple

check the current status with the hpvmstatus command for the running VM .

#-> hpvmstatus
[Virtual Machines]
Virtual Machine Name VM # Type OS Type State     #VCPUs #Devs #Nets Memory
==================== ===== ==== ======= ========= ====== ===== ===== =======
tavmva                   1 SH   HPUX    On (OS)        4     2     2   10 GB

Then RUN the command hpvmmigrate .

[root@tahux1:/root]#
#-> hpvmmigrate -o -P tavmva -h tahux2
hpvmmigrate: Connected to target VSP using 'tahux2'
hpvmmigrate: WARNING (tavmva): Remote message: GUID service warning: The WWNs (0x50014C2000000000, 0x50014C2800000000) are in use by another vpar or VM. Data loss may occur if not changed.
hpvmmigrate: WARNING (tavmva): Remote message: GUID service warning: The WWNs (0x50014C2000000001, 0x50014C2800000001) are in use by another vpar or VM. Data loss may occur if not changed.
hpvmmigrate: Starting vPar/VM 'tavmva' on target VSP host 'tahux2'
    (C) Copyright 2000 - 2012 Hewlett-Packard Development Company, L.P.
    Creation of VM minor device 1
    Device file = /var/opt/hpvm/uuids/912889d6-7d8a-11e1-8f00-ec9a746a1c22/vm_dev
    guestStatsStartThread: Started guestStatsCollectLoop - thread = 6
      allocating datalogger memory: FF800000-FF900000 (1024KB)
      allocating firmware RAM (fff00000-100000000, 1024KB)
    Starting event polling thread

    Online migration initiated by source 'tahux1.upmraflatac.upm-kymmene.com' (10.96.34.100)
    Target: online migration started with encryption algorithm AES-128-CBC.

hpvmmigrate: Init phase (step 5) - progress 0%
hpvmmigrate: Init phase completed successfully.
hpvmmigrate: Copy phase - progress 0%
hpvmmigrate: Copy phase - progress 100%
hpvmmigrate: Copy phase completed successfully.
hpvmmigrate: I/O quiesce phase (step 12) - progress 0%
hpvmmigrate: I/O quiesce phase completed successfully.
hpvmmigrate: Frozen phase (step 1) - progress 0%
hpvmmigrate: Frozen phase (step 4) - progress 86%
    Target: transfer aborted by source
hpvmmigrate: ERROR (tavmva): Remote message: Target vPar or VM exited. Status 2.
hpvmmigrate: ERROR (tavmva): Remote message: Unable to start vPar/VM on target.
hpvmmigrate: ERROR (tavmva): Migration was aborted by timeout in Frozen phase step 4.
[root@tahux1:/root]#

So in this it was aborted . due to timeout .

we can check the logs also .
SYSLOG says .

Aug 16 09:20:37 tahux1 hpvmapp tavmva[4210]: Source: transfer aborted by 60 second timeout in Frozen phase step 4
Aug 16 09:20:43 tahux1 vmunix: hvsd: HPVM online migration failed, restarting guest on original host

VM GUEST SPECIFIC LOGS /var/opt/hpvm/guests/tavmva/log
Source: online migration started at Fri Aug 16 09:17:27 2013

Source: online migration started with encryption algorithm AES-128-CBC.
Source: copyPages returning early because of phase timeout
Source: transfer aborted by 60 second timeout in Frozen phase step 4
Source: online migration abort - severity 3, status 1012

So now it is clear the the issue is due to timeout of frozen phase .

so we will change that from default to 90 seconds .

#-> hpvmstatus -P tavmva -V | grep -i time
Graceful stop timeout   : 30
Init phase timeout      : 90 seconds
Copy phase timeout      : Infinite
I/O quiesce phase timeout: 15 seconds
Frozen phase timeout    : 60 seconds
[Boot-time Information]
[root@tahux1:/root]#
#-> hpvmmodify -P tavmva -x graceful_stop_timeout=60 -x migrate_frozen_phase_timeout=90
[root@tahux1:/root]#
#-> hpvmstatus -P tavmva -V | grep -i time
Graceful stop timeout   : 60
Init phase timeout      : 90 seconds
Copy phase timeout      : Infinite
I/O quiesce phase timeout: 30 seconds
Frozen phase timeout    : 90 seconds
[Boot-time Information]
[root@tahux1:/root]#

and then after that we run this again .

and the VM gets migrated successfully

hpvmmigrate: Frozen phase (step 4) - progress 81%hpvmmigrate: Frozen phase (step 4) -
progress 84%hpvmmigrate: Frozen phase (step 23) - progress 96%hpvmmigrate: Frozen phase (step 24) - progress 96%
    Event: configuration file renamed to /var/opt/hpvm/uuids/912889d6-7d8a-11e1-8f00-ec9a746a1c22/vmm_config.current
hpvmmigrate: Frozen phase completed successfully.
hpvmmigrate: vPar/VM migrated successfully.
[root@tahux1:/var/adm/RFC]#

and the status of VM it shows like below .

on TAHUX1 source host .
Virtual Machine Name VM # Type OS Type State     #VCPUs #Devs #Nets Memory
==================== ===== ==== ======= ========= ====== ===== ===== =======
tavmva                   1 SH   HPUX    Off (NR)       4     2     2   10 GB

[root@tahux2:/root]#
#-> hpvmstatus
[Virtual Machines]
Virtual Machine Name VM # Type OS Type State     #VCPUs #Devs #Nets Memory
==================== ===== ==== ======= ========= ====== ===== ===== =======
tavmtest                 1 SH   HPUX    On (OS)        1     2     2    8 GB
tavmva                   2 SH   HPUX    On (OS)        4     2     2   10 GB

Friday, June 28, 2013

Linux User management Unlock , passwd reset , Passwd checking , pam_tally2

In this Post I will tell about the issue I faced recently . A user was locked due to wrong password attempts .

Now the option to enable this user in linux was simply use the command

passwd -u user_name

But that gave error

l00lnx1001:/etc/pam.d # passwd -u vspadm
Cannot unlock the password for `vspadm'!

We tried to even reset the password with the passwd command .

                l00lnx1001:/etc/pam.d # passwd vspadm
                Changing password for vspadm.
                New Password:
                Reenter New Password:
                Password changed.


Then tried to login but still the same error .

                l00lnx1001:~ # ssh vspadm@0
                The authenticity of host '0 (0.0.0.0)' can't be established.
                RSA key fingerprint is 14:a8:9b:da:bd:f5:48:85:89:72:17:35:5f:d9:0b:f0.
                Are you sure you want to continue connecting (yes/no)? yes
                Warning: Permanently added '0,0.0.0.0' (RSA) to the list of known hosts.
                Password:
                Account locked due to 21 failed logins

                Password:
                Account locked due to 22 failed logins

                Password:
                Account locked due to 23 failed logins

                Permission denied (publickey,keyboard-interactive).

even tried to check the status of passwd / shadow files if any issue in that .
but that also did not gave any clue about this user

we also checked faillog   but with no help .

though all other users were able to access their account very well .

                l00lnx1001:~ # pwck
                Checking `/etc/passwd'
                User `pulse': directory `/var/lib/pulseaudio' does not exist.
                User `suse-ncc': directory `/var/lib/YaST2/suse-ncc-fakehome' does not exist.
                User `v759500': directory `/home/v759500' does not exist.
                User `upm': directory `/home/upm' does not exist.
                Checking `/etc/shadow'.

The key was found in the system logs   /var/log/messages

            Jun 28 14:03:12 l00lnx1001 passwd[66203]: password changed - account=vspadm, uid=1007, by=0
            Jun 28 14:03:32 l00lnx1001 sshd[66234]: pam_tally2(sshd:auth): user vspadm (1007) tally 21, deny 6
            Jun 28 14:03:36 l00lnx1001 sshd[66243]: pam_tally2(sshd:auth): user vspadm (1007) tally 22, deny 6
            Jun 28 14:03:39 l00lnx1001 sshd[66252]: pam_tally2(sshd:auth): user vspadm (1007) tally 23, deny 6

So then I searched for pam_tally2 and that was the keystroke

                l00lnx1001:/etc/pam.d # pam_tally2 -u vspadm
                Login           Failures Latest failure     From
                vspadm             23    06/28/13 14:03:39 localhost
                l00lnx1001:/etc/pam.d # pam_tally2 -r -u vspadm
                Login           Failures Latest failure     From
                vspadm             23    06/28/13 14:03:39 localhost
                l00lnx1001:/etc/pam.d # pam_tally2
                Login           Failures Latest failure     From
                root               12    06/28/13 12:43:16 10.106.66.6
                upm                 2    06/20/13 13:00:49 l00lnx1001.group.upm.com
                vspmiadmin         11    06/21/13 16:47:29 193.24.70.199
                l00lnx1001:/etc/pam.d # pam_tally2 -r
                Login           Failures Latest failure     From
                root               12    06/28/13 12:43:16 10.106.66.6
                upm                 2    06/20/13 13:00:49 l00lnx1001.group.upm.com
                vspmiadmin         11    06/21/13 16:47:29 193.24.70.199
                l00lnx1001:/etc/pam.d # pam_tally2


And after that the prolem was solved . and we were able to login to the system using that user .

                l00lnx1001:/etc/pam.d # ssh vspadm@l00lnx1001
                Password:
                vspadm@l00lnx1001:~> id
                uid=1007(vspadm) gid=100(users) groups=100(users),16(dialout),33(video)

That's It .

So if you got similar issue then you can use this as a reference ......

Happy learning ...

Disk performance improvement on HPUX 11.31 servers with max_q_depth parameter .

Configure the max_q_depth in HPUX 11.31 servers for better disk performance .

Detailed plan .

1) login to the server as root and start script logging
script /var/adm/install-logs/CRQ.scriptlog
2.) verify the successfull ignite backup status .
#-> tail /usr/local/log/ignite.txt

3.) Check the current disk usage .

sar -d 1 8

4.) check the current value of the disk tunables .

scsimgr get_attr -D /dev/rdisk/disk241 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk242 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk151 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk152 -a max_q_depth

5. ) UPdate the value of max_q_depth tunable to 16

scsimgr save_attr -D /dev/rdisk/disk241 -a max_q_depth=16
scsimgr save_attr -D /dev/rdisk/disk242 -a max_q_depth=16
scsimgr save_attr -D /dev/rdisk/disk151 -a max_q_depth=16
scsimgr save_attr -D /dev/rdisk/disk152 -a max_q_depth=16

6. ) Verify the new values after changing .
scsimgr get_attr -D /dev/rdisk/disk241 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk242 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk151 -a max_q_depth
scsimgr get_attr -D /dev/rdisk/disk152 -a max_q_depth

7.) check the disk utlizations .

sar -d 1 9

8.) exit

Friday, June 21, 2013

All about UNIX flavours

In this blog I will discuss about performing diffirent tasks in diffirent Flavours of UNIX .

will use the sub blogs for each tasks.

Keep on watching .....