Saturday, June 30, 2018

Fork function failed on AIX versioned WPAR.

Hello everyone today i am going to share knowledge about the error "fork function failed".
I faced this issue on AIX 5.3  versioned WPAR on AIX 7.1 TL03 SP4.

fork function - "it create new process"

On IBM POWER8 we had created versioned WPAR of AIX 5.3.so here you must be thinking that why this guy didn't install directly AIX 5.3 on POWER8 BOX ??
so answer to this query is ,we cant directly install AIX 5.3 on POWER8,so first we need to install AIX 7.1 TL03 and SP4 and on that need to create AIX 5.3 versioned WPAR by using AIX 5.3 MKSYSB image,also because of database dependency ,we not able to install latest version of AIX on POWER8.

Let me answer why we created Versioned WPAR 5.3 on AIX 7.1
1st reason is that we cant directly install 5.3 on POWER 8
2nd ,we need IBM support for versioned WPAR in case of any critical issue and IBM support AIX 5.3 Versioned WPAR on AIX 7.1.

Overall setup was like below.

BASE OS
AIX 7.1 TL03 SP4.
on top of this we created AIX 5.3 versioned WPAR which hosting oracle 9i database for critical application.After restoration of database on AIX versioned WPAR oracle team started database and
after 30-40 minutes ,oracle team was complaining ,when they trying to login oracle account they are getting error :

"The fork function failed. Too many processes already exist"

after doing analysis of this error we increased memory of AIX LPAR and again asked DBA team to start database again and check whether error is sorted out or not.but no success getting same error again.
so what next ??

from OS end our thinking was like if we restore OS MKSYSB then no need to change any OS tuning parameter,but when we started checking parameter for "MAXUPROC" ,we found culprit which was causing this error."MAXUPROC" wasn't correctly set on AIX global LPAR and versioned WPAR  and which causing fork function failed error.on source AIX  server it was 2048 and on target it was 128 on both BASE OS and versioned WPAR.

MAXUPROC = maximum number of oracle/application process a user can able run on AIX OS.
so because of wrong setting at target after migration database throwing error of reaching maximum limit 128 process on versioned WPAR.

our solution to this was like ,first change maxuproc on AIX 7.1 GLOBAL LPAR and then do changes on AIX 5.3 versioned WPAR. Here note one thing that if AIX admin didn't set maxuproc for WPAR then non-root user inherits maximum process limit from global AIX 7.1 LPAR.


Command to change "MAXUPROC"

# chdev -l sys0 -a maxuproc=2048 
and then reboot AIX 7.1 global LPAR
and after reboot change "totalProcessess" on wpar by command,
#chwpar -R totalProcessess=2048 wparname.
then stopwpar and startwpar once .

after setting correct MAXUPROC limit we sorted out fork function failed error .

at last we can say that maxuproc limit changes according  to database and application requirement,so 
its always wise decision to refer database and application tuning document for performance related issue.

Thanks !!!!









Friday, June 29, 2018

Passwordless ssh configuration for ORACLE RAC

                                         



While doing oracle RAC configuration on AIX 6.1 TL 08 ,oracle RAC admin  team had requested some prerequisite and one of them is to configure password less configuration between 2 user of oracle RAC. These 2 user exist on 2 different AIX LPAR.

Here is the scenario,
There are 2 AIX LPAR, where oracle RAC DBA going to configure ORACLE RAC cluster.
Hostname are 
1.aixorarac1
2.aixorarac2
and  user  is "racora" on both AIX node.AIX OS team going to configure password less ssh .
you must be wondered why need password less configuration for oracle RAC ???
Answer to this query is ,it is one of the mandatory prerequisite before doing oracle RAC configuration on AIX or Linux server.

While doing oracle RAC configuration oracle RAC admin stuck with error,password less connection not working for user "racora" .After this he contacted our team and request to sort out this issue.
we immediately copied public key of user "racora" from  aixorarac1 to aixorarac2 in .ssh path of user "racora" on 2nd node.when again installation started for oracle RAC again getting error that password less connection from aixorarac1 to aixorarac2 is ok but from aixorarac2 to aixorarac1 is not correct.
So here we got that we need to configure both way password less communication for oracle RAC user "racora".

Lets see how to configure both way passoword less communication for "racora" user on 2 AIX node which going to host oracle RAC cluster.
Passwordless ssh configuration for oeacle node aixorarac1 and aixorarac2
=============================================================

Steps :
Login to aixorarac1 using user "racora" and execute following command,
aixorarac1#ssh-keygen -t rsa (execute same command on aixorarac2)  //perform this step on both node then proceed for next part.

aixorarac1#cd .ssh (.ssh permission must be 700)
aixorarac1#cat id_rsa.pub >> authorized_keys
aixorarac1#scp authorized_keys aixorarac2:/home/racora/.ssh

============================================================
Now switch to aixorarac2

steps:
aixorarac2#cd .ssh (.ssh permission must be 700)
aixorarac2#cat id_rsa.pub >> authorized_keys
aixorarac2#scp authorized_keys aixorarac1:/home/racora/.ssh

Here what we done is ,First generated rsa key on both node,
Then added id_rsa.pub to authorized_keys file on aixorarac1.
Then SCP authorized_keys file to aixorarac2 home directory "aixorarac2:/home/racora/.ssh."

Then on node aixorarac2 added public key to authorized_keys by command cat id_rsa.pub >> authorized_keys and scp that file to aixorarac1.
So in this way on aixorarac2 home directory(aixorarac2:/home/racora/.ssh) we having authorized_keys file which contain both AIX node public key and same file we copied to “aixorarac1.
after this we simply tested whether passwordless connection is working or not from both end by following command,

aixorarac1# ssh aixorarac2 date
Fri Jun 29 11:11:31 UTC 2018

================================
aixorarac2#ssh aixorarac1 date
Fri Jun 29 11:11:50 UTC 2018
After executing above both command on oracle RAC cluster output  must show date without asking for password.

That’s all  ,Needed to configure password less connection between 2 oracle RAC node on AIX .


Thanks !!!

Thursday, June 28, 2018

Recover niminfo on AIX NIM MASTER


How to recover niminfo file on NIM master in AIX environment ??

What if anyone from your admin team by mistake remove /etc/niminfo on NIM MASTER .how will you recover that file ?what will be your approach on this ??

Lets see first , what is /etc/niminfo  on AIX NIM MASTER ??
 why it is so important ??
Answer to this query is ,it is NIM configuration file which contain nim configuration information and it’s is edited by only NIM MASTER only.it contain information like NIM MASTER name,NIM MASTER port like 1058(nim) and 1059(nimreg).1059 register client on NIM MASTER and 1058 is used by process
“nimesis”,which is responsible for accepting connection from NIM client and respond to them. nimesis always run on NIM MASTER.

Also this file contain BOS image path on NIM MASTER, apart from this NIMclient route information and host IP address.

Here is the short answer for recovering niminfo using SMIT.

1.Using smitty AIX admin can recover file by following below steps.

1.     SMIT
2.     Perform NIM Administration Tasks
3.     Rebuild the niminfo File on the Master

2.Command line option is like below.

Rebuild nimnfo on MASTER server by using command
#nimconfig -r



Thanks !!!

Change Timezone on SUSE Linux


While doing administration Linux admin came across situation where need is to change time zone.
This situation arise because of following reason
1. Application/database requirement for time-zone change.
2. Wrongly configured Time zone by admin.
So here procedure for changing time zone.
You must be thinking why i am changing time zone, in my case while doing new installation for one of the application server i forgot to change time zone.

Let’s see ...........

While changing  time zone for server please do proper backup of setting which you are changing.
Example. Time zone setting before changing.in case of reverting back this will help you.

So generally all time zone files are in path /usr/share/zoninfo/.
Here we need to change  timezone from UTC to Pacific Time.

Step 1. Change directory to /etc
Execute command #date before changing time zone file.
# date
Mon Sep 17 22:59:24 UTC 2017

here we can see that time zone is UTC.

Step 2: Change directory to cd /etc and remove file “localtime”

#rm localtime

Step 3:
To change time zone from UTC to Pacific timezone execute following command
# cd /etc
# ln -s /usr/share/zoneinfo/US/Pacific localtime

To confirm changes again execute date command.
# date
Mon Sep 17 23:20:14 PDT 2017
Now here we changed timezone from UTC to PDT.

Thanks !!!!

Wednesday, June 27, 2018

How to take MKSYSB of AIX LPAR from NIM MASTER


How to take MKSYSB of AIX LPAR from NIM MASTER
During my struggling days for job when I was appeared for interview, interviewer asked me how to take backup from NIM master, after hearing I was completely blank. After this incident I collected all knowledge from my friend and team-mate for how to take backup from NIM MASTER. So this blog is for those who want to know how to take backup of AIX LPAR from NIM MASTER.
Generally command for taking AIX LPAR backup from locally is like below.
#mksysb –iX /tmp/aixbackup/hostname_backup

But what will procedure if AIX admin want to take backup from NIM MASTER.
Step by step procedure to take MKSYSB backup from NIM MASTER on AIX LPAR.
1. On NIM MASTER enter AIX LPAR hostname/IP entry in /etc/hosts file.
2. Enable rshd on AIX LPAR by following command.
# Chsubserver -a -v shell -p tcp6 -r inetd
          # refresh –s inetd
          # cd  /
          # rm .rhosts
          # vi .rhosts
                +
Here if AIX admin want to take backup of multiple AIX LPAR then simply add + in .rhosts file.
If admin want to take selective AIX LAPR backup then only add hostname of these LPAR in .rhosts file.
          # chmod 600 .rhosts

3. Copy /etc/niminfo file from NIM MASTER to client /etc location

4. Execute following command to take backup of AIX LPAR from NIM MASTER.
# nim -o define -t mksysb -a server=master \
-a location=/backup/AIXLpar -a mk_image=yes \
-a source=AIX_Lpar_name  mksysb_AIX71TL04


Flag
-t                      Type of resource here it is MKSYSB.
Server             Specify “master”.
Location         Path on NIM master where AIX_LPAR image will be stored.
source                   Name of AIX client (hostname of AIX LPAR)
Mksysb_AIX71TL04        MKSYSB resource name

5. Create spot from existing MKSYSB resource Mksysb_AIX71TL04.

#nim -o define -t spot -a source= Mksysb_AIX71TL04 -a server=master -a location=/export/spot  spot_ AIX71TL04

By using MKSYSB resource and SPOT AIX admin can restore AIX client to same AIX level.

Thanks!!!





Tuesday, June 26, 2018

Permission denied error while executing Perl script from remote jump server

While performing OS admin task, one of the application guy came to me ,was complaining about permission issue on one of the Linux server.

Error : "permission denied "to user while login from remote user.
after seeing this error first thought in mind was may be user account is locked, but here login method was password less and also i checked whether user account is locked or not.
#passwd -S username
user account wasn't locked
so what's next..........

After i checked /etc/passwd file and checked whether user is exist or not and found that user is not present in  /etc/passwd file ,so thought that this is the issue. But application guy was saying that he was able login using same command few days back. Now i  really confused, what is the issue and what denying access to user on Linux server. Again i checked which command user was executing and found that he was executing Perl script with some parameter, which was trying to login on Linux server to application user account. Perl script was trying to login to app user account .

again i thought that let check log in /var/log file ,so checked messages but not found anything, then i turned to /var/log/authlog file and asked app guy to execute that command again and kept authlog file in monitoring mode by executing following command and here i found some hint from error "Failed publickey"  that error is in authorized_keys file.

tail -f /var/log/authlog

Entry was like below :
Failed publickey for "appuser" from "XX.XX.XX.XXX"(IP address).

after reading this messages i decided to check authorized_keys  file .
I executed below sequence of command 
#su - appuser
#cat /apphome/.ssh/authorized_keys

Now here i get messages that "permission denied". I am not understanding why authorized_keys file in appuser home directory denying permission to owner itself. After that checked permission of authorized_keys file and found that file ownership is not correct ,owner of file is different, somebody had changed ownership to other user.
After proper change request i done ownership change and asked app team guy to check again?
Here finally user able to login on remote server.

from above error message, we conclude that first understand problem, then do proper analysis, like check logs on server and try to extract some indication from that, sometime appteam guy also dont know what is the issue, in this incident app guy was asking me to create new user and here this wasn't correct.

For above type of error please check following point.
1. Check log files which are related to error like, authlog,messages.
2. Find indication from that log file.
3. Check permission and ownership of authorized_keys,this file permission is always 600 and file is not world writable.
4. Check permission and ownership of home directory.
5. Also check required user exist or not.
6. Correct public key in authorized_keys.


Thanks!!!!!















Virtual Media Library on Virtual I/O server


In this post i am going to share how to create ISO image from AIX mksysb and restore to AIX LPAR
using Virtual media library from VIO. its also called as media repository .media repository contain ISO image.

Let see how to create ISO image from AIX mksysb.
mkcd -L -S -I /mksysb/AIX_ISO -m /mksysb/AIX_mksysb_bkp

-L create ISO image
-I Path to store ISO image
-m Previously created mksysb image
-S stop mkcd command from writing ISO image on CD/DVD.
a
once ISO image created next task is to make sure that following condition satisfied.
1.Version of VIOS 1.5 or Later.
2.Virtual SCSI adapter mapping between VIO server and AIX LPAR.
3.Healthy ISO image .here we already created (AIX_ISO).

so Lets see step by step process how to create media Library and map it to AIX LPAR.

1.First check that whether any media repository exist or not by following command.
$lsrep
if repository exist command output will show its size ,if not exist then proceed for step 2.
2.Create media repository using command "mkrep"

$mkrep -sp rootvg -size 10G
confirm that media repository is created by executing following command. 
$lsrep
Size(mb) Free(mb) Parent Pool Parent Size Parent Free
10198    10198    rootvg     139776     22528

If AIX admin want to store multiple AIX ISO then he might need to increase the size of media repository.command for increasing media repository is 

$chrep -size 6G

3.Create Virtual media disk by executing following command .
#mkvopt -name AIX_71_TL04_base_image -file /tmp/AIX_ISO

here we copied already created "AIX_ISO" /tmp on VIO server.

confirm that virtual media disk is created or not by executing command 
#lsrep

padmin$ lsrep
 Size(mb) Free(mb) Parent Pool Parent Size Parent Free
10198    10198    rootvg     139776     22528
 Name                                    File Size  Optical         Access

 AIX_ISO                                  3679    None             rw

4.Connect this Virtual optical media device to AIX LPAR using following command.

$mkvdev -fbo  -vadapter vhost6
vtopt0 Available 

vhost6 is virtual SCSI adapter connected to AIX LPAR.

5.Load ISO to virtual drive by executing command loadopt.


$ loadopt -disk AIX_ISO -vtd vtopt0

check mapping on VIO server by command lsmap

$lsmap -vadapter vhost6

$ lsmap -vadapter vhost3


6.Once ISO is loaded on Virtual drive next step is to boot AIX LPAR in SMS mode .
in SMS mode virtual drive will be same as CD/DVD
 drive,choose correct adapter and boot from CD/DVD which holding AIX_ISO.

7.once installation done unload image from drive by command unloadopt.
$unloadopt -release -vtd vtopt0

please share  your comment.

thanks




Sunday, June 24, 2018

ASM disk error synchronous I/O operation to a disk failed oracle RAC on AIX.


Oracle RAC cluster on AIX 6.1 TL08 : Error while restoring RMAN backup on ASM disk.
ORA-15080: synchronous I/O operation to a disk failed


While restoring oracle RMAN backup, we faced error "synchronous I/O operation to a disk failed".

After checking at database end they confirmed that backup is ok and everything is fine from their end,but we thought that why should we take oracle RMAN backup one more time and try with new fresh backup.
we took backup once again and tried to restore on ASM oracle RAC setup.but no success :(.
So its OS team turn now,after doing analysis on OS end.found that ASM disk reserve policy is not correctly set and this is the main reason for  error "synchronous I/O operation to a disk failed".
so we changed ASM disk reserver_lock policy by using following command.
on server ASM disk was from EMC storage.

command : 

#chdev –l hdiskpowerX –a reserve_lock=no 

check attributes after change using command.

# lsattr –El hdiskpowerX |grep -i reserve_lock

After doing resreve_lock policy database team tried backup restore on oracle RAC and guess what this time its work perfect!!!!!!!!.
So error was at OS end it’s because of not done proper pre-requisite check.

Pre-requisite for oracle ASM disk
==========================
1. All ASM disk reserve_lock must be set to "no"
check attribute using commnad # lsattr –El hdiskpowerX.
2. check and confirm that all ASM disk having correct ownership set.

If ownership is not correctly set then at oracle RAC end they will not able to detect ASM disk

also i like to share how to assign RAW/ASM disk to Oracle RAC cluster ,there are some simple steps.

1.Ask storage guy to assign disk from storage to AIX LPAR.
2.detect that disk on AIX LPAR by executing "cfgmgr" command.
3.After detecting disk don't assign any PVID to DISK and don't create any file system on it.
4.simply change reserve_lock policy to no_reserve and change ownership to oracle user .
5.after this inform oracle admin then he will detect disk and assign to ASM group.

RAW/ASM disk not contain any file system.its purpose is for getting performance.

Thanks !!!

Thursday, June 21, 2018

AIX command


Commonly used AIX command
AIX LVM command
Lsvg       List all VG on server
Lsvg –o   List active VG on server
Lsvg –l  vgname                                List all logical volume in volume group
example  of mkvg command.
mkvg -y vgname hdisk0 hdisk1
mkvg -y my_vg –s 128 hdisk0 hdisk1
mkvg -s 2 -t 2 vgname hdisk1
# mkvg -B -y vgname -s 128 -f -n -V 101
# mkvg -S -y vgname -s 128 -f -n -V 101
Flag of mkvg
-s                    specify the physical partition size
-y                    indicate the name of the new vg
-f                     forcefully create volume group
-t                    t factor
-V                 major number of volume group
-S                Create scalable volume group
LV create command
#mklv -t jfs2 -y testlv datavg 60
Flag
-t type of LV
-y LV name
File system creation command
crfs -v jfs2 -d testlv -m /datamnt -A yes
-v  File system Type
-d  device on which we creating file system
-m mount point
-A  automount

#Mount /mount-point  File system mount command.

AIX system resource controller command
lssrc –a                 show all system resource controller and their status
lssrc  –s sshd      show sshd status on server    
lssrc –g nfs            show service status on all services in nfs group

lssrc –g group name                        Command to see service status in specific group
lssrc –s service_name                                    Command to see specific service status

startsrc –g groupname                  To start all services in specific group
stopsrc –g groupname                   To stop all services in specific group
refresh –s service name                               To refresh specific service
refresh –g groupname                  to refresh services in specific group

Command for changing bootlist and showing bootlist on AIX

bootlist -m normal -o    It will  display bootlist for AIX server.
example

#bootlist -m normal -o
output :
hdisk0 blv=hd5 pathid=0

bootlist -b     Show last boot device from which AIX LPAR booted.

bootlist -m normal hdisk0 blv=hd5 hdisk1 blv=hd5    This command change bootlist to hdisk0 and hdisk1

AIX file system utilization command

df -gt     This command display all filesystem which are mounted on AIX server.
df -gt /fsname   Display specific filesysetm disk utilization.

-g  show File system utilization in GB unit
-m show File system utilization in MB unit
-k  show File system utilization in 1024-byte blocks unit


Command to take OS backup on AIX

mksysb -ieX /path_to_take_backup     This command will create AIX mksysb backup with excluded content.

mksysb -iX   This will create full backup of AIX OS.

-i mksz command called to create image.data file.
-e check /etc/exclude.rootvg and exclude required file from backup
-X expand /tmp dynamically if needed


Mirroring AIX rootvg to hdisk1
extendvg rootvg hdisk1
mirrorvg rootvg
bosboot -ad hdisk0
bosboot -ad hdisk1
bootlist -m normal hdisk0 hdisk1



Command To list devices on AIX
cfgmgr  Detect new device on AIX
lsdev  List all devices on AIX LPAR
lsdev -Cc disk    Show disk information on AIX
lsdev -Cc adapter |grep -i fcs  show only FCS adapter information
lsdev -Cc adapter |grep -i ent  show only ethernet adapter information
lscfg -vl fcs0 | grep Network  display wwn of fcs0 adapter


Performance Monitoring
For monitoring performance of AIX LPAR admin can use command like topas,nmon,svmon,ipcs ,vmstat and df .

topas    Monitor top process which are consuming system resources ,also can monitor paging space ,memory and CPU utilization

while topas is running if admin press ~ (tild) then topas switch to nmon mode.

nmon   This command also show information like ,CPU ,Memory and paging
 space.

lsps -s  Show paging space utilization.

# svmon -P -O summary=basic,unit=MB    Virtual memory stats in Mb

#vmstat 2 5   Show five output at interval of 2 second.it contains CPU and memory utilization.

ipcs     Report interprocess communication information like shared memory,semaphore and queue details.

-s semaphore 
-m memory
q- message queue

will update new command .............