Friday, September 29, 2023

SAN disk path health check script using diff command UNIX

 SAN disk path health check script using diff command UNIX

Here following script will be helpful during SAN switch activity, where AIX admin involved for health check of disk before and after activity.

=============================================

step 1: create directory in /tmp  "health_check_disk"

#cd /tmp

#mkdir health_check_disk

Step 2:

create 2 file with following name

#cd /tmp/health_check_disk

#touch s1 s2

s1===>contain path information for all physical volume on AIX server

s2===>will contain path information for all physical volume on AIX LPAR.


step 3:

check content of s1 file

#cat s1

Enabled hdisk0 fscsi0

Enabled hdisk0 fscsi1

Enabled hdisk1 fscsi0

Enabled hdisk1 fscsi1

Enabled hdisk2 fscsi0

Enabled hdisk2 fscsi1

Enabled hdisk3 fscsi0

Enabled hdisk3 fscsi1


now we will see actual script how it create lspath data before SAN activity.

------------------------------------

for i in `lspv |awk '{print $1}'`

do

lspath -l $i >> s1        //Create s1 file with pv path information

done

----------------------------------

==========================

lspath data after SAN activity

----------------------------------

for i in `lspv |awk '{print $1}'`

do

lspath -l $i >> s2   Create s2 file with pv path information

done

----------------------------------

Now AIX admin need to perform "diff" command execution on both file

#diff s1 s2         // If this command return nothing then no paths is failed ,missing or defined state

#echo $?    //return 0 then no paths change happened both s1 and s2 file same

if above command return disk output with name path information then path is changed after AN activity so AIX admin need to correct those path using "chpath".


for example if after SAN team san switch migration some path got faulty then s2 file will contain following failed path

#cat s2

Enabled hdisk0 fscsi0

failed  hdisk0 fscsi1

Enabled hdisk1 fscsi0

failed  hdisk1 fscsi1

Enabled hdisk2 fscsi0

failed  hdisk2 fscsi1

Enabled hdisk3 fscsi0

failed hdisk3 fscsi1

When AIX admin run "diff" command on both file it will throw output with line number and file content. it also display what was content of s1 file and what difference now it having. so AIX admin easily identify those hdisk and correct their path

#diff s1 s2

Output:

It will show that s1 and s2 file has diffrence and show what was that diffrence

like hdisk0 fscsi1 path got failed and that was in enabled state before SAN team activity.


Thanks !!

Script to find stale physical partition on AIX LPAR

Find stale partition on AIX LPAR using following script

for i in `lsvg -o |awk '{print $1}'`
do
echo $i 
lsvg $i|grep -i "stale pps"
Sleep 2
done

Output: it will display stale pps if any, this script usefull for monitoring stale pps in volume group when there is VG sync in progress.

Thanks

Wednesday, September 27, 2023

/bin/rm: argument list too long

 /bin/rm: argument list too long

what is the solution to this error on UNIX or Linux???

solution:

When UNIX admin tries to delete too many files using rm command then he came across this error to address this error he can follow below solution

1.

find /tmp/logs -mtime +90  -exec -rm -f {} \;

above command will pass list of files older than 90 days to rm and it will remove those files

2.

find /tmp/logs -mtime +90  -print |xargs rm -f

it will also remove files older than 90 days

How to pass argumnet to xargs using n option to make sure it will delete n number of file

find /tmp/logs -mtime +90  -print |xargs -n 20 rm -f

it will delete/remove file in set of 20 .


Thanks


Dump device too small to store system dump ,warning message is AIX errpt

 Hello everyone, today i am writing about following error in AIX errpt

 Dump device too small to store system dump, warning message ?????

How AIX admin resolve this error and why its important to address this error.

dump device used to store memory image or it captures memory snap when AIX system crash, so that image is very much useful during analysis by Expert or finding main cause of the system crash.

Useful command on AIX system to get information about system dump device are following:

 know the potential size of the dump AIX would generate: 

#sysdumpdev -e

To change the primary dump device as sysdumpnull: 

#sysdumpdev -p /dev/sysdumpnull

To change the secondary dump device as sysdumpnull: 

#sysdumpdev -s /dev/sysdumpnull

To change the primary device permanently:  

#sysdumpdev -P -p <device_name>

To change the secondary device permanently: 

 #sysdumpdev -P -s <device_name>

To create the secondary dump device:

Command to create dump device in AIX using mklv command, make sure that volume group have sufficient space.

 #mklv -t sysdump -y <dumplv_name> <vg_name> <no of PPs> 

 #mklv -t sysdump -y lg_dumplv rootvg 6 hdisk0


Also AIX admin can create dump device using 

smitty mklv command=>create LV=>mention LV name,type as dump and number of pps


How to calculate AIX dump device size and create required size dump device??

check following image snap ,it having details

Once aix Admin confirmed size he can create dump lv and make it primary using above command 



Thanks !!!


mksysb backup failing with error /tmp/backupxxaaah8pqaa there is not enough space in file system

mksysb backup failing with error  /tmp/backupxxaaah8pqaa there is not enough space in file system

Solution:
Increaed /tmp by 1gb and performed health check of other file system like /, /home,/var ,/opt

Then executed vio backup and it's successful.




Thanks .

Monday, December 5, 2022

Boot AIX lpar from single user mode to multiuser

Boot a AIX lpar which was stuck in single user mode. AIX admin unable to access over network using ip address and cyyberark

Solution: Tried to access using nim jump server, putty and cyyberark but throwing network connection timeout, do decided to access from HMC console and found its stuck at single user mode.

Thanks!!!!
:)
Happy reading

LDAP user authentication issue on AIX LDAP client

Login issue for multiple LDAP users on AIX client server.

Hello all, 

Today I am going to discuss one incident where User on AIX server unable to switch to his home directory and when AIX admin tried to list user properties using lsuser command that command also not showing any details.


 #lsuser -R LDAP username 

Error message: 
3004-687 user abcd does not exist.

When did primary check found that this issue is on all LDAP client server.
 
Steps which AIX admin must perform to do troubleshooting are
  • Check whether NFS mounted LDAP mount point is mounted and healthy state . From this nfs mount point user home directory is accessible.
  • Make sure that slapd -server daemon and secldapclntd - Client Daemon active on LDAP server and LDAP client.
  • Also make sure that there is no CPU and memory bottleneck on this server.
✓checked and verify LDAP demon working on AIX LDAP server and client.using following command
.
#ps -eaf |grep -i slapd
#ps -eaf |grep -i secldapclntd
#ps -eaf |grep -i LDAP 

AIX LDAP  Server Daemon : slapd
 runs in LDAP server, processes the requests from LDAP client server

 #lsuser -R LDAP username // to disply LDAP user information 
secldapclntd - Client Daemon 

In case restart of client LDAP daemon need that achieved by using command:
# /usr/sbin/restart-secldapclntd

Below command will display ldapsever which is currently active.
#ls-secldapclntd
#ps -eaf |grep -i LDAP this will show running daemon process on AIX LDAP client and server.

Second thing which tried to verify is log file of LDAP on AIX LDAP server, here is the main breakthrough/clue we got.also we engaged database team and they saying that authentication denied error they getting when they tried to login, this information was sufficient to move towards solution. Also from AIX side IBM software support suggested to reset ldapdb2 user password.

Problem is ldapdb2 password was expired and that causing LDAP client requests are rejected.
Log file path for LDAP server where found ldapdb2 password expired 
----------------------------------
"db2cli.log" had following error message
"Sql30082N security processing failed with reason 1 password expired sqlstate=08001"


Also for this we got software support and they suggested to run command to reset password of ldapdb2 user.
Command for password reset on AIX ldap server
----------------------------------
#idscfgdb -l ldapdb2 -w <new password>

Here make sure that ibm slapd must be stopped when AIX admin execute this command.


✓Executed following command and took reboot of AIX LDAP server.
idscfgdb -l ldapdb2 -w <new password>

Also here forgot one thing that when thus issue occurred during that time our AIX patching team did pathcing and our first doubt was this caused because of this pathching, but main culprit was ldapdb2 password was expired. After password reset and AIX LDAP server reboot all AIX LDAP client able to access their home directory and switch to their account 


At the end we can conclude that there are following possible reasons for error "3004-687 user does not exist on LDAP client
1 check secldapclntd is active and running on AIX client.
2 Check whether NFS mounted LDAP mount point is mounted and healthy state
slapd
 runs in LDAP server, processes the requests from LDAP client server. Make sure it's active
4.Last important is ldapdb2 user password is not expired

Useful command while troubleshooting LDAP issue on AIX

#lsuser -R LDAP username 

#idscfgdb -l ldapdb2 -w <new password>.      Reset ldapdb2 password 

#ps -eaf |grep -i LDAP.       Display running LDAP process

# /usr/sbin/restart-secldapclntd.       Restart LDAP client demon

# /usr/sbin/start-secldapclntd    start LDAP client demon

# /usr/sbin/stop-secldapclntd          stop LDAP client demon


Thanks  :)
Happy Reading !!!!