Friday, October 14, 2022

Trial capacity on Demand for AIX LPAR on POWER system frame -TCOD

 Trial/Temporary Capacity on Demand (CoD) -- AIX



What is Trial capacity on Demand??

Answer:

You can evaluate the use of inactive processors, memory, or both, at no charge using Trial CoD. 

After it is started, the trial period is available for 30 power-on days.


How to enable Temporary TCOD on AIX Frame:

In our environment we tried to enable it for 24hrs only and its temporary

=========================================================================


AIX LPAR   Current PU         Current VP                  Target PU      Target VP

gold_LPAR     2 4 8 10

Silver_LPAR     1 4 8 10

Diamond_LPAR  2         4        8         10

Marble_LPAR      2 4 8 10


Before proceeding for TCOD and DLPAR operation make sure that on the AIX frame enough free CPU processing units are available.

if not then AIX admin need to get those.


Step 1

Login to HMC

step 2

select frame, where AIX admin wants to perform TCOD

step 3

Select COD Functions --->

Under on/off Processor choose ----->manage

Here enter:

1. Number of On/Off COD processor: 25

2.Number of Days:1


How we calculate Number of On/Off COD processor??

Answer:

Total Current processing units: 2+1+2+2=7

Total Target Processing Units=8+8+8+8=32


2 case we need to consider here before adding "Number of On/Off COD processor"

Case 1 where no no free processing units available

If on our AIX Frame there is no free processing units available, then AIX admin need to add total 25 processing units under step3.

--------------------------------------------------------------------------------------------------------------------------------

Case 2

If there is 10 processing units free on AIX frame then AIX admin can add 15 processing units.


Target requirement 25 and free on AIX Frame is 10


Target Processing UNITS- Free Processing units on AIX frame= 15


so need to add only 15 prcessor under step 3 "Nmber of On/Off COD 


Step 4:

Business requirement is for 24 hrs only so we are mentioning 1 day here

Number of Days:1


AIX admin need to deactive/remove CPU resource before specified time expires ,using DLPAR option.

If he forgot to do so will get charged for extra day.


step 5:

Then AIX admin need to make sure that these units are showing as free on AIX frame.

once he sure that,there are 25 processing units are free he can proceed for DLPAR operation of "Entitled Processing units" and "Virtual Processing units"



step 6:


After DLPAR operation performed by AIX Admin , he assigned "Target Processing Units" and "Target Virtual processor", he need to take command line session 

of HMC using hscroot to to run optimizer command using hscroot user


#lsmemopt -m sys -o currscore -r lpar


Above command will show score value ,till we get value 95 we have to run following optimizer command

once you run optimizer command wait for 15-20 minutes and check score using "lsmemopt" command.


Optimize command:


#optmem -m <frame name> -t affinity -o start


Example of command if my frame name is p9117-pmmd-hmc10


 To list score 

----------------

lsmemopt -m p9117-pmmd-hmc10 -o currscore -r lpar


To optimize on specific frame

==============================


optmem -m p9117-pmmd-hmc10 -t affinity -o start


**once AIX admin get score is equal to 95 or greater that 95 then no need to execute optimizer command

====================================================================================================


How AIX admin will deactivate TCOD before time expire:


step 1.


Go to HMC perform DLPAR and assign "processsing units" and Virtual Processor which was set before TCOD operation.


step 2: once "processing units" are free, AIX admin confirm this by checking at frame level.


step 3: Once he sure that "processor are free he needs to follow below sequence:


a. 

Select COD Functions --->

Under on/off Processor choose ----->manage

Here enter :

1. Number of On/Off COD processor: 0

2.Number of Days:0


Then press OK button. AIX admin confirm this by checking free available processing units at frame level.


b.

AIX admin must run following 2 command


To list score 

----------------

lsmemopt -m p9117-pmmd-hmc10 -o currscore -r lpar


To optimize on specific frame

==============================


#optmem -m p9117-pmmd-hmc10 -t affinity -o start


**once AIX admin get score is equal to 95 or greater that 95 then no need to execute optimizer command




Thanks  !!!!


Friday, October 7, 2022

Run full fsck on AIX lpar for superblock corruption

Run full fsck on AIX LPAR for superblock corruption incident
--------------------------------------------------------------------------

One of the file system on AIX was not mounted and when tried to mount it throwing following error:

----------------------------------------------------------------------------
But before running fsck, which things AIX admin need to check:

1. Check with backup team, for latest backup of affected filesystem.
2. if it's any application/DB filesystem then need to take necessary down time and approval before performing fsck to fix superblock dirty issue.
-------------------------------------------------------------------------

----------------$$$$$$$$$$$$$-------------------------------------
Error details:
mount:0506:324 can not mount

Superblock is dirty on /dev/oraclelv .
Run full fsck to fix this.
----------------$$$$$$$$$$$$$-------------------------------------

Solution AIX admin used: 

#fsck -yvv /dev/oraclelv      //Full fsck command to fix superblock dirty issue

After fsck fixed superblock then execute following command to mount file system 

mount /orafs




Thanks !!!

Sunday, May 29, 2022

How to delete old file using find command on AIX operating system

Delete log files on AIX operating system which are older than 365 days.

Command to list these files
#find /oracle/log -name "*.log" -mtime +365 -exec ls -lrt {} \;

#find /oracle/log -name "*.txt" -mtime +365 -exec ls -lrt {} \;

Above command will list which are older than 365 days.

If UNIX AIX admin wanted to remove them after approval from Oracle team he can remove by following command

#find /oracle/log -name "*.log" -mtime +365 -exec rm {} \;

find /oracle/log -name "*.txt" -mtime +365 -exec rm {} \;

Same command AIX admin can use different criteria like 100 days, 200 days old file housekeeping,by modifying -mtime .

Thanks.

Thursday, May 19, 2022

create file backup using script on AIX operating system.

Script

for inputfile in test1 test2 test3 test3
do
cp $inputfile ${inputfile}.bkp
echo $inputfile backup created successfully
Sleep 2
done 

Monday, May 2, 2022

How To replace faulty ethernet adapter Which is part of shared ethernet on AIX VIO, without reconfiguring SEA

 Replace faulty ethernet adapter Which is part of shared ethernet on AIX VIO, without reconfiguring SEA.

Lets see....

Scenario:

on VIO server version 2.2 one of the 4 port ethernet adapter is faulty and it caused ,SEA in error state.

we provided snap to IBM CE from VIO and IBM confirmed that ethernet adapter is faulty it needs to be replaced.

So following steps IBM recommended:


Steps 1:

Take all prechecks for VIO like ,latest mksysb, viosbr and all necessary health check

step2 :

Make sure that AIX admin have faulty ethernet adapter location details:

lsslot -c pci // Display all Hot hot-plug slots in the system unit.

example 

ent0  10/100 Base-TX Ethernet PCI Adapter  // port 1

ent1 10/100 Base-TX Ethernet PCI Adapter  //// port 2

ent2  10/100 Base-TX Ethernet PCI Adapter // port 3

ent3  10/100 Base-TX Ethernet PCI Adapter // port 4


if admin do #lscfg -vpl ent0 

#lscfg -vpl ent0 // show location of faulty ethernet adapter


step 3:

once location identified and IBM CE in data center ,identify that faulty ethernet adapter using "diag" utility.


#diag-->press enter-->task selection-->hot plug task-->PCI hot plug task-->identify PCI hot plug slot--->select and set to identify mode .


once CE confirm this , admin exit from this window


step4:

failover all SEA to secondary VIO and shutdown VIO

for i in `lsdev -Cc adapter | grep -i shared |awk '{print $1}'`
do
echo "-------shared SEA $i  failover to partner VIO------ "
chdev -l $i -a ha_mode=standby
echo "---------------------"
sleep 2
done

step 5:

Ask IBM CE to replace faulty ethernet adapter

once ethernet adapter is replaced ,activate VIO and do health check.

step 6:

check ethernet adapter is available .then do SEA failback.

for i in `lsdev -Cc adapter | grep -i shared |awk '{print $1}'`
do
echo "-------shared SEA $i  failover to partner VIO------ "
chdev -l $i -a ha_mode=auto
echo "---------------------"
sleep 2
done

check SEA link, by following script

for i in `lsdev -Cc adapter | grep -i shared |awk '{print $1}'`
do
echo "-------shared SEA $i  state priority and active status ------ "
entstat -d $i | grep -i link
echo "---------------------"
sleep 2
done

This way we can replace faulty ethernet adapter . 


Thanks

:)






Change Hitachi disk attribute using shell script on multiple disk on AIX Lpar

 Script:

Step 1 : create list of server on where u wanted to change disk parameter, create this file

on NIM server from where pass wordless ssh is working.

Example


#vi list

test1

test2

test3

test4

test5


:wq


step 2:


Create script to login on each AIX LPAR using following script:

===========================

for i in `cat list`

do

ssh $i

done

===========================


Above script will connect to each server once you sure that you are host where Hitachi disk 

parameter need to change, execute following script.

=======================================================


for i in `lsdev -c disk |grep -i Hitachi |awk '{print $1}'`

do

echo $i

chdev -Pl $i -a hcheck_interval=90 -a timeout_policy=fail_path

done

==============================================================


Once above script executed check parameter changed or not by following script


for i in `lsdev -c disk |grep -i Hitachi |awk '{print $1}'`

do

echo $i

lsattr -El $i -a hcheck_interval,timeout_policy -F attribute=value |xargs

done


Once done type "exit" on AIX LPAR ,then it will prompt for next AIX LPAR.

This way AIX admin can interactively change disk parameter on AIX LPAR

Thanks :)










Sunday, May 1, 2022

Troubleshoot SAS disk drive name using rmdev and cfgmgr on AIX VIO Server

 Today I am going to share my experience for how to change hot  swappable SAS disk drive name change.

Details of scenario:

On one of the VIO server ,had disk drive failure and in AIX errpt its showing as permanent hardware error. so as per IBM suggestion this was faulty disk drive need to be replaced. from VIO end we performed DIAG procedure for replacement of faulty disk drive.

But when new disk was detected it came up with new disk name and that unexpected, because SAS disk have name as "hdisk0" before replacement. so when IBM CE replaced disk it must come up with hdisk0.

#diag      //Diag procedure for SAS disk replacement

-->Select Hot Plug Task.

-->Certify media task

-->Select RAID Hot Plug Devices.

--->Select SAS disk drive location and press enter to "set faulty SAS disk LED in identify mode"

and ASK IBM CE to confirm


--->once IBM CE confirmed press enter and came to previous screen


---->Choose remove and replace and select faulty SAS disk and make disk ready for replacement


---->Once SAS disk is ready for Replacement, ask IBM CE to replace and confirm.

----> Once disk is replaced detect new disk by selecting following menu


"Configure newly detected Device"


How to check SAS disk after replacement:

======================================


#lspv |grep -w "hdisk0"

#lsdev -Cc disk |grep -w "hdisk0"

#bootinfo -s hdisk0

#lscfg -vpl hdisk0


Make sure that new serial number is on newly replaced hdisk0


Troubleshooting we did to solve this issue:

SAS disk name which was faulty: hdisk0  

New disk name after Hot swap replacement using diag: hdisk400

Soution: when we checked where exactly hdisk0 disappeared, we found that hdisk0 is in defined state.

Ran following command:

#rmdev -dl hdisk0

#rmdev -dl hdisk400   // before deleting we make sure that location is correct and its SAS disk

#cfgmgr 


After running above procedure we able to see "hdisk0" in available state on VIO server.


Thanks !!!! :)