Category Archives: Oracle

AD Patch Worker Hangs on XDOLoader Process

Published by:

Have you run an e-Business Suite R12 patch that slowed down or hung at the Java Loader steps for no apparent reason? I first encountered this issue in January, and finding a workable solution took several hours of research. No Oracle Support notes pointed directly to the issue at the time, although several more recent notes make the issue easier to identify and solve. Hopefully this post will be useful to someone else.

Platform: Red Hat Enterprise Linux Server
Application Version: e-Business Suite 12.1+

Symptoms:

Patch runs fine until it begins to slow down and hang partway through the java loader (e.g., XDOLoader) steps for no apparent reason. There are no indications that the hang is being caused by a database performance or locking issue.

Troubleshooting:

AD patch worker log error:

Error: Error connecting to database "jdbc:oracle:thin:APPS/xxxxxx@(DESCRIPTION=(LOAD_BALANCE=YES)(FAILOVER=YES)(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=YOUR_HOST)(PORT=1521)))(CONNECT_DATA=(SID=YOUR_SID)))"
Io exception: Connection reset

Run jstack on the hanging java process:

"main" prio=10 tid=0x08937000 nid=0x22ea runnable [0xf73e1000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:199)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0xf29b25a0> (a java.io.BufferedInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0xf29b2370> (a java.io.BufferedInputStream)
at sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedByte(SeedGenerator.java:453)
at sun.security.provider.SeedGenerator.getSeedBytes(SeedGenerator.java:123)
at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:118)
at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
- locked <0xf29b1fd0> (a sun.security.provider.SecureRandom)
at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
- locked <0xf29b2250> (a java.security.SecureRandom)
at oracle.security.o5logon.O5LoginClientHelper.generateOAuthResponse(Unknown Source)
at oracle.jdbc.driver.T4CTTIoauthenticate.marshalOauth(T4CTTIoauthenticate.java:457)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:367)
at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:510)
at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:203)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:33)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:510)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at oracle.apps.xdo.oa.util.XDOLoader.initAppsContext(XDOLoader.java:558)
at oracle.apps.xdo.oa.util.XDOLoader.init(XDOLoader.java:455)
at oracle.apps.xdo.oa.util.XDOLoader.(XDOLoader.java:413)
at oracle.apps.xdo.oa.util.XDOLoader.main(XDOLoader.java:2250)

Check /dev/random entropy:

cat /proc/sys/kernel/random/entropy_avail
NOTE: Higher numbers are better. The patch will begin to slow down or hang whenever entropy is ~50 or less.

Explanation:

The java process depends on the /dev/random device to provide random numbers to the SecureRandom Java class. If /dev/random runs out of random numbers, the patch workers calling SecureRandom hang until enough random numbers are available.

Solutions:
NOTE: Pick one of the solutions below. Solution number 1 is my preferred solution, since it is specific to the e-Business Suite and should not affect other processes on the server.

  1. Search for all jre/lib/security/java.security files and replace:

    securerandom.source=file:/dev/random
    with
    securerandom.source=file:/dev/urandom

  2. Run the rngd daemon to seed /dev/random with random numbers:
    Install the rngd-utils package in RedHat 5 or kernel-utils in RedHat 4.
    rngd -r /dev/urandom -o /dev/random -f -t 1
  3. Replace the /dev/random device with /dev/urandom. (Not recommended for security reasons.)

    sudo mv /dev/random /dev/random.bak
    sudo ln -s /dev/urandom /dev/random

References:

Oracle Log and Trace File Cleanup

Published by:

UPDATE: Several script bugs brought to my attention by a comment posted below have been fixed. The script should now be compatible with Linux and Solaris. Please let me know if any additional bugs are identified.

Every running Oracle installation has several directories and files that need to be rotated and/or purged. Surprisingly, or not, Oracle has not included this basic maintenance in their software. I have come across the oraclean utility in the past, but the script does not do everything I need.

To achieve what I required, I recently hacked together a single script that does the following things:

  • Cleans audit_dump_dest.
  • Cleans background_dump_dest.
  • Cleans core_dump_dest.
  • Cleans user_dump_dest.
  • Cleans Oracle Clusterware log files.
  • Rotates and purges alert log files.
  • Rotates and purges listener log files.

The script has been tested on Solaris 9 and 10 with Oracle database versions 9i and 10g. It has also been tested with Oracle Clusterware and ASM 11g. The script can be scheduled on each server having one or more Oracle homes installed, and it will clean all of them up using the retention policy specified. The limitation is that log file retention is specified per server, not per instance. However, I find that placing a single crontab entry on each database server is easier than setting up separate log purge processes for each one.

The script finds all unique Oracle Homes listed in the oratab file and retrieves the list of running Oracle instances and listeners. Once the script knows that information, it rotates and cleans the trace, dump, and log files.

Download: cleanhouse.sh

Usage: cleanhouse.sh -d DAYS [-a DAYS] [-b DAYS] [-c DAYS] [-n DAYS] [-r DAYS] [-u DAYS] [-t] [-h]
   -d = Mandatory default number of days to keep log files that are not explicitly passed as parameters.
   -a = Optional number of days to keep audit logs.
   -b = Optional number of days to keep background dumps.
   -c = Optional number of days to keep core dumps.
   -n = Optional number of days to keep network log files.
   -r = Optional number of days to keep clusterware log files.
   -u = Optional number of days to keep user dumps.
   -h = Optional help mode.
   -t = Optional test mode. Does not delete any files.

Copy Tables From DB2 to Oracle – The Free Way

Published by:

Part of a recent project I was working on involved the decommissioning of an old DB2 database on an IBM z/OS mainframe. As part of the decommissioning process, the business wanted to keep the data available for potential audit reporting. The Oracle Migration Workbench for DB2 sounded like the best option, but it turned out to not be supported on z/OS.

After several attempts at using SQL*Loader to move the 350 tables, a colleague suggested Oracle’s Generic Connectivity. After coordinating with several other groups, this is the process that finally worked:

  1. Have a DB2 account created, so that the data can be queried.
  2. Install the DB2 Connect client on the UNIX server on which the Oracle database resides.
  3. Configure the DB2 Connect client.
    – The DB2 administrator and UNIX administrator coordinated on this, so
    I do not have the specifics.
  4. Test the DB2 connection
    . /export/home/db2inst1/sqllib/cfg/db2profile
    db2 connect to MYDB2DATABASE user <username>
    db2 => select current time as DB2_TIME from sysibm.sysdummy1
    db2 => terminate
  5. Install the unixODBC package on the Oracle database server.
  6. Configure the odbc.ini file (usually located in /usr/local/etc/odbc.ini).
    Example:
    [DB2DATABASE]
    Description = DB2 Driver
    Driver = /export/home/db2inst1/sqllib/lib/libdb2.so
  7. Test the unixODBC connection.
    isql -v MYDB2DATABASE username password
    SQL> select current time as DB2_TIME from sysibm.sysdummy1
    SQL> quit
  8. Create an initialization file for Oracle Generic Connectivity.
    Example:
    cd $ORACLE_HOME/hs/admin
    vi initMYDB2DATABASE.ora
    #
    # HS init parameters
    #
    #
    # HS init parameters
    #
    HS_FDS_CONNECT_INFO = MYDB2DATABASE
    HS_FDS_TRACE_LEVEL = debug
    HS_FDS_SHAREABLE_NAME = /usr/local/lib/libodbc.so
     
    #
    # ODBC specific environment variables
    #
    set ODBCINI=/usr/local/etc/odbc.ini
     
    #
    # Environment variables required for the non-Oracle system
    #
    set DB2INSTANCE=db2inst1
  9. Create a listener entry in the Oracle listener.ora.
    Example:
    (SID_DESC =
    (ORACLE_HOME = /path/to/your/oracle/home)
    (SID_NAME = MYDB2DATABASE)
    (PROGRAM = hsodbc)
    (ENVS=LD_LIBRARY_PATH=/path/to/your/oracle/home/lib:/export/home/db2inst1/sqllib/lib:/u
    sr/lib)
    )
  10. Ensure the listener connection timeout is unlimited in the listener.ora.
    Example:
    INBOUND_CONNECT_TIMEOUT_YOUR_LISTENER=0
  11. Ensure the connection timeout is unlimited in the sqlnet.ora.
    Example:
    SQLNET.INBOUND_CONNECT_TIMEOUT = 0
  12. Restart the database listener.
    lsnrctl stop listener_name; lsnrctl start listener_name
  13. Add a tnsnames.ora entry for the HS listener.
    Example:
    MYDB2DATABASE =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = hostname)(PORT = 1521>))
    )
    (CONNECT_DATA =
    (SERVICE_NAME = MYDB2DATABASE)
    )
    (HS = OK)
    )
  14. Log into the Oracle database as a user that has the CREATE DATABASE LINK privilege.
  15. Create a database link to the DB2 database.
    CREATE DATABASE LINK "MYDB2DATABASE" CONNECT TO "DB2_USERNAME" IDENTIFIED by "DB2_PASSWORD" USING 'MYDB2DATABASE';
  16. Test the database link.
    select current time as DB2_TIME from sysibm.sysdummy1@MYDB2DATABASE;
  17. Move as many tables as possible using:
    create table table_name as select * from db2_schema.db2_table_name@MYDB2DATABASE;
  18. Some tables will fall out due to “ORA-00997: illegal use of LONG datatype”.
    Workaround:
    SET ARRAYSIZE 1000
    SET COPYCOMMIT 1
    COPY FROM username/password@ORACLE_SID TO username/password@ORACLE_SID -
    CREATE table_name USING SELECT * from db2_schema.db2_table_name@MYDB2DATABASE;

Known Issues:

  1. ORA-28511: lost RPC connection to heterogeneous remote agent using
    Solution: Set the connections to not timeout.
    listener.ora: INBOUND_CONNECT_TIMEOUT_YOUR_LISTENER=0
    sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=0
  2. ORA-00997: illegal use of LONG datatype
    Solution: Use the SQL*Plus COPY command.
  3. Error when running SQL*Plus COPY command.
    ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
    [Generic Connectivity Using ODBC]DRV_BlobRead: DB_ODBC_ENGINE (1489): ;
    [unixODBC][IBM][CLI Driver][DB2] SQL0805N Package
    “MYDB2DATABASE.NULLID.SYSLH203.5359534C564C3031” was not found. SQLSTATE=51002
    (SQL State: 51002; SQL Code: -805)
    Solution: This error is due to packages missing on the DB2 side. I had the DB2 database admin create the missing package.

  4. ORA-01400: cannot insert NULL into (“oracle_schema”.”table_name”.”column_name”)
    Solution: Create an empty table and alter the column to accept NULL.
    COPY FROM username/password@ORACLE_SID TO username/password@ORACLE_SID -
    CREATE table_name USING SELECT * from db2_schema.db2_table_name@MYDB2DATABASE WHERE 1=2;
    ALTER TABLE table_name MODIFY column_name NULL;
    COPY FROM username/password@ORACLE_SID TO username/password@ORACLE_SID -
    APPEND table_name USING SELECT * from db2_schema.db2_table_name@MYDB2DATABASE;
  5. Enable DB2 ODBC driver tracing.
    Solution: Edit the db2cli.ini file.
    [COMMON]
    Trace=1
    TraceFileName=/tmp/db2cli_trace.log

References:
Metalink Note:375624.1 – How to Configure Generic Connectivity (HSODBC) on Linux 32 bit using DB2Connect

Solaris 10 + IPMP + Oracle RAC

Published by:

I recently installed a 2-node RAC cluster using the following configuration:

Operating System: Solaris 10 (SPARC-64)
Oracle Clusterware: 11.1.0.6
Oracle ASM: 11.1.0.6
Oracle RDBMS: 10.2.0.3

Because the servers had 4 network interface cards, I asked the system administrators to configure IPMP on the Virtual IP and Private Interconnect interfaces.

The Clusterware, ASM, and RDBMS installations went as planned. However, when we tried restarting the ASM instance, it would take several minutes before coming up. While it started, I ran a ptree on the racgimon process and found that it was hanging on the “sh -c /usr/sbin/arp -a | /usr/xpg4/bin/grep SP” command. It took awhile to sort out, but I was finally able to put together enough blog posts and Metalink notes to figure out what needed to be done.

  1. Collect the hostname, VIP, and private interconnect aliases and IP addresses for each RAC node from /etc/hosts.
  2. Collect network interface information on each node, identifying which interfaces are part of each IPMP group.
    ifconfig -a
  3. Identify which interfaces nodeapps is using on each node.
    srvctl config nodeapps -n <hostname>
  4. Update the nodeapps interfaces as necessary.
    srvctl modify nodeapps -n <hostname> -A <ip_ddress>/<subnet_mask>/<ipmp_interface1>\|<ipmp_interface2>
  5. Identify the OCR private interface(s).
    oifcfg getif
  6. Delete the OCR private interface(s).
    oifcfg delif -global <if_name>
  7. Set the CLUSTER_INTERCONNECTS parameter in each ASM and database instance pfile/spile.
    CLUSTER_INTERCONNECTS='ip_address'
    or
    alter system set cluster_interconnects='ip_address' scope=spfile sid='SID1';
  8. Restart all services on each node.
  9. Verify that each database and ASM instance is using the appropriate Private Interconnect.
    select * from gv$cluster_interconnects;

The ASM startup will now take a fraction of the time it was taking before, and the correct interconnect IP address will be used.

References
Metalink Note 283107.1 – Configuring Solaris IP Multipathing (IPMP) for the Oracle 10g VIP
Metalink Note 368464.1 – How to Setup IPMP as Cluster Interconnect

Spinning Forms Processes

Published by:

Do you happen to have f60webmx processes that run for days or hours on end, consuming up to 100% of your forms tier CPUs? I do, and I decided to do something about it.

The following script will kill forms processes that hog the CPU past a defined time limit. I have set the process time threshold at 4 hours, and it has been running in a production environment for ~8 months. No users have complained so far, but I recently updated the script to capture the user, responsibility, and form information if available. I plan on calling some of the users to see if their forms sessions were valid or not. Use the script at your own risk, and if you improve upon it, please forward me the code or a link to it, so that we can all benefit from the update.

#!/bin/ksh
#
# Script used to kill forms processes running 4 hours or more.
#
# Usage: kill_runaway_forms.sh
#
######################################################
#
# History: Created 2007-05-09 by Shad
# Update 2007-12-14 by Shad - Added SQL to get form info.
#
######################################################

PROCESS="f60webmx webfile" # Process name to check
HOSTNAME=`hostname`
NOTIFY="you@email.com"

if [ -z $TWO_TASK ]
then
echo "Exiting. TWO_TASK is not set."
exit
fi

ps -ef -o pid,time,args | \
grep -v grep | \
grep "$PROCESS" | \
awk '$2 ~ /[0-9]-[0-9][0-9]:[0-9][0-9]:[0-9][0-9]/ || $2 ~ /[0-9][4-9]:[0-9][0-9]:[0-9][0-9]/' | \
while read LINE
do

PROCESS_ID=`echo $LINE | awk '{print $1}'`
PROCESS_TIME=`echo $LINE | awk '{print $2}'`
PROCESS_NAME=`echo $LINE | awk '{print $3}'`
echo "PROCESS_ID: $PROCESS_ID"
echo "PROCESS_TIME: $PROCESS_TIME"
echo "PROCESS_NAME: $PROCESS_NAME"

mailx -s "Killed $PROCESS_NAME $PROCESS_ID on $HOSTNAME" $NOTIFY < exit 0