IBM Sterling products unofficial blog

IBM Sterling B2B Integrator, IBM Sterling Filegateway, Performance, troubleshooting

Why is my Sterling B2B Integrator slow ?

This blog will explain how to diagnose and troubleshoot a slow performing B2B Integrator.

A slow performing application can appear in different forms like : unusual Queueing, slow or unresponsive User Interface UI, freezing of the application, failing probes and robots connecting to the application, high CPU usage, long running processes etc …

Queue Watcher:

The first thing I would check in such a situation is whether Queue Watcher UI (monitoring interface shipped with Sterling Integrator) is accessible or not.

If Queue Watcher is responding, it can help find the cause of the problem.

The QueueWatch interface is used to understand the current internal activity of the main B2B processing engine. It is used to identify problematic code, and provide the insights for correctly tuning the application.

Open Queue Watcher from https://yourB2BServerHostname:Port/QueueWatch

To understand what’s going on, click on the first link View Active Threads for All Queues:

From here we can see all of the currently executing processes (shown at the bottom under “List of Working Threads”) and the processing backlog (“Queue Depth”).

In this simple example below, we see that 19 Business Processes are waiting for an execution thread in queue 1 which has a maximum capacity of 1 thread. The BP being executed is slow and has a considerable number of steps (i.e. 48034):

Another example below showing 100k+ business processes waiting in queue 7 which is using 80 executions threads out of a maximum of 120:

Queueing in the application can be normal most of the time. This is to ensure the load is processed using the current system resources and the configured maximum global threads. The application queues can be tuned and adjusted to prioritize work.

It is also possible to change the queues configuration dynamically in Queue Watcher without restarting the application (tip unknown to many users). The dynamic changes, obviously, will be lost after restart.

To permanently make changes to queue configuration, which are persisted between restarts, use the Performance Tuning Wizard from the Dashboard interface (Operations /System / Performance /Tuning) and can be put in costumer_overrides.properties to make them permanent.

By adjusting the queues dynamically you can help drain the queues quickly in some situations.

This is how to change the queues configuration dynamically:

Go to the secure QueueWatcher main Menu. At the bottom of the page type “1” in Configure Queue to change the queue 1 configuration and hit Enter:

We will need to change the queue configuration to run more threads in parallel. In this example we changed max number of threads of Queue 1 from 1 to 10:

change the queues dynamically in IBM Sterling B2B Integrator

Return to the Active Threads for All Queues page.

You should observe that there are now multiple threads running in parallel :

This post will not cover how to tune the system queues but I thought that this tip can be very helpful is some situations.

Now, let’s go back to the original question: why my application is slow?

If you see that the BPs are taking longer than usual you can check the thread execution stack trace (second tip):

In queue watcher main page, look for View Queue Threads:

Then click on the stacksize link (the number 13 link in blue below):

And now, you can see what your BP is doing exactly.

In the below example, we see that the thread is in a TIMED WAITING state, and looking at the method detail, the code is in the SleepyService.

The second example below shows that the BP is doing some Database operations:

Then you can refresh the stack trace page to see if the BP is moving or always stuck in the same operation. This will give you an initial idea on where to look next.

Slowness can have different causes like:

  • Slow Database queries.
  • Slow shared File System.
  • Network, DNS, slow partners problems.
  • Resources problems: maximum number of open files, max DB processes, waiting for DB connections…
  • High CPU.
  • GC overhead, high memory usage and OOM.
  • etc

If the problem is general and affecting all aspects of the application including the UI and not only BPs and queueing, think always to take a few thread dumps of the application from command line during the slowness.

This external link from the IBM website, explains how to take thread dumps. Check here.

Top tip: Always check the application logs for system errors!!

Look for errors like: Heap space, OOM, JVM errors, Database, FS, DNS, connectivity and network errors…

Start looking at noapp, system, sci, perimeter logs if the slowness is general.

Checking the running threads from the OS utilities:

As well as using the administration interfaces, you can identify executing processes directly from standard operating system utilities.

On Linux, you can use the top process monitor by entering the command “top –H

The –H switch changes the default view from top processes to top threads.

In the screenshot below, the thread “WFE.37.Thread” is using a high proportion of CPU.

Using OS tools like top provides a quick method to identify processes that are impacting your system. Did you notice the workflow ID in the thread names:

top is available on most Linux environments. For other operating systems, equivalents are normally available, although not always as part of standard builds.

  •  AIX  – topas is available, typically installed by default. Topas is integrated with IBM nmon tool in AIX.
  • Windows – Microsoft Windows Process Monitor, part of the SysInternals suite can be downloaded from the Microsoft website
  • HPUXtop is typically available
  • Solarisprstat –L shows similar information.

If the top output is showing system threads rather than BP threads, always think of taking a few thread dumps of the application to share with customer support. And never forget to look at the application logs!

JMX monitoring:

Another efficient way to find the top consuming application methods is using JMX monitoring. I detailed this in a previous post.

Using SQL queries:

SQL queries can help detect long running processes and steps. I have shared some queries in a previous post Check here.

5 Comments

  1. Madhu

    Nice one.!

  2. CAF

    In most recent cases – we have found that if you can speed up the SQL database server the system will process very quickly.
    All the fine tuning (eg, Performance Tuning) on the local SI B2B app server sometimes makes little difference, as all main interaction of BP steps processing is with the database.
    It is evidently noticeable, that the faster the Database can transact, the faster SI B2B will run !
    We ran the Database on Flash storage – everyone on SI B2B absolutely flies !!

  3. Luis

    Is there a way to extract the same information from queue watcher with SQL queries? this to centralize monitoring on defined tool and have the ability of alerting depending on defined thresholds

  4. Carina Reyes

    Anyone running Sterling on OCP and encountering problems with auto resume and recovering your BPs without manual intervention?

Leave a Reply

We’ve detected that you’re using an ad blocker. Please disable it to support our website.