This page last changed on Aug 14, 2009 by ritchiem.

Analysis Tools

This page contains details of the broker analysis tools available as part of the [Performance Test] package. The design for this work is located here.

Overview

To better understand the performance of the Java broker this collection of tools have been gathered to perform analysis on a variety of logging that the broker can produce. Looking solely at the throughput values from our performance suite is not sufficient to tell us that the broker's performance has increased.

Currently it the scripts monitor:

  • Heap Usage via verbose GC logging
  • GC Duration via verbose GC logging
  • CPU Usage via batch mode top

Additional logging can be added to gather data as required. The processing of the resulting log files from the broker run can then processed and using GnuPlot graphs of the data are generated.

Monitoring

To better understand how the broker is performing there are some easy things we can start monitoring.

  • Verbose GC/Heap Usage
  • CPU Usage

GC / Heap Usage

Enabling verbose gc will allow the broker to provide us with a log file that details GC operation. SO we can get a better handle on the impact of GC on the performance of the broker. Enabling is done by providing a few additional values via QPID_OPTS:

-Xloggc:<gc log file> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

This will result in a gc log file that shows all GCs performed. Of inititial interest is the extraction of:

  • Heap Usage ( Max Allocated, Pre/Post GC Usage)
  • GC Count (Incremental, Full)
  • GC Duration (Incremental, Full, Total)
    Testing+Design+-+Java+Broker+CPU+GC+Monitoring
    As we gather this information a better internal view of the broker in opperation can be built. When changes are made to the broker this data should allow us to determine how the changes have affected the GC and memory profile.

CPU Usage

In addition to the GC and Memory profiling available via the verbose gc settings monitoring the CPU usage via top is a quick and easy way to view the cpu utilisation of the broker.

Using the following top command we can monitor a give broker and adjust the time interval in which we record the CPU usage.

$ top -d $monitor_rate -S -c -p $broker_pid -b > broker_cpu.log

Scripting

To make life easier and to allow for future automated testing the following monitoring scripts have been added to the 'perftest' package:

  • monitor-broker.sh
  • stop-monitored-broker.sh
monitor-broker.sh

The monitor-broker.sh script current starts the following processes:

  • The broker with additional QPID_OPTS for gc logging
  • Top to monitoring the CPU usage

To run this script a number of parameters are requried:

Usage ./monitor-broker.sh <Path to Test Broker> <LOG DIR> <CPU Monitor Rate (s)> [Additional options to
  pass to Qpid broker startup]

The first parameter is the path to a qpid-broker package, or at least a directory that contains an executable broker.
The <LOG DIR> is a path that must not exist (this is checked) for all the log files to be collected for this monitored startup process. The verbose gc, broker log, std out/err and PID files will be written to this directory.
The rate at which top will run to monitor this broker is the third value, any value here that is valid for top's '-d' parameter is valid here.
Finally the remainder of the command line is passed directly to the qpid-server start up so additional log configuration or configuration can be provided here. NOTE: if providing custom log4j setup please ensure that the log file is written to QPID_WORK as this is set to the <LOG DIR> value. This will ensure that all the log files for the testing run are located in a single directory for easy later processing.

The pids of the broker and top are written to a *.pid file in the LOG_DIR. These are also used by the stop-monitored-broker.sh
script to ensure clean shutdown.

If additional processes are desired to be run if they write a PID into LOG_DIR/*.pid then they will be shutdown with the stop script.

stop-monitored-broker.sh

This is a simple script that takes the <LOG DIR> as its only parameter.

It then looks in this directory for all '.pid' files and uses their contained pid value to execute a 'kill' command.

If all processes have not stopped within 3 seconds then a 'kill -9' is executed.

runTests.sh

The final script in the monitoring package currently is 'runTest.sh' this simplifies the execution of a suite of tests.

The script takes three parameters:

Usage ./runTests.sh <Path to Test Pack> <LOG DIR> <TEST LIST FILE>

As with the monitor-broker.sh the first parameter is the path to the qpid-test package that can be build using the Build Creator. The test simply looks for a path that has a bin which will be used to execute your test lists.

The LOG_DIR path again is requied not to exist on startup to ensure we have a clean result set. The results of each test run are instructed to be written here by adding '-o <LOG_DIR> --csv' to the executed test.

Finally the 'Test List File' is a plain text file containing a single command per line to execute. The entries here should at least respond to the '-o' parameter to ensure that their output is collected in the LOG_DIR.

An example test file might contain something link:

TQBT-AA-Qpid-01.sh -d10M
TTBT-AA-Qpid-01.sh -d10M

The two test scripts are assumed to exist in thee <path to test pack>/bin directory.

Monitoring
If you want to check the tests are running the standard out of the individual tests is redirected to <log dir>/TestRun.log

Processing

Monitoring is only the first stage to gather the data. It is the collection of processing tools that are responsible for turing the raw data into something more human understandable.

There are three scripts here that perform take the raw data from the monitoring phase and turn that in to three graphs such as these examples that were made during the design of these scripts:

The three scripts are:

  • processTests.py
  • processAll.sh
  • process.sh

processTests.py

This is the first script written in python that takes the raw output from the monitoring stage and generates test packs.

The script has two parameters the two output directories (broker and test) from the monitoring phase:

Usage: processTests.py [-b|--broker-log-dir] <dir> [-t|--test-dir] <dir>

The tool currently looks for all the *.csv file the individual tests have generated and uses the gathered metadata to create a slice of each of the broker log files (gc, logging, cpu). In addition the script will gather details about the test run and broker used to form the title and filename for the graph.

processAll.sh

The processAll.sh script searches for 'graph.data' files and then runs the process.sh script on each of them to generate the graphs for that data.
The script takes a single argument, a directory to start searching in.

processAll.sh <search dir>

The graph images are then copied to a 'results' directory that is created in the current working directory.

process.sh

This is the main processing script for the collected data. It has been updated to work in conjuction with the processTests.py script. Further development of this script should be performed to allow the explicit naming of the various log file and parameter inputs that this uses.

This script is currently expected to be called from processAll.sh and as a result takes a single arguement a graph.data file.

This graph.data file contains two text lines. The first is the title to give the graph, the second is the name of the file.
This is an example file of the automatic output from the processTests.py:

0.5:TQBT-AA-Qpid-01:256kb x 962 msg/sec using AutoAck
0.5-TQBT-AA-Qpid-01-2009-06-19-17.04.25-timings

In generic terms it creates the following graph.data file:

<broker version>:<test name>:<messageSize>kb x <test volume as measured> msg/sec using <ackMode of test>
<broker version>-<test name>

The process.sh script produces three graphs:

  • GC Heap Usage
  • GC Duration
  • CPU Utilisation

NOTE GnuPlot is used to generate the graphs.

To generated these graphs it does a lot of data manipulation and extraction on the gc log file. Currently the script will process a ConcurrentMark and Sweep gc log file and the format used by the new G1 collector. The processing of these files extracts the recorded time for each gc and the instant count of minor and full GCs. This information is graphed on the GC Duration graph.

The GC log file also highlights the Allocated Heap, Pre and Post GC heap sizes. This is the data that is then graphed in the GC Heap Size graph.

The final graph, CPU Utilisation' is generated from the cpu data gathered using top. Here the script has been updated to work with the data output from processTests.py where the broker_cpu.log file contains a list of time-stamped entries. This is then used to show time on the x-axis.

This improvement is also due be applied to the other two graphs. In addition to standardising the x-axis the y-axis scale for a given batch of tests, as processed by processAll.sh, will be standardised allow for easy image comparison.

Document generated by Confluence on May 26, 2010 10:33