JMeter

About

Download

Documentation

Tutorials (PDF format)

Community


20. Regular Expressions
20.1 Overview

JMeter includes the pattern matching software Apache Jakarta ORO

There is some documentation for this on the Jakarta web-site, for example a summary of the pattern matching characters

There is also documentation on an older incarnation of the product at OROMatcher User's guide , which might prove useful.

The pattern matching is very similar to the pattern matching in Perl. A full installation of Perl will include plenty of documentation on regular expressions - look for perlrequick, perlretut, perlre, perlreref.

It is worth stressing the difference between "contains" and "matches", as used on the Response Assertion test element:

  • "contains" means that the regular expression matched at least some part of the target, so 'alphabet' "contains" 'ph.b.' because the regular expression matches the substring 'phabe'.
  • "matches" means that the regular expression matched the whole target. So 'alphabet' is "matched" by 'al.*t'.

In this case, it is equivalent to wrapping the regular expression in ^ and $, viz '^al.*t$'.

However, this is not always the case. For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet', but does not match 'alphabet'.

Why? Because when the pattern matcher finds the sequence 'alp' in 'alphabet', it stops trying any other combinations - and 'alp' is not the same as 'alphabet', as it does not include 'habet'.

Note: unlike Perl, there is no need to (i.e. do not) enclose the regular expression in //. So how does one use the Perl modifiers ismx etc if there is no trailing /? The solution is to use Perl5 extended regular expressions, i.e. /abc/i becomes (?i)abc


20.2 Examples

Extract single string

Suppose you want to match the following portion of a web-page:

name="file" value="readme.txt" and you want to extract readme.txt.

A suitable reqular expression would be:

name="file" value="(.+?)"

The special characters above are:

  • ( and ) - these enclose the portion of the match string to be returned
  • . - match any character. + - one or more times. ? - don't be greedy, i.e. stop when first match succeeds

Note: without the ?, the .+ would continue past the first " until it found the last possible " - probably not what was intended.

Extract multiple strings

Suppose you want to match the following portion of a web-page: name="file.name" value="readme.txt" and you want to extract file.name and readme.txt.

A suitable reqular expression would be:

name="(.+?)" value="(.+?)"

This would create 2 groups, which could be used in the JMeter Regular Expression Extractor template as $1$ and $2$.

The JMeter Regex Extractor saves the values of the groups in additional variables.

For example, assume:

  • Reference Name: MYREF
  • Regex: name="(.+?)" value="(.+?)"
  • Template: $1$$2$

Do not enclose the regular expression in / /

The following variables would be set:

  • MYREF: file.namereadme.txt
  • MYREF_g0: name="file.name" value="readme.txt"
  • MYREF_g1: file.name
  • MYREF_g2: readme.txt
These variables can be referred to later on in the JMeter test plan, as ${MYREF}, ${MYREF_g1} etc


20.3 Line mode

The pattern matching behave in various slightly different ways, depending on the setting of the multi-line and single-line modifiers.

There are the four possible combinations:

  • Default behavior. '.' matches any character except "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.
  • Single-line modifier (?s): Treat string as a single long line. '.' matches any character, even "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.
  • Multi-line modifier (?m): Treat string as a set of multiple lines. '.' matches any character except "\n". ^ and $ are able to match at the start or end of any line within the string.
  • Both modifiers (?sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n". ^ and $, however, are able to match at the start or end of any line within the string.




Copyright © 1999-2007, Apache Software Foundation