The Hyades Adapter Configuration Editor allows you to use regular expressions to describe how log files should be transformed into Common Base Event records. The following tables are a guideline to regular expression usage.
Expression | Matches |
---|---|
{n,m} | at least n but not more than m times |
{n,} | at least n times |
{n} | exactly n times |
* | 0 or more times |
+ | 1 or more times |
? | 0 or 1 times |
. | everything except \n in a regular expression within parentheses |
^ | a null token matching the beginning of a string or line (i.e., the position right after a newline or right before the beginning of a string) in a regular expression within parentheses |
$ | a null token matching the end of a string or line (that is, the position right before a newline or right after the end of a string) in a regular expression within parentheses |
\b | backspace inside a character class ([abcd]) |
\b | null token matching a word boundary (\w on one side and \W on the other) |
\B | null token matching a boundary that isn't a word boundary |
\A | only at beginning of string |
\Z | only at end of string (or before newline at the end) |
\ | newline |
\r | carriage return |
\t | tab |
\f | formfeed |
\d | digit [0-9] |
\D | non-digit [^0-9] |
\w | word character [0-9a-z_A-Z] |
\W | non-word character [^0-9a-z_A-Z] |
\s | a whitespace character [ \t\n\r\f] |
\S | a non-whitespace character [^ \t\n\r\f] |
\xnn | the hexadecimal representation of character nn |
\cD | the corresponding control character |
\nn or \nnn | the octal representation of character nn unless a backreference. |
\1, \2, \3 ... | whatever the first, second, third, and so on, parenthesized group matched. This is called a backreference. If there is no corresponding group, the number is interpreted as an octal representation of a character. |
\0 | the null character. Any other backslashed character matches itself . |
*? | 0 or more times |
+? | 1 or more times |
?? | 0 or 1 times |
{n}? | exactly n times |
{n,}? | at least n times |
{n,m}? | at least n but not more than m times |
To group parts of an expression, use the metacharacters ( ). This allows the regular expression in the parantheses to be treated as a single unit. For example, the regular expression
severity:(1|2)matches the pattern severity:1 or severity:2.
To extract parts of a string that have been matched using the grouping metacharacters, use the special variables $1, $2, etc.
# Extract the name and URL from $pattern = <a href="secure_logon.html">Logon form</a> $pattern =~ <a href=\"(.*)\">(.*)</a> ; # match using grouping $url = $1; # $1 equals secure_logon.html $pagename = $2; # $2 equals Logon form
Expression | Matches |
---|---|
(?#text) | An embedded comment causing text to be ignored. |
(?:regexp) | Groups things like "()" but doesn't cause the group match to be saved. |
(?=regexp) | A zero-width positive lookahead assertion. For example, \w+(?=\s) matches a word followed by whitespace, without including whitespace in the MatchResult |
(?!regexp) | A zero-width negative lookahead assertion. For example foo(?!bar) matches any occurrence of foo that isn't followed by bar. This is a zero-width assertion, which means that a(?!b)d matches ad because a is followed by a character that is not b (the d) and d follows the zero-width assertion. |
(?imsx) | One or more embedded pattern-match modifiers: i enables case insensitivity m enables multiline treatment of the input s enables single-line treatment of the input x enables extended whitespace comments |
Related Concepts
Overview of the Hyades Generic Log Adapter
Common Base Event format specification
Related tasks
Creating a log parser
Creating a rules-based adapter
Creating a static adapter
Related references
Adapter Configuration File structure
Common Base Event format specification
Adapter Configuration Editor
Regular expression grammar
(C) Copyright IBM Corporation 2000, 2005. All Rights Reserved.