Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Search  

RegexMatcher Class Reference

class RegexMatcher bundles together a reular expression pattern and input text to which the expression can be applied. More...

#include <regex.h>

Inheritance diagram for RegexMatcher::

UObject UMemory List of all members.

Public Methods

virtual ~RegexMatcher ()
 Destructor. More...

virtual UBool matches (UErrorCode &status)
 Attempts to match the entire input string against the pattern. More...

virtual UBool lookingAt (UErrorCode &status)
 Attempts to match the input string, starting from the beginning, against the pattern. More...

virtual UBool find ()
 Find the next pattern match in the input string. More...

virtual UBool find (int32_t start, UErrorCode &status)
 Resets this RegexMatcher and then attempts to find the next substring of the input string that matches the pattern, starting at the specified index. More...

virtual UnicodeString group (UErrorCode &status) const
virtual UnicodeString group (int32_t groupNum, UErrorCode &status) const
 Returns a string containing the text captured by the given group during the previous match operation. More...

virtual int32_t groupCount () const
 Returns the number of capturing groups in this matcher's pattern. More...

virtual int32_t start (UErrorCode &status) const
 Returns the index in the input string of the start of the text matched during the previous match operation. More...

virtual int32_t start (int group, UErrorCode &status) const
 Returns the index in the input string of the start of the text matched by the specified capture group during the previous match operation. More...

virtual int32_t end (UErrorCode &status) const
 Returns the index in the input string of the character following the text matched during the previous match operation. More...

virtual int32_t end (int group, UErrorCode &status) const
 Returns the index in the input string of the character following the text matched by the specified capture group during the previous match operation. More...

virtual RegexMatcher & reset ()
 Resets this matcher. More...

virtual RegexMatcher & reset (const UnicodeString &input)
 Resets this matcher with a new input string. More...

virtual const UnicodeStringinput () const
 Returns the input string being matched. More...

virtual const RegexPatternpattern () const
 Returns the pattern that is interpreted by this matcher. More...

virtual UnicodeString replaceAll (const UnicodeString &replacement, UErrorCode &status)
 Replaces every substring of the input that matches the pattern with the given replacement string. More...

virtual UnicodeString replaceFirst (const UnicodeString &replacement, UErrorCode &status)
 Replaces the first substring of the input that matches the pattern with the replacement string. More...

virtual RegexMatcher & appendReplacement (UnicodeString &dest, const UnicodeString &replacement, UErrorCode &status)
 Implements a replace operation intended to be used as part of an incremental find-and-replace. More...

virtual UnicodeStringappendTail (UnicodeString &dest)
 As the final step in a find-and-replace operation, append the remainder of the input string, starting at the position following the last match, to the destination string. More...

virtual UClassID getDynamicClassID () const
 ICU "poor man's RTTI", returns a UClassID for the actual class. More...


Static Public Methods

UClassID getStaticClassID ()
 ICU "poor man's RTTI", returns a UClassID for this class. More...


Private Methods

 RegexMatcher (const RegexPattern *pat)
 RegexMatcher (const RegexMatcher &other)
RegexMatcher & operator= (const RegexMatcher &rhs)
void MatchAt (int32_t startIdx, UErrorCode &status)
void backTrack (int32_t &inputIdx, int32_t &patIdx)
UBool isWordBoundary (int32_t pos)

Private Attributes

const RegexPatternfPattern
const UnicodeStringfInput
int32_t fInputLength
UBool fMatch
int32_t fMatchStart
int32_t fMatchEnd
int32_t fLastMatchEnd
UStack * fBackTrackStack
UVectorfCaptureStarts
UVectorfCaptureEnds

Static Private Attributes

const char fgClassID
 The address of this static class variable serves as this class's ID for ICU "poor man's RTTI". More...


Friends

class RegexPattern

Detailed Description

class RegexMatcher bundles together a reular expression pattern and input text to which the expression can be applied.

It includes methods for testing for matches, and for find and replace operations.

Class RegexMatcher is not intended to be subclassed.

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Definition at line 358 of file regex.h.


Constructor & Destructor Documentation

virtual RegexMatcher::~RegexMatcher   [virtual]
 

Destructor.

Note that there are no public constructors; creation is done with RegexPattern::matcher().

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

RegexMatcher::RegexMatcher const RegexPattern   pat [private]
 

RegexMatcher::RegexMatcher const RegexMatcher &    other [private]
 


Member Function Documentation

void RegexMatcher::MatchAt int32_t    startIdx,
UErrorCode   status
[private]
 

virtual RegexMatcher& RegexMatcher::appendReplacement UnicodeString   dest,
const UnicodeString   replacement,
UErrorCode   status
[virtual]
 

Implements a replace operation intended to be used as part of an incremental find-and-replace.

The input string, starting from the end of the previous match and ending at the start of the current match, is appended to the destination string. Then the replacement string is appended to the output string, including handling any substitutions of captured text.

For simple, prepackaged, non-incremental find-and-replace operations, see replaceFirst() or replaceAll().

Parameters:
dest  A UnicodeString to which the results of the find-and-replace are appended.
replacement  A UnicodeString that provides the text to be substitured for the input text that matched the regexp pattern. The replacement text may contain references to captured text from the input.
status  A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed, and U_INDEX_OUTOFBOUNDS_ERROR if the replacement text specifies a capture group that does not exist in the pattern.
Returns:
this RegexMatcher
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeString& RegexMatcher::appendTail UnicodeString   dest [virtual]
 

As the final step in a find-and-replace operation, append the remainder of the input string, starting at the position following the last match, to the destination string.

appendTail() is intended to be invoked after one or more invocations of the RegexMatcher::appendReplacement().

Parameters:
dest  A UnicodeString to which the results of the find-and-replace are appended.
Returns:
the destination string.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

void RegexMatcher::backTrack int32_t   inputIdx,
int32_t   patIdx
[inline, private]
 

virtual int32_t RegexMatcher::end int    group,
UErrorCode   status
const [virtual]
 

Returns the index in the input string of the character following the text matched by the specified capture group during the previous match operation.

Parameters:
group  the capture group number
status  A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number
Returns:
the index of the last character, plus one, of the text captured by the specifed group during the previous match operation. Return -1 if the capture group was not part of the match.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual int32_t RegexMatcher::end UErrorCode   status const [virtual]
 

Returns the index in the input string of the character following the text matched during the previous match operation.

Parameters:
status  A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed.
Returns:
the index of the last character matched, plus one.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UBool RegexMatcher::find int32_t    start,
UErrorCode   status
[virtual]
 

Resets this RegexMatcher and then attempts to find the next substring of the input string that matches the pattern, starting at the specified index.

Parameters:
start  the position in the input string to begin the search
status  A reference to a UErrorCode to receive any errors.
Returns:
TRUE if a match is found.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UBool RegexMatcher::find   [virtual]
 

Find the next pattern match in the input string.

The find begins searching the input at the location following the end of the previous match, or at the start of the string if there is no previous match. If a match is found, start(), end() and group() will provide more information regarding the match.

Note that if the input string is changed by the application, use find(startPos, status) instead of find(), because the saved starting position may not be valid with the altered input string.

Returns:
TRUE if a match is found.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UClassID RegexMatcher::getDynamicClassID void    const [inline, virtual]
 

ICU "poor man's RTTI", returns a UClassID for the actual class.

Draft:
This API has been introduced in ICU 2.2. It is still in draft state and may be modified in a future release.

Reimplemented from UObject.

Definition at line 638 of file regex.h.

UClassID RegexMatcher::getStaticClassID void    [inline, static]
 

ICU "poor man's RTTI", returns a UClassID for this class.

Draft:
This API has been introduced in ICU 2.2. It is still in draft state and may be modified in a future release.

Definition at line 645 of file regex.h.

virtual UnicodeString RegexMatcher::group int32_t    groupNum,
UErrorCode   status
const [virtual]
 

Returns a string containing the text captured by the given group during the previous match operation.

Group(0) is the entire match.

Parameters:
group  the capture group number
status  A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number.
Returns:
the captured text
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeString RegexMatcher::group UErrorCode   status const [virtual]
 

virtual int32_t RegexMatcher::groupCount   const [virtual]
 

Returns the number of capturing groups in this matcher's pattern.

Returns:
the number of capture groups
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual const UnicodeString& RegexMatcher::input   const [virtual]
 

Returns the input string being matched.

The returned string is not a copy, but the live input string. It should not be altered or deleted.

Returns:
the input string
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

UBool RegexMatcher::isWordBoundary int32_t    pos [private]
 

virtual UBool RegexMatcher::lookingAt UErrorCode   status [virtual]
 

Attempts to match the input string, starting from the beginning, against the pattern.

Like the matches() method, this function always starts at the beginning of the input string; unlike that function, it does not require that the entire input string be matched.

If the match succeeds then more information can be obtained via the start(), end(), and group() functions.

Parameters:
status  A reference to a UErrorCode to receive any errors.
Returns:
TRUE if there is a match at the start of the input string.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UBool RegexMatcher::matches UErrorCode   status [virtual]
 

Attempts to match the entire input string against the pattern.

Parameters:
status  A reference to a UErrorCode to receive any errors.
Returns:
TRUE if there is a match
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

RegexMatcher& RegexMatcher::operator= const RegexMatcher &    rhs [private]
 

virtual const RegexPattern& RegexMatcher::pattern   const [virtual]
 

Returns the pattern that is interpreted by this matcher.

Returns:
the RegexPattern for this RegexMatcher
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeString RegexMatcher::replaceAll const UnicodeString   replacement,
UErrorCode   status
[virtual]
 

Replaces every substring of the input that matches the pattern with the given replacement string.

This is a convenience function that provides a complete find-and-replace-all operation.

This method first resets this matcher. It then scans the input string looking for matches of the pattern. Input that is not part of any match is left unchanged; each match is replaced in the result by the replacement string. The replacement string may contain references to capture groups.

Parameters:
replacement  a string containing the replacement text.
status  a reference to a UErrorCode to receive any errors.
Returns:
a string containing the results of the find and replace.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeString RegexMatcher::replaceFirst const UnicodeString   replacement,
UErrorCode   status
[virtual]
 

Replaces the first substring of the input that matches the pattern with the replacement string.

This is a convenience function that provides a complete find-and-replace operation.

This function first resets this RegexMatcher. It then scans the input string looking for a match of the pattern. Input that is not part of the match is appended directly to the result string; the match is replaced in the result by the replacement string. The replacement string may contain references to captured groups.

The state of the matcher (the position at which a subsequent find() would begin) after completing a replaceFirst() is not specified. The RegexMatcher should be reset before doing additional find() operations.

Parameters:
replacement  a string containing the replacement text.
status  a reference to a UErrorCode to receive any errors.
Returns:
a string containing the results of the find and replace.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual RegexMatcher& RegexMatcher::reset const UnicodeString   input [virtual]
 

Resets this matcher with a new input string.

This allows instances of RegexMatcher to be reused, which is more efficient than creating a new RegexMatcher for each input string to be processed.

Returns:
this RegexMatcher.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual RegexMatcher& RegexMatcher::reset void    [virtual]
 

Resets this matcher.

The effect is to remove any memory of previous matches, and to cause subsequent find() operations to begin at the beginning of the input string.

Returns:
this RegexMatcher.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual int32_t RegexMatcher::start int    group,
UErrorCode   status
const [virtual]
 

Returns the index in the input string of the start of the text matched by the specified capture group during the previous match operation.

Return -1 if the capture group exists in the pattern, but was not part of the last match.

Parameters:
group  the capture group number
status  A reference to a UErrorCode to receive any errors. Possible errors are U_REGEX_INVALID_STATE if no match has been attempted or the last match failed, and U_INDEX_OUTOFBOUNDS_ERROR for a bad capture group number
Returns:
the start position of substring matched by the specified group.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual int32_t RegexMatcher::start UErrorCode   status const [virtual]
 

Returns the index in the input string of the start of the text matched during the previous match operation.

Parameters:
status  a reference to a UErrorCode to receive any errors.
Returns:
The position in the input string of the start of the last match.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.


Friends And Related Function Documentation

friend class RegexPattern [friend]
 

Definition at line 654 of file regex.h.


Member Data Documentation

UStack* RegexMatcher::fBackTrackStack [private]
 

Definition at line 673 of file regex.h.

UVector* RegexMatcher::fCaptureEnds [private]
 

Definition at line 675 of file regex.h.

UVector* RegexMatcher::fCaptureStarts [private]
 

Definition at line 674 of file regex.h.

const UnicodeString* RegexMatcher::fInput [private]
 

Definition at line 667 of file regex.h.

int32_t RegexMatcher::fInputLength [private]
 

Definition at line 668 of file regex.h.

int32_t RegexMatcher::fLastMatchEnd [private]
 

Definition at line 672 of file regex.h.

UBool RegexMatcher::fMatch [private]
 

Definition at line 669 of file regex.h.

int32_t RegexMatcher::fMatchEnd [private]
 

Definition at line 671 of file regex.h.

int32_t RegexMatcher::fMatchStart [private]
 

Definition at line 670 of file regex.h.

const RegexPattern* RegexMatcher::fPattern [private]
 

Definition at line 666 of file regex.h.

const char RegexMatcher::fgClassID [static, private]
 

The address of this static class variable serves as this class's ID for ICU "poor man's RTTI".

Definition at line 681 of file regex.h.


The documentation for this class was generated from the following file:
Generated on Wed Dec 18 16:51:54 2002 for ICU 2.4 by doxygen1.2.11.1 written by Dimitri van Heesch, © 1997-2001