RegExpr_FindPatternInText

int RegExpr_FindPatternInText (const char *regularExpressionText, int caseSensitive, const char *textToSearch, int textLength, int direction, int matchPolicy, int *matched, int *matchPosition, int *matchLength);

Purpose

This function searches a text buffer for a match to a regular expression.

See the Instrument help for example source code showing how to use this function.

Parameters

Input
Name Type Description
regularExpressionText const char * A nul–terminated string containing a regular expression.

A regular expression consists of the following tokens:
. (period) match 1 character a.t matches act and apt but not abort or at
* (asterisk) match 0 or more occurrences of the preceding character or {expression} 0*1 matches 1, 01, 001, etc. a.* matches act, apt, abort, and at
+ (plus sign) match 1 or more occurrences of the preceding character or {expression} 0+1 matches 01, 001, 0001, etc. {ab}+c matches abc, ababc, but not c
? (question mark) match 0 or 1 occurrences of the preceding character or {expression} 0?1 matches 1, 01, but not 001
| (pipe) matches either the preceding or following character or {expression} a3|4b matches a3b or a4b
^ (caret) matches the beginning of a line ^int matches any line that begins with int
$ (dollar sign) matches the end of a line done$ matches any line that ends with done
{} (curly braces) groups characters or expressions {a3}|{4b} matches a3 or 4b
[] (brackets) matches any one character or range listed within the brackets [a–z] matches every occurrence of lowercase letters [abc] matches every occurrence of a, b or c
~ (tilde) when appearing immediately after the left bracket, negates the contents of the set [~a–z] matches anything except lowercase letters [a–z~A–Z] matches all letters and the '~' character
\t (backslash t) matches any tab character \t3 matches every occurrence of a tab followed by a 3
\x (backslash x) matches any character specified in hex \x2a matches every occurrence of the '*' specified in hexcharacter
\ (backslash) used if any of the above characters themselves are to be included in the search \–\?\\ matches every occurrence of '–' followed by '?' and '\'
caseSensitive integer Specifies whether the matching of characters is to be done on a case–senstive or case–insensitive basis.

A non–zero value specifies that characters are to be matched on a case–sensitive basis. For example. "chr" would match only to "chr" and not to "CHR".

A zero value specifies that characters are to be matched on a case–insensitive basis. For example. "chr" would match to "chr", "CHR", and "Chr".

This parameter does apply to ranges. For example, if this parameter is non–zero, then "[a–z]" in the regular expression string would match to any lowercase letter. If this parameter is zero, then "[a–z]" would match to any letter.
textToSearch const char * The text to match to the regular expression.

The regular expression must match from the beginning of the text. If you want to search for a regular expression match anywhere in the text, you must call this function iteratively, each time passing the address of the next character in the text. See the Instrument Help for example source code. The function RegExpr_FindPatternInText performs this for you.
textLength integer The number of bytes in the text you want to be included to the search for a match to the regular expression.

If you pass a value less than zero, the text is assumed to be nul–terminated, and the full length is used.
direction integer Specifies whether the text is searched starting at the beginning and working forwards or starting at the end and working backwards.

Values:
RegExpr_SearchForwards 1: Start at beginning (any non–zero value can be used)
RegExpr_SearchBackwards 0: Start end end
matchPolicy integer Specifies whether the maximum or minimum number of characters are matched to the pattern.

Examples:

If the pattern is "a+" and the text to search is "aaaaab", the match could be of length 1 or 5. If you specify RegExpr_MatchLargestNumChars, the match length will be 5. Otherwise, it will be 1.

If the pattern is "a+b" and the text to search is "ababab", the match could be of length 2 or 6. If you specify RegExpr_MatchLargestNumChars, the match length will be 6. Otherwise, it will be 2.

Values:
RegExpr_MatchLargestNumChars 1 (Any non–zero value can be used)
RegExpr_MatchSmallestNumChars 0
Output
Name Type Description
matched integer * If there was a match, this parameter is set to 1. Otherwise, it is set to 0.
matchPosition integer * The zero–based index of the position within the text where the match to the regular expression begins.

If there was no match, then the parameter is set to –1.
matchLength integer * This parameter is set to the number of characters in the text that actually matched to the regular expression.

If there was no match, then the parameter is set to –1.

Return Value

Name Type Description
parseStatus integer Indicates if the regular expression was parsed successfully.

If the string is not a valid regular expression, a negative error number is returned. You can pass this error number to the RegExpr_GetErrorString function. However, in some cases there is more error information than can be encoded in the error number. You can get more detailed information about the result of the last call to this function by calling RegExpr_GetErrorElaboration.

The error numbers are:
0 success
–12 out of memory
–7900 unmatched character
–7901 invalid character in range
–7902 regular expression ends with backslash
–7903 invalid hex character (after \x)
–7904 operator applied to an empty pattern
–7905 empty left side of '|'
–7906 empty right side of '|'
–7907 empty group
–7908 invalid range
–7909 empty set ([])
–7910 empty input string
–7911 NULL input string
–7912 multibyte characters not allowed in range