Table Of Contents

Match Regular Expression (G Dataflow)

Version:
    Last Modified: March 15, 2017

    Searches for a pattern of characters in a string as specified by a regular expression. If this node finds a match, it splits the string into three substrings and any number of submatches. Resize the node to view additional submatches found in the string.

    This node does not support null characters in strings. If you include null characters in strings you wire to this node, the node returns an error and may return unexpected results.

    This node performs more slowly than Match Pattern but gives you more options for matching strings. For example, Match Regular Expression supports the parenthesis and vertical bar (|) characters.

    connector_pane_image
    datatype_icon

    ignore case

    Boolean value that determines whether the string search is case sensitive.

    True The search ignores the letter case of the input.
    False The search matches the letter case of the input.

    Default: False

    datatype_icon

    multiline?

    A Boolean value that determines whether to treat the text in the input string as a multiple-line string. This affects how the ^ and $ characters handle matches.

    True ^ matches the beginning of any line in the input string. $ matches the end of any line in the input string.
    False ^ matches only the beginning of the input string. $ matches only the end of the input string.

    Default: False

    datatype_icon

    input string

    The string that this node searches. This string cannot contain null characters.

    datatype_icon

    regular expression

    The pattern you want to search for in the input string. This string cannot contain null characters. Use Perl Compatible Regular Expressions to refine searches.

    Definitions of Regular Expressions

    Use the following regular expressions to search the input string.

    Regular Expression Description Examples
    . (period) Matches any single character except a newline character. Within square brackets, . is literal.

    Input String: Welcome to LabVIEW.

    Regular Expression: t....

    Match: to La

    If you use [z.] as the regular expression, the period is literal and matches either . or z. In this example, [z.] returns . as the match.

    * Marks the single preceding character, character group, or character class as one that can appear zero or more times in the input. Because an asterisk can mark a pattern as one that appears zero times, regular expressions that include an asterisk can return an empty string if the whole pattern is marked with an asterisk. This quantifier applies to as many characters as possible.

    Input String: Hello LabVIEW!

    Regular Expression: el*

    Match: ell

    Expressions such as w* or (welcome)* match an empty string if the node finds no other matches.

    + Marks the single preceding character, character group, or character class as one that can appear one or more times in the input. This quantifier applies to as many characters as possible.

    Input String: Hello LabVIEW!

    Regular Expression: el+

    Match: ell

    ?

    Marks the single preceding character, character group, or character class as one that can appear zero or one time in the input. This quantifier applies to as many characters as possible.

    When used immediately after a quantifier, ? modifies the quantifier to match as few times as possible. Modifiable quantifiers include *, +, and {}.

    Input String: Hello LabVIEW!

    Regular Expression: el?

    Match: el

    Input String: <ul><li>Hello</li><li>LabVIEW</li></ul>

    Regular Expression: <li>.+?</li>

    Match: <li>Hello</li>

    In the second example, if you remove ? from the regular expression, the new match becomes <li>Hello</li><li>LabVIEW</li> because + matches as many characters as possible unless you include ? immediately after +. You can use this regular expression to match any string within <li></li> tags.

    {n,N} Marks the single preceding character, character group, or character class as one that can appear the number of times you specify, where n is the minimum and N is the maximum. You also can specify a single number. If you specify a range, this quantifier matches as many times as possible.

    Input String: <ul><li>Hello</li><li>Lab</li><li>VIEW</li><li>!</li></ul>

    Regular Expression: (<li>.+?</li>){2}

    Match: <li>Hello</li><li>Lab</li>

    Input String: <ul><li>Hello</li><li>Lab</li><li>VIEW</li><li>!</li></ul>

    Regular Expression: (<li>.+?</li>){1,3}

    Match: <li>Hello</li><li>Lab</li><li>VIEW</li>

    In the second example, the minimum match limit is one and the maximum is three. Because the regular expression matches as many times as possible within the limit you specify, the regular expression returns three.

    []

    Creates a character class, which allows you to match any one of a set of characters that you specify. For example, [abc] matches a, b, or c.

    You can use - to specify a range of characters. For example, [a-z] matches any single lowercase letter.

    This node interprets special characters inside square brackets literally, with the exception of ^, -, and \.

    Input string: version=14.0.1

    Regular Expression: [0-9]+(\.[0-9]+)*

    Match: 14.0.1

    The expression [0-9] matches any digit. The plus sign matches the previous character class, [0-9], one or more times but as many times as possible. The parentheses create a character group, which creates a submatch of the . character and all following digits. The expression \. matches a literal . character. The plus sign matches the previous character class, [0-9], one or more times but matches as many times as possible. The asterisk matches the previous character group, (\.[0-9]+), zero or more times, so the regular expression still matches integers if there is no . character. You can use this regular expression to match any integer, decimal number, version number, IPv4 address, or other number sequence separated by . characters.

    ()

    Creates a character group, which allows you to match an entire set of characters that you specify. A quantifier that immediately follows a character group quantifies the entire group.

    Parentheses also create submatches where each individual character group returns a submatch. If you nest a character group within another character group, the regular expression creates a submatch for the outer group before the inner group. Expand Match Regular Expression to access submatch outputs.

    You also can refer back to submatches later in an expression using backreferences. Refer to the Backreferences section for more information about using backreferences in regular expressions.

    Input String: Hello LabVIEW!

    Regular Expression: (el.)..(L..)

    Match: ello Lab

    Submatch 1: ell

    Submatch 2: Lab

    Input String: Hello LabVIEW!

    Regular Expression: (.(el.).).(L..)

    Match: Hello Lab

    Submatch 1: Hello

    Submatch 2: ell

    Submatch 3: Lab

    | Separates alternate possible matches. This character is useful when you want to match any of a number of character groups. A regular expression that contains | returns the first match that the node finds in the input string regardless of the order of your possible matches. For example, both regular expressions dog|cat and cat|dog match dog in the dog chased the cat.

    Input String: value=FALSE total=12.34 token=TRUE

    Regular Expression: (value|token)=(TRUE|FALSE)

    Match: value=FALSE

    Submatch 1: value

    Submatch 2: FALSE

    The regular expression returns the first possible match in the input string. If token=TRUE appeared before value=FALSE in the input string, the regular expression would match token=TRUE instead.

    ^

    Anchors a match to the beginning of a string when used as the first character of a pattern.

    If you set the multiline? input to True on this node, ^ matches the beginning of any line within the string using the line endings of the current platform.

    You also can match any character not in a given character class by adding ^ to the beginning of a character class. For example, [^0-9] matches any character that is not a digit. [^a-zA-Z0-9] matches any character that is not a lowercase or uppercase letter and also not a digit.

    Input String: Hello LabVIEW!

    Regular Expression: ^[^ ]+

    Match: Hello

    The regular expression matches as many characters as possible – other than a space character – from the beginning of the input string. You can use this regular expression to isolate the first word, numeral, or other character combination of a string.

    Input String:
    Hello
    LabVIEW

    Regular Expression: ^LabVIEW

    Match: LabVIEW

    The regular expression matches LabVIEW only if you set multiline? to True.

    $

    Anchors a match at the end of a string when used as the last character of a pattern.

    If you set the multiline? input to True, $ matches the end of any line within the string using the line endings of the current platform.

    Referenced a parenthesized item with $n, where n is the index of the parenthesized item. Explicitly insert dollar signs ($) by prefixing them with a backslash (\). For example, \$5 represents $5 and $5 represents the 5th parenthesized item.

    Input String: Hello LabVIEW!

    Regular Expression: [^ ]+$

    Match: LabVIEW!

    The regular expression matches as many characters as possible – other than a space character – from the end of the input string. You can use this regular expression to isolate the last word, numeral, or other character combination of a string.

    Input String:
    Hello
    LabVIEW

    Regular Expression: Hello$

    Match: Hello

    The regular expression matches Hello only if you set multiline? to True.

    \

    Cancels the special meaning of any special character in this list that immediately follows the backslash and instead matches the literal character.

    The following escaped expressions have special meanings:

    • \b—Represents a word boundary. A word boundary is a character that is not a word character adjacent to a character that is a word character and vice versa. A word character is an alphanumeric character or an underscore (_). For example, \bhat matches hat in hatchet but not in that. hat\b matches hat in that but not in hatchet. \bhat\b matches hat in hat but not in that or hatchet.
    • \c—Matches any control or non-printing character; includes any code point in the character set that does not represent a written symbol
    • \w—Matches any word character; equivalent to [a-zA-Z0-9_]
    • \W—Matches any non-word character; equivalent to [^a-zA-Z0-9_]
    • \d—Matches any digit character; equivalent to [0-9]
    • \D—Matches any non-digit character; equivalent to [^0-9]
    • \N—Matches a previous submatch within the same regular expression where N is a digit; refer to the Backreferences section for more information about using \N
    • \s—Matches any whitespace character; includes space, newline, tab, carriage return, and so on
    • \S—Matches any non-whitespace character
    • \n—Matches a newline character
    • \t—Matches a tab character
    • \r—Matches a carriage return character
    • \f—Matches a formfeed character
    • \031—Matches an octal character (31 octal in this case)
    • \x3F—Matches a hexadecimal character (3F hexadecimal in this case)

    Input String: total=$12.34

    Regular Expression: \$\d+(\.\d{2})?

    Match: $12.34

    The expression \$ matches a literal dollar sign because the backslash cancels the special meaning. The expression \d+ matches as many digits as possible and must match at least one digit. The expression (\.\d{2})? matches . and two digits, but ? makes this portion of the regular expression optional to match. You can use this regular expression to match dollar values that use a . character as a decimal separator. Locales that use a different character as a decimal separator must adapt the regular expression.

    Input String: NEWtoken=FALSE token=TRUE checkFile=TRUE total=12.34

    Regular Expression: \btoken=\w+\s\b\S*

    Match: token=TRUE checkFile=TRUE

    The regular expression does not match token=FALSE in NEWtoken=FALSE because \b makes the regular expression match token= only at the beginning of a word. The expression \w+ matches as many word characters as possible and must match at least one. In this example, \w+ matches TRUE. The expression \s matches a space character. The expression \b\S* matches all non-whitespace characters that begin a word until the function finds another whitespace character. In this example, \b\S* matches checkFile=TRUE.

    Input String:
    Welcome
    to   LabVIEW!

    Regular Expression: come\n\S*\t\w*\x21

    Match:
    come
    to   LabVIEW!

    The expression come\n matches the literal letters followed by a newline character. The expression \S* matches as many non-whitespace characters as possible, which is the word to in this case. The expression \t matches the tab in between to and LabVIEW!. The expression \w* matches as many word characters as possible, which is LabVIEW in this case. The expression \x21 matches the exclamation point because 21 is the hexadecimal code for an exclamation point.

    spd-note-tip
    Tip  

    To anchor a match at the beginning and end of a string, use ^ as the first character in a pattern and $ as the last character of a pattern. For example, ^LabVIEW$ matches LabVIEW in LabVIEW but not in LabVIEW! or Hello LabVIEW. Anchoring the match at the beginning and end of the string requires the whole string to match.

    Specifying Backreferences in the Search String

    Use backreferences to refer to previous submatches within the same regular expression. You can use backreferences to create a submatch using a character group in one part of an expression and then match that exact submatch in a later part of the expression.

    To specify a backreference, use \1 to refer to the first submatch, \2 to refer to the second, and so on. For example, consider the following regular expression:

    (["*$])(\w+)\1\2\1

    The first character group contains a character class that matches ", *, or $. The second character group matches one or more word characters. The first backreference, \1, matches the same submatch as the first character group, (["*$]). The second backreference, \2, matches the same submatch as the second character group, (\w+). The third backreference, \1, is identical to the first backreference and matches the same submatch as the first character group.

    This example matches strings such as "foo"foo", *bar*bar*, and $baz$baz$, but does not match strings such as "foo$foo" or "foo*bar".

    datatype_icon

    offset

    The number of bytes into the input string at which this node starts searching.

    The offset of the first byte in the input string is 0. If offset is beyond the end of the input string, this node returns an empty string.

    spd-note-note
    Note  

    Strings are encoded in UTF-8. For strings containing characters in the U+0000 through U+007F range, the number of bytes in a string is equivalent to the number of characters. However, for strings containing the characters U+0080 through U+7FFFFFFF, the number of bytes is greater than the number of characters.

    datatype_icon

    error in

    Error conditions that occur before this node runs.

    The node responds to this input according to standard error behavior.

    Standard Error Behavior

    Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

    error in does not contain an error error in contains an error
    If no error occurred before the node runs, the node begins execution normally.

    If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

    If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.

    Default: No error

    datatype_icon

    before match

    A string that contains all the characters in input string that occur before the match. If the node does not find a match, this string contains all of the characters in the input string.

    datatype_icon

    whole match

    All the characters that match the expression entered in regular expression. Any substring matches the node finds appear in the submatch outputs. If this node does not find a match, whole match contains an empty string.

    datatype_icon

    after match

    All the characters after the match. If the node does not find a match, after match contains an empty string.

    datatype_icon

    offset past match

    Index of the first character after the last match the node finds in the input string. If the node does not find a match, offset past match returns -1.

    datatype_icon

    error out

    Error information.

    The node produces this output according to standard error behavior.

    Standard Error Behavior

    Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

    error in does not contain an error error in contains an error
    If no error occurred before the node runs, the node begins execution normally.

    If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

    If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.
    datatype_icon

    submatch

    Portion of the whole match you capture using character grouping in the regular expression. Capture a submatch by placing parentheses around the portion of a regular expression you want the node to return as a submatch.

    For example, the regular expression (el.)..(L..) returns two submatches in the input string Hello LabVIEW!. Each submatch corresponds to a character group in the order that the character group appears in the regular expression. In this example, submatch 1 is ell and submatch 2 is Lab.

    Submatch Behavior with Nested Character Groups

    If you nest a character group within another character group, the regular expression creates a submatch for the outer group before the inner group. For example, the regular expression (.(el.).).(L..) returns three submatches in the input Hello LabVIEW!: Hello, ell, and Lab. In this example, submatch 1 is Hello because the regular expression matches the outer character group before the inner group.

    Repeated Grouped Expressions and Stack Overflow

    Certain regular expressions that use repeated grouped expressions require significant resources to search large input strings. If these regular expressions recurse repeatedly while attempting to match a large string, they may eventually overflow the stack. You can rewrite expressions like these to avoid overflowing the stack. Refer to the following table for example alternatives.
    Grouped Regular Expression Rewritten Expression
    (.|\s)*A (?s).*A or [^A]*A
    (a*)* a*

    What Happens When There Is No Match?

    If the node does not find a match, whole match and after match contain empty strings, before match contains the entire input string, offset past match returns -1, and all submatch outputs return empty strings. Place any substrings you want to search for in parentheses. This node returns any substring expressions it finds in submatch 1..n.

    Where This Node Can Run:

    Desktop OS: Windows

    FPGA: This product does not support FPGA devices


    Recently Viewed Topics