Table Of Contents

Scan String for Tokens (G Dataflow)

Version:
    Last Modified: March 2, 2017

    Searches a string for the next token, where a token is defined as either the next set of characters that appears before a specified delimiter or as one of a specified set of operators.

    Programming Patterns
    connector_pane_image
    datatype_icon

    allow empty tokens?

    A Boolean value that determines whether an empty token exists between consecutive delimiters.

    True The node returns an empty string as the token string when it encounters a pair of consecutive delimiters.
    False The node considers consecutive delimiters to be a single delimiter and therefore never returns an empty string as a token.

    Default: False

    datatype_icon

    input string

    The string to scan for tokens.

    datatype_icon

    offset

    The number of bytes into the input string at which this node begins its operation.

    The offset of the first byte in the input string is 0. If offset is beyond the end of the input string, this node returns an empty string.

    spd-note-note
    Note  

    Strings are encoded in UTF-8. For strings containing characters in the U+0000 through U+007F range, the number of bytes in a string is equivalent to the number of characters. However, for strings containing the characters U+0080 through U+7FFFFFFF, the number of bytes is greater than the number of characters.

    Default: 0

    datatype_icon

    operators

    An array of strings that this node identifies as tokens when they appear in input string, even if they are not surrounded by delimiters.

    Available Format Specifiers

    In addition to specifying literal strings as tokens, you can use certain format specifiers to interpret a series of digits as a token.

    %d match decimal integer
    %o match octal integer
    %x match hexadecimal integer
    %b match binary integer
    %e, %f, %g match floating-point or scientific real number
    %% match a single % character
    datatype_icon

    delimiters

    An array of strings that act as separators between tokens. The node does not return these strings as tokens but instead uses these strings to determine where tokens begin and end.

    Default: White space characters — space, tab, linefeed, and carriage return

    datatype_icon

    use cached delimiter/operator data?

    A Boolean value that determines whether the node uses saved values for delimiters and operators, thereby improving string parsing performance.

    Set this input to True only if delimiters and operators have not changed since the last time this node executed.

    True The node uses the delimiters and operators from the most recent time this node executed.
    False The node uses the values wired to delimiters and operators.

    Default: False

    datatype_icon

    string out

    The same string as input string, unchanged.

    datatype_icon

    offset past token

    The index in the input string of the first byte following the token and any trailing delimiters.

    If the offset is less than 0 or greater than the number of bytes in the input string, or if the end of the string was reached, this output is -1.

    To continue searching for more tokens in the input string, use this value as the offset the next time you call this node.

    spd-note-note
    Note  

    Strings are encoded in UTF-8. For strings containing characters in the U+0000 through U+007F range, the number of bytes in a string is equivalent to the number of characters. However, for strings containing the characters U+0080 through U+7FFFFFFF, the number of bytes is greater than the number of characters.

    datatype_icon

    token string

    The first token in the input string following the offset. This output is either all text that appears between two delimiters or one of the strings specified by operators.

    datatype_icon

    token index

    The index of the token string in operators if the token string matches any of the elements in operators.

    If token string is any other string, this output returns -1. If the node reaches the end of the input string without finding any valid operator, this output returns -2.

    Token Definition

    Tokens are text segments that typically represent individual keywords, numeric values, or operators found when parsing a configuration file or other text-based data format. You can specify tokens with the data you pass into the node through the delimiters and operators inputs. For example, because the space character is a delimiter by default, each word of This is a string is a token, and you can parse the sentence into its component words.

    Scanning Multiple Tokens

    You can use this node in a While Loop to scan multiple tokens. Refer to Parsing a String into Smaller Pieces for more information.

    Behavior When Matching Multiple Operators

    If a portion of the input string matches more than one defined operator, the node chooses the longest match as a token. For example, if >, =, and >= are defined operators, the input string 4>=0 produces >= as the next token string with an offset of 1.

    Interpreting Numbers as Tokens

    If you want to interpret a series of digits as a token that represents a number, include a format specifier in the list of values for the operators input. For example, including %b as one of the elements in operators causes the node to interpret a string of 1s and 0s as a binary number and return it as a token after encountering any character that is not a 1 or 0.

    If you include a format specifier in operators along with the strings + or -, the node does not recognize leading, or unary, + and - signs. The node always returns them as separate tokens. For example, if input string contains -5 and operators includes [%d, -], token string returns [-, 5] instead of [-5]. This is an exception to the "longest match" rule.

    If you place the node in a While Loop, the node returns the following values.

    input string operators delimiters token string Comments
    4>=0 [>, =, >=] \s, \t, \r, \n (default) [4, >=, 0]

    If a portion of the input string matches more than one defined operator, Scan String for Tokens chooses the longest match as a token.

    a==b

    c!=d

    [==, !=] \s, \t, \r, \n (default) [a, ==, b, c, !=, d]
    G2 X0.5Y1.0 i0.5j0 z-0.05 [X, Y, Z, i, j, z] \s, \t, \r, \n (default) [G2, X, 0.5, Y, 1.0, i, 0.5, j, 0, z, -0.05]

    This is an example of a string of G-code, a language commonly used for machine control. This string describes a circle.

    C1_1.11C2_2.22C3_3.33 None C, _ (add to delimiters array)

    \s, \t, \r, \n (default)

    [1, 1.11, 2, 2.22, 3, 3.33]

    This is an example of a string from a DAQ log with three channels.

    Where This Node Can Run:

    Desktop OS: Windows

    FPGA: DAQExpress does not support FPGA devices


    Recently Viewed Topics