:GREP_PROCESS, GREP_PROCESS
Use GREP_PROCESS to grep, for example, to filter lines in a text container. You can use this scripting element with regular expressions, process handles, and also select specific columns and work only within them.
Note: The terms Data Sequences and Text Containers are used interchangeably.
This scripting element can be used as both, a script statement and a script function:
-
When used as a script statement, it modifies the text container of the input handle and provides the result in that text container. This is useful for stacking commands to modify text containers, thus avoiding reallocating new process handles.
-
When used as a script function, the text container of the process handle is provided read-only, and the result is provided in a new text container as a return value of the function.
Syntax
When used as a script statement:
:GREP_PROCESS Handle[, Regex[, Column[, Flags]]]
When used as a script function:
GREP_PROCESS (Handle[, Regex[, Column[, Flags]]])
Parameters
-
Handle
Process handle of a text container
Format: script literal or script variable
-
Regex
(Optional) Regular expression to be used.
The regular expression use the Boost C++ Libraries syntax. For more information, see the official Boost C++ Libraries documentation.
-
Column
(Optional) The index indicates which column to search, beginning with 1, or the entire line if the index is 0.
If the text container (data sequence) loaded had columns defined, you can select a specific column and use this script element to work within the relevant column. To do so, make sure you know which column delimiter was used at the time of loading the text container (data sequence). Otherwise, you can define the delimiter using the ādā flag.
Format: script literal, script variable or number without quotation marks, Column number or 0 for plain line.
Default value: 0
Note: When you specify a column to be used for grepping, the grepped result still contains the full lines.
-
Flags
(Optional) Use flags to modify the operation. The flags available are:
-
i: Case-insensitive
By default, the search is case sensitive.
-
n: Not match null (for example, with "\d*")
The "*" qualifier searches for zero or more occurrences of a pattern and tries to find as many occurrences as possible. This might return zero occurrences. Setting this flag avoids that behavior.
-
v: Inverted selection
Inverts the result. When provided, it outputs all lines that did not match.
-
x: Match whole line
If provided, the regex provided matches the whole line / string.
-
l: Print line numbers
When provided, the line number is printed before each line.
-
d: Delimiter that separates columns
Allows you to define a specific delimiter, which can be any text of at least one character, to virtually break up text into columns. The delimiter is enclosed by the character immediately following the "d". This overwrites any default delimiter that the text container might have. For example:
d' ' for a single blank space
d'$$' for a sequence of two dollar signs
d+,+ for a comma
d!foo! for the word "foo"
Flags can be combined and provided in any order. For example, if you combine flag i with flag n, you can do so using "in" or "ni", or the script variable &FLAG#. An exception is the flag "d", which can be used to specify a multi character sequence as a delimiter used to separate columns in a text.
-
Examples
Grepping and modifying the content of a file:
!* Load text container from a file
:SET&HND_TEXT# = PREP_PROCESS_FILE(AR_MAIN_JWX6_VFRKARWIN01_01,"C:\Temp\ContentFile.txt",,,'UC_LOGIN=AR.LOGIN.OK')
!* 1.) Number all lines of the text
:SET &HND_NUMBERED_LINES# = GREP_PROCESS(&HND_TEXT#, "", , "l")
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_NUMBERED_LINES#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_NUMBERED_LINES#
: SET &LINE# = GET_PROCESS_LINE(&HND_NUMBERED_LINES#)
: PRINT &LINE#
:ENDPROCESS
!* 2.) Get rid of empty lines
:GREP_PROCESS &HND_TEXT#, "^[ ]*$", , "v"
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_TEXT#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_TEXT#
: SET &LINE# = GET_PROCESS_LINE(&HND_TEXT#)
: PRINT &LINE#
:ENDPROCESS
!* 3.) Get only lines which start with a word with uppercase letters and has at least two letters
:GREP_PROCESS &HND_TEXT#, "^[[:upper:]][[:alpha:]]"
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_TEXT#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_TEXT#
: SET &LINE# = GET_PROCESS_LINE(&HND_TEXT#)
: PRINT &LINE#
:ENDPROCESS
!* 4.) Remove all words which start with a lower case letter
:MODIFY_PROCESS &HND_TEXT#, " [[:lower:]][[:alpha:]]*", "", , "g"
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_TEXT#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_TEXT#
: SET &LINE# = GET_PROCESS_LINE(&HND_TEXT#)
: PRINT &LINE#
:ENDPROCESS
!* 5.) Remove all characters which are no letters except spaces
:MODIFY_PROCESS &HND_TEXT#, "[[:punct:]]", "", , "g"
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_TEXT#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_TEXT#
: SET &LINE# = GET_PROCESS_LINE(&HND_TEXT#)
: PRINT &LINE#
:ENDPROCESS
!* 6.) Take that text container and exchange the first with the last word in each line
:MODIFY_PROCESS &HND_TEXT#, "^([[:alpha:]]+)( (.* )?)([[:alpha:]]+)$", "$4$2$1"
!* Output lines
:SET &NUMBER_LINES# = GET_PROCESS_INFO(&HND_TEXT#, ROWS)
:PRINT &NUMBER_LINES# LINES
:PROCESS &HND_TEXT#
: SET &LINE# = GET_PROCESS_LINE(&HND_TEXT#)
: PRINT &LINE#
:ENDPROCESS
See also: