Class RegEx() Foundation
Class for regular expression processing with PCRE compatibility.
The RegEx class implements PCRE (Perl Compatible Regular Expressions) for text processing tasks. It supports pattern matching, text searching and replacement, string tokenization, and data extraction using standard PCRE syntax.
The pattern matching functionality includes :match() for detailed results with capture groups, :test() for quick pattern checking, and :matchAll() for extracting all occurrences from a text. The matching behavior is controlled through configuration methods such as :setIgnoreCase(), :setMultiline(), and :setUnicode(), which adjust case sensitivity, anchor handling, and character encoding respectively. Helper functions such as :getMatchText() and :getMatchPos() simplify extraction of matched text and position data from the match results.
Text manipulation is accomplished through :replace() for literal substitutions, :replaceCallback() for dynamic replacements using callback functions, and :split() for dividing strings at match locations.
Batch processing is supported by the class via the methods :batchMatch(), :batchTest(), and :batchReplace(), which compile patterns once and apply them efficiently to multiple subject (input) strings.
By default, the RegEx class compiles patterns automatically on first use. If this lazy compilation approach is not desired, either because the validity of the pattern must be ensured at a certain point in time, or if the compilation costs should be moved outside of the execution code, the method :precompile() can be used.
Execution speed can be improved with the :optimize() method. Pattern execution then occurs either using bytecode or native machine code depending on the optimization method that is used.
Dedicated constructor methods provide pre-configured RegEx instances for common validation scenarios including email addresses, URLs, and IP addresses. For single-use operations, convenience class methods like :quickMatch() and :quickTest() perform pattern operations without requiring object creation.
Error information from compilation and matching operations is accessible through :getLastError() and related methods.
When done using a RegEx instance, the method :destroy() must be called to clear the pattern and free compiled data. Otherwise, this memory will remain allocated and cause a memory leak.
//
// Use the method :match() to find a number in a string
//
PROCEDURE Main()
LOCAL oRegEx, aMatch
// Match numbers
oRegEx := RegEx():create("[0-9]+")
aMatch := oRegEx:match("The answer is 42")
// Output:
// 42
? RegEx():getMatchText(aMatch)
RETURN
//
// Use the class method :getMatchText() to extract the
// text from a match result
//
PROCEDURE Main()
LOCAL oRegEx, cSubject, cText
cSubject := "Contact: john.doe@example.com"
oRegEx := RegEx():createEmail()
// Output:
// Valid email: john.doe@example.com
IF oRegEx:test( cSubject )
cText := RegEx():getMatchText( oRegEx:match( cSubject ) )
? "Valid email:", cText
ELSE
? "No valid e-mail"
ENDIF
RETURN
//
// Use two RegEx() objects to extract information
// from a string
//
PROCEDURE Main()
LOCAL i, aRaw, aCustomers
aRaw := { ;
"John (555) 123-4567 john@example.com", ;
"Jane 001 555 987-6543 jane@example.com", ;
"Bob invalid-phone bob@test.com", ;
"Scott +1 (555) 223-6543 scott@example.com" ;
}
aCustomers := ImportCustomerData(aRaw)
// Output:
// Imported 3 of 4 records
// John 5551234567 john@example.com
// Jane +15559876543 jane@example.com
// Scott +15552236543 scott@example.com
FOR i := 1 TO Len(aCustomers)
? aCustomers[i][1], aCustomers[i][2], aCustomers[i][3]
NEXT
RETURN
FUNCTION ImportCustomerData(aRawData)
LOCAL oRXPhone, oRXEmail, aMatch
LOCAL nRaw, cLine, cName, cEmail
LOCAL cNormPhone, cPhonePattern
LOCAL aCustomers
// Simple US and Canada phone number expression
cPhonePattern := "(?<!\d)" + ; // # No digit comes before
"(\+1|001)?" + ; // # Group 1: +1 or 001
"[\s.-]?" + ; // # Optional seperator
"\(?" + ; // # Optional opening brace
"(\d{3})" + ; // # Group 2: Three digits
"\)?" + ; // # Optional closing brace
"[\s.-]?" + ; // # Optional seperator
"(\d{3})" + ; // # Group 3: Three digits
"[\s.-]?" + ; // # Optional seperator
"(\d{4})" + ; // # Group 4: Four digits
"(?!\d)" // # No digits follow
// Factory methods
oRXEmail := RegEx():createEmail()
oRXPhone := RegEx():create(cPhonePattern)
aCustomers := {}
FOR nRaw := 1 TO Len(aRawData)
cLine := aRawData[nRaw]
cNormPhone := ""
// Extract name (first word)
cName := Left(cLine, At(" ", cLine) - 1)
// Extract phone
IF oRXPhone:test( cLine )
aMatch := oRXPhone:match( cLine )
cNormPhone := RegEx():getMatchText( aMatch, 1 )
cNormPhone := StrTran( cNormPhone, "001", "+1" )
cNormPhone += RegEx():getMatchText( aMatch, 2 )
cNormPhone += RegEx():getMatchText( aMatch, 3 )
cNormPhone += RegEx():getMatchText( aMatch, 4 )
ENDIF
// Extract email
IF oRXEmail:test( cLine )
cEmail := RegEx():getMatchText( oRXEmail:match(cLine) )
ENDIF
// Add customer if valid
IF !Empty(cNormPhone) .AND. !Empty(cEmail)
AAdd(aCustomers, {cName, cNormPhone, cEmail})
ENDIF
NEXT
? "Imported", Var2Char(Len(aCustomers)), "of", ;
Var2Char(Len(aRawData)), "records"
RETURN aCustomers
If you see anything in the documentation that is not correct, does not match your experience with the particular feature or requires further clarification, please use this form to report a documentation issue.