Perl: RegEX: Capture group multiple times - regex

I'm developing a piece of code to filter a text as follows:
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.....
+ X Y Z
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
I need the output to be
A B C D E F
G H I
J K L
.....
X Y Z
I already made a regular expression to do so:
m/\.SUBCKT\s+SVI\s(.*)|\+(.*)/gm
The problem is that I have many similar sections like this input but I only need to detect + lines which are following .SUBCKT SVI header not any other header.
How I could match group many times like (\+\s+(.*)). I want to match this repeated capture group as it repeated many times.
Any advice to get this expression.

Perhaps this is closer to what you need.
m/\.SUBCKT\s+SVI\s(.*)\n(\+\s+(.*)\n)*/gm

Does this do what you want? Note that it stops at the ..... because it doesn't begin with a + or .SUBCKT
It won't handle the case where a range of + lines is immediately followed by another .SUBCKT line; is that a problem?
use strict;
use warnings;
while ( <DATA> ) {
next unless my $in_range = s/^\.SUBCKT\s+// ... /^[^+]/;
next if $in_range =~ /E/;
s/^\S+\s+//;
print;
}
__DATA__
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.....
+ X Y Z
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
output
A B C D E F
G H I
J K L
Update
Here's a state machine version that deals with the special case described above
use strict;
use warnings;
my $state;
while ( <DATA> ) {
if ( /^\.SUBCKT\s+\S+\s+(.+)/ ) {
$state = 1;
print $1, "\n";
}
elsif ( /^\+\s+(.+)/ ) {
print $1, "\n" if $state;
}
else {
$state = 0;
}
}
__DATA__
<DATA>
.SUBCKT SVI A B C D E F
+ G H I
+ J K L
.SUBCKT SVI A B C D E F
+ M N O
+ P Q R
*.PININFO AA BB CC
*.PININFO DD EE FF
<DATA>
output
A B C D E F
G H I
J K L
A B C D E F
M N O
P Q R

I made use of #shawnt00 answer and modified the regular expression and it made the job.
\.SUBCKT\s+SVI_TRX201TH\s(.*\n(\+\s+.*\n)*)

Related

Excel nesting - IF / AND Query part two?

Hi I Had a query earlier and thought I had cracked it with the help of Richard but it doesn't appear
I have attached an image and what I am trying to achieve to make my query clearer.
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 111 then G will populate with the contents of C
* If E is no and F is set to anything but 111 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 112 then H will populate with the contents of C
* If E is no and F is set to anything but 112 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 118 then I will populate with the contents of C
* If E is no and F is set to anything but 118 then it will return 0
* If E is correct then cell F will be set to match D manually
* If E is yes and F is set to 119 then J will populate with the contents of C
* If E is no and F is set to anything but 119 then it will return 0
It's not 100% clear, but sounds like this is what you're after:
F2 = =IF(E2="Yes",IF(OR(D2=111,D2=112,D2=118,D2=119)=TRUE,D2,""),"")
G2 = =IF(AND(E2="Yes",F2=111)=TRUE,C2,"")
H2 = =IF(AND(E2="Yes",F2=112)=TRUE,C2,"")
I2 = =IF(AND(E2="Yes",F2=118)=TRUE,C2,"")
J2 = =IF(AND(E2="Yes",F2=119)=TRUE,C2,"")
Then just fill down. I've put "" instead of 0, because it's a lot easier to see what's going on without zero's everywhere. You can change them back once you're happy with the outcome.
Incidentally, sometimes it's easier to parse the code out. Excel works fine if you have code on different lines, like the following for D2:
=
IF(
E2="Yes",
IF(
OR(
D2=111,D2=112,D2=118,D2=119
)=TRUE,
D2,
""
),
""
)

Regex - Summing values in a string

We get data from another company in the following formats
374-KH-ON-PEAK|807-KH-OFF-PEAK
82.5-KH-TOTAL|8-K1-CURRENT
44.5-KH-TOTAL
65-KH-ON-PEAK|2.1-K1-ON-PEAK|164-KH-OFF-PEAK|27-K1
These values go into a SQL Server table. The numbers represent electricity usages. I'm working on finding a way to extract the numbers and sum them together.
There is only one condition: the number must be followed by "-KH". If it is followed by "-K1" we don't need to do anything with it.
Upon inputting "65-KH-ON-PEAK|2.1-K1-ON-PEAK|164-KH-OFF-PEAK|27-K1", I need to output 229 which stands for 65 + 164
I'd prefer to find a solution using VBA for Access(For reasons related to the business's current software solutions), but I'm open to other solutions as well.
Using [Excel] can be done like this:
code:
Sub test()
Dim cl As Range, z!, x As Variant, x2 As Variant
For Each cl In [A1:A4]
z = 0
For Each x In Split(cl.Value2, "|")
If x Like "*-KH-*" Then
For Each x2 In Split(x, "-")
If IsNumeric(x2) Then z = z + x2
Next x2
End If
Next x
cl.Offset(, 1).Value = z
Next cl
End Sub
another variant, without second loop (using #shawnt00 comment below OP)
Sub test()
Dim cl As Range, z!, x As Variant
For Each cl In [A1:A4]
z = 0
For Each x In Split(cl.Value2, "|")
If x Like "*-KH-*" Then z = z + Left(x, InStr(1, x, "-") - 1)
Next x
cl.Offset(, 1).Value = z
Next cl
End Sub
output:
Using [Access] can be something like this:
Sub test2()
Dim z!, x As Variant
Dim rs As DAO.Recordset
Set rs = CurrentDb.OpenRecordset("SELECT * FROM Table1")
Do Until rs.EOF = True
z = 0
For Each x In Split(rs!Field1, "|")
If x Like "*-KH-*" Then z = z + Left(x, InStr(1, x, "-") - 1)
Next x
Debug.Print rs!Field1, z
rs.MoveNext
Loop
End Sub
test:
You would do a single bulk insert into an SQL Server table using | as the field terminator, so you would have fields like f1,f2,f3,f4. Then you can use an expression like:
WITH numerics
AS ( SELECT CASE
WHEN PATINDEX('%-KH-%', f1) > 0
THEN CAST(SUBSTRING(f1, 1, PATINDEX('%-KH-%', f1) - 1) AS INT)
ELSE 0
END AS f1,
CASE
WHEN PATINDEX('%-KH-%', f2) > 0
THEN CAST(SUBSTRING(f2, 1, PATINDEX('%-KH-%', f2) - 1) AS INT)
ELSE 0
END AS f2,
CASE
WHEN PATINDEX('%-KH-%', f3) > 0
THEN CAST(SUBSTRING(f3, 1, PATINDEX('%-KH-%', f3) - 1) AS INT)
ELSE 0
END AS f3,
CASE
WHEN PATINDEX('%-KH-%', f4) > 0
THEN CAST(SUBSTRING(f4, 1, PATINDEX('%-KH-%', f4) - 1) AS INT)
ELSE 0
END AS f4
FROM myTable )
SELECT f1 + f2 + f3 + f4 AS rowTotal;
You could do it with a Powershell script, that would give the power of regex to extract and sum the numbers. Something like the example below (I have tested the extracting from the file part but not the Access parts so they may need some tweaking):
$conn = New-Object -ComObject ADODB.Connection
$recordset = New-Object -ComObject ADODB.Recordset
$conn.Open()
$cmd = $conn.CreateCommand()
$ado.open("Provider = Microsoft.ACE.OLEDB.12.0;Data Source=\\path_to\database.accdb")
# Microsoft.Jet.OLEDB.4.0 for older versions of Access
(Select-String file.txt -Pattern '[\d.]+(?=-KH)' -AllMatches) | % {
($_.Matches | % {
[double]$_.Value
} | Measure-Object -Sum).Sum
} | % {
$cmd.CommandText = "INSERT INTO TABLE VALUES($($_))"
Write-Output $cmd.ExecuteNonQuery()
}
$conn.Close()

Extracting columns with a difference in aligned data

I have some aligned data (something bioinformatic related) as so:
reference_string = 'yearning'
string2 = 'learning'
string3 = 'aligning'
I need to extract only columns showing differences in relation to the reference data.
The output should show only positional information of the columns containing differences in relation to the reference string and the corresponding reference item.
1 2 3 4
y e a r
l
a l i g
My current code does most things okay except that it also reports columns with no difference.
string1 = 'yearning'
string2 = 'learning'
string3 = 'aligning'
string_list = [string1, string2]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
for d in all_diffs:
print str(int(d+1)),
print
for c in reference:
print str(c),
print
for i, s in enumerate(string_list):
for j, c in enumerate(s):
if j in diffs_top[i]:
print str(c),
else:
print str(' '),
print
This code would give:
1 2 3 4
y e a r n i n g
l
a l i g
Any help appreciated.
EDIT: I have picked some section of real data to make the problem as clearer as possible and my attempt at solving it thus far:
reference_string = 'MAHEWGPQRLAGGQPQAS'
string1 = 'MAQQWSLQRLAGRHPQDS'
string2 = 'MAQRWGAHRLTGGQLQDT'
string3 = 'MAQRWGPHALSGVQAQDA'
string_list = [string1, string2, string3]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
#print diffs_top
#print all_diffs
for d in all_diffs:
print str(int(d+1)), # retains natural positions of the reference residues
print
for d in all_diffs:
for i, c in enumerate(reference):
if i == d:
print c,
print
The print out will be an output showing the position at which there is any difference to other non-reference strings and the corresponding reference letter.
3 4 6 7 8 9 11 13 14 15 17 18
H E G P Q R A G Q P A S
Then the next step is to write a code that will process non reference strings by printing out the difference with the reference (at that position). If there is no difference it will leave blank (' ').
Doing it manually the output will be:
3 4 6 7 8 9 11 13 14 15 17 18
H E G P Q R A G Q P A S
Q Q S L R H D
Q R A H T L D T
Q R H A S V A D A
My entire code as an attempt to get to the solution above as been messy to say the least:
reference_string = 'MAHEWGPQRLAGGQPQAS'
string1 = 'MAQQWSLQRLAGRHPQDS'
string2 = 'MAQRWGAHRLTGGQLQDT'
string3 = 'MAQRWGPHALSGVQAQDA'
string_list = [string1, string2, string3]
reference = reference_string
diffs_top, diffs = [], []
all_diffs = set()
for s in string_list:
diffs = []
for i, c in enumerate(s):
if s[i] != reference[i]:
diffs.append(i)
all_diffs.add(i)
diffs_top.append(diffs)
#print diffs_top
#print all_diffs
for d in all_diffs:
print str(int(d+1)),
print
for d in all_diffs:
for i, c in enumerate(reference):
if i == d:
print c,
print
# this is my attempt to look into non-reference strings
# to check for the difference with the reference, and print an output.
for d in all_diffs:
for i, s in enumerate(string_list):
for j, c in enumerate(s):
if j == d:
print c,
else:
print str(' '),
print
Your code is working perfectly fine (as per your logic).
What is happening , is that while printing the output, when you come across the reference string, Python looks for the corresponding entry in the diffs_top list and because while storing in diff_top, you have no entry stored for the reference string, Python just prints blank spaces for your reference string.
1 2 3 4
y e a r n i n g #prints the reference string, because you've coded in that way
#prints blank as string_list[0] and reference string are the same
l
a l i g
The question here is how exactly do you define your difference for reference string.
Besides, I also found some fundamental flaws in your code implementation. If you try to run your code by setting string_list[1] as your reference string, you would get your output as :
1 2 3 4
l e a r n i n g
y
a l i g
Is this what you need? Please spend some time in properly defining difference for all cases and then try to implement you code.
EDIT:
As per you updated requirements, replace the last block in your code with this:
for i, s in enumerate(string_list):
for d in all_diffs:
if d in diffs_top[i]:
print s[d],
else:
print ' ',
print
Cheers!
I think there is a general problem in your logic. If you need to extract only columns showing difference in relation to the reference data and string1 is the reference the output should be:
1 2 3 4
l
a l i g
So, 'yearning' shouldn't show any character because it has no difference to string1.
If you delete or put the following lines in comments, you will exactly get what I expect is the right answer:
#for c in reference:
# print str(c),
#print
Consider to review your logic if this solution is not what you actually want.
Update
Here is a shorter solution which solves your task:
from itertools import compress, izip_longest
def delta(reference, string):
return [ '' if a == b else b for a, b in izip_longest(reference, string)]
ref_string = 'MAHEWGPQRLAGGQPQAS'
strings = ['MAQQWSLQRLAGRHPQDS',
'MAQRWGAHRLTGGQLQDT',
'MAQRWGPHALSGVQAQDA']
delta_strings = [delta(ref_string, string) for string in strings]
selectors = [1 if any(tup) else 0 for tup in izip_longest(*delta_strings)]
indices = [str(i+1) for i in range(len(selectors))]
output_data = [indices, ref_string] + delta_strings
for line in output_data:
print ''.join(x.rjust(3) for x in compress(line, selectors))
Explanation:
I defined a function delta(reference, string) which returns the delta between the string and the referenced string. For example: delta("ABFF", "AECF") returns the list ['', E, C, ''].
The variable delta_strings holds all the deltas between each string in the list strings and the reference string ref_string.
The variable selector is a list containing only 1 and 0 values, where 0 specifies the collumns which shouldn't be printed and vice versa.

G++ Warning: extra tokens at end of #include directive [enabled by default]

I can't find the problem, anyone know solve?
Code
#include <algorithm>‎
int main(int argc, char* argv[]) {
return 0;
}
Warning
extra tokens at end of #include directive [enabled by default]
Looking at the code quoted above using od -c gives this output:
0000000 # i n c l u d e < a l g o r i
0000020 t h m > 342 200 216 \n i n t m a i n
0000040 ( i n t a r g c , c h a r *
0000060 a r g v [ ] ) { \n r
0000100 e t u r n 0 ; \n } \n
Note the bytes between the > and the \n: You probably want to get rid of them.

Regex Replace on IBMi

I am looking for a way to use Regex Replace functions on IBM iseries.
As far as i know, i can use C++ librairies (regex.h) (source)
With this, i can only match regex, but not replace.
(using regcomp() to compile and regexec() to match the regex)
Does anyone know a way to do it ?
It's true that the C/C++ POSIX regular expression library doesn't have a built in regexp replace function, but you can accomplish the same thing using positional information from regexec() and the RPGLE %replace() built in function. (I'm assuming you're going to use RPGLE but you could use another language.)
For example, if you wanted to mask all but the last four digits of a phone number you could do this:
/include qcpysrc,regex_h
d regex_phone_number...
d ds inz likeds(regex_t)
d dsrm ds inz likeds(regmatch_t) dim(20)
d data s 52a inz varying
d pattern s 256a inz varying
d rc s 10i 0 inz(0)
/FREE
*inlr = *on ;
data = 'My phone #''s are: (444) 555 - 6666 and 777.888.9999' ;
dsply data ;
pattern = '\(?([0-9]{3})[ .)]*([0-9]{3})[ .-]*([0-9]{4})' ;
rc = regcomp(regex_phone_number :pattern :REG_EXTENDED) ;
if rc = 0 ;
dow '1' ;
rc = regexec(regex_phone_number :data
:regex_phone_number.re_nsub :%addr(dsrm) :0) ;
if rc <> 0 ;
leave ;
endif ;
data = %replace('***': data :dsrm(2).rm_so+1
:dsrm(2).rm_eo - dsrm(2).rm_so) ;
data = %replace('***': data :dsrm(3).rm_so+1
:dsrm(3).rm_eo - dsrm(3).rm_so) ;
enddo ;
endif ;
dsply data ;
regfree(regex_phone_number) ;
/END-FREE
Here's what the copy book regex_h looks like:
** Header file for calling the "Regular Expression" functions
** provided by the ILE C Runtime Library from an RPG IV
** program. Scott Klement, 2001-05-04
** Converted to qualified DS 2003-11-29
** Modified by Jarrett Gilliam 2014-11-05
**
** This copy book is for using the C regular expression library, regex.h, in RPG.
** You can go to http://www.regular-expressions.info/ to learn more about
** regular expressions. This regex flavor is POSIX ERE. You can go to
** http://www-01.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rtref/regexec.htm
** to learn more about how the C functions work.
d/if defined(REGEX_H)
d/eof
d/endif
d/define REGEX_H
**------------------------------------------------------------
* cflags for regcomp()
**------------------------------------------------------------
d REG_BASIC c CONST(0)
d REG_EXTENDED c CONST(1)
d REG_ICASE c CONST(2)
d REG_NEWLINE c CONST(4)
d REG_NOSUB c CONST(8)
**------------------------------------------------------------
* eflags for regexec()
**------------------------------------------------------------
d REG_NOTBOL c CONST(256)
d REG_NOTEOL c CONST(512)
**------------------------------------------------------------
* errors returned
**------------------------------------------------------------
* RE pattern not found
d REG_NOMATCH c CONST(1)
* Invalid Regular Expression
d REG_BADPAT c CONST(2)
* Invalid collating element
d REG_ECOLLATE c CONST(3)
* Invalid character class
d REG_ECTYPE c CONST(4)
* Last character is \
d REG_EESCAPE c CONST(5)
* Invalid number in \digit
d REG_ESUBREG c CONST(6)
* imbalance
d REG_EBRACK c CONST(7)
* \( \) or () imbalance
d REG_EPAREN c CONST(8)
* \{ \} or { } imbalance
d REG_EBRACE c CONST(9)
* Invalid \{ \} range exp
d REG_BADBR c CONST(10)
* Invalid range exp endpoint
d REG_ERANGE c CONST(11)
* Out of memory
d REG_ESPACE c CONST(12)
* ?*+ not preceded by valid RE
d REG_BADRPT c CONST(13)
* invalid multibyte character
d REG_ECHAR c CONST(14)
* (shift 6 caret or not) anchor and not BOL
d REG_EBOL c CONST(15)
* $ anchor and not EOL
d REG_EEOL c CONST(16)
* Unknown error in regcomp() call
d REG_ECOMP c CONST(17)
* Unknown error in regexec() call
d REG_EEXEC c CONST(18)
**------------------------------------------------------------
* Structure of a compiled regular expression:
**------------------------------------------------------------
d REG_SUBEXP_MAX c 20
d regex_t ds qualified align based(template)
d re_nsub 10i 0
d re_comp *
d re_cflags 10i 0
d re_erroff 10i 0
d re_len 10i 0
d re_ucoll 10i 0 dim(2)
d re_lsub * DIM(REG_SUBEXP_MAX)
d re_esub * DIM(REG_SUBEXP_MAX)
d re_map 256a
d re_shift 5i 0
d re_dbcs 5i 0
**------------------------------------------------------------
* structure used to report matches found by regexec()
**------------------------------------------------------------
d regmatch_t ds qualified align based(template)
d rm_so 10i 0
d rm_ss 5i 0
d rm_eo 10i 0
d rm_es 5i 0
**------------------------------------------------------------
* regcomp() -- Compile a Regular Expression ("RE")
*
* int regcomp(regex_t *preg, const char *pattern,
* int cflags);
*
* where:
* preg (output) = the compiled regular expression.
* pattern (input) = the RE to be compiled.
* cflags (input) = the sum of the cflag constants
* (listed above) for this RE.
*
* Returns 0 = success, otherwise an error number.
**------------------------------------------------------------
d regcomp pr 10i 0 extproc('regcomp')
d preg like(regex_t)
d pattern * value options(*string)
d cflags 10i 0 value
**------------------------------------------------------------
* regexec() -- Execute a compiled Regular Expression ("RE")
*
* int regexec(const regex_t *preg, const char *string,
* size_t nmatch, regmatch_t *pmatch, int eflags);
*
* where:
* preg (input) = the compiled regular expression
* (the output of regcomp())
* string (input) = string to run the RE upon
* nmatch (input) = the number of matches to return.
* pmatch (output) = array of regmatch_t DS's
* showing what matches were found.
* eflags (input) = the sum of the flags (constants
* provided above) modifying the RE
*
* Returns 0 = success, otherwise an error number.
**------------------------------------------------------------
d regexec pr 10i 0 extproc('regexec')
d preg like(regex_t) const
d string * value options(*string)
d nmatch 10u 0 value
d pmatch * value
d eflags 10i 0 value
**------------------------------------------------------------
* regerror() -- return error information from regcomp/regexec
*
* size_t regerror(int errcode, const regex_t *preg,
* char *errbuf, size_t errbuf_size);
*
* where:
* errcode (input) = the error code to return info on
* (obtained as the return value from
* either regcomp() or regexec())
* preg (input) = the (compiled) RE to return the
* error for.
* errbuf (output) = buffer containing human-readable
* error message.
* errbuf_size (input) = size of errbuf (max length of msg
* that will be returned)
*
* returns: length of buffer needed to get entire error msg
**------------------------------------------------------------
d regerror pr 10u 0 extproc('regerror')
d errcode 10i 0 value
d preg like(regex_t) const
d errbuf * value
d errbuf_size 10i 0 value
**------------------------------------------------------------
* regfree() -- free memory locked by Regular Expression
*
* void regfree(regex_t *preg);
*
* where:
* preg (input) = regular expression to free mem for.
*
* NOTE: regcomp() will always allocate extra memory
* to be pointed to by the various pointers in
* the regex_t structure. if you don't call this,
* that memory will never be returned to the system!
**------------------------------------------------------------
d regfree pr extproc('regfree')
d preg like(regex_t)
Here's the output:
DSPLY My phone #'s are: (444) 555 - 6666 and 777.888.9999
DSPLY My phone #'s are: (***) *** - 6666 and ***.***.9999
The code could be improved by extracting the replace logic and putting it in a Procedure of it's own, creating a custom regexp replace function based on the POSIX library but it's not absolutely necessary.
The ILE C/C++ runtime library does not have a regex replace function available.
Java, however, has excellent support for regular expressions and integrates easily with RPGLE.
Introduction to Java and RPG
Using Regular Expressions in Java
I succeed in using Regex with Java.
I was inspired by this code from scott klement and that code from ibm.
The mix works well. I just added the replace function.
H
/include QSYSINC/QRPGLESRC,JNI
D newString pr O CLASS(*JAVA:'java.lang.String')
D EXTPROC(*JAVA:'java.lang.String':
D *CONSTRUCTOR)
D bytearray 32767A VARYING CONST
D getBytes PR 65535A VARYING
D EXTPROC(*JAVA:
D 'java.lang.String':
D 'getBytes')
D PatternCompile pr O CLASS(*JAVA:
D 'java.util.regex.Pattern')
D EXTPROC(*JAVA:
D 'java.util.regex.Pattern':
D 'compile') STATIC
D pattern O CLASS(*JAVA:'java.lang.String')
D PatternMatcher pr O CLASS(*JAVA:
D 'java.util.regex.Matcher')
D EXTPROC(*JAVA:
D 'java.util.regex.Pattern':
D 'matcher')
D comparestr O CLASS(*JAVA
D :'java.lang.CharSequence')
D CheckMatches pr 1N EXTPROC(*JAVA
D :'java.util.regex.Matcher'
D :'matches')
D DoReplace pr O CLASS(*JAVA:'java.lang.String')
D EXTPROC(*JAVA
D :'java.util.regex.Matcher'
D :'replaceAll')
D replacement O CLASS(*JAVA
D :'java.lang.String')
D RegExPattern s O CLASS(*JAVA:
D 'java.util.regex.Pattern')
D RegExMatcher s O CLASS(*JAVA:
D 'java.util.regex.Matcher')
D jstrStmt s like(jstring)
D jPatStr s like(jstring)
D jRepStr s like(jstring)
D jRepStr2 s like(jstring)
D result S 30A
/free
jPatStr = newString('^(\+33|0)([1-9][0-9]{8})$');
jstrStmt = newString('+33123456789');
jRepStr = newString('0$2');
RegExPattern = PatternCompile(jPatStr);
RegExMatcher = PatternMatcher(RegExPattern : jstrStmt);
if (CheckMatches(RegExMatcher) = *ON);
dsply ('it matches');
else;
dsply ('it doesn''t match');
endif;
jRepStr2 = DoReplace(RegExMatcher : jRepStr);
result = getBytes(jRepStr2);
dsply (%subst(result : 1 : 30));
*inlr = *on;
/end-free
It works, but with Java. I still work on the PASE Solution WarrenT suggested, but using PASE in an ILE program is such a pain...
The Young i Professionals Wiki has a page of Open Source Binaries. In the list is the PCRE Library (Perl Compatible Regular Expressions).
Let us know how this works out. I may try it myself ;-)
For excellent SQLRPGLE example and explanation refer to :
https://www.rpgpgm.com/2017/10/replacing-parts-of-strings-using-regexp.html
REGEXP_REPLACE
(
source-string
,
pattern-expression
,
replacement-string
,
start
,
occurence
,
flags
)