Suppose I have the following function
fx(label, text) = match(Regex("\\W*\\Q$label\\E\\s*(.*)"), text).captures[]
text1 = "IAREA aoi/IA_1.ias"
The result of the first function call
fx("IAREA (?!FILE)", text1)
is expected (probably with a wrong reason), because string "IAREA" is not followed by "FILE" in test1.
But the result of the second function call
fx("IAREA (?!MMM)", text1)
is not expected: Because "IAREA" in "text1" is NOT followed by "MMM", this function call should return aoi/IA_1.ias, but it returns nothing.
I'm wondering: Is it possible to achieve this by changing the label argument, not by changing the function body?
Since you pass a regex pattern to the regex constructor, you needn't use the \Q and \E operators since all characters in between the two operators are treated as literal symbols.
You need to fix as follows:
Regex("\\W*$label\\s*(.*)")
Related
I try to write c# application and try to use regex to match function argument from function declaration. For example, I have
ReturnType Condition_Check(NegativeResponseCode *ErrorCode_pt, uint8 Sid_u8)
FUNC(Std_ReturnType, DiaDcmAdapter_CODE) ProductionMode_Write
(P2CONST(uint8, AUTOMATIC, RTE_APPL_DATA) p_Data,
P2VAR(Dcm_NegativeResponseCodeType, AUTOMATIC, RTE_APPL_DATA) ErrorCode)
void DSDL_V(const UBYTE* abc)
I want to match their's argument. Example
for first function, it should return ErrorCode_pt and Sid_u8
for second function, it should return p_Data and ErrorCode
for third function, it should return abc
Not match if function don't have argument
It's easy for us with first function by use below regex
Condition_Check\(\w+\s+\*?(\w+)\s*,\s*\w+\s*(\w+)\s*\)
My test: https://regex101.com/r/thD5bR/1
and get $1 and $2 But for 2nd function and 3rd function to complicated and more over, I don't know how many argument in the function (sometime it's 1 arg, sometime 2 arg...)
Is there any best way to match function argument from function declaration?
You can match your function parameters by imposing two conditions:
(?<=[^,][ *]) - it is preceeded by no commas and either space or star
(?=\)|,) - it is followed by either a closed parenthesis or a comma
Here's the full regex:
(?<=[^,][ *])\w+(?=\)|,)
Check the demo here.
I'm doing a small javascript method, which receive a list of point, and I've to read those points to create a Polygon in a google map.
I receive those point on the form:
(lat, long), (lat, long),(lat, long)
So I've done the following regex:
\(\s*([0-9.-]+)\s*,\s([0-9.-]+)\s*\)
I've tested it with RegexPal and the exact data I receive:
(25.774252, -80.190262),(18.466465, -66.118292),(32.321384, -64.75737),(25.774252, -80.190262)
and it works, so why when I've this code in my javascript, I receive null in the result?
var polygons="(25.774252, -80.190262),(18.466465, -66.118292),(32.321384, -64.75737),(25.774252, -80.190262)";
var reg = new RegExp("/\(\s*([0-9.-]+)\s*,\s([0-9.-]+)\s*\)/g");
var result = polygons.match(reg);
I've no javascript error when executing(with debug mode of google chrome). This code is hosted in a javascript function which is in a included JS file. This method is called in the OnLoad method.
I've searched a lot, but I can't find why this isn't working. Thank you very much!
Use a regex literal [MDN]:
var reg = /\(\s*([0-9.-]+)\s*,\s([0-9.-]+)\s*\)/g;
You are making two errors when you use RegExp [MDN]:
The "delimiters" / are should not be part of the expression
If you define an expression as string, you have to escape the backslash, because it is the escape character in strings
Furthermore, modifiers are passed as second argument to the function.
So if you wanted to use RegExp (which you don't have to in this case), the equivalent would be:
var reg = new RegExp("\\(\\s*([0-9.-]+)\\s*,\\s([0-9.-]+)\\s*\\)", "g");
(and I think now you see why regex literals are more convenient)
I always find it helpful to copy and past a RegExp expression in the console and see its output. Taking your original expression, we get:
/(s*([0-9.-]+)s*,s([0-9.-]+)s*)/g
which means that the expressions tries to match /, s and g literally and the parens () are still treated as special characters.
Update: .match() returns an array:
["(25.774252, -80.190262)", "(18.466465, -66.118292)", ... ]
which does not seem to be very useful.
You have to use .exec() [MDN] to extract the numbers:
["(25.774252, -80.190262)", "25.774252", "-80.190262"]
This has to be called repeatedly until the whole strings was processed.
Example:
var reg = /\(\s*([0-9.-]+)\s*,\s([0-9.-]+)\s*\)/g;
var result, points = [];
while((result = reg.exec(polygons)) !== null) {
points.push([+result[1], +result[2]]);
}
This creates an array of arrays and the unary plus (+) will convert the strings into numbers:
[
[25.774252, -80.190262],
[18.466465, -66.118292],
...
]
Of course if you want the values as strings and not as numbers, you can just omit the +.
const stringWithDate: string = "4/7/20 This is a date!";
const reg: RegExp = new RegExp("^(\d{1,2}\/\d{1,2}\/\d{1,2})").compile();
const exist: boolean = reg.test(stringWithDate)
const matches: RegExpExecArray | null = reg.exec(stringWithDate);
console.log(exist);
console.log(matches);
I am trying to get the date (4/7/20) extracted from strngWithDate. When I log the value of 'exist' it says true but the matches array says [""]. I'm not sure what I'm doing wrong here. I know the regex isn't that good but I know it works because I tried the same in python and
here. As far as I can tell it should give me "4/7/20" from stringWithDate. But isn't happening.
There are two problems:
You're not allowing for the fact your backslashes are in a string literal.
You're not passing anything into compile.
1. Backslashes
Remember that in a string literal, a backslash is an escape character, so the \d in your string is an unnecessary escape of d, which results in just d. So your actual regular expression is:
^(d{1,2}/d{1,2}/d{1,2})
Use the literal form instead:
const reg: RegExp = /^(\d{1,2}\/\d{1,2}\/\d{1,2})/; // No `compile`, see next point
Live Example:
const stringWithDate/*: string*/ = "4/7/20 This is a date!";
const reg/*: RegExp*/ = /^(\d{1,2}\/\d{1,2}\/\d{1,2})/; // No `compile`, see next point
const exist/*: boolean*/ = reg.test(stringWithDate)
const matches/*: RegExpExecArray | null*/ = reg.exec(stringWithDate);
console.log(exist);
console.log(matches);
2. compile
compile accepts a new expression to compile, replacing the existing expression. By not passing an expression in as an argument, you're getting the expression (?:), which matches the blank at the beginning of your string.
You dont need compile (spec | MDN). It's an Annex B feature (supposedly only in JavaScript engines in web browsers). Here's what the spec has to say in a note about it:
The compile method completely reinitializes the this object RegExp with a new pattern and flags. An implementation may interpret use of this method as an assertion that the resulting RegExp object will be used multiple times and hence is a candidate for extra optimization.
...but JavaScript engines can figure out whether a regular expression needs optimization without your telling them.
If you wanted to use compile, you'd do it like this:
const reg: RegExp = /x/.compile(/^(\d{1,2}\/\d{1,2}\/\d{1,2})/);
The contents of the initial regular expression are completely replaced with the pattern and flags from the one passed into compile.
Side note: There's no reason for the type annotations on any of those consts. TypeScript will correctly infer them.
I'm trying to get a better understanding of how lambda functions and regex matches work in Python. For this purpose I'm replacing a lambda with a named function.
Even though I've found a way to make it work, I'm not able to understand why it works.
The lambda/regex I'm working on are the one mentioned in the following posts:
How to replace multiple substrings of a string?
Python - Replace regular expression match with a matching pair value
This is the main piece of code:
import re
# define desired replacements here
rep = {"condition1": "", "condition2": "text"}
text = "(condition1) and --condition2--"
# use these three lines to do the replacement
rep = dict((re.escape(k), v) for k, v in rep.items())
pattern = re.compile("|".join(rep.keys()))
output = pattern.sub(lambda m: rep[re.escape(m.group(0))], text)
print(output)
>>> '() and --text--'
If I replace the lambda function with:
def replace_conditions(match, rep):
return rep[re.escape(match.group(0))]
output = pattern.sub(replace_conditions(m, rep), text)
I get the following exception:
NameError: name 'm' is not defined
And I'm able to make it work only using this syntax:
def replace_conditions(match, rep=rep):
return rep[re.escape(match.group(0))]
output = pattern.sub(replace_conditions, line)
NOTE: I had to pre-assign a value to the second argument "rep" and use the function's name without actually calling it.
I can't understand why the match returned by the regex expression is properly passed on to the function if called with no arguments, while it's not passed to its first argument when called with the usual syntax.
I can't understand why the match returned by the regex expression is properly passed on to the function if called with no arguments
That's not what's happening. pattern.sub(replace_conditions, line) doesn't call the replace_conditions function, it just passes it on.
From the docs for:
re.sub(pattern, repl, string, count=0, flags=0)
which is the same as:
pattern.sub(repl, string)
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.
I want to check if a variable is in pascal case, in OpenEdge.
I found the matches operator, and I write the following code:
define variable cVariable as character no-undo.
cVariable = "cPascalCase":U.
message cVariable matches 'c[A-Z]*':U.
But it doesn't work, it shows "no". Is there a way to specify in OpenEdge that the second character should be upper case?
And more, to check if the variable contains groups of words starting with upper case?
Thanks in advance!
MATCHES does not support regular expressions. The documentation says it only takes simple wildcards like . and *. If you know your code will always run on Windows, you can use the CLR bridge to run .NET code:
USING System.Text.RegularExpressions.*.
DEF VAR cVariable AS CHAR NO-UNDO INITIAL "cPascalCase".
DEF VAR regexp AS CLASS Regex NO-UNDO.
regexp = NEW Regex("c[A-Z]*").
MESSAGE regexp:IsMatch(cVariable).
FINALLY:
DELETE OBJECT regexp.
END.
Progress does not directly support regular expressions.
For some examples of using regular expressions: using System.Text.RegularExpressions within OpenEdge ABL
Progress variables are not case sensitive. To work with a case sensitive string you can declare a variable to be case-sensitive like so:
define variable s as character no-undo case-sensitive.
display "aBc" matches "abc".
s = "aBc".
display s matches "abc".
display s matches "a*c".
Or you can use the UPPER() and LOWER(), ASC() and CHR() functions to make character by character comparisons.
You can't use regular expressions with Progress unless you use .NET classes, but your requirement is easily implemented with a simple function.
FUNCTION isPascalCase RETURNS LOGICAL
(cString AS CHARACTER):
IF LENGTH(cString) < 2 THEN
RETURN FALSE.
RETURN SUBSTRING(cString,1,1) = "c" AND
ASC(SUBSTRING(cString,2,1)) = ASC(UPPER(SUBSTRING(cString,2,1))).
END FUNCTION.
MESSAGE isPascalCase("cpascalCase").
You can use a class that I developed. It's available in https://github.com/gabsoftware/Progress-ABL-4GL-Regex. This class adds support for Perl regular expressions for Windows and HP-UX 11.31 ia64.
It's very easy to use. Just do the following:
DEFINE VARIABLE cl_regex AS CLASS Regex NO-UNDO.
DEFINE VARIABLE ch_pattern AS CHARACTER NO-UNDO CASE-SENSITIVE.
ASSIGN
ch_pattern = "c[A-Z]*"
cl_regex = NEW Regex().
/* should display: "No" */
MESSAGE cl_regex:mMatch( "ctest", ch_pattern, "" )
VIEW-AS ALERT-BOX.
Note that you have to escape Progress special characters in your pattern, as described here: http://knowledgebase.progress.com/articles/Article/P27229 or it will not work as expected.