C++ Regex for 0=250|18000=300|26000=0.86M - c++

I am trying to write a regex in C++ that would match any of the following statements:
0=250
26000=0.86M
0=250|18000=300
0=250|18000=300|26000=0.86M
I wrote the following and checked with regexr.com:
(([0-9]+=[0-9]+)|((\|[0-9]+=[0-9]|.[0-9]+)[Mm]))((\|[0-9]+=[0-9]+)|((\|[0-9]+=[0-9]|.[0-9]+)[Mm])?)+
It looks like it is working, but I do not understand one thing. I thought I needed double backslashes before ".", like:
(([0-9]+=[0-9]+)|((\|[0-9]+=[0-9]\\.[0-9]+)[Mm]))((\|[0-9]+=[0-9]+)|((\|[0-9]+=[0-9]\\.[0-9]+)[Mm])?)+
as I was advised in my other post. However, according to the online tester, this is not correct.
Can someone explain please?
Thanks a lot!

This depends on the flavor of regex that you are using. Regex doesn't have universal rules, and some of the main differences are what characters need to be escaped when.
I don't know what flavor your online tester was, but here is what c++ 11 uses for regex: http://www.cplusplus.com/reference/regex/ECMAScript/
You will need to escape all literal . and | characters in your regex according to this.

Related

Regex find a specific character zero or one or multiple time in a string

I'm upgrading a Symfony app with VSCode and I have numerous occurences of this kind of string :
#Template("Area:Local:delete.html.twig")
or
#Template("Group:add.html.twig")
In this case, I want to replace all the : with / to have :
#Template("Area/Local/delete.html.twig")
I can't think of doing it manually, so I was looking for a regular expression for a search/replace in the editor.
I've been toying with this fearsome beast without luck (i'm really dumb in regexp) :
#Template\("([:]*)"
#Template\("(.*?)"
#Template\("[a-zA-Z.-]{0,}[:]")
Still, I think there should be a "simple" regexp for this kind of standard replacement.
Anyone has any clue ? Thank you for any hint
You can use this regex with a capture group: (#Template.*):.
And replace with this $1/.
But you'll have to use replace all until there's no : left, that won't take long.
Just explaining a lit bit more, everything between the parenthesis is a capture group that we can reference later in replace field, if we had (tem)(pla)te, $1 would be tem and $2 would be pla
Regex!
You can use this regex #Template\("(.[^\(\:]*)?(?:\:)(.[^\(\:]*)?(?:\:)?(.[^\(\:]*)?"\) and replacement would simply be #Template\("$1/$2/$3
You can test it out at https://regex101.com/r/VfZHFa/2
Explanation: The linked site will give a better explanation than I can write here, and has test cases you can use.

regular expression matching filename with multiple extensions

Is there a regular expression to match the some.prefix part of both of the following filenames?
xyz can be any character of [a-z0-9-_\ ]
some.prefix part can be any character in [a-zA-Z0-9-_\.\ ].
I intentionally included a . in some.prefix.
some.prefix.xyz.xyz
some.prefix.xyz
I have tried many combinations. For example:
(?P<prefix>[a-zA-Z0-9-_\.]+)(?:\.[a-z0-9]+\.gz|\.[a-z0-9]+)
It works with abc.def.csv by catching abc.def, but fail to catch it in abc.def.csv.gz.
I primarily use Python, but I thought the regex itself should apply to many languages.
Update: It's not possible, see discussion with #nowox below.
I think your regex works pretty well. I recommend you to trying regex101 with your example:
https://regex101.com/r/dV6cE8/3
The expression
^(?i)[ \w-]+\.[ \w-]+
Should work in your case:
som e.prefix.xyz.xyz
^^^^^^^^^^^
some.prefix.xyz
^^^^^^^^^^^
abc.def.csv.gz
^^^^^^^
And in Python you can use:
import re
text = """some.prefix.xyz.xyz
some.prefix.xyz
abc.def.csv.gz"""
print re.findall('^(?i)[ \w-]+\.[ \w-]+', text, re.MULTILINE)
Which will display:
['som e.prefix', 'some.prefix', 'abc.def']
I might think you are a bit confused about your requirement. If I summarize, you have a pathname made of chars and dot such as:
foo.bar.baz.0
foobar.tar.gz
f.o.o.b.a.r
How would you separate these string into a base-name and an extension? Here we recognize some known patterns .tar.gz is definitely an extension, but is .bar.baz.0 the extension or it is only .0?
The answer is not easy and no regexes in this World would be able to guess the correct answer at 100% without some hints.
For example you can list the acceptable extensions and make some criteria:
An extension match the regex \.\w{1,4}$
Several extensions may be concatenated together (\.\w{1,4}){1,4}$
The remaining is called the basename
From this you can build this regular expression:
(?P<basename>.*?)(?P<extension>(?:\.\w{1,4}){1,4})$
Try this[a-z0-9-_\\]+\.[a-z0-9-_\\]+[a-zA-Z0-9-_\.\\]+

Regex to match everything."LettersNumbers"."extension" and forum searching tip

I would need a regex to match my files named "something".Title"numberFrom1to99".mp4 on Windows' File Explorer, my first approach as a regex newbie was something like
"..mp4"
, but it didn't work, so i tried
"*.Title[1-9][0-9].mp4"
, that also did not work.
I would also like a tip on how to search regex related advices on Stackoverflow archive but also on the web, so that i can be specific, but without having the regex in the searching bar interact.
Thank you!
EDIT
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)
Try something like this:
(.*)\.[A-za-z]+\d+\.mp4
See this Regex Demo to get an explanation on the regex.
Use regex101.com to test your regexs
Here it is:
^[\s\S]*\.Title[1-9][0-9]?\.mp4$
I suggest regexr.com to find many interesting regexes(Favourites tab) and simple tutorial.
About the second part of the question: in the question itself there is written "..mp4" but i wrote "asterisk"."asterisk".mp4, is there any universal way to write regex on the web without it having effect and without escaping the characters? (in that way the backslash shows inside the regex, and that could be misunderstood)

Regex Expression

Please find expression for example string :
john/niel
Stephanie/Arnold
I wrote:
^[a-zA-Z/][a-zA-z]+$
But it accepts multiple slash also.
This is really basic. If you Googled for a minute or two, I'm sure you'd come up with something like
^[a-zA-Z]+\/[a-zA-Z]+$
The quoting of the / might not be necessary - depends on regex flavor.
Regards

How does writing a string using only a regex work

On a recreative coding website (https://dmoj.ca/user/quantum) I came across a piece of code which prints out the following string:
If a problem can't be solved with *regex*, it's a bad problem.
I was suprised seeing the code because it seems that it only uses a regular expression to accomplish this:
''=~('('.'?'.'{'.('['^'+').('['^')').('`'|')').('`'|'.').('['^'/').'"'.('`'^')')
.('`'|'&').('{'^'[').('`'|'!').('{'^'[').('['^'+').('['^')').('`'|'/').('`'|'"')
.('`'|',').('`'|'%').('`'|'-').('{'^'[').('`'|'#').('`'|'!').('`'|'.')."'".('['^
'/').('{'^'[').('`'|'"').('`'|'%').('{'^'[').('['^'(').('`'|'/').('`'|',').('['^
'-').('`'|'%').('`'|'$').('{'^'[').('['^',').('`'|')').('['^'/').('`'|'(').('{'^
'[').'*'.('['^')').('`'|'%').('`'|"'").('`'|'%').('['^'#').'*'.','.('{'^'[').('`'
|')').('['^'/')."'".('['^'(').('{'^'[').('`'|'!').('{'^'[').('`'|'"').('`'|'!').
('`'|'$').('{'^'[').('['^'+').('['^')').('`'|'/').('`'|'"').('`'|',').('`'|'%').
('`'|'-').'.'.'"'.'}'.')');
(all credits for this piece of art go to the original author from the website I mentioned above: Quantum)
I really want to know how this works exactly but I couldn't find anything on Google, can someone explain this to me? Oh, and it's written in Perl.
The code uses an Eval-group inside the regex to execute arbitrary code. You have to use use re 'eval' to enable the behavior.
Eval-groups look like (?{...}) with the part inside the curly braces being evaluated.
The rest of the regex is OR'ing and XOR'ing characters. For instance '['^'+']' is equivalent to 'p'. . simply concatenates all those characters.
You can paste the part after the =~ matching operator into your perl shell and see the final regex that is being matched/executed.