inclusive exclusion in regular expression - regex

Trying to create an inclusive exclusion in regular expression using the following syntax. Not having much succcess so figured try my luck with stackoverflow.
EG of URL I'm trying run exclusion on is:
'https://somesite.domain.com:port/folder1/subfolder1/subfolder2/18`
Regex I've for it is:
\d{2,3}\/folder1\/subfolder1\/subfolder2(?!18)\d
The above regex cover all from 181-189. I only want to see 18.

You need to use the end of the string anchor:
\d{2,3}\/folder1\/subfolder1\/subfolder2\/(?!18$)\d
or a slash or another character if your string is only a substring (not at the end):
\d{2,3}\/folder1\/subfolder1\/subfolder2\/(?!18\/)\d

Related

Regular Expression - Starting and ending with, and contains specific string in the middle

I would like to generate a regex with the following condition:
The string "EVENT" is contained within a xml tag called "SHEM-HAKOVETZ".
For example, the following string should be a match:
<SHEM-HAKOVETZ>104000514813450EVENTS0001dfd0.DAT</SHEM-HAKOVETZ>
I think you want something like this ^<SHEM-HAKOVETZ>.*EVENT.*<\/SHEM-HAKOVETZ>$
Regular expression
^<SHEM-HAKOVETZ>.*EVENTS.*<\/SHEM-HAKOVETZ>$
Parts of the regular expression
^ From the beginning of the line
<SHEM-HAKOVETZ> Starting tag
.* Any character - zero or more
EVENT Middle part
<\/SHEM-HAKOVETZ>$ Ending part of the match
Here is the working regex.
If you want to match this line, you could use this regex:
<SHEM-HAKOVETZ>*EVENTS.*(?=<\/SHEM-HAKOVETZ>)
However, I would not recommend using regex XML-based data, because there may be problems with whitespace handling in XML (see this article for more information). I would suggest using an actual XML parser (and then applying the reg to be sure about your results.
Here is a solution to only match the "value" part ignoring the XML tags:
(?<=<SHEM-HAKOVETZ>)(?:.*EVENTS.*)(?=<\/SHEM-HAKOVETZ>)
You can check it out in action at: https://regex101.com/r/4XiRch/1
It works with Lookbehind and Lookahead to make sure it will only match if the tags are correct, but for further coding will only match the content.

Does mongo regex query have character limit, if the regex search string is more than that limit it throws error

I am seeing mongo regex query not returning result when the regex searched string is very big, instead its throwing error. I have a scenario where I append lot of names to do a regex and thus my regex search string goes beyond 40000 characters.
eg:
db.getCollection('collection').find(
{"name":{"$regex" :"name1 | name2 | name3", "$options":"-i"}}
)
Can you explain why you are doing this?
The idea of a regex is to create a expression with which you match (multiple) value(s).
example expression:
name\d+
will match on all "namex" vales where x is a decimal.
The idea will be to create a single expression to fullfill your query requirement.
When you want to match on multiple string values you can use $and operator
Yes, mongoDB regex has character limit, originates from perl regex limit.
Because "MongoDB uses Perl compatible regular expressions (i.e. “PCRE” ) version 8.41 with UTF-8 support."MongoDB v3.2
You can see the limitation is 32764 characters:MongoDB add assert of regular expression length
I recently met this issue, and the solution was to use $in query operator instead. $in does not have characters limit. This suits my problem, since it was exact match rather than pattern matching in such long input case.

regex expression for selecting a value

I want to write a regexp formula for the below sip message that takes number:
< sip:callpark#as1sip1.com:5060;user=callpark;service=callpark;preason=park;paction=park;ptoken=150009;pautortrv=180;nt_server_host=47.168.105.100:5060 >
(Actually there are "<" and ">" signs in the message, but the site does not let me write)
For this case, I want to select ptoken value.. I wrote an expression such as: ptoken=(.*);p but it returns me ptoken=150009;p, I just need the number:150009
How do I write a regexp for this case?
PS: I write this for XML script..
Thanks,
I SOLVE THE PROBLEM BY USING TWO REGEX:
ereg assign_to="token" check_it="true" header="Refer-To:" regexp="(ptoken=([\d]*))" search_in="hdr"/
ereg assign_to="callParkToken" search_in="var" variable="token" check_it="true" regexp="([\d].*)" /
You could use the following regex:
ptoken=(\d+)
# searches for ptoken= literally
# captures every digit found in the first group
Your wanted numbers are in the first group then. Take a look at this demo on regex101.com. Depending on your actual needs, there could be better approaches (Xpath? as tagged as XML) though.
You should use lookahead and lookbehind:
(?<=ptoken=)(.+?)(?=;)
It captures any character (.+?) before which is ptoken= and behind which is ;
The <ereg ... > action has the assign_to parameter. In your case assign_to="token". In fact, the parameter can receive several variable names. The first is assigned the whole string matching the regular expression, and the following are assigned the "capture groups" of the regular expression.
If your regexp is ptoken=([\d]*), the whole match includes ptoken which is bad. The first capture group is ([\d]*) which is the required value. Thus, use <ereg regexp="ptoken=([\d]*)" assign_to="dummyvar,token" ..other parameters here.. >.
Is it working?

Finding a pattern with optional end using regular expression

I am looking for one single regular expression to extract a block of text, which can be surrounded with an optional end. The challenge here is just to use a single regular expression.
The input is as follows:
Anchor: This is the text I want to extract A/C : 2015-5-20
Anchor: This is the text I want to extract
I am currently using the following regular expression
Anchor:(?<extact>.*)(A\/C)
The result looks as follows:
If I make the A/C block optional, Anchor:(?<extact>.*)(A\/C)? using a ? the matching gets to long:
It looks as follows:
Any ideas how to elegantly solve this with a single regex. An additional constraint is that I want to have a named block in the regex, (here extact)
You can find the sample code on regex101: https://regex101.com/r/wH5iQ4/1
Anchor:(?<extact>.*?)\s*(?=A\/C|$)
You can make use of lookahead here.See demo.
https://regex101.com/r/wH5iQ4/3

php regex to match three words if not then two and then one

Q1: I'm writing a regex in php and not successful. I want to match the following:
so i would
if not then match:
so i
and then:
i would
and
so
i
would
Here is my code:
\b(so i|i would|so i would|(so|i|would))\b
Its only matching the: so, i, would, so i, i would .... but not matching the so i would?
Order your regex correctly.
\b(so i would|so i|i would|(so|i|would))\b
Put the longest string to match to the left.
The | is left-associative and hence, in your version Of the regex, is matching the shorter string.
Just put it at the beginning
\b(so i would|so i|i would|(so|i|would))\b
put longest pattern to left in the group: \b(long|...|short)\b
another solution: \b(so i would|i would|would|so i|so|i)\b
p.s. this is NFA regex engine feature, please refer to "Mastering Regular Expressions"