regex to ignore string ends with specific pattern - regex

I am trying to write regex which should ignore any string ends with _numbers like (_1234)
like below
abc_def_1234 - should not match
abc_fgh - match
abc_ghj - match
abc_ijk_2345 - not match
I am trying to use lookahead regex like below, but it's matching everything. Can someone please help me how I can achieve this?
\w+(?!_\d+)

Match words separated by underscores, but use a negative look ahead to exclude input that has the unwanted tail:
^(?!.*_\d+$)\w+(?<!_)$
See live demo.
The last look behind (which you can remove) is there to require that the last char is not an underscore - ie that the input is AFAICT well formed.

Related

How can I match all lines with a certain pattern, except when a certain substring is present?

I have multiple lines that have a bit of code that has a format that follow a very simple pattern: &G3FRM.GetRecord("<TAG>".GetField("<TAG>").Value. For example, I might have the following:
&G3FRM.GetRecord("PAGEREC").GetField("GSHOURS").Value
&G3FRM.GetRecord("RSCH_SETUP").GetField("Y_NIH_MNTHLY_CAP").Value
&G3FRM.GetRecord("PAYMENT").GetField("Y_HRS_TOTAL").Value
I need to match anything that has &G3FRM.GetRecord, that doesn't have PAGEREC as the first string/tag, and is then followed by the rest of the pattern. These statements can appear at the beginning, middle or end of any given line, and there could even be multiple matches in a single line.
This is the Regex pattern that I have tried:
&G3FRM\.GetRecord\("(?!PAGEREC)"\)\.GetField\("\w+"\)\.Value
As far as I understand, this is matching some literals (&G3FRM.GetRecord(") and is then looking for any string that doesn't match PAGEREC, using a negative lookahead. It certainly excludes any of the matches that have PAGEREC, but it also excludes everything else, so I know that I'm missing something.
So, I have a bunch of lines that I've cherry-picked that could look something like this:
Local string &rqst_dept_descr = %This.GetDepartmentDescription(&G3FRM.GetRecord("PAGEREC").GetField("GSREQUESTING_DEPT").Value);
Local string &hoursHTML = GetHTMLText(HTML.G_FORM_ROW_VALUE, "Hours", &G3FRM.GetRecord("PAYMENT").GetField("GSHOURS").Value);
Local string &off_cycle_deposit = &G3FRM.GetRecord("PAGEREC").GetField("GSOFFCYCLE_DIR_DEP").Value;
&G3FRM.GetRecord("POSITION").GetField("GSCOMMISSIONTIPS").Value = "Y";
SQLExec(SQL.Y_HAS_CONTRACT_DATA_IN_RANGE, &G3FRM.GetRecord("PAGEREC").GetField("EMPLID").Value, &G3FRM.GetRecord("PAYMENT").GetField("CONTRACT_NUM").Value, &G3FRM.GetRecord("PAYMENT").GetField("EFFDT").Value, &G3FRM.GetRecord("PAYMENT").GetField("EFFDT").Value, &HasContractData);
In this example, it should exclude the first line, since it only has the pattern I don't want. It should include the second line, exclude the third, include the fourth, and include the fifth (even though it does have one example of the excluded pattern, it has multiples that I do want).
You may use this regex:
&G3FRM\.GetRecord\("(?!PAGEREC\b)\w+"\)\.GetField\("\w+"\)\.Value
Note use of \w+ after negative lookahead to allow it to match a word that must not be PAGEREC1. I have added \b in your lookahead condition to make sure we don't match partial words.
RegEx Demo
In your regex &G3FRM\.GetRecord\("(?!PAGEREC)"\)\.GetField\("\w+"\)\.Value your negative lookahead condition is correct but regex is not matching anything between 2 double quotes so your regex will only match e.g. &G3FRM.GetRecord("").GetField("GSHOURS").Value.

REGEX - How to find two hyphens in a filename?

I'd like to search for filenames that contain two hyphens (only). Some filenames have one hyphen, I just want the one's with two hyphens in the name:
THIS: some text - more text - yet more.txt
NOT THIS: some text - more text.txt
The hyphens are always surrounded by a space, FWIW.
I tried using (.*) - (.*) - (.*) and a couple variants, but the results aren't what I am looking for. I either get nothing or filenames with just one hyphen when I try various combinations.
I know this is an obvious one, but I have tried wading through regex tutorials concerning greedy, look aheads, etc. but can't for the life of me solve this. Can anyone help? I'm not looking for just the solution--I'd like to understand what I'm doing wrong in the regex syntax.
You can use this regex,
^[^-]*(?:-[^-]*){2}$
This when written in expanded form will look like this,
^[^-]*-[^-]*-[^-]*$
Which is how you wanted it, but I've compacted it by using quantifier to restrict the occurrence of hyphen to just two only.
Demo
If you want to extend your regex, just change .* to [^-]* to make your regex this, otherwise .* will match additional hyphens too leading to unexpected match results.
^([^-]*) - ([^-]*) - ([^-]*)$
Notice you should use start ^ and end $ anchors to make the filename match whole regex.
Demo with your modified regex

Regex to match a pattern but not include another pattern

I'd like a regex to match thank-you but exclude it when the string contains the word removals.
So contact-thank-you should return a positive but removals/contact-thank-you should return a negative
I don't know much about regex and found a couple of posts refering to negative lookaheads. The best i could come up with was
(?!(?:removals)).*thank-you
which is clearly rubbish. Could anyone help?
Thanks
What you ideally want in this case is a negative lookbehind, since you're looking "behind" (to the left) of the word you're matching to make sure something's not there.
A complication here is that many regex engines don't permit variable-width negative-lookbehinds.
But if you can anchor to the start of the string you want to match somehow, then you can use lookahead from that anchor, instead.
(?:\s|^)((?!removals)\S)+thank-you(?:\s|$)
bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - no match.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - no match.
bananas/removals-thank-you - no match.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.
I am presuming that the characters permitted in the string are "anything but whitespace".
So it starts out by looking for either the beginning of the string, or some whitespace; then any number of non-whitespace \S characters that aren't the beginning of the string "removals"; then the string "thank-you".
But I suspect what you're actually looking for is something a little different, maybe something like:
^(?!removals\/)\w+\/[-\w]*thank-you$
bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - no match.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - MATCH.
bananas/removals-thank-you - MATCH.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.
This assumes that the structure is very fixed: to include anything that ends "/blah-blah-thank-you", unless the first word is exactly "removals/". Without knowing the exact specification, though, the first seems the most likely to be helpful.
If you're not trying to extract this string from many others, but are just checking a URL to see if it matches this pattern, then you can simplify it a lot:
^(?!.*removals).*thank-you
bananas/fred-thank-you - MATCH.
bananas/fred-no-thank-you - MATCH.
bananas/thank-you-with-words-after - MATCH.
removals/fred-thank-you - no match.
non-removals/fred-thank-you - no match.
bananas/removals-thank-you - no match.
bananas/thank-you-supremovalsale - no match.
bananas/fred-sorry - no match.
This just matches any string that has "thank-you", and not "removals".

regex for weird string

I need to help about writing regex for below string. I have tried lots of pattern but all failed.
I have a string like
package1[module11,module12,module13],package2[module21,module22,module23,module24,module25],package3[module31]
and I want to split this string like
package1
module11,module12,module13
package2
module21,module22,module23,module24,module25
package3
module31
I know it is weird to ask a regex from here but ...
You can match using the pattern:
(\w+)\[(\w+(?:,\w+)*)\]
Example: http://www.rubular.com/r/rPUEWBoU1d
The pattern is pretty simple, really:
(\w+) - capture the first word (package1)
\[
(\w+(?:,\w+)*) - A sequence of at least one word (module11), followed by comma separated words (assuming they are well formed)
\]
In all cases, you may want to change \w to your alphabet (maybe even [^,\[\]] - not comma or brackets). You also may want to check the whole string matches, as the above pattern may skip over unwanted parts (for example: a[b]$$$$c[d])

Antimatch with Regex

I search for a regex pattern, which shouldn't match a group but everything else.
Following regex pattern works basicly:
index\.php\?page=(?:.*)&tagID=([0-9]+)$
But the .* should not match TaggedObjects.
Thanks for any advices.
(?:.*) is unnecessary - you're not grouping anything, so .* means exactly the same. But that's not the answer to your question.
To match any string that does not contain another predefined string (say TaggedObjects), use
(?:(?!TaggedObjects).)*
In your example,
index\.php\?page=(?:(?!TaggedObjects).)*&tagID=([0-9]+)$
will match
index.php?page=blahblah&tagID=1234
and will not match
index.php?page=blahTaggedObjectsblah&tagID=1234
If you do want to allow that match and only exclude the exact string TaggedObjects, then use
index\.php\?page=(?!TaggedObjects&tagID=([0-9]+)$).*&tagID=([0-9]+)$
Try this. I think you mean you want to fail the match if the string contains an occurence of 'TaggedObjects'
index\.php\?page=(?!.*TaggedObjects).*&tagID=([0-9]+)$