Regex Match and exclude if contains word before match - regex

I have the following case.
In a log there is multiple hashes that can be extracted with the following regex
\b[a-fA-F\d]{32}\b
In this example we have 3 hashes that will be matching with the previous regex, but I want to exclude the ones that the field is named 'link' and 'value'
u'closed_by': {u'link': u'https://test.test.com/api/now/table/sys_user/175f7cc0d7989d87bc43e322c42c8da8', u'value': u'175f7cc0d7989d87bc43e322c42c8da8'}, u'sensor_name': u'175f7cc0d7989d87bc43e322c42c8da8'
I tried the following regex but didn't work, should be matching the last hash 'sensor_name'
(\b[a-fA-F\d]{32}\b)((.?!'link':\s\S+\')\,|(.?!'value':\s\S+\')\},)
**Note: this is only an extract of the original log, the match should be to anything that is a hash except the fields 'link' and 'value' following to 'lin', could be multiple fields named 'value'
Can someone help me to know what I'm doing wrong, please?

Use this pattern - any key words you want to avoid can be added to the lookbehind
(?<!link|value)': u'([\da-zA-Z]{32})

Related

How to use Postgres Regex Replace with a capture group

As the title presents above I am trying to reference a capture groups for a regex replace in a postgres query. I have read that the regex_replace does not support using regex capture groups. The regex I am using is
r"(?:[\s\(\)\=\)\,])(username)(?:[\s\(\)\=\)\,])?"gm
The above regex almost does what I need it to but I need to find out how to only allow a match if the capture groups also capture something. There is no situation where a "username" should be matched if it just so happens to be a substring of a word. By ensuring its surrounded by one of the above I can much more confidently ensure its a username.
An example application of the regex would be something like this in postgres (of course I would be doing an update vs a select):
select *, REGEXP_REPLACE(reqcontent,'(?:[\s\(\)\=\)\,])(username)(?:[\s\(\)\=\)\,])?' ,'NEW-VALUE', 'gm') from table where column like '%username%' limit 100;
If there is any more context that can be provided please let me know. I have also found similar posts (postgresql regexp_replace: how to replace captured group with evaluated expression (adding an integer value to capture group)) but that talks more about splicing in values back in and I don't think quite answers my question.
More context and example value(s) for regex work against. The below text may look familiar these are JQL filters in Jira. We are looking to update our usernames and all their occurrences in the table that contains the filter. Below is a few examples of filters. We originally were just doing a find a replace but that doesn't work because we have some usernames that are only two characters and it was matching on non usernames (e.g je (username) would place a new value in where the word project is found which completely malforms the JQL/String resulting in something like proNEW-VALUEct = balh blah)
type = bug AND status not in (Closed, Executed) AND assignee in (test, username)
assignee=username
assignee = username
Definition of Answered:
Regex that will only match on a 'username' if its surrounded by one of the specials
A way to regex/replace that username in a postgres query.
Capturing groups are used to keep the important bits of information matched with a regex.
Use either capturing groups around the string parts you want to stay in the result and use their placeholders in the replacement:
REGEXP_REPLACE(reqcontent,'([\s\(\)\=\)\,])username([\s\(\)\=\)\,])?' ,'\1NEW-VALUE\2', 'gm')
Or use lookarounds:
REGEXP_REPLACE(reqcontent,'(?<=[\s\(\)\=\)\,])(username)(?=[\s\(\)\=\)\,])?' ,'NEW-VALUE', 'gm')
Or, in this case, use word boundaries to ensure you only replace a word when inside special characters:
REGEXP_REPLACE(reqcontent,'\yusername\y' ,'NEW-VALUE', 'g')

in regex get a single match just before the match pattern?

I have a response like below
{"id":9,"announcementName":"Test","announcementText":"<p>TestAssertion</p>\n","effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":103,"announcementName":"d3mgcwtqhdu8003","announcementText":"<p>This announcement is a test announcement”,"effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":113,"announcementName":"asdfrtwju3f5gh7f21","announcementText":"<p>This announcement is a test announcement”,"effectiveStartDate":"03/02/2016","effectiveEndDate":"03/03/2016","updatedDate":"02/29/2016","status":"InActive","moduleName":"Individual Portal"}
I am trying get the value of id (103) of announcementName d3mgcwtqhdu8003.
I am using below regEx pattern to get the id
"id":(.*?),"announcementName":"${announcementName}","announcementText":"
But it is matching everything from the first id to the announcementName. and returning
9,"announcementName":"Test","announcementText":"<p>TestAssertion</p>\n","effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":103,"announcementName":"d3mgcwtqhdu8003","announcementText":
But I want to match only from the id just before the required announcementName.
How can I do this in RegEx . Can someone please help me on this ?
As an answer here as well. Either use appropriate JSON functions, if not, a simple regex like:
"id":(\d+)
will probably do as the IDs are numeric.

regex match multiple capture groups in any order

Given the sample string below I'm trying to capture the 'to', 'from', 'subject' and 'type' fields and spit them back out in a different format. The issue is that these fields (to, from, etc) can be in any order.
SAMPLE STRING TO REGEX ON
<cfmail to="#toAddr#" from="#fromAddress"
subject="#subject#" type="html">
#emailMsg#
</cfmail>
OUTPUT I'M LOOKING FOR
to:toAddr, from:fromAddress, subject:subject
If I knew that the order of those field I'm interested in was always the same then this is pretty easy, but I'm stumped on how to do this matching if, for instance, 'from' comes before 'to'
The perl one-liner I have right now is (just testing with 'to' and 'subject')
s/<cfmail.*?((to)="(.*?)")|((subject)="(.*?)").*<\/cfmail>/\1:\2, \3:\4/g
This ends up matching the 'to' value but stops there and I don't get anything for the 'subject' value. I've tried several variations on this where I change matching group setup etc but have had no luck on it.
Do you need to allow for missing fields (e.g. no type field)? What about other fields in addition to those four? If you answered no to both questions, this regex should do the trick:
s!<cfmail(?:\s+to="(?<to>[^"]+)"|\s+from="(?<from>[^"]+)"|\s+subject="(?<subject>[^"]+)"|\s+type="(?<type>[^"]+)")+>.*?</cfmail>!to:$+{to}, from:$+{from}, subject:$+{subject}!gs
Here's the regex alone in more readable form:
<cfmail
(?:
\s+to="(?<to>[^"]+)"
|
\s+from="(?<from>[^"]+)"
|
\s+subject="(?<subject>[^"]+)"
|
\s+type="(?<type>[^"]+)"
)+
>
.*?</cfmail>
...and a DEMO
You were actually pretty close; alternation was the key. You just needed to add a quantifier.
Notice that I removed the capturing groups from the field names. You already know the names, you just need to pair them with the correct values. The named groups make that much easier.

How to define stash issue regex to match only from begining?

The default regex in Stash to match JIRA ID is
JVM_SUPPORT_RECOMMENDED_ARGS="-Dintegration.jira.key.pattern=\"((?<!([a-z]{1,10})-?)[a-z]+-\d+)\""
But it matches regardless where the JIRA ID's location.
I want it only matches from beginning:
JIRA-1 what ever: match!
something JIRA-1 else: not match
How to edit the regex?
Following don't work
\"^((?<!([a-z]{1,10})-?)[a-z]+-\d+)\"
and
\"(^(?<!([a-z]{1,10})-?)[a-z]+-\d+)\"
Solution:
^[a-z]+-\d+ will do.
If you want to match the JIRA-<id> only from the beginning you should try:
\"^JIRA-(\d+)\"

Matching anchor tag attribute regardless of order

I am trying to write a regex match for anchor that should check data-username attribute in any order.
Binod
and
<a data-username="username" class ="myclass" href="abac">Binod</a>
In general, parsing HTML is a much better solution than trying to match it with a regex. That said, give this a try
/<a.*data-username\s*=\s*\"(.*?)\"/g
That should match regardless of where data-username shows up in the tag list, and will leave the actual username in the capturing group.
Regex solved my problem. <a\ .*?data-username=.*?>(?<linktext>.*?)</a>