Groovy replaceLast() alternative for replaceFirst() - regex

I am using this Groovy code on a Jenkins Pipeline to replace the first match:
index_html_file = index_html_file.replaceFirst(/(?<=.*${dir_cut}.*${dir_cut}.*li>).*/, "\n<li>${f.name}</li>")
This it will add a new line for the first match.
Now I want to do the same to the last match but there is no replaceLast() method. How can I achieve this?

Related

Based on a REGEX pattern set a value using Groovy in Jenkins pipeline

Based on a REGEX validation I am trying to set a value which is not happening as expected in Groovy Jenkins pipeline
If version = 1.1.1 or 1.5.9 or 9.9.9 like that then it has to be a managed store else like 1.22.9 or 1.99.9 ... tmp store. Below is what I tried with
String artefact_bucket_name
def artefact_version = "1.99.0"
if (artefact_version ==~ /[0-9]{1}\.[0-9]{1}\.[0-9]{1}/) {
artefact_bucket_name = "managed-artefact-store"
}
if (artefact_version ==~ /[0-9]{1}\.[0-9]{1,2}\.[0-9]{1}/) {
artefact_bucket_name = "tmp-artefact-store"
}
echo "Application version to deploy is ${artefact_version} from Artefact store ${artefact_bucket_name}"
It looks like you have a mistake in the second regex, that is overriding first one. e.g. when you have artefact_version = 1.1.1 - It matches first regex and second regex as well, so it always will be tmp-artefact-store.
I would change the second regex to match string like:
/[0-9]{1}\.[0-9]{2}\.[0-9]{1}/ - Notice I changed {1,2} to only {2}. This will exclusively match strings like "\d.\d\d.\d", so version like 1.1.1 will match only first regex and version like 1.99.9 - only second

Find Replace with RegEx with Google Script

I am trying to replace substrings that look like the following
Education|AAA|BBB|CCC|DDD|EEE
AAA|Educator|CCC|DDD|EEE
where Education or Educator could be in any position of the pipe delimited string. They are present at most only one time or zero times
I need to replace them with Educator/Education
What I want is
Educator/Education|AAA|BBB|CCC|DDD|EEE
AAA|Educator/Education|CCC|DDD|EEE
This works on the first pass with Education, I get
Educator/Education|AAA|BBB|CCC|DDD|EEE
AAA|Educator|CCC|DDD|EEE
But on the second pass replacing Educator, I get
Educator/Education/Education|AAA|BBB|CCC|DDD|EEE
AAA|Educator/Education|CCC|DDD|EEE
I think different a RegEx pattern for the ReplaceNth function would do it but cannot get what it should be
This is what I came up with in my limited ability with Google Script, which I got mostly from cobbling together from searches
Thanks
Edit: Updated with Wiktors solution given in a comment
function FRCrit_n(){
FRCrit("Elements", "Combined-What Are The Primary Role(s) Being Play?");
}
function FRCrit(shtName,cheader){
var sh = SpreadsheetApp.getActiveSpreadsheet().getSheetByName(shtName);
var data = sh.getDataRange().getValues();
var col =HTN(shtName,cheader)-1; //Header to column index
for(n=1;n<data.length;++n){
data[n][col] = data[n][col].replace(/\bEducat(?:or|ion)\b/g, 'Educator/Education')[col],"Education","Educator/Education",1)
}
sh.getRange(1,1,data.length,data[0].length).setValues(data);
}
In regular expression syntax, you can create an 'or' condition using a vertical bar to search for multiple things in one go. You can test this out in G Suite Sheets without needing any code.
I think you want to search for
Education|Educator
and replace it with
Education/Educator
When you are happy with your results you can put it back into code - the same regex syntax should work in javascript.

Regex ignore first 12 characters from string

I'm trying to create a custom filter in Google Analytic to remove the query parts of the url which I don't want to see. The url has the following structure
[domain]/?p=899:2000:15018702722302::NO:::
I would like to create a regex which skips the first 12 characters (that is until:/?p=899:2000), and what ever is going to be after that replace it with nothing.
So I made this one: https://regex101.com/r/Xgbfqz/1 (which could be simplified to .{0,12}) , but I actually would like to skip those and only let the regex match whatever is going to be after that, so that I'll be able to tell in Google Analytics to replace it with "".
The part in the url that is always the same is
?p=[3numbers]:[0-4numbers]
Thank you
Your regular expression:
\/\?p=\d{3}\:\d{0,4}(.*)
Tested in Golang RegEx 2 and RegEx101
It search for /p=###:[optional:####] and capture the rest of the right side string.
(extra) JavaScript:
paragraf='[domain]/?p=899:2000:15018702722302::NO:::'
var regex= /\/\?p=\d{3}\:\d{0,4}(.*)/;
var match = regex.exec(paragraf);
alert('The rest of the right side of the string: ' + match[1]);
Easily use "[domain]/?p=899:2000:15018702722302::NO:::".substr(12)
You can try this:
/\?p\=\d{3}:\d{0,4}
Which matches just this: ?p=[3numbers]:[0-4numbers]
Not sure about replacing though.
https://regex101.com/r/Xgbfqz/1

Postgres: remove second occurrence of a string

I tried to fix bad data in postgres DB where photo tags are appended twice.
The trip is wonderful.<photo=2-1-1601981-7-1.jpg><photo=2-1-1601981-5-2.jpg>We enjoyed it very much.<photo=2-1-1601981-5-2.jpg><photo=2-1-1601981-7-1.jpg>
As you can see in the string, photo tags were added already, but they were appended to the text again. I want to remove the second occurrence: . The first occurrence has certain order and I want to keep them.
I wrote a function that could construct a regex pattern:
CREATE OR REPLACE FUNCTION dd_trip_photo_tags(tagId int) RETURNS text
LANGUAGE sql IMMUTABLE
AS $$
SELECT string_agg(concat('<photo=',media_name,'>.*?(<photo=',media_name,'>)'),'|') FROM t_ddtrip_media WHERE tag_id=tagId $$;
This captures the second occurrence of a certain photo tag.
Then, I use regex_replace to replace the second occurrence:
update t_ddtrip_content set content = regexp_replace(content,dd_trip_photo_tags(332761),'') from t_ddtrip_content where tag_id=332761;
Yet, it would remove all matched tags. I looked up online for days but still couldn't figure out a way to fix this. Appreciate any help.
This Should Work.
Regex 1:
<photo=.+?>
See: https://regex101.com/r/thHmlq/1
Regex 2:
<.+?>
See: https://regex101.com/r/thHmlq/2
Input:
The trip is wonderful.<photo=2-1-1601981-7-1.jpg><photo=2-1-1601981-5-2.jpg>We enjoyed it very much.<photo=2-1-1601981-5-2.jpg><photo=2-1-1601981-7-1.jpg>
Output:
<photo=2-1-1601981-7-1.jpg>
<photo=2-1-1601981-5-2.jpg>
<photo=2-1-1601981-5-2.jpg>
<photo=2-1-1601981-7-1.jpg>

How do I extract a postcode from one column in SSIS using regular expression

I'm trying to use a custom regex clean transformation (information found here ) to extract a post code from a mixed address column (Address3) and move it to a new column (Post Code)
Example of incoming data:
Address3: "London W12 9LZ"
Incoming data could be any combination of place names with a post code at the start, middle or end (or not at all).
Desired outcome:
Address3: "London"
Post Code: "W12 9LZ"
Essentially, in plain english, "move (not copy) any post code found from address3 into Post Code".
My regex skills aren't brilliant but I've managed to get as far as extracting the post code and getting it into its own column using the following regex, matching from Address3 and replacing into Post Code:
Match Expression:
(?<stringOUT>([A-PR-UWYZa-pr-uwyz]([0-9]{1,2}|([A-HK-Ya-hk-y][0-9]|[A-HK-Ya-hk-y][0-9] ([0-9]|[ABEHMNPRV-Yabehmnprv-y]))|[0-9][A-HJKS-UWa-hjks-uw])\ {0,1}[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}|([Gg][Ii][Rr]\ 0[Aa][Aa])|([Ss][Aa][Nn]\ {0,1}[Tt][Aa]1)|([Bb][Ff][Pp][Oo]\ {0,1}([Cc]\/[Oo]\ )?[0-9]{1,4})|(([Aa][Ss][Cc][Nn]|[Bb][Bb][Nn][Dd]|[BFSbfs][Ii][Qq][Qq]|[Pp][Cc][Rr][Nn]|[Ss][Tt][Hh][Ll]|[Tt][Dd][Cc][Uu]|[Tt][Kk][Cc][Aa])\ {0,1}1[Zz][Zz])))
Replace Expression:
${stringOUT}
So this leaves me with:
Address3: "London W12 9LZ"
Post Code: "W12 9LZ"
My next thought is to keep the above match/replace, then add another to match anything that doesn't match the above regex. I think it might be a negative lookahead but I can't seem to make it work.
I'm using SSIS 2008 R2 and I think the regex clean transformation uses .net regex implementation.
Thanks.
Just solved this. As usual, it was simpler logic than I thought it should be. Instead of trying to match the non-post code strings and replace them with themselves, I have added another line matching the postcode again and replacing it with "".
So in total, I have:
Match the post code using the above regex and move it to the Post Code column
Match the post code using the above regex and replace it with "" in the Address3 column