regular expression to find a repeated code - regex

I am trying to write a reg expression to find match of strings / code in a database.
here is some of the sample code / string which i need to remove using the regular expression.
[b:1wkvatkt]
[/b:1wkvatkt]
[b:3qo0q63v]
[/b:3qo0q63v]
[b:2r2hso9d]
[/b:2r2hso9d]
Anything that match [b:********] and [/b:********]
Anybody please help me out. Thanks in advance.

You can use the following pattern (as stated by LukStorms in the comments):
\[\/?b:[a-z0-9]+\]
If you want to replace [b:********] with <b> (and also the closing one), you can use the following snippet (here in JavaScript, other languages are similar):
var regex = /\[(\/)?b:[a-z0-9]+\]/g;
var testText = "There was once a guy called [b:12a345]Peter[/b:12a345]. He was very old.";
var result = testText.replace(regex, "<$1b>");
console.log(result);
It matches an optional / and puts it into the first group ($1). This group can then be used in the replacement string. If the slash is not found, it won't be added, but if it is found, it will be added to <b>.

Related

RegEx remove part of string and and replace another part

I have a challenge getting the desired result with RegEx (using C#) and I hope that the community can help.
I have a URL in the following format:
https://somedomain.com/subfolder/category/?abc=text:value&ida=0&idb=1
I want make two modifications, specifically:
1) Remove everything after 'value' e.g. '&ida=0&idb=1'
2) Replace 'category' with e.g. 'newcategory'
So the result is:
https://somedomain.com/subfolder/newcategory/?abc=text:value
I can remove the string from 1) e.g. ^[^&]+ above but I have been unable to figure out how to replace the 'category' substring.
Any help or guidance would be much appreciated.
Thank you in advance.
Use the following:
Find: /(category/.+?value)&.+
Replace: /new$1 or /new\1 depending on your regex flavor
Demo & explanation
Update according to comment.
If the new name is completely_different_name, use the following:
Find: /category(/.+?value)&.+
Replace: /completely_different_name$1
Demo & explanation
You haven't specified language here, I mainly work on python so the solution is in python.
url = re.sub('category','newcategory',re.search('^https.*value', value).group(0))
Explanation
re.sub is used to replace value a with b in c.
re.search is used to match specific patterns in string and store value in the group. so in the above code re.search will store value from "https to value" in group 0.
Using Python and only built-in string methods (there is no need for regular expressions here):
url = r"https://somedomain.com/subfolder/category/?abc=text:value&ida=0&idb=1"
new_url = (url.split('value')[0] + "value").replace("category", 'newcategory')
print(new_url)
Outputs:
https://somedomain.com/subfolder/newcategory/?abc=text:value

Regex ignore first 12 characters from string

I'm trying to create a custom filter in Google Analytic to remove the query parts of the url which I don't want to see. The url has the following structure
[domain]/?p=899:2000:15018702722302::NO:::
I would like to create a regex which skips the first 12 characters (that is until:/?p=899:2000), and what ever is going to be after that replace it with nothing.
So I made this one: https://regex101.com/r/Xgbfqz/1 (which could be simplified to .{0,12}) , but I actually would like to skip those and only let the regex match whatever is going to be after that, so that I'll be able to tell in Google Analytics to replace it with "".
The part in the url that is always the same is
?p=[3numbers]:[0-4numbers]
Thank you
Your regular expression:
\/\?p=\d{3}\:\d{0,4}(.*)
Tested in Golang RegEx 2 and RegEx101
It search for /p=###:[optional:####] and capture the rest of the right side string.
(extra) JavaScript:
paragraf='[domain]/?p=899:2000:15018702722302::NO:::'
var regex= /\/\?p=\d{3}\:\d{0,4}(.*)/;
var match = regex.exec(paragraf);
alert('The rest of the right side of the string: ' + match[1]);
Easily use "[domain]/?p=899:2000:15018702722302::NO:::".substr(12)
You can try this:
/\?p\=\d{3}:\d{0,4}
Which matches just this: ?p=[3numbers]:[0-4numbers]
Not sure about replacing though.
https://regex101.com/r/Xgbfqz/1

Regular Expression for String without a "?" character to redirect to string with "?" character

On our website we occasionally experience an error where dynamic links aren't building correctly.
URLs like this
https://www.test.url.edu/collections/&edan_fq[]=p.edanmdm.indexedstructured.object_type:%22Financial+records%22&edan_fq[]=p.edanmdm.descriptivenonrepeating.record_id:item_*
Should actually be this:
https://www.test.url.edu/collections/search?edan_fq[]=p.edanmdm.indexedstructured.object_type:%22Financial+records%22&edan_fq[]=p.edanmdm.descriptivenonrepeating.record_id:item_*
We want to create a regular expression to redirect
/collections/&edan_fq[]=
to
/collections/search?edan_fq[]=
But everything after "edan_fq[]=" can change dynamically--there are thousands of permutations of the string after that point.
Does anyone know how this would be done?
If you use \& without Global Flag in Regex it will give first match. I've used JavaScript, please check this.
var data = "https://www.test.url.edu/collections/&edan_fq[]=p.edanmdm.indexedstructured.object_type:%22Financial+records%22&edan_fq[]=p.edanmdm.descriptivenonrepeating.record_id:item_*";
var regex = /\&/
data = data.replace(regex,"search?");
console.log(data);
Please check Substitution example in Regex101.

Regular expression that contains one expression yet doesn't contain the other

We are currently matching "service_hub*queue"
I want to ignore the case "service_hub_scout_dead_queue" and yet still match everything else.
What is the regular expression for that ?
This javascript sollution gives an array with the matches
var myText = 'service_hub_anything_queue Add service_hub_scout_dead_queue something service_hub_someting_queue else';
var myMatches = myText.match(/service_hub(?!_scout_dead_)\w+queue/g);
If you are rather interested in what follows a match
var mySplit = ('dummy'+myText).split(/service_hub(?!_scout_dead_)\w+queue/g).filter(function(txt,i) {return (i>0);})
I put 'dummy' and then filter away the first part to make it work both if the sting starts with a valid tag and when it does not.
Using negative lookbehind: "service_hub_.*?(?<!_scout_dead)_queue"
This appears to be widely supported by popular regex engines; I've tested with Java (or Scala, rather) just to make sure it works.

Regex URI portion: Remove hyphens

I have to split URIs on the second portion:
/directory/this-part/blah
The issue I'm facing is that I have 2 URIs which logically need to be one
/directory/house-&-home/blah
/directory/house-%26-home/blah
This comes back as:
house-&-home and house-%26-home
So logically I need a regex to retrieve the second portion but also remove everything between the hyphens.
I have this, so far:
/[^(/;\?)]*/([^(/;\?)]*).*
(?<=directory\/)(.+?)(?=\/)
Does this solve your issue? This returns:
house-&-home and house-%26-home
Here is a demo
If you want to get the result:
house--home
then you should use a replace method. Because I am not sure what language you are using, I will give my example in java:
String regex = (?<=directory\/)(.+?)(?=\/);
String str = "/directory/house-&-home/blah"
Pattern.compile(regex).matcher(str).replaceAll("\&", "");
This replace method allows you to replace a certain pattern ( The & symbol ) with nothing ""