For example, if a value matches "an email pattern", then remove the key also associated to it from the url.
If url is
https://stackoverflow.com/?key1=test&key2=test#gmail.com&key3=something&key4=another#email.com,
then remove key2=test#gmail.com and key4=another#email.com, so that the new url will be
https://stackoverflow.com/?key1=test&key3=something
Here, key names are not fixed and they can be anything. and also the position of the keys is not fixed.
So, want a regex to get the entire string which does not contain those key value pairs. I tried to generate the regex to match the unwanted key value pairs, but could get the rest of the string which does not match the regex.
I did it using a java program. But looking at a regex so that I can apply in the xml and avoid a java program
This is mainly to use in urlrewritefilter (tuckey) and want to remove certain query strings matching a regex.
Here is a simple solution in java (I saw your question is tagged as java). This is basically pattern that matches ? or & followed by a word then a = and then an email. You can substitute that part [.\w]+#[\w]+\.\w+ with a better email regex. Finding email with regex can be tricky with stranger emails but this would be the basic idea.
public class HelloWorld{
public static void main(String []args){
String url="https://stackoverflow.com/?key1=test&key2=test#gmail.com&key3=something&key4=another#email.com";
System.out.println(url.replaceAll("[?&]\\w+=[.\\w]+#[\\w]+\\.\\w+",""));
}
}
Related
I have a string in this shape
State#Received#ID#e23d8926-1327-4fde-9ea7-d364af3325e0
I want to extract the State value via RegEx. So in this above example I only want to extract Received
I have tried the following ([^State#])([A-Za-z]) which matches Received but I am stuck at excluding the rest of the string #ID#e23d8926-1327-4fde-9ea7-d364af3325e0
You should not use a parenthesis for the group you don't want to capture. My solution is that:
State#(?'state'[^#]+)#
Sample: https://regex101.com/r/vAr65j/1
I am using a data analysis package that exposes a Regex function for string parsing. I am trying to parse a response from a website that is in the format...
key1=val1&key2=val2&key3=val3 ...
[There is the possibility that the keys and values may be percent encoded, but the current return values are not, the current return values are tokens and other info that are alphanumeric].
I understand this data to be www-form-urlencoded, or alternatively it might be known as query string format.
The object is to extract the value for a given key, if the order of the keys cannot be relied upon. For example, I might know that one of the keys I should receive is "token", so what regex pattern can I use to extract the value for the key "token"? I have searched for this but cannot find anything that does what I need, but if there is a duplicate question, apologies in advance.
In Alteryx, you may use Tokenize with a regex containing a capturing group around the part you need to extract:
The Tokenize Method allows you to specify a regular expression to match on and that part of the string is parsed into separate columns (or rows). When using the Tokenize method, you want to match to the whole token, and if you have a marked group, only that part is returned.
I bolded the part of the method description that proves that if there is a capturing group, only this part will be returned rather than the whole match.
Thus, you may use
(?:^|[?&])token=([^&]*)
where instead of token you may use any of the keys the value for which you want to extract.
See the regex demo.
Details
(?:^|[?&]) - the start of a string, ? or & (if the string is just a plain key-value pair string, you may omit ? and use (?:^|&) or (?<![^&]))
token - the key
= - an equal sign
([^&]*) - Group 1 (this will get extracted): 0 or more chars other than & (if you do not want to extract empty values, replace * with + quantifier).
I have a text field which stores a list of email addresses e.g: x#demo.com; a.x#demo.com. I have another text field which stores the exact value matched from the list of emails i.e. if /x#demo.com/i is in x#demo.com;a.x#demo.com then it should return x#demo.com.
The issue I am having is that if I have /a.x#demo.com/i, I will get x#demo.com instead of a.x#demo.com
I know of the regex expression /^x#demo.com$/i, but this means I can only have one email in my list of email addresses which won't help.
I have tried a couple of other regex expressions with no luck.
Any ideas on how I can achieve this?
You can use this slightly changed regex:
/(^|;)x#demo.com($|;)/i
It will match from either beginning of string or start after a semi colon and end either at end of string or at a semi colon.
Edit:
Small change, this uses look behind and look forward, then you will only get the match, you want:
(?<=^|;)x#demo.com(?=$|;)
Edit2:
To allow Spaces around the semi colon and at start and end, use this (#-quoted):
#"(?<=^\s*|;\s*)x#demo.com(?=\s*$|\s*;)"
or use double escaping:
"(?<=^\\s*|;\\s*)x#demo.com(?=\\s*$|\\s*;)"
I try to get an URL from a String of the following format:
RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH
I already tried some things, especially the the look before/after, which I used before successfully on another url format (starts https... ends .html, this was working).
But seems I'm too stupid to figure out the regex for the kind of string mentioned above. I just want the URL part from https.... to the end of the random last name. Is this even possible?
Any Ideas?
If you can guarantee that randomfirstname_randomlastname is all lowercase and RANDOMRUBBISH is all uppercase, you can use character classes [a-z] and [A-Z]. The language the regex is for will determine how to use these.
This is example works in javascript:
var str = "RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH";
var match = /https:\/\/www\.my-url\.com\/[a-z]*/.exec(str);
I'm quite bad with regex, and I'm looking to match a criteria.
This is a regex expression that should go emmbed into the url for a firewall, so It will block any url that is not like the list at the end.
This is what Im currently using but its not working:
http://www.youtube.com/(*.*)list=UUFwtOm4N5djdcuTAlNIWJaQ
This is the example url (to be blocked):
http://www.youtube.com/watch?NR=1&feature=fvwp&v=P1b5VY_Bp_o&list=UUFwtOm4N5djdcuTAlNIWJaQ
I'm trying to make a regex that will Success fully match when NR=1 or feature=fvwp
are NOT present, I asume I can do it like this: (?!^feature=fvwp$) but the v= and list=UUFwtOm4N5djdcuTAlNIWJaQ are allowed.
Also the v= should be limited to any character (uppercase and lowercase) and 11 length, I assume its: /^[a-z0-9]{11}$/
How can I build all that together and make it work so it would allow and match only on this urls excluding from allowing the previous criterias that I explained:
http://www.youtube.com/watch?v=4eK_RWpTgcc&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
http://www.youtube.com/watch?v=TLRl85TJwZM&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
http://www.youtube.com/watch?v=QEV9yqrpxkc&feature=BFa&list=UUFwtOm4N5djdcuTAlNIWJaQ
Can you block based on matching by regex? If so, just use
(.*)www\.youtube\.com/watch\?NR=1&feature=fvwp and block whatever matches that.