Reg Ex to match a ? or the Unicode %3d - regex

I have an expression which matches the question mark in a url query string and I find myself needing to extend the expression to accommodate for a case where the URL I am trying to read contains the unicode equivalent of the question mark %3d
the expression is
var regexS = "[\\?&]" + name + "=([^&#]*)";
From what very little I know of RegEx I thought this might work
var regexS = "[\\?&]|[\\%3d&]" + name + "=([^&#]*)";
Thanks for the help

Assuming Javascript for the purposes of string escaping.
In "[\\?&]|[\\%3d&]" + name + "=([^&#]*)" note that
concatenation (abc) has priority over alternation (a|b|c) and
[\\%3d&] means "percent or 3 or d or ampersand" (character class).
The escaped form of ? is %3F, not %3D. %3D means =. See wikipedia: percent encoding
the ampersand in the first character class is present to match &q2= in www.example.com?q1=v1&q2=v2. Perhaps you want to allow escaped ampersand as well. Its escaped form is %26
You probably mean "([\\?&]|\\%3f|\\%26)" + name + "=([^&#]*)" instead.
Also note that ? has no special meaning inside a character class and doesn't need to be escaped: "([?&]|%3f|%26)" + name + "=([^&#]*)"

Related

Swift regex for characters and empty spaces

I'm trying a regex expression to only allow characters and spaces for a full name field i.e. Mr Bob Smith
What I've currently tried:
let textRegex = "[A-Za-z+\\s]"
let textRegex = "[A-Za-z ]"
let textRegex = "[A-Za-z+ ]"
let textRegex = "([A-Za-z ])"
It doesn't appear to be working.
Thanks
Your regular expression isn't working because you misplaced the + symbol.
This one will work:
([A-Za-z ]+)
I don't know how Swift handles regex however so keep in mind if you strictly want whitespaces only, it is better to just add " " character instead of the \s which can sometimes be extended to other spaces.

AS3 Regex and the File.separator

I am on a Windows machine, and I am looking for a way to use Regex to count the number of occurrences of the File.separator characters in a path. Below is my code, and it outputs 0 every time.
var dummyPath:String = "C:" + File.separator + "A" + File.separator + "B.jpg";
var pattern:RegExp = new RegExp(File.separator,"g");
trace(dummyPath.match(pattern).length);
//Outputs 0
I'm not sure what else to do.
I wouldn't use a regex in a case like this, just because they're a lot more confusing to work with (and I think a lot more inefficient as well) than usual string operations, and you aren't doing anything here that's complicated enough to make up for the difference.
In that case, I would just go about it this way:
var dummyPath:String = "C:" + File.separator + "A" + File.separator + "B.jpg";
trace(dummyPath.split(File.separator).length - 1);
As for what you're running into though, remember that operating systems' file separators are generally either / or \. You're saying you're running this on Windows. That means you're passing "\" into the constructor for the regex. \ is used to begin escape sequences in regexes the same way it's used like that in strings.
So essentially you're not describing a regex that looks for instances of "\" on a Windows machine; you're describing a regex that starts an escape sequence and doesn't finish. So to use a regex in this case, you would need to escape \ with another \:
// This is technically untested, but the principle is the same.
var pattern:RegExp = new RegExp(File.separator.replace("\\", "\\\\"), "g");
Its not matching because the file separator you are using is a metacharacter.
The escape \.
The regex engine expects metachars, used as literals, to be escaped.
Try \\, which would be "\\\\" as a double quoted string.
If you run into a forward slash separator, just escape it too, does no harm.
So, concatenate the variable with an escape as a string Sep = "\\" + Sep; or something.

Using regular expression to strip out every character that's not in a list

I want to strip out every character that isn't in a list of valid characters.
In this example, I want to strip out everything that's either: (a) not alphanumeric, or (b) is the e accent grave character:
Line = rereplace(Line,'[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789' + Chr(233) + ']','','all')
I think , I just need a 'not' symbol or something .
You can use shortcuts for most of that:
Line = rereplace(Line,'[^A-Za-z0-9' + Chr(233) + ']','','all')
The ^ inside the bracket means 'not these characters'

What delimiter could be used to parse string to list of regexps?

I need to convert input string which is actual array of regexps separated by some delimiter.
Output - is list of strings where each string is regexp from input.
Question is what delimiter should I use to be sure that I will receive correct values.
Because it seems like regexp string could contain any set of characters, and in this case I need to decide what should be better for use as delimiter.
Thanks.
Building on #Theox's answer, a triple + is not valid in a regular expression and, assuming you expect the values to be valid regular expressions, could be used as a delimiter.
regex1+++regex2+++regex3
If a regular expression ended with a + or a double +, you'd have 4 or 5 + characters in a row. But, since a regular expression cannot start with a +, you'd know that the last three + characters represent the delimiter. For example,
a+++++b
would represent two regular expressions: a++ and b.
Note that the double + is valid in a regular expression with the second + being the possessive quantifier so we cannot use only two + characters as the delimiter.
You say its an Input-String and I assume you are able to manipulate it.
Why don't you use doubled character as delimiter? For example, I don't think you will use double semicolon in your regex, or triple.
regex1;;regex2;;regex3
Then
regexString.split(";;", regexString);
I think you could use a double + as your delimiter.
It seems impossible to have a double + in a regexp, as it is a quantifier and it must be escaped to match the character.
So regexp1++regexp2++regexp3 will work fine.
Edit : After seeing Rangi Keen's comment : two + is not enough, as it is still valid, but three + (or more) should do it !
regexp1+++regexp2+++regexp3 will answer your problem.

problem in not replaceing minus sign(-) with a blank using regex

I am using this regex expression to replace some characters with ""
I used it as
query=query.replace(/[^a-zA-Z 0-9 * ? : . + - ^ "" _]+/g,'');
But when my query is as +White+Diamond, i get result +White+Diamond, but when query is -White+diamond i am getting White+diamond, it means - is replaced by "" that i don't want.
Please tell me what is the problem.
In regex, - means "from ... to ...", escape your - with a backslash: \-.
What SteeveDroz said:
query=query.replace(/[^a-zA-Z0-9*?:.+\-^"_ ]+/g,'');
I'm assuming you want to exclude spaces as well. If not, remove the final space from the character class.