Regex to certain special characters - regex

Currently i have this following regex which i use to validate the name of a company/industry and its working fine
/(?=[a-zA-Z0-9-]{5,25}$)^[a-zA-Z0-9]+(-[a-zA-Z0-9]+)*$/
The above regex doesnt supports for special characters like & - . _ which are valid in my case
I came up with this but it wasnt working as expected.
/(?=[a-zA-Z0-9-\&\_\.]{5,25}$)^[a-zA-Z0-9\&\_\.]+(-[a-zA-Z0-9\&\_\.]+)*$/
Can someone point it out where my above regex goes wrong. Also a short explaination of the above regex wud be greatly appreciated
Thanks

I don't think you have to escape & with \&, same way _ also
/(?=[a-zA-Z0-9-&_\.]{5,25}$)^[a-zA-Z0-9&_\.]+(-[a-zA-Z0-9&_\.]+)*$/

If I'm not wrong, you don't actually have to put backslash with every special character unless the special character is the backslash itself or the character -. So your regular expression would be
/(?=[a-zA-Z0-9-&_.]{5,25}$)^[a-zA-Z0-9&_.]+(-[a-zA-Z0-9&_.]+)*$/

Related

Brackets within a Regex string

I'm trying to use a regular expression to match on a string. Brackets are special characters within regex, am I'm unsure of how'd i'd go about including them in my regex.
To provide more context, I want to find a string such as test[test]
My regex currently looks like this: ^*test[test]. My expression is built out more much than this, but this example is enough to understand the problem.
How can i search for brackets in my string without triggering a character class. I need to use a regex, please don't recommend switching to something else.
You can escape a character with a backslash so \[
I can highly recommend https://regex101.com/ to test your regex without having to code it.
Try: ^.*test\[test\] - This mean {start of line}, {anything}, "test[test]".

Regex to handle a dynamic set of delimters

Im writing a parser and need to handle escaping characters via regex, if possible.
Given a sample string of with the escape character of '\' and a delimiter of '&':
TestSection1&TestSection2\\&TestSection3\&TestSection4
I would like to be able to split on a valid '&', that is to say not an & that is escaped. So the above example would come out something like this:
TestSection1
TestSection2\
TestSection3\&TestSection4
Ive tried a quite a few regex that Ive tried to muddle together but no luck. Does anyone have any insight on how one can accomplish this, or if its even possible?
Thanks
You can use this double lookbehind based regex:
(.+?)(?:(?<!(?<!\\)\\)&|$)
RegEx Demo
(?:(?<!(?<!\\)\\)&|$) means match & or end anchor if & is not preceded by a single \

Regex expression to match all char inside

I'm trying to mass update a web app, I need to create a regex that matches:
lang::id(ALLCHARACTERS]
Can someone assist me with this? I'm not good with regex. I'm pretty sure it can start like:
lang\:\:\(WHAT GOES HERE\]
Something like this would work:
lang::id\([^]]*]
This will match a literal lang::id\(, followed by zero or more of any character other than ], followed by a literal ].
Note that the only character that really needs to be escaped is the open parenthesis.
lang::id\(.*]
The . means any single character, and then * repeats it zero->N times. Make sure to escape the ( since it is used inside regex and is a special char for them, so escaping it with \ is needed, or the regex will probably complain about unbalanced parenthesis.
If you wanted it to not include all characters, you can add a smaller regex in place of the .*. This way you can break the regex down into smaller chunks which help make it easier to understand and develop for some complex rules.

Trouble rejecting a specific character in my RegEx

I'm running the following regular expression to check a username:
^(?=.*[a-zA-Z0-9])\w{2,25}\s*$
It works fine but now I need to amend it to reject any instances of underscores(_). I've tried wedging ^(?!_)$ in there but it doesn't seem to work for me in that it either checks at the beginning or the end.
I know a little about regular expressions but I'm hazy on all the classes. I've found a great resource for it at http://www.regular-expressions.info/reference.html
Thanks for the help, folks.
This should work for you:
[a-zA-Z][a-zA-Z0-9.\-]{2,25}\s*$
What this regex will validate:
The first character is a letter
The input contains only alphanumeric characters (i added - also)
if dont want - just remove \-
The input is 2-25 characters long
Well, you could always remove the \w by its character class excluding _.
^(?=.*[a-zA-Z0-9])[A-Za-z0-9]{2,25}\s*$

RegExp extraction

Here's the input string:
loadMedia('mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml', '/videos/video-splash-image.gif)
With this RegExp: \'.+.xml\'
... we get this:
'mediacontainer1', 'http://www.something.com/videos/JohnsAwesomeVideo.flv', 'http://www.something.com/videos/JohnsAwesomeCaption.xml'
... but I want to extract only this:
http://www.something.com/videos/JohnsAwesomeCaption.xml
Any suggestions? I'm sure this problem has been asked before, but it's difficult to search for. I'll be happy to Accept a solution.
Thanks!
If you want to get everything within quotes that starts with http:
(?<=')http:[^']+(?=')
If you only want those ending with .xml
(?<=')http:[^']+\.xml(?=')
It doesn't select the quotation marks (as you asked)
It's fast!
Fair warning: it only works if the regex engine you're using can handle lookbehind
Knowing the language would be helpful. Basically, you are having a problem because the + quantifier is greedy, meaning it will match the largest part of the string that it can. you need to use a non-greedy quantifier, which will match as little as possible.
We will need to know the language you're in to know what the syntax for the non-greedy quantifier should be.
Here is a perl recipe. Just as a sidenote, instead of .+, you probably want to match [^.]+.xml.
\'.+?.xml\'
should work if your language supports perl-like regexes.
This should work (tested in javascript, but pretty sure it would work in most cases)
'[^']+?\.xml'
it looks for these rules
starts with '
is followed by anything but '
ends in .xml'
you can demo it at http://RegExr.com?2tp6q
in .net this regex works for me:
\'[\w:/.]+\.xml\'
breaking it down:
a ' character
followed by a word character or ':' or '/' or '.' any number of times (which matches the url bit)
followed by '.xml' (which differentiates the sought string from the other urls which it will match without this)
followed by another ' character
I tested it here
Edit
I missed that you don't want the quotes in the result, in which case as has been pointed out you need to use look behind and look ahead to include the quotes in the search, but not in the answer. again in .net:
(?<=')[\w:/.]+\.xml(?=')
but I think the best solution is a combination of those offered already:
(?<=')[^']+\.xml(?=')
which seems the simplest to read, at least to me.