How to match all characters and check only desirable length?

How to match all characters and check only desirable length? - regex

I want to create a simple regex check to simply check the length of the text, nothing else. Closest I got was [^.]{1,6} however this will not allow . characters.
How can I make sure that only thing that regex expression does is checks length?
Example: http://regex101.com/r/eJ6pK5#pcre

Use this regex:
[\s\S]{1,6}
The above will work with multiline.
Otherwise . represents any character other than new line.
So, .{1,6} will also do the trick.

Related

Regular expression to check strings containing a set of words separated by a delimiter

As the title says, I'm trying to build up a regular expression that can recognize strings with this format:
word!!cat!!DOG!! ... Phone!!home!!
where !! is used as a delimiter. Each word must have a length between 1 and 5 characters. Empty words are not allowed, i.e. no strings like !!,!!!! etc.
A word can only contain alphabetical characters between a and z (case insensitive). After each word I expect to find the special delimiter !!.
I came up with the solution below but since I need to add other controls (e.g. words can contain spaces) I would like to know if I'm on the right way.
(([a-zA-Z]{1,5})([!]{2}))+
Also note that empty strings are not allowed, hence the use of +
Help and advices are very welcome since I just started learning how to build regular expressions. I run some tests using http://regexr.com/ and it seems to be okay but I want to be sure. Thank you!
Examples that shouldn't match:
a!!b!!aaaaaa!!
a123!!b!!c!!
aAaa!!bbb
aAaa!!bbb!

Splitting the string and using the values between the !!
It depends on what you want to do with the regular expression. If you want to match the values between the !!, here are two ways:
Matching with groups
([^!]+)!!
[^!]+ requires at least 1 character other than !
!! instead of [!]{2} because it is the same but much more readable
Matching with lookahead
If you only want to match the actual word (and not the two !), you can do this by using a positive lookahead:
[^!]+(?=!!)
(?=) is a positive lookahead. It requires everything inside, i.e. here !!, to be directly after the previous match. It however won't be in the resulting match.
Here is a live example.
Validating the string
If you however want to check the validity of the whole string, then you need something like this:
^([^!]+!!)+$
^ start of the string
$ end of the string
It requires the whole string to contain only ([^!]+!!) one or more than one times.
If [^!] does not fit your requirements, you can of course replace it with [a-zA-Z] or similar.

Regex for string that contains characters in unspecified order

When having strings like
helloworld
worldhello
ollehdlrow
Is there a regex that can match all those cases? So, basically a pattern that will match all strings that contain all characters, in unspecified order.
I tried using
/[helloworld]{10}/
but this doesn't work for obvious reasons, as it will also match eeeeeeeeee.

You definitely don't want to use regular expressions for this.
In order to check if a character exists in the string, in your case, you would have to use a positive lookahead. It would look something like this (?=a) to check for the character a. Thats fine. If we want to check for a string containing the character a and b we can do /^(?=.*a)(?=.*b)/. Problems arise if we want to check for multiple as.
View this example: http://regex101.com/r/iV2jC8
As you can see, the regex has been "told" to look two times for the letter 'a'. However, the first case still matches. This is because the engine does not save the position where it initially found the first 'a', and thus the next assertion finds the very same a. This is the case in all three of the examples. So in reality, none of them are really being validated.
You would have to do something like this: http://regex101.com/r/cR8eR4
Which as you probably can imagine will quickly get out of hand with larger patterns.
I hope this helps, best of luck.

What is wrong with my simple regex that accepts empty strings and apartment numbers?

So I wanted to limit a textbox which contains an apartment number which is optional.
Here is the regex in question:
([0-9]{1,4}[A-Z]?)|([A-Z])|(^$)
Simple enough eh?
I'm using these tools to test my regex:
Regex Analyzer
Regex Validator
Here are the expected results:
Valid
"1234A"
"Z"
"(Empty string)"
Invalid
"A1234"
"fhfdsahds527523832dvhsfdg"
Obviously if I'm here, the invalid ones are accepted by the regex. The goal of this regex is accept either 1 to 4 numbers with an optional letter, or a single letter or an empty string.
I just can't seem to figure out what's not working, I mean it is a simple enough regex we have here. I'm probably missing something as I'm not very good with regexes, but this syntax seems ok to my eyes. Hopefully someone here can point to my error.
Thanks for all help, it is greatly appreciated.

You need to use the ^ and $ anchors for your first two options as well. Also you can include the second option into the first one (which immediately matches the third variant as well):
^[0-9]{0,4}[A-Z]?$
Without the anchors your regular expression matches because it will just pick a single letter from anywhere within your string.

Depending on the language, you can also use a negative look ahead.
^[0-9]{0,4}[A-Za-z](?!.*[0-9])
Breakdown:
^[0-9]{0,4} = This look for any number 0 through 4 times at the beginning of the string
[A-Za-z] = This look for any characters (Both cases)
(?!.*[0-9]) = This will only allow the letters if there are no numbers anywhere after the letter.
I haven't quite figured out how to validate against a null character, but that might be easier done using tools from whatever language you are using. Something along this logic:
if String Doesn't equal $null Then check the Rexex
Something along those lines, just adjusted for however you would do it in your language.
I used RegEx Skinner to validate the answers.
Edit: Fixed error from comments

How to make a regular expression looking for a list of extensions separated by a space

I want to be able to take a string of text from the user that should be formated like this:
.ext1 .ext2 .ext3 ...
Basically, I am looking for a dot, a string of alphanumeric characters of any length a space, and rinse and repeat. I am a little confused on how to say " i need a period, string of characters and a space". But also, the last extension could either be followed by nothing, or a space, or a series of spaces. Also, I guess in between extensions could be followed by any number of spaces?
EDIT: I made it clearer what I was looking for.
Thanks!

Try this:
^(?:\.[A-Za-z0-9]+ +)*\.[A-Za-z0-9]+ *$
(Rubular)
In a Java string literal you need to escape the backslashes:
"^(?:\\.[A-Za-z0-9]+ +)*\\.[A-Za-z0-9]+ *$"

(\.\w+)\s* Match this and get your results.
^((\.\w+)\s*)*$ Check this and if it's true, your String is exactly what you want.
For the last pattern thing, you can't (AFAIK) do both getting all extensions (separated) and checking that the last is followed by other things. Either you check your string, or you extract the extensions from it.

I'd start with something like: ^.[a-z0-9]+([\t\n\v ]+.[a-z0-9]+)*$

Regular Expression to List accepted words

I need a regular expression to list accepted Version Numbers. ie. Say I wanted to accept "V1.00" and "V1.02". I've tried this "(V1.00)|(V1.01)" which almost works but then if I input "V1.002" (Which is likely due to the weird version numbers I am working with) I still get a match. I need to match the exact strings.
Can anyone help?

The reason you're getting a match on "V1.002" is because it is seeing the substring "V1.00", which is part of your regex. You need to specify that there is nothing more to match. So, you could do this:
^(V1\.00|V1\.01)$
A more compact way of getting the same result would be:
^(V1\.0[01])$

Do this:
^(V1\.00|V1\.01)$
(. needs to be escaped, ^ means must be on the beginning of the text and $ must be on the end of the text)

I would use the '^' and '$' to mark the beginning and end of the string, like this:
^(V1\.00|V1\.01)$
That way the entire string must match the regex.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to match all characters and check only desirable length? - regex

I want to create a simple regex check to simply check the length of the text, nothing else. Closest I got was [^.]{1,6} however this will not allow . characters. How can I make sure that only thing that regex expression does is checks length? Example: http://regex101.com/r/eJ6pK5#pcre

Use this regex: [\s\S]{1,6} The above will work with multiline. Otherwise . represents any character other than new line. So, .{1,6} will also do the trick.

Related

Regular expression to check strings containing a set of words separated by a delimiter

Regex for string that contains characters in unspecified order

What is wrong with my simple regex that accepts empty strings and apartment numbers?

How to make a regular expression looking for a list of extensions separated by a space

Regular Expression to List accepted words

Categories

Resources