Match exact string and random string length in between - regex

I hate regex and I really can't get my head around it properly. I'm trying to match the following example:
fwb fcb"><a href="https://www.facebook.com/random.length?
while random.length can be any word with upper/lowercase letters, a dot or a number. And it ends with the ? so the question mark indicates the end.
I came as far as:
/fwb fcb"><a href="https:\/\/www.facebook.com\/ missing bit ?/g
Any help?

[a-zA-Z0-9\.]+\? should do the trick.
a-z matches all lowercase letters.
A-Z matches all uppercase letters.
0-9 matches all digits.
You need to escape the dot with a backslash as it has a special meaning in regex.
+ means that the length of the string can be anything from 1 to infinity.

the missing part could be (\w|\.)+ if underscore was accepted. Otherwise, like in your case, you have to specify all different possibilities: [A-Za-z0-9\.]+. Pay attention because in your regex there are some characters that need to be escaped (. and ? are an example).

Related

Regex that only allows empty string, letters, numbers or spaces?

Need help coming up with a regex that only allows numbers, letters, empty string, or spaces.
^[_A-z0-9]*((-|\s)*[_A-z0-9])*$
This one is the closest I've found but it allows underscores and hyphen.
Only letters, numbers, space, or empty string?
Then 1 character class will do.
^[A-Za-z0-9 ]*$
^ : start of the string or line (depending on the flag)
[A-Za-z0-9 ]* : zero or more upper-case or lower-case letters, or digits, or spaces.
$ : end of the string or line (depending on the flag)
The A-z range contains more than just letters.
You can see that in the ASCII table.
And \s for whitespace also includes tabs or linebreaks (depending on the flag).
But if you also want those, then just use that instead of the space.
^[A-Za-z0-9\s]*$
Also, depending on the regex engine/dialect that your language/tool uses, you could use \p{L} for any unicode letter.
Since [A-Za-z] only includes the normal ascii letters.
Reference here
Your regex is too complicated for what you need.
the first part is fine, you are allowing letter and number, you could simply add the space character with it.
Then, if you use the * character, which translate to 0 or any, you could take care of your empty string problem.
See here.
/^[a-z0-9 ]*$/gmi
Notice here that i'm not using A-z like you were because this translate to any character between the A in ascii (101) and the z(172). this mean it will also match char in between (133 to 141 that are not number nor letter). I've instead use a-z which allow lowercase letter and used the flag i which tell the regex to not take care of the case.
Here is a visual explanation of the regex
You can also test more cases in this regex101
Matching only certain characters is equivalent to not matching any other character, so you could use the regex r = /[^a-z\d ]/i to determine if the string contains any character other than the ones permitted. In Ruby that would be implemented as follows.
"aBc d01e e$9" !~ r #=> false
"aBc d01e ex9" !~ r #=> true
In this situation there may not much to choose between this approach and attempting to match /\A[a-z\d ]+\z/i, but in other situations the use of a negative match can simplify the regex considerably.

Reg-ex with different symbols

Im using the following reg-ex which is working great but the only problem is
that you cannot mix symbols like
aaa-bbbb\ccc
it always should have the same sperator
like aaa-bbb-cccc
"^(?:(?:-?[A-z0-9]+)*|(?:_?[A-z0-9]+)*|(?:\/?[A-z0-9]+/?)*)\s*$"
How can I change it ?
The Value should have
hyphen ‘-‘,
Underscore ‘_’
Slash ‘/’
First off, A-z gives you too wide of a range. In ASCII (and Unicode) there are characters between uppercase 'Z' and lowercase 'a' that are not letters or numbers. You can use the regex escape sequence \w for word characters or A-Za-z. Both are equivalent.
Also, it looks like you know you'll always have three sections so the lazy indicators are unnecessary.
[A-Za-z\d]+([-_\\/])[A-Za-z\d]+\1[A-Za-z\d]+\s*
This will ensure you have the same separator which can be a hyphen, slash, or underscore. Whatever the separator is it will separate 3 groups of alphanumeric characters.
Is this what you're looking for?
To make sure the symbol is the same throughout you should use a back reference, e.g.
aaa[_/-]bbb\1ccc
^^
The \1 will have to be whatever symbol was matched in the [_/-]
Also you are using [A-z] which almost certainly doesn't do what you think it does, the characters between uppercase A and lowercase z are:
ABCD...XYZ[\]^_`abcd...xyz
You probably want [A-Za-z]

regex included a-zA-Z, digit, and some symbol with length limit

I try to create a regex to match lower and uppercase of A-Z, digits and ##$_ symbols with length limit of 4 to 16 for all of string.
My useless regex:
/^([a-zA-Z])|(\d)|(##\$_){4,16}$/
I test Online regex generators Like http://www.jslab.dk/tools.regex.php but don't have a good result .
Your regex /^([a-zA-Z])|(\d)|(##\$_){4,16}$/ matches for a single letter OR a single digit OR 4 to 16 characters of "##\$_".
The groups around the alternatives are useless.
One solution would be to make a group around the whole alternation
/^([a-zA-Z]|\d|##\$_){4,16}$/
but the better solution would be to add everything to one character class
/^[a-zA-Z##$_\d]{4,16}$/
See it here on Regexr
you can maybe simplify it further, since [a-zA-Z\d_] is the same than \w, when \w is not unicode based!
/^[\w##$]{4,16}$/
\w includes lowercase and UPPERCASE letters, digits and the _ character
RegEx Pattern: ^[\w#\#\$]{4,16}$
Explained demo here: http://regex101.com/r/rK1yH2
The expression that you need is this one:
( ([a-zA-Z])|(\d)|(##\$_) ){4,6}
The problem that you have in yours is that the last {2,6} are affecting only to the last group of brackets, not to the whole expression. Also make sure that the "/^" and "$/" are mandatory for your case, because the "^" means "not", so I'm not sure why you have it there.
You can also see it graphically here: http://www.debuggex.com/

Regex help with matching

Hello I need coming up with a valid regular expression It could be any identifier name that starts with a letter or underscore but may contain any number of letters, underscores, and/or digits (all letters may be upper or lower case).
For example, your regular expression should match the following text strings: “_”, “x2”, and “This_is_valid” It should not match these text strings: “2days”, or “invalid_variable%”.
So far this is what I have came up with but I don't think it is right
/^[_\w][^\W]+/
The following will work:
/^[_a-zA-Z]\w*$/
Starts with (^) a letter (upper or lowercase) or underscore ([_a-zA-Z]), followed by any amount of letter, digit, or underscore (\w) to the end ($)
Read more about Regular Expressions in Perl
Maybe the below regex:
^[a-zA-Z_]\w*$
If the identify is at the start of a string, then it's easy
/^(_|[a-zA-Z]).*/
If it's embedded in a longer string, I guess it's not much worse, assuming it's the start of a word...
/\s(_|[a-zA-Z]).*/

Regex for alphanumeric, but at least one letter

In my ASP.NET page, I have an input box that has to have the following validation on it:
Must be alphanumeric, with at least one letter (i.e. can't be ALL
numbers).
^\d*[a-zA-Z][a-zA-Z0-9]*$
Basically this means:
Zero or more ASCII digits;
One alphabetic ASCII character;
Zero or more alphanumeric ASCII characters.
Try a few tests and you'll see this'll pass any alphanumeric ASCII string where at least one non-numeric ASCII character is required.
The key to this is the \d* at the front. Without it the regex gets much more awkward to do.
Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:
^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$
This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.
For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:
^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$
^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$
Explanation:
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
\p{L} matches one letter
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.
Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.
\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.
The less fancy non-Unicode version would be
^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
^[0-9]*[A-Za-z][0-9A-Za-z]*$
is the regex that will do what you're after. The ^ and $ match the start and end of the word to prevent other characters. You could replace the [0-9A-z] block with \w, but i prefer to more verbose form because it's easier to extend with other characters if you want.
Add a regular expression validator to your asp.net page as per the tutorial on MSDN: http://msdn.microsoft.com/en-us/library/ms998267.aspx.
^\w*[\p{L}]\w*$
This one's not that hard. The regular expression reads: match a line starting with any number of word characters (letters, numbers, punctuation (which you might not want)), that contains one letter character (that's the [\p{L}] part in the middle), followed by any number of word characters again.
If you want to exclude punctuation, you'll need a heftier expression:
^[\p{L}\p{N}]*[\p{L}][\p{L}\p{N}]*$
And if you don't care about Unicode you can use a boring expression:
^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$
^[0-9]*[a-zA-Z][a-zA-Z0-9]*$
Can be
any number ended with a character,
or an alphanumeric expression started with a character
or an alphanumeric expression started with a number, followed by a character and ended with an alphanumeric subexpression