Regex starts with alphabet or underscore (_) - regex

I'm trying to check string starting with underscore(_) or alphabet and can contain only letters, digits, hyphens, underscores or periods.
The string can also be of length 1.
Expected valid strings:
_Name
_name.First
name-first
Name
name.First
name-first
A
b
I tried using the below given regex but is not working for a single alphabet.
^[a-zA-Z0-9_][a-zA-Z0-9_|/.|/-]{1,20}[a-zA-Z0-9]$

Use
^[a-zA-Z0-9_](?:[a-zA-Z0-9_.-]{0,20}[a-zA-Z0-9])?$
See proof.
Explanation
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[a-zA-Z0-9_] any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
[a-zA-Z0-9_.- any character of: 'a' to 'z', 'A' to
]{0,20} 'Z', '0' to '9', '_', '.', '-' (between
0 and 20 times (matching the most amount
possible))
--------------------------------------------------------------------------------
[a-zA-Z0-9] any character of: 'a' to 'z', 'A' to
'Z', '0' to '9'
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Related

String must be alphanumeric and contain a certain substring

I'm working on adding a regex that determines whether a given input is valid. The input should be alpha numeric (underscores, dashes, periods also allowed) and between 1 and 60 characters. It should also contain a certain substring inside it (let's just say "foo.bar"). This is my attempt:
^.[a-zA-Z0-9_.-]{1,60}$
That does what I need, aside from the substring part. I'm not sure how to add the "the string must contain the substring foo.bar" requirement. FWIW I'm doing this in Ruby so I understand this means PCRE is being used.
As an example, this string should be valid:
aGreatStringWithfoo.barInIt1111
this shouldn't
aBadStringWithoutTheSubstringInIt
Use
^(?=.{1,60}$)[a-zA-Z0-9_.-]*foo\.bar[a-zA-Z0-9_.-]*$
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.{1,60} any character except \n (between 1 and
60 times (matching the most amount
possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[a-zA-Z0-9_.-]* any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_', '.', '-' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
foo 'foo'
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
bar 'bar'
--------------------------------------------------------------------------------
[a-zA-Z0-9_.-]* any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_', '.', '-' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Regex Only words and single spaces between them

I'm not good with writing regex. The requirement is to validate names.
Only letters should include
Only 1 space between the words
Cannot include any other space characters
Use
^[a-zA-Z]+(?: [a-zA-Z]+)*$
See proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[a-zA-Z]+ any character of: 'a' to 'z', 'A' to 'Z'
(1 or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
' '
--------------------------------------------------------------------------------
[a-zA-Z]+ any character of: 'a' to 'z', 'A' to 'Z'
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Regex[Python] Extract from url path parameters

I have an URLs from the access log. Example:
/someService/US/getPersonFromAllAccessoriesByDescription/67814/alloy%20nudge%20w
/someService/NZ/asdNmasdf423-asd342e/getDealerFromSomethingSomething/FS443GH/front%20parking%20sen
I cannot make any assumption on the service name or the function name.
I'm trying to find a regex that can only match in the first log:
67814
alloy%20nudge%20w
and in the second:
asdNmasdf423-asd342e
FS443GH
front%20parking%20sen
with some heuristic, I tried to use [a-zA-Z0-9_%-]{15,}|[A-Z0-9]{5,} match only long strings but the function names(getPersonFromAllAccessoriesByDescription, getDealerFromSomethingSomething) also had been caught.
I was thinking about regex that can do the same as [a-zA-Z0-9_%-]{15,} but with condition that it must be at least one digit, so this way the function names will be skipped.
Thank you
Your heuristics is fine, use
\b(?=[a-zA-Z_%-]*[0-9])[a-zA-Z0-9_%-]{5,}
See proof.
Explanation
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
[a-zA-Z_%-]* any character of: 'a' to 'z', 'A' to
'Z', '_', '%', '-' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[0-9] any character of: '0' to '9'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[a-zA-Z0-9_%-]{5,} any character of: 'a' to 'z', 'A' to 'Z',
'0' to '9', '_', '%', '-' (at least 5
times (matching the most amount possible))

Regex to forbid any space before or after apostrophes and hyphens

i need detail validation with this requirement:
Accept Alphabet (A-Z or a-z)
May include this special char (åçêëèïîìæôöòûùÿáíóúñ]*$)
Accept Spaces, apostrophes and hyphens
Can using apostrophes and hyphens between non space character, example:
Le'brahm
Ben-John
cannot accept Le' brahm or Ben -John or Ben- John
I use this regex currently but cannot fullfill the number 4 requirements
and number 3 partially
^[a-zA-Z åçêëèïîìæôöòûùÿáíóúñ]*$
if i added hypens like this
^[a-zA-Z -'åçêëèïîìæôöòûùÿáíóúñ]*$
the regex become error and it accepts number character (should be not)
You may try this regex:
^[a-zA-Zåçêëèïîìæôöòûùÿáíóúñ]+(?:[-' ][a-zA-Zåçêëèïîìæôöòûùÿáíóúñ]+)*$
RegEx Demo
RegEx Details:
[a-zA-Zåçêëèïîìæôöòûùÿáíóúñ]+: Matches 1+ of given letters inside [...]
(?:[-' ][a-zA-Zåçêëèïîìæôöòûùÿáíóúñ]+)*: Matches 0 or more of same set of characters separated by - or ' or a space.
Use lookarounds:
^(?!.*[-'‘’ ]{2})(?![-'‘’ ])(?!.*[-'‘’ ]$)[a-zA-Zåçêëèïîìæôöòûùÿáíóúñ '‘’-]+$
See proof.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[-‘’' ]{2} any character of: '-', ''', ' ', '‘', '’' (2
times)
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
[-'‘’ ] any character of: '-', ''', ' ', '‘', '’'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[-'‘’ ] any character of: '-', ''', ' ', '‘', '’'
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[a-zA- any character of: 'a' to 'z', 'A' to 'Z',
Zåçêëèïîìæôöòûùÿáíóú 'å', 'ç', 'ê', 'ë', 'è', 'ï', 'î', 'ì',
ñ '‘’-]+ 'æ', 'ô', 'ö', 'ò', 'û', 'ù', 'ÿ', 'á',
'í', 'ó', 'ú', 'ñ', ' ', ''', '-', '‘', '’' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

How can I write a regular expression that matchs case insensitive emails?

I have following code to find user with an email.
User.findOne({
email: { $regex: new RegExp('^' + req.body.email.toLowerCase() + '$','i') }
})
It finds the user with a given email by lowercase letters and case insensitive search.
The problem is, we have some emails like john+doe#johndoe.com and this regular expression doesn't match those emails.
What should I add to regular expression to find that kind of emails?
The issue is that you're using the e-mail address, as req.body.email, unescaped in a regular expression.
As you noticed, characters that have a special meaning in regexes, like +, will cause problems. Even worse, when a user enters .* as their e-mail address, your query will match any user, which is a security concern.
What you want is to escape the e-mail address input so any special characters will be searched for as-is (have their "special meaning" stripped from them).
The easiest way is to use a module like regex-escape that will do that for you:
var escape = require('regex-escape');
...
User.findOne({
email: { $regex: new RegExp('^' + escape(req.body.email) + '$','i') }
})
Since the regex is already set to match case-insensitive, there's not need to lowercase the string.
Description
I use this expression, it's not perfect as there are some edge cases which will slip by but those are easy enough to test by simply sending the test email:
^[_a-z0-9-+]+(?:\.[_a-z0-9-+]+)*#[a-z0-9-]+(?:\.[a-z0-9-]+)*(?:\.[a-z]{2,4})$
By adding A-Z to each of the character classes I've made the same expression case insensitive.
^[_a-zA-Z0-9-+]+(?:\.[_a-zA-Z0-9-+]+)*#[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*(?:\.[a-zA-Z]{2,4})$
Example
Live Demo
https://regex101.com/r/uC5oG4/1
Explanation
NODE EXPLANATION
----------------------------------------------------------------------
^ the beginning of a "line"
----------------------------------------------------------------------
[_a-z0-9-+]+ any character of: '_', 'a' to 'z', '0' to
'9', '-', '+' (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
[_a-z0-9-+]+ any character of: '_', 'a' to 'z', '0'
to '9', '-', '+' (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
# '#'
----------------------------------------------------------------------
[a-z0-9-]+ any character of: 'a' to 'z', '0' to '9',
'-' (1 or more times (matching the most
amount possible))
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
[a-z0-9-]+ any character of: 'a' to 'z', '0' to
'9', '-' (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
[a-z]{2,4} any character of: 'a' to 'z' (between 2
and 4 times (matching the most amount
possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
----------------------------------------------------------------------