A comprehensive regex for key word validation - regex

How to write a regex for key word validation having following constraints
All alphabets and digits are allowed.
You can use a blank space with in a keyword, but leading or tailing spaces are not allowed.
When using the hyphen character '-' it can only be used to hyphenate words and cannot have blank space around it.
For Example
"kkda asdlkfj-kklda12" should match
"kdka - klad lakdjoa" should not match
" kdakla120alsd " should not match(space at start and end)

You can use pattern:
^(?:[a-z0-9]+(?: (?!$)|-|$))+$
^ Beginning of line.
(?: Non capturing group.
[a-z0-9]+ Match alphanumeric values.
(?: (?!$)|-|$)) Non capturing group for either whitespace (as long as it does not precede end of string (?!$)), a - character or end of string.
) Close non capturing group.
+ Repeat non capturing group.
$ Assert position end of line.
You can try it live here.

Related

What is the proper regex for capturing everything after "String" and between two delimeters ('=' and and non alphanumeric))

Details={
AwsEc2SecurityGroup={GroupName=m.com-rds, OwnerId=123, VpcId=vpc-123,
IpPermissions=[{FromPort=3306, ToPort=3306, IpProtocol=tcp, IpRanges=[{CidrIp=1.1.1.1/32}, {CidrIp=2.2.2.2/32}, {CidrIp=0.0.0.0/0}, {CidrIp=3.3.3.3/32}],
UserIdGroupPairs=[{UserId=123, GroupId=sg-123abc}]}], IpPermissionsEgress=[{IpProtocol=-1, IpRanges=[{CidrIp=0.0.0.0/0}]}], GroupId=sg-123abc}},
Region=us-east-1, Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc}]
}
I want to capture exactly arn:aws:ec2:us-east-1:123:security-group/sg-123abc in this example. Generically, I want to capture the value of Id regardless of placement. My current solution is /Details={.*Id=(.*\w)/, but this only works if it's the last object in the data. How can I take into account the following potential scenario:
Id=arn:aws:ec2:us-east-1:123:security-group/sg-123abc, Thing=123abc}]
You have a pattern with 2 times .* which will first match till the end of the line/string (depending on if the dot matches a newline) and it will backtrack to match the last occurrence where this part of the pattern Id=(.*\w) can match.
If you want to use a capture group, you can make the format and the allowed characters a bit more specific:
\bId=(\w+(?:[:\/-]\w+)+)
The pattern in parts
\b A word boundary to prevent a partial word match
Id= Match literally
( Capture group 1
\w+ Match 1+ word chars
(?:[:\/-]\w+)+ Repeat 1+ times either : / - and 1+ word chars
) Close group 1
Regex demo
Or if you know that it starts with Id=arn:
\bId=(arn:[\w:\/-]+)
Regex demo
Note that you don't have to escape the \/ only when the delimiters of the regex are forward slashes, but there is no language tagged.
You can use look-behind to check that there is the Id= prefix, and then match anything that is not a space, comma or closing brace:
(?<=\bId=)[^,}\s]*

Regex to validate cookie string (Key value paired)

So far I tried this regex but no luck.
([^=;]+=[^=;]+(;(?!$)|$))+
Valid Strings:
something=value1;another=value2
something=value1 ; anothe=value2
Invalid Strings:
something=value1 ;;;name=test
some=value=3;key=val
somekey=somevalue;
You might use an optional repeating group to get the matches.
If you don't want to cross newline boundaries, you might add \n or \r\n to the negated character class.
^[^=;\n]+=[^=;\n]+(?:;[^=;\n]+=[^=;\n]+)*$
Explanation
^ Start of string
[^=;\n]+=[^=;\n]+ Match the key and value using a negated character class
(?: Non capture group
;[^=;\n]+=[^=;\n]+ Match a comma followed by the same pattern
)* Close group and repeat 0+ times
$ End string
Regex demo

Match all instances of a certain character inside every word preceded by a certain word and not delimited by a space

Given a string such as below:
word.hi. bla. word.
I want to construct a regex which will match all "."s preceded by "word" and any other non space character
So, in the above example I would want the the first, second and last dots to be matched.
While matching the first and last dots would be easy with global flag (/(?:word.*)\K./gU), I'm not sure how to construct a regex that would also match the second dot.
Appreciate any pointers.
You might match word and then get all consecutive matches using the \G anchor excluding matching whitespace chars or a dot.
(?:\bword|\G(?!\A))[^.\s]*\K\.
In parts
(?: Non capture group
\bword Match word preceded by a word boundary
| Or
\G(?!\A) Assert the position at the end of the previous match, not at the start
) Close non capture group
[^.\s]* Match 0+ occurrences of any char except . or a whitespace char
\K Clear the match buffer (forget what is matched until now)
\. Match a dot
Regex demo

Match names joined with a delimiter except last

Let's suppose we have, in a text file, many rows containing each one multiple names joined with ";" delimiter except last name (which doesn't end with it).
We can use the following regex :
^(\w+;)+$ // Not good
The previous regex won't work because it forces last name, hence the whole row to end with a ";" also
You could add matching a single \w+ after it. If you don't need the capturing group, you might make it non capturing.
This way you are repeating matching word characters followed by a ; and end the match with word characters.
^(?:\w+;)+\w+$
Explanation
^ Start of string
(?: Non capturing group
\w+; Match 1+ word chars followed by ;
)+ Close non capturing group and repeat 1+ times
\w+ Match 1+ word chars
$ End of string
Regex demo
If a single word should also match, you could repeat the group 0+ times using * instead of +
^(?:\w+;)*\w+$
Regex demo

How can i validate for special character at particular position in regexp

I have written Regexp #"^([a-zA-Z ]+[a-zA-Z0-9 ]*)$" it allows all the characters and numbers except special characters and first characters cannot be numbers. Now I have to allow '-' character anywhere except last and first character. How can i modify it.
You can use this:
#"^([a-zA-Z ]+[a-zA-Z0-9\- ]*[a-zA-Z0-9 ]+)$"
The first group allows only letters (at least one)
The second group allows any character (- included)
The last group allows only letters and numbers (at least one to
exclude any other character)
You can test it here
You could add a - to the second character class and add negative lookahead (?! to make sure the string does not end with -.
^(?!.*-$)([a-zA-Z ]+[a-zA-Z0-9 -]*)$
Explanation
^ Assert position at the start of the line
(?!.*-$) Negative lookahead to assert that the string does not end with -
( Capturing group
[a-zA-Z ]+ Match character class one or more times
[a-zA-Z0-9 -]* Match character class with - zero or more times
) Close capturing group
$ Assert position at the end of the line
Note
Your regex is inside a capturing group. If you don't use that group you might leave out the parenthesis:
^(?!.*-$)[a-zA-Z ]+[a-zA-Z0-9 -]*$