Match a String with optional number of hyphens - Java Regex - regex

I am trying to match Strings with optional number of hyphens.
For example,
string1-string2,
string1-string2-string3,
string1-string2-string3 and so on.
Right now, I have something which matches one hyphen. How can I make the regex to match optional number of hyphens?
My current regex is: arn:aws:iam::\d{12}:[a-zA-Z]/?[a-zA-Z]-?[a-zA-Z]*
What do I need to add?

Use this regex:
^\\w+(-\\w+)*$
Explanation:
\\w+ - match any string containing [a-zA-Z_0-9]
(-\\w+)* - match a hyphen followed by a string zero or more times
Regex101
Note that this won't match an empty string, or a string containing weird characters. You could handle these cases manually or you could update the regex.

Related

Regex for replacing anything other than characters, more than one spaces and number only in end with empty char

I want to replace anything other than character, spaces and number only in end with empty string or in other words: we replace any number or spaces comes in-starting or in-middle of the string replace with empty string.
Example
**Input** **Output**
Ndd12 Ndd12
12Ndd12 Ndd12
Ndd 12 Ndd 12
Nav G45up Nav Gup
Attempted Code
regexp_replace(df1[col_name]), "(^[A-Za-z]+[0-9 ])", ""))
You may use:
\d+(?!\d*$)|[^\w\n]+(?!([A-Z]|$))
RegEx Demo
Explanation:
\d+(?!\d*$): Match 1+ digits that are not followed by 0+ digits and end of line
|: OR
[^\w\n]+(?!([A-Z]|$)): Match 1+ non-word characters that are not followed by an uppercase letter or and end of line
if you use python, you can use regular expressions.
You can use the re module.
import re
new_string = re.sub(r"[^a-zA-Z0-9]","",s)
Where ^ means exclusion.
Regular expressions exist in other languages. So it would be helpful to find a regular expression.
I came up with this regex to capture all characters that you want to remove from the string.
^\d+|(?<=\w)\d+(?![\d\s])|(?<=\s)\s+
Do
regexp_replace(df1[col_name]), "^\d+|(?<=\w)\d+(?![\d\s])|(?<=\s)\s+", ""))
Regex Demo
Explanation:
^\d+ - captures all digits in a sequence from the start.
(?<=\w)\d+(?![\d\s]) - Positive look behind for a word character with a negative look ahead for a number followed by space and capturing a sequence of digits in the middle. (Captures digits in G45up)
(?<=\s)\s+ - positive look behind for a space followed by one or more spaces, capturing all additional spaces.
Note : This regex could be inefficient when matching large strings as it uses expensive look-arounds.
^\d+|(?<=\w)\d+(?![\d\s])|(?<=\s)\s+|(?<=\w)\W|\W(?=\w)|(?<!\w)\W|\W(?!\w)

Regex match last occurrence of substring among the same substrings in the string

For example we have a string:
asd/asd/asd/asd/1#s_
I need to match this part: /asd/1#s_ or asd/1#s_
How is it possible to do with plain regex?
I've tried negative lookahead like this
But it didn't work
\/(?:.(?!\/))?(asd)(\/(([\W\d\w]){1,})|)$
it matches this '/asd/asd/asd/asd/asd/asd/1#s_'
from this 'prefix/asd/asd/asd/asd/asd/asd/1#s_'
and I need to match '/asd/1#s_' without all preceding /asd/'s
Match should work with plain regex
Without any helper functions of any programming language
https://regexr.com/
I use this site to check if regex matches or not
here's the possible strings:
prefix/asd/asd/asd/1#s
prefix/asd/asd/asd/1s#
prefix/asd/asd/asd/s1#
prefix/asd/asd/asd/s#1
prefix/asd/asd/asd/#1s
prefix/asd/asd/asd/#s1
and asd part could be replaced with any word like
prefix/a1sd/a1sd/a1sd/1#s
prefix/a1sd/a1sd/a1sd/1s#
...
So I need to match last repeating part with everything to the right
And everything to the right could be character, not character, digit, in any order
A more complicated string example:
prefix/a1sd/a1sd/a1sd/1s#/ds/dsse/a1sd/22$$#!/123/321/asd
this should match that part:
/a1sd/22$$#!/123/321/asd
If you want the match only, you can use \K to reset the match buffer right before the parts that you want to match:
^.*\K/a\d?sd/\S+
The pattern will match
^ Start of string
.* Match any char except a newline until end of the line
\K Forget what is matched until now
/a\d?sd/ match a, optional digits and sd between forward slashes
\S+ Match 1+ non whitespace chars
See a regex demo

Can i make my regex shortest?

Is there any better(shortest) regex then the below which matches the below conditions?
/((.*,)|\s*)String((,.*)|\s*)/
Conditions:
--> Should select only when there is the exact match for the string (String might be in comma separated list or just the only String)
few accepted cases is for inputs:
String, some other, something other
some other, String
String
Example inputs for failure:
String test,String new,Stringtest
The problem is after encoding the url length will be increased because of this big regex. So i am thinking if there is a way to make my regex better to match the conditions.
You may use
(^|,\s*)String($|\s*,)
See the regex demo.
Details
(^|,\s*) - either the start of string (^) or (|) a comma followed with 0+ whitespace chars
String - a literal String
($|\s*,) - either the end of string ($) or (|) 0+ whitespace chars followed with a comma.

Regex matches parts of a string, but not whole string

This is the regex I'm using to validate a string that can contain lowercase and uppercase letters, numbers and dash:
/([a-zA-Z0-9-])+$/
It has the following results:
abd - matches
abcd- - matches
abcd0 - matches
abcd0- - matches
abc# - doesn't match (correct)
abc#efg - matches (incorrect, it shouldn't)
What am I doing wrong?
I would say you need /^([a-zA-Z0-9-])+$/. You want to match the whole string, not just a part, but you're missing the mark for the beginning of the string ^.
^ and $ say between the beginning and the end of the string and ([a-zA-Z0-9-])+ says there can be one or more characters a-zA-Z0-9-.
Your regexp matches everything which contains one or more characters a-zA-Z0-9- before the end of the string no matter what's before.
You can test your regular expression on regex101.com (very good online tool for regular expression testing with explanation, reference etc.).

Regex - alphabetical with hyphen

I would like to have a regular expression that checks if string of up to 14 alpha-numeric chars. can include hyphen, not at the beginning or end.
This what I have so far:
var patt = new RegExp("^([a-zA-Z0-9]+(-[a-zA-Z0-9])*){1,14}$");
But it's not working - http://jsfiddle.net/u6cWs/1/
Any idea?
You need to use positive lookahead (count number of alpha-numeric chars with optional hyphen).
If only single hyphen is allowed:
^(?=([a-zA-Z0-9]-?){1,14}$)[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)?$
Demo
If multiple hyphens are allowed:
^(?=([a-zA-Z0-9]-?){1,14}$)[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*$
Demo
Additional option:
^[a-zA-Z0-9](?:-?[a-zA-Z0-9]){0,13}$
Demo
Here is a simple solution that is faster because it does not use lookaheads:
^[A-Za-z0-9](?:[-A-Za-z0-9]{0,12}[A-Za-z0-9])?$
See demo.
How does it work?
Like your original pattern, this regex is anchored between ^ and $, enforcing our limit on the number of characters.
The first character has to be a letter or digit.
The rest of the string, included in a (?: non-capturing group, is made optional by the ? at the end. This rest of the string, if it is there (more than one character), must end with a letter or digit. In the middle, you can have between 0 and 12 letters, digits or hyphens.
Optionally
If you want your regex to be a little shorter, turn on the case-insensitive option, and remove either the lower-case chars or the upper-case ones, for instance:
^[a-z0-9](?:[-a-z0-9]{0,12}[a-z0-9])?$
Use two regexes for simplicity and readability.
First check that it matches this:
/^[A-Za-z0-9-]{1,14}$/
then check that it does NOT match this:
/^-|-$/