Finding a regex for an ID format with hyphens

Finding a regex for an ID format with hyphens - regex

I'm trying to validate an id using regex. The id is in the below format.
alphaNumeric-alphaNumeric-alphaNumeric (And the total length should be 14, and there should be two hyphens)
Below examples are valid formats
AS12-AS12-AB1C
AS-12ASBC-1234
N-IKNKL-A2LI40
Here the catch is hyphens should not come in the beginning as well as in the end. And also no two hyphens should be together.
Up until now I'm using positive look ahead to do the length match (?=^.{14}$). And matching the other hyphens logic using (?=^[^-]*-[^-]*-[^-]*$)[a-zA-Z0-9-]+. So the regex I'm using is
(?=^.{12}$)(?=^[^-]*-[^-]*-[^-]*$)[a-zA-Z0-9-]+
And the problem here is hyphens can come in the beginning as well as at the end, as well as two hyphens can come together, both of which should not be valid and it's against my id validation check.

You may use this regex:
^(?=.{14}$)[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+){2}$
RegEx Demo
RegEx Details:
^: Match Start
(?=.{14}$): Lookahead condition to assert that we have exact 14 characters of input
[a-zA-Z0-9]+: Match 1 or more of alphanumeric characters
(?:: Start a non-capturing group
-: Match a hyphen
[a-zA-Z0-9]+: Followed by 1 or more of alphanumeric characters
){2}: End non-capturing group. Match 2 instances of this group
$: Match end

Related

Regex pattern reads correctly but doesn't produce desired result

I am testing the following regex:
(?<=\d{3}).+(?!',')
This at regex101 regex
Test string:
187 SURNAME First Names 7 Every Street, Welltown Racing Driver
The sequence I require is:
Begin after 3 digit numeral
Read all characters
Don't read the comma
In other words:
SURNAME First Names 7 Every Street
But as demo shows the negative lookahead to the comma has no bearing on the result. I can't see anything wrong with my lookarounds.

You could match the 3 digits, and make use of a capture group capturing any character except a comma.
\b\d{3}\b\s*([^,]+)
Explanation
\b\d{3}\b Match 3 digits between word boundaries to prevent partial word matches
\s* Match optional whitespace chars
([^,]+) Capture group 1, match 1+ chars other than a comma
Regex demo

.+ consumes everything.
So (?!,) is guaranteed to be true.
I'm not sure if using quotes is correct for whichever flavour of regex you are using. Bare comma seems more correct.
Try:
(?<=\d{3})[^,]+

I wrote url validation regex but the regex is very slow

I know this is slow because of ([\.\-][a-z0-9])*. But I don't know how to optimize it.
^https:\/\/([a-z0-9]+([\.\-][a-z0-9])*)+(\.([a-z]{2,11}|[0-9]{1,5}))(:[0-9]{1,5})?(\/.*)?$

You don't have to use this part )*)+ in your pattern. This could also potentially lead to catastrophic backtracking.
Note that you only have to escape the backslash if the delimiters for the regex are also / and you don't have to escape the [\.\-]
If you don't need that capture groups afterwards, you can omit them.
^https:\/\/[a-z0-9]+(?:[.-][a-z0-9]+)*\.(?:[a-z]{2,11}|[0-9]{1,5})(?::[0-9]{1,5})?(\/.*)?$
The pattern matches:
^ Start of string
https:\/\/ Match https:// As you only want to match https
[a-z0-9]+ Match 1+ times any of the listed
(?:[.-][a-z0-9]+)* Optionally repeat matching . or - and 1+ times any of the listed
\.(?:[a-z]{2,11}|[0-9]{1,5}) Match either 2-11 times a char a-z or match 1-5 digits
(?::[0-9]{1,5})? Optionally match : and 1-5 digits
(\/.*)? Optionally match /` and the rest of the line
$ End of string
Regex demo

Match all except specific group

I have a test string repo-2019-12-31-14-30-11.gz and I want to exclude 2019-12-31-14-30-11.gz from that string and match everything else. Digits with date and hour can be different. String at the beginning of text can be any word, can contain digits, dashes or underscores. Constant characters are:
dash between repo name and date
.gz at end of text
I tried following regex:
^.*(?!-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}.gz$)
but it always matches whole text

The pattern that you tried ^.*(?!-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}.gz$) always matches the whole text because .* will first match until the end of the string. Then at the end of the string, it will assert that what is directly on the right is not the date like pattern.
That assertion will succeed as it is at the end of the string.
You could use a capturing group with a character class matching word characters or a hyphen and use that in the replacement:
^([\w-]+)-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}\.gz$
Regex demo
If the beginning can not start with an underscore and can not contain consecutive underscores, you could repeat matching a hyphen and a word character in a grouping stucture \w+(?:-\w+)*
^(\w+(?:-\w+)*)-\d{4}-\d{2}-\d{2}-\d{2}-\d{2}-\d{2}\.gz$
Regex demo

RegEx for identifying a date followed by a special pattern

I have a pattern of strings/values occurring at different interval. The Pattern is as follows:
30/09/2016 2,085,669 0 0 UC No
Date>SPACE>Number separated by comma>SPACE> NUMBER> SPACE> NUMBER> SPACE>STRING>SPACE>NUMBER
How do i identify this and extract from a cell. I have been trying to use regex to solve this problem. Please note the pattern can occur at any instance in single cell. Viz.
Somestring(space)(30/09/2016 2,085,669 0 0 UC No)(space) More string
Somemorestring(space)(30/09/2016 2,085,669 0 0 UC No)
Brackets are for illustration only
To identify for date I am using the below regex, not the best way, but does my job.
(^\d{1,2}\/\d{1,2}\/\d{4}$)
How to stitch this with remaining pattern?

You are only matching the date like part between the anchors to assert the start ^ and the end $ of the string.
Note that if you only want to match the value you can omit the parenthesis () to make it a capturing group around the expression.
You could extend it to:
^\d{1,2}\/\d{1,2}\/\d{4} \d+(?:,\d+)+ \d+ \d+ [A-Za-z]+ [A-Za-z]+$
Explanation
^ Start of string
\d{1,2}\/\d{1,2}\/\d{4} Match date like pattern
\d+(?:,\d+)+ Match 1+ digits and repeat 1+ times matching a comma and a digit
\d+ \d+ Match two times 1+ digits followed by a space
[A-Za-z]+ [A-Za-z]+ Match 2 times 1+ chars a-z followed by a space
$ End of string
Regex demo

If you only wish to extract the date from anywhere in a string, this expression uses two capturing groups before and after the date, and the middle group captures the desired date:
(.*?)(\d{1,2}\/\d{1,2}\/\d{4})(.*)
You may not want to use start ^ and end $ chars and it would work.
If you wish to match and capture everything, you might just want to add more boundaries, and match patterns step by step, maybe similar to this expression:
(.*?)(\d{1,2}\/\d{1,2}\/\d{4})\s+([0-9,]+)\s+([0-9]+)\s+([0-9]+)\s+([A-Z]+)\s+(No)(.*)
This tool can help you to edit/modify/change your expressions as you wish.
I have added extra boundaries, just to be safe, which you can simplify it.
RegEx Descriptive Graph
This link helps you to visualize your expressions:

Regex to match 4 groups of letters/numbers, separated by hyphens

I need a regular expression that will match this pattern (case doesn't matter):
066B-E77B-CE41-4279
4 groups of letters or numbers 4 characters long per group, hyphens in between each group.
Any help would be greatly appreciated.

^(?:\w{4}-){3}\w{4}$
Explanation:
^ # must match beginning of string
(?: # make a non-capturing group (for duplicating entry)
\w{4} # a-z, A-Z, 0-9 or _ matching 4 times
- # hyphen
){3} # this group matches 3 times
\w{4} # 4 more of the letters numbers or underscore
$ # must match end of string
Would be my best bet. Then you can use Regex Match (static).
P.S. More info on regex can be found here.
P.P.S. If you don't want to match underscores, the \w above can be replaced (both times) with [a-zA-Z0-9] (known as a class matching lowercase and uppercase letters and numbers). e.g.
^(?:[a-zA-Z0-9]{4}-){3}[a-zA-Z0-9]{4}$

Try:
[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}\-[A-Za-z0-9]{4}

With such a small sample of data, it's not easy to be certain what you actually want.
I'm going to assume that all the characters in that string are hex digits, and that's what you need to search for.
In that case, you would need a regular expression something like this:
^[a-f0-9]-[a-f0-9]-[a-f0-9]-[a-f0-9]$
If they can be any letter, then replace the fs with zs.
Oh, and use myRE.IgnoreCase = True to make it case insensitive.
If you need further advice on regular expressions, I'd recommend http://www.regular-expressions.info/ as good site. They even have a VB.net-specific page.

Assuming from your example:
There are four groups of letters, separated by dashes.
Each group is four letters.
The letters are hexadecimal digits.
This pattern would match that:
^[\dA-F]{4}-[\dA-F]{4}-[\dA-F]{4}-[\dA-F]{4}$
Note that ^ and $ match the beginning and end of the string, which is important if you want to match the entire string and not check if the pattern occurs inside a string.
You could also make use of the repetitions in the pattern:
^(?:[\dA-F]{4}-){3}[\dA-F]{4}$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Finding a regex for an ID format with hyphens - regex

Related

Regex pattern reads correctly but doesn't produce desired result

I wrote url validation regex but the regex is very slow

Match all except specific group

RegEx for identifying a date followed by a special pattern

Regex to match 4 groups of letters/numbers, separated by hyphens

Categories

Resources