How can I recognize a valid barcode using regex?

How can I recognize a valid barcode using regex? - regex

I have a barcode of the format 123456########. That is, the first 6 digits are always the same followed by 8 digits.
How would I check that a variable matches that format?

You haven't specified a language, but regexp. syntax is relatively uniform across implementations, so something like the following should work: 123456\d{8}
\d Indicates numeric characters and is typically equivalent to the set [0-9].
{8} indicates repetition of the preceding character set precisely eight times.
Depending on how the input is coming in, you may want to anchor the regexp. thusly:
^123456\d{8}$
Where ^ matches the beginning of the line or string and $ matches the end. Alternatively, you may wish to use word boundaries, to ensure that your bar-code strings are properly separated:
\b123456\d{8}\b
Where \b matches the empty string but only at the edges of a word (normally defined as a sequence consisting exclusively of alphanumeric characters plus the underscore, but this can be locale-dependent).

123456\d{8}
123456 # Literals
\d # Match a digit
{8} # 8 times
You can change the {8} to any number of digits depending on how many are after your static ones.
Regexr will let you try out the regex.

123456\d{8}
should do it. This breaks down to:
123456 - the fixed bit, obviously substitute this for what you're fixed bit is, remember to escape and regex special characters in here, although with just numbers you should be fine
\d - a digit
{8} - the number of times the previous element must be repeated, 8 in this case.
the {8} can take 2 digits if you have a minimum or maximum number in the range so you could do {6,8} if the previous element had to be repeated between 6 and 8 times.

The way you describe it, it's just
^123456[0-9]{8}$
...where you'd replace 123456 with your 6 known digits. I'm using [0-9] instead of \d because I don't know what flavor of regex you're using, and \d allows non-Arabic numerals in some flavors (if that concerns you).

Related

Why is my regular Expression that ignore the order of the characters does not work?

I want to make a string pattern that is:
at least 7 characters long
have at least 1 digits, max 5
have at least 3 capital alphabetic characters , max 5
have at least 1 lower alphabetic characters , max 5
have at least 1 special characters , max 5
How to express this in a regular expression?
I can do something like
^((?=.*[A-Z]{3,5})(?=.*[a-z]{1,5})(?=.*[0-9]{1,5})(?=.*[.~!##$%^_&-]{1,5}))(?=.{7,20}).*$
I don't want to require this kind of order. In fact, any mixed order should be accepted, only require the number of characters.
This Match:
PASSW120P45ccb^&#%#
But this one does not
PA12S1SW2045ccb^&#%#
How can i fix this?
P&#Ass120W45ccb^%#
P&#Ass20W45cb^%#
Please have a look at https://regex101.com/r/vF2yO7/51

You need to operate with the contrary character classes, put these into non-capturing groups and repeat these:
^
(?=(?:\D*\d){1,5})
(?=(?:[^A-Z]*[A-Z]){3,5})
(?=(?:[^a-z]*[a-z]){1,5})
(?=(?:[^.\~!##$%^_&-]*[.\~!##$%^_&-]){1,5})
.{7,20}
$
See a demo on regex101.com.
The structure here is always the same, e.g. with the numbers: require anything not a number zero or more times, followed by a number and repeat the whole pattern 1-5 times. In general:
(?=(?:not_what_you_want*what_you_want){min_times, max_times})
In the expression above, all pos. lookaheads follow this scheme, [^...] negates the characters to be matched in the class and \D* is essentially the same as [^\d]*.

Need a Regex that contains at least one number, zero or more letters, no spaces, min/max

I need a regular expression that will match a string that contains:
at least one number
zero or more letters
no other characters such as spaces
The string must also be a minimum of 8 characters and a maximum of 13 characters.
Placement of the numbers and/or letters within the 8-13 character string does not matter. I haven't figured out how to make sure that the string contains a number, but here are some expressions that don't work because they are picking up spaces in the online tool Regexr. Take a look below:
- ([\w^/s]){8,13}
- ([a-zA-Z0-9]){8,13}
- ([a-zA-Z\d]){8,13}
I am specifically looking to exclude spaces and special characters. The linked and related questions all appear to allow for these characters. This is not for validating passwords, it is for detecting case numbers in natural language processing. This is different from "Password REGEX with min 6 chars, at least one letter and one number and may contain special characters" because I am looking for at least one number but zero or more letters. I also do not want to return strings that contain any special characters including spaces.

This is a typical password validation with your requirements.
Note that this will also match 8-13 digits as well (but it is requested).
Ten million + 1 (and counting) happy customers ..
^(?=.*\d)[a-zA-Z\d]{8,13}$
Explained
^ # Beginning of string
(?= .* \d ) # Lookahead for a digit
[a-zA-Z\d]{8,13} # Consume 8 to 13 alphanum characters
$ # End of string

I've seen the answer above (by sln) everywhere over the internet, but as far as I can tell, it is NOT ACCURATE.
If your string contains 8 to 13 characters with no numbers this expression will match it, because it uses the * quantifier on the wildcard character . in the positive lookahead.
In order to match at least 1 digit, 1 A-Z and 1 a-z in a password that's at least 8 characters long, you'll need something like this:
(?=.{1,7}\d)(?=.{1,7}[a-z])(?=.{1,7}[A-Z])[a-zA-Z\d]{8,13}
it uses 3 lookaheads:
(?=.{1,7}\d)
(?=.{1,7}[a-z])
(?=.{1,7}[A-Z])
each time, it looks for the target (eg the first digit) but allows 1 to 7 occurances of any character before it.
Then it will match 8 to 13 alphanumeric characters.
NOTE to Powershell users:
Use a search group to be able to extract a result
$password = [regex]::match($string-to-search,'(?=.{1,7}\d)(?=.{1,7}[a-z])(?=.{1,7}[A-Z])([a-zA-Z\d]{8,13})').Groups[1].Value

Decoding a regex... I know what it's function is but I want to understand exactly what is happening

I have a regular expression that I'm going to be using to verify that an inputted number is in standard U.S. telephone format (i.e (###) ###-####). I am new to regex and still having some trouble figuring out the exact function of each character. If someone would go through this piece by piece/verify that I am understanding I would really appreciate it. Also if the regex is wrong I would obviously like to know that.
\D*?(\d\D*?){10}
What I think is happening:
\D*?( indicates an escape sequence for the parenthesis metacharacter... not sure why the \D*? is necessary
\d indicating digits
\D*? indicating there is a non-digit character (-) followed by the closing parenthesis.
{10} for the 10 digits
I feel very unsure explaining this, like my understanding is very vague in terms of why the regex is in the order that it is etc. Thanks in advance for help/explanations.
EDIT
It seems like this is not the best regex for what I want. Another possibility was [(][0-9]{3}[)] [0-9]{3}-[0-9]{4}, but I was told this would fail. I suppose I'll have to do a little more work with regular expressions to figure this out.

\D matches any non-digit character.
* means that the previous character is repeated 0 or more times.
*? means that the previous character is repeated 0 or more times, but until the match of the following character in the regex. It is a bit difficult perhaps at the start, but in your regex, the next character is \d, meaning \D*? will match the least amount of characters until the next \d character.
( ... ) is a capture group, and is also used to group things. For instance {10} means that the previous character or group is repeated 10 times exactly.
Now, \D*?(\d\D*?){10} will match exactly 10 numbers, starting with non-digit characters or not, with non-digit characters in between the digits if they are present.
[(][0-9]{3}[)] [0-9]{3}-[0-9]{4}
This regex is a bit better since it doesn't just accept anything (like the first regex does) and will match the format (###) ###-#### (notice the space is a character in regex!).
The new things introduced here are the square brackets. These represent character classes. [0-9] means any character between 0 to 9 inclusive, which means it will match 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9. Adding {3} after it makes it match 3 similar character class, and since this character class contains only digits, it will match exactly 3 digits.
A character class can be used to escape certain characters, such as ( or ) (note I mentioned earlier they are for capturing groups, or grouping) and thus, [(] and [)] are literal ( and ) instead of being used for capturing/grouping.
You can also use backslashes (\) to escape characters. Thus:
\([0-9]{3}\) [0-9]{3}-[0-9]{4}
Will also work. I would also recommend the use of line anchors ^ and $ if you're only trying to see if a phone number matches the above format. This ensures that the string has only the phone number, and nothing else. ^ matches the beginning of a line and $ matches the end of a line. Thus, the regex will become:
^\([0-9]{3}\) [0-9]{3}-[0-9]{4}$
However, I don't know all the combinations of the different formats of phone numbers in the US, so this regex might need some tweaking if you have different phone number formats.

\D is "not a digit"; \d is "digit". With that in mind:
This matches zero or more non-digits, then it matches a digit and any number of non-digit characters 10 times. This won't actually verify that the number if formatted properly, just that it contains 10 digits. I suspect that the regex isn't what you want in the first place.
For example, the following will match your regex:
this is some bad text 1 and some more 2 and more 34567890

\D matches a character that is not a digit
* repeats the previous item 0 or more times
? find the first occurrence
\d matches a digit
so your group is matches 10 digits or non digits

Regular Expression - Start with 3538 and then contain 8 digits

E.g. match 353812345678 So far I have ^3538{1}[\d]{8} which works but does not restrict the length. How do I make sure the length is only a maximum of 12 digits?

If you want the number to be the only thing in the string: ^3538\d{8}$
If you just want the number in a string: \b3538\d{8}\b
^ is the start-of-string anchor, while $ is the end-of-string anchor, so the first one restricts the number to be the only thing on the line.
In the other one, \b means a word boundary, so it just means no other letters or digits may come directly before or after the number.
Also, note, in your original regex, the {1} is redundant, and [\d] means the same as \d.

^3538{1}[\d]{8}[^\d] will ensure you have 3538 followed by 8 digits and something that is NOT a digit -- thus limiting the length.

Add a dollar sign ($) at the end of the regex:
^3538{1}[\d]{8}$

How to validate numeric values which may contain dots or commas?

I need a regular expression for validation two or one numbers then , or . and again two or one numbers.
So, these are valid inputs:
11,11
11.11
1.1
1,1

\d{1,2}[\,\.]{1}\d{1,2}
EDIT: update to meet the new requirements (comments) ;)
EDIT: remove unnecesary qtfier as per Bryan
^[0-9]{1,2}([,.][0-9]{1,2})?$

In order to represent a single digit in the form of a regular expression you can use either:
[0-9] or \d
In order to specify how many times the number appears you would add
[0-9]*: the star means there are zero or more digits
[0-9]{2}: {N} means N digits
[0-9]{0,2}: {N,M} N digits to M digits
Lets say I want to represent a number between 1 and 99 I would express it as such:
[0-9]{1,2} or \d{1,2}
Or lets say we were working with binary display, displaying a byte size, we would want our digits to be between 0 and 1 and length of a byte size, 8, so we would represent it as follows:
[0-1]{8} representation of a binary byte
Then if you want to add a , or a . symbol you would use:
\, or \. or you can use [.] or [,]
You can also state a selection between possible values as such
[.,] means either a dot or a comma symbol
And you just need to concatenate the pieces together, so in the case where you want to represent a 1 or 2 digit number followed by either a comma or a period and followed by two more digits you would express it as follows:
[0-9]{1,2}[.,]\d{1,2}
Also note that regular expression strings inside C++ strings must be double-back-slashed so every \ becomes \\

\d means a digit in most languages. You can also use [0-9] in all languages. For the "period or comma" use [\.,]. Depending on your language you may need more backslashes based on how you quote the expression. Ultimately, the regular expression engine needs to see a single backslash.
* means "zero-or-more", so \d* and [0-9]* mean "zero or more numbers". ? means "zero-or-one". Neither of those qualifiers means exactly one. Most languages also let you use {m,n} to mean "between m and n" (ie: {1,2} means "between 1 and 2")
Since the dot or comma and additional numbers are optional, you can put them in a group and use the ? quantifier to mean "zero-or-one" of that group.
Putting that all together you can use:
\d{1,2}([\.,][\d{1,2}])?
Meaning, one or two digits \d{1,2}, followed by zero-or-one of a group (...)? consisting of a dot or comma followed by one or two digits [\.,]\d{1,2}

\d{1,2}[,.]\d{1,2}
\d means a digit, the {1,2} part means 1 or 2 of the previous character (\d in this case) and the [,.] part means either a comma or dot.

Shortest regexp I know (16 char)
^\d\d?[,.]\d\d?$
The ^ and $ means begin and end of input string (without this part 23.45 of string like 123.45 will be matched). The \d means digit, the \d? means optional digit, the [,.] means dot or comma. Working example (when you click on left menu> tools> code generator you can gen code for one of 9 popular languages like c#, js, php, java, ...) here.
[ // tests
'11,11', // valid
'11.11',
'1.1',
'1,1',
'111,1', // nonvalid
'11.111',
'11-11',
',11',
'11.',
'a.11',
'11,a',
].forEach(n=> console.log(`${n}\t valid: ${ /^\d\d?[,.]\d\d?$/.test(n) }`))

If you want to be very permissive, required only two final digits with comma or dot:
^([,.\d]+)([,.]\d{2})$

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How can I recognize a valid barcode using regex? - regex

I have a barcode of the format 123456########. That is, the first 6 digits are always the same followed by 8 digits. How would I check that a variable matches that format?

123456\d{8} 123456 # Literals \d # Match a digit {8} # 8 times You can change the {8} to any number of digits depending on how many are after your static ones. Regexr will let you try out the regex.

The way you describe it, it's just ^123456[0-9]{8}$ ...where you'd replace 123456 with your 6 known digits. I'm using [0-9] instead of \d because I don't know what flavor of regex you're using, and \d allows non-Arabic numerals in some flavors (if that concerns you).

Related

Why is my regular Expression that ignore the order of the characters does not work?

Need a Regex that contains at least one number, zero or more letters, no spaces, min/max

Decoding a regex... I know what it's function is but I want to understand exactly what is happening

Regular Expression - Start with 3538 and then contain 8 digits

How to validate numeric values which may contain dots or commas?

Categories

Resources