Regular Expression to extract alphanumeric parts of a URL?

Regular Expression to extract alphanumeric parts of a URL? - regex

Given any URL, like:
https://stackoverflow.com/v1/summary/1243PQ/details/P1/9981
How do I extract the numeric or alphanumeric part of the URL? I.e. the following strings from the url given above:
1. v1
2. 1243PQ
3. P1
4. 9981
To rephrase, a regex to extract strings from a string (URL) which have at least 1 digit and 0 or more alphabet characters, separated by '/'.
I tried to capture a repeating group (^[a-zA-Z0-9]+)+ and ([a-zA-Z]{0,100}[0-9]{1,100})+ but it didn't work. In hindsight intuition does say this shouldn't work. I am unsure how do I match patterns over a group and not just a single character.

If I understand what you really want:
Extracting parts with only numbers or with numbers following alphabets
then; I can suggest this regex:
\b[a-zA-Z]*[0-9]+[a-zA-z]*\b
Regex Demo
I use \b to assert position of a word boundary or a part.
As numbers are required and alphabets can comes before or after that I use above regex.
If following alphabets are not required then I can suggest this regex:
\b[a-zA-z0-9]*[0-9]+[a-zA-Z0-9]*\b
Regex Demo

I believe this should work for you:
(\d*\w+\d+\w*)
EDIT: actually, this should be sufficient
(\w+\d+\w*)
or
(\w*\d+\w*)

Well, you could do this:
(\w*\d+\w*) with the g (global) regex option
On the example URL, it would look like this:
const regex = /(\w*\d+\w*)/g;
const url = 'https://stackoverflow.com/v1/summary/1243PQ/details/P1/9981';
console.log(url.match(regex))

Try \/[a-zA-Z]*\d+[a-zA-Z0-9]*
Explanation:
\/ - match / literally
[a-zA-Z]* - 0+ letters
\d+ - 1+ digits - thanks to this, we require at least one digits
[a-zA-Z0-9]* - 0+ letters or digits
Demo
It will captrure together with / at the beginning, so you need to trim it.

Related

Regular Expression to Validate Monaco Number Plates

I would like to have an expression to validate the plates of monaco.
They are written as follows:
A123
123A
1234
I started by doing:
^[a-zA-Z0-9]{1}?[0-9]{2}?[a-zA-Z0-9]{1}$
But the case A12A which is false is possible with that.

You can use
^(?!(?:\d*[a-zA-Z]){2})[a-zA-Z\d]{4}$
See the regex demo. Details:
^ - start of string
(?!(?:\d*[a-zA-Z]){2}) - a negative lookahead that fails the match if there are two occurrences of any zero or more digits followed with two ASCII letters immediately to the right of the current location
[a-zA-Z\d]{4} - four alphanumeric chars
$ - end of string.

You can write the pattern using 3 alternatives specifying all the allowed variations for the example data:
^(?:[a-zA-Z][0-9]{3}|[0-9]{3}[a-zA-Z]|[0-9]{4})$
See a regex demo.
Note that you can omit {1} and
To not match 2 chars A-Z you can write the alternation as:
^(?:[a-zA-Z]\d{3}|\d{3}[a-zA-Z\d]|\d[a-zA-Z\d][a-zA-Z\d]\d)$
See another regex demo.

So it needs 3 connected digits and 1 letter or digit.
Then you can use this pattern :
^(?=.?[0-9]{3})[A-Za-z0-9]{4}$
The lookahead (?=.?[0-9]{3}) asserts the 3 connected digits.
Test on Regex101 here

How to group expressions to be matched as one?

What i am trying to match is like this :
char-char-int-int-int
char-char-char-int-int-int
char-char-int-int-int-optionnalValue (optionalValue being a "-" plus letters after it
My current regep looks like this :
([A-Za-z]{1,2})([1-9]{3})("-"[\w])
In the end, the regexp should match any of these:
AB001
aB999
Hm000
en789
rv005-ab
These should be invalid:
ab (because only letters)
abcfr (because too much letters)
158 (because only numbers)
78532 (because too much numbers)
123ab (because all letters should come before numbers, optionalValue exepted)
a1b23 (because letters and numbers are mixed)
What am i doing wrong ? (please be gentle this is my first post ever on stackoverflow)

If you use [A-Za-z]{1,2} then the second example would not match as there a 3 char-char-char
Using \w would also match numbers and an underscore. If you mean letters like a-zA-Z you can use that in an optional group preceded by a hyphen (?:-[a-zA-Z]+)?
You could use
^[a-zA-Z]{2,3}[0-9]{3}(?:-[a-zA-Z]+)?$
^ Start of string
[a-zA-Z]{2,3} Match 2 or 3 times a char A-Za-z
[0-9]{3} Match 3 digits
(?:-[a-zA-Z]+)? Optionally match a - and 1 or more chars A-Za-z
$ End of string
Regex demo
Or using word boundaries \b instead of anchors
\b[a-zA-Z]{2,3}[0-9]{3}(?:-[a-zA-Z]+)?\b
Regex demo

I have corrected your regex below. Please give it a try.
([A-Za-z]{1,2})([0-9]{3})(-\w*)?
Demo

How to exclude specific uri in regex

I have a regular experession like this PRCE using php:
^/someurl/*
There are a lot of urls like
/someurl/test
/someurl/something/{version}/{name}/{etc}
and i need to exclude urls like this one:
^/someurl/test/{version}/commands/*
{version} is a float number like 2.1.1 2.4
I've tried this
^((?!/someurl/test/[0-9].+/commands/*))
It works
But I need to add this to single line like
^/someurl/* Excluding ^((?!/someurl/test/[0-9].+/commands/*))
How to join them? Thanks.

You may use
^/someurl/(?!test/[0-9.]+/commands/).*
Or a bit more precise
^/someurl/(?!test/[0-9]+(?:\.[0-9]+)+/commands/).*
See the regex demo
Details
^/someurl/ - /someurl/ at the start of the string
(?!test/[0-9.]+/commands/) - immediately to the right of the current position, there can't be test/, then 1+ digits or dots, then /commands/ substring (if [0-9]+(?:\.[0-9]+)+ is used it will match 1 or more digits, and then 1 or more repetitions of a dot followed with 1+ digits)
.* - any 0+ chars as many as possible.

Regex for String with first two characters fixed and rest digits

Is there a regular expression for? :
String of length 8
First two chracters fixed 'UE' or 'ue'
remaining 6 characters must be digits [0-9]
Eg: https://regex101.com/r/PufypE/1
The expression i tried
\^(UE|ue){2}[0-9]{6}\
but its not working (no match found!)

You want:
\b(UE|ue)[0-9]{6}\b
You don't need the {2} next to the (UE|ue) since you are specifying those exactly. The \b is a word boundary so this will match a list like you put in the comment: UE123456,ue654321 This is a good site to play with a regex on for this kind of stuff: http://regex101.com

Regex should be:
^[Uu][Ee][0-9]{6}$
(UE|ue){2} in your regex would match 2 occurrences of UE or ue

Regex for Chilean RUT/RUN with PCRE

I'm having issues with the validation of the chilean RUT/RUN with a regex expression in PCRE. I have the next regular expression but sadly can't make it work:
\b[0-9|.]{1,10}\-[K|k|0-9]
I need help to see what is wrong with the code. The application I need to use only uses PCRE.
Thank you.

You may use
^(\d{1,3}(?:\.\d{1,3}){2}-[\dkK])$
to match and capture (that is not usually necessary, but your app requires a capturing group to extract its contents) a whole string that matches the pattern. See the regex demo.
To match shorter strings that match this pattern inside a larger string, you may remove ^ and $ (see demo) or use \b word boundaries instead (see this demo).
Details:
^ - start of string
\d{1,3} - 1 to 3 digits
(?:\.\d{1,3}){2} - 2 sequences of a literal . and 1 to 3 digits
- - a hyphen
[\dkK] - a digit, k or K.
$ - end of string.

As they sometimes omit the dots, I used this one:
^(\d{1,2}(?:[\.]?\d{3}){2}-[\dkK])$
Details:
^ - start of string
\d{1,2} - 1 or 2 digits
(?:[.]?\d{3}){2} - 2 sequences of an optional '.' and 3 digits
- a hyphen
[\dkK] - a digit, k or K
$ - end of string
1234567-k OK
12345678-k OK
1.234.567-k OK
12.345.678-k OK
known issue:
12.345678-k and 12345.678-k still OK and I do not like this :(

You need to change to ^(\d{1,3}(?:\.\d{3}){2}-[\dkK])$ to capture only 2 sequence of 3 digits after the first sequence of 1-3 digits.

please consider being more specific in the REGEX build, since it matched wrong numbers, such as 17.87.335-2. Also the included one did't match formats without the dots or the hyphens.
Please consider using the following format: \b(\d{1,3}(?:(.?)\d{3}){2}(-?)[\dkK])\b
Modified prior version to try the other formats: https://regex101.com/r/2Us0j6/9

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regular Expression to extract alphanumeric parts of a URL? - regex

I believe this should work for you: (\d\w+\d+\w) EDIT: actually, this should be sufficient (\w+\d+\w) or (\w\d+\w*)

Well, you could do this: (\w\d+\w) with the g (global) regex option On the example URL, it would look like this: const regex = /(\w\d+\w)/g; const url = 'https://stackoverflow.com/v1/summary/1243PQ/details/P1/9981'; console.log(url.match(regex))

Try \/[a-zA-Z]\d+[a-zA-Z0-9] Explanation: \/ - match / literally [a-zA-Z]* - 0+ letters \d+ - 1+ digits - thanks to this, we require at least one digits [a-zA-Z0-9]* - 0+ letters or digits Demo It will captrure together with / at the beginning, so you need to trim it.

Related

Regular Expression to Validate Monaco Number Plates

How to group expressions to be matched as one?

How to exclude specific uri in regex

Regex for String with first two characters fixed and rest digits

Regex for Chilean RUT/RUN with PCRE

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regular Expression to extract alphanumeric parts of a URL? - regex

I believe this should work for you: (\d*\w+\d+\w*) EDIT: actually, this should be sufficient (\w+\d+\w*) or (\w*\d+\w*)

Well, you could do this: (\w*\d+\w*) with the g (global) regex option On the example URL, it would look like this: const regex = /(\w*\d+\w*)/g; const url = 'https://stackoverflow.com/v1/summary/1243PQ/details/P1/9981'; console.log(url.match(regex))

Try \/[a-zA-Z]*\d+[a-zA-Z0-9]* Explanation: \/ - match / literally [a-zA-Z]* - 0+ letters \d+ - 1+ digits - thanks to this, we require at least one digits [a-zA-Z0-9]* - 0+ letters or digits Demo It will captrure together with / at the beginning, so you need to trim it.

Related

Regular Expression to Validate Monaco Number Plates

How to group expressions to be matched as one?

How to exclude specific uri in regex

Regex for String with first two characters fixed and rest digits

Regex for Chilean RUT/RUN with PCRE

Categories

Resources

I believe this should work for you: (\d\w+\d+\w) EDIT: actually, this should be sufficient (\w+\d+\w) or (\w\d+\w*)

Well, you could do this: (\w\d+\w) with the g (global) regex option On the example URL, it would look like this: const regex = /(\w\d+\w)/g; const url = 'https://stackoverflow.com/v1/summary/1243PQ/details/P1/9981'; console.log(url.match(regex))

Try \/[a-zA-Z]\d+[a-zA-Z0-9] Explanation: \/ - match / literally [a-zA-Z]* - 0+ letters \d+ - 1+ digits - thanks to this, we require at least one digits [a-zA-Z0-9]* - 0+ letters or digits Demo It will captrure together with / at the beginning, so you need to trim it.