Split branch name regex - regex

What regex could I use if I wanted to match bar-100 from foo/bar-100-baz. The original string could be longer with more hyphens.
Total regex beginner don't really have a start..
\/([^-]+) matches bar but I want to match the second hyphen somehow.

If a full-match might be desired, then
(?<=/)[a-z]+-\d+
Demo 1
or,
[a-z]+-\d+(?=-)
Demo 2
or,
[^/]+(?=-)
Demo 3
might also work OK.

Related

Regex: match only first instance of a pattern

Using a regex for a string, we need to remove all text before the first instance of four digits in a row. We have a regex that "sort of" works:
^((?!\d{4}\w).)*
Given this string:
foo-bar-spring_06-2006_02_25.rm
the desired output is:
2006_02_25.rm
That works - if there's only one instance of a four-digit pattern. The string:
batt-fall_01-2001-11-10_0200-0400.rm produces this result: 0400.rm
It should produce:
2001-11-10_0200-0400.rm
Note: long story, but we cannot use a - or _ as a delimiter.
I feel like we're close. Does anyone have any suggestions?
Thanks!
You can use a positive lookahead pattern after a lazily repeated . instead:
^.*?(?=\d{4})
Demo: https://regex101.com/r/8DZDQp/1
Alternatively, you can group the 4 digits:
^.*?(\d{4})
and substitute the match with the first group $1.
Demo: https://regex101.com/r/8DZDQp/3
A likely faster option would be to ignore the beginning and undesired part, without using lookarounds, and with a simple expression similar to:
(\d{4}.*\..+)$
or:
(\d{4}.*\.[a-z]+)$
End $ anchor is also unnecessary, without which it would still work.
Demo

Capture number between two whitespaces (RegEx)

I have the following data:
SOMEDATA .test 01/45/12 2.50 THIS IS DATA
and I want to extract the number 2.50 out of this. I have managed to do this with the following RegEx:
(?<=\d{2}\/\d{2}\/\d{2} )\d+.\d+
However that doesn't work for input like this:
SOMEDATA .test 01/45/12 2500 THIS IS DATA
In this case, I want to extract the number 2500.
I can't seem to figure out a regex rule for that. Is there a way to extract something between two spaces ? So extract the text/number after the date until the next whitespace ? All I know is that the date will always have the same format and there will always be a space after the text and then a space after the number I want to extract.
Can someone help me out on this ?
Capture number between two whitespaces
A whitespace is matched with \s, and non-whitespace with \S.
So, what you can use is:
\d{2}\/\d{2}\/\d{2} +(\S+)
^^^
See the regex demo
The 1+ non-whitespace symbols are captured into Group 1.
If - for some reason - you need to only get the value as a whole match, use your lookbehind approach:
(?<=\d{2}\/\d{2}\/\d{2} )\S+
Or - if you are using PCRE - you may leverage the match reset operator \K:
\d{2}\/\d{2}\/\d{2} +\K\S+
^^
See another demo
NOTE: the \K and a capture group approaches allow 1 or more spaces after the date and are thus more flexible.
I see some people helped you already, but if you would want an alternative working one for some reason, here's what works too :)
.+ \d+\/\d+\/\d+ (\d+[\.\d]*)
So the .+ matches anything plus the first space
then the \d+/\d+/\d+ is the date parsing plus a space
the capturing group is the number, as you can see I made the last part optional, so both floating point values and normal values can be matched. Hope this helped!
Proof: https://regex101.com/r/fY3nJ2/1
Just make the fractal part optional:
(?<=\d{2}\/\d{2}\/\d{2} )\d+(?:\.\d+)?
Demo: https://regex101.com/r/jH3pU7/1
Update following clarifications in comments:
To match anything (but space) surrounded by spaces and prepended by date use:
(?<=\d{2}\/\d{2}\/\d{2} )\S+
Demo: https://regex101.com/r/jH3pU7/3
Rather than capture, you can make your entire match be the target text by using a look behind:
(?<=\d\d(\/\d\d){2} )\S+
This matches the first series of non-whitespace that follows a "date like" part.
Note also the reduction in the length of the "date like" pattern. You may consider using this part of the regex in whatever solution you use.

Regex Greediness

I have a perl regex that i'm fairly certain should work (perl) but is being too greedy:
regex:
(?:.*serial[^\d]+?(\d+).*)
Test string:
APPLICATIONSERIALNO123456Plnsn123456te20140728tdrnserialnun12hou
Desired group 1 match:
123456
Actual group 1 Match:
12
I've tried every permutation of lookahead and behind and laziness and I can't get the damn thing to work.
WHAT AM I MISSING.
Thanks!
The Problem is Not Greediness, but Case-Sensitivity
Currently your regex matches the 12 at the end of serialnun12, probably because it is case-sensitive. We have two options: using upper-case, or making the pattern case-insensitive.
Option 1: Use Upper-Case
If you only want 123456, you can use:
SERIALNO\K\d+
The \K tells the engine to drop what was matched so far from the final match it returns.
If you want to match the whole string and capture 123456 to Group 1, use:
.*?SERIAL\D+(\d+).*
Option 2: Turning Case-Sensitivity On using (?i) inline or the i flag
To only match 123456, you can use:
(?i)serial\D+\K\d+
Note that if you use the g flag, this would match both numbers.
If you want to match the whole string and capture 123456 to Group 1, use:
(?i).*?serial\D+(\d+).*
A few tips
You can turn case-insensitivity either with the (?i) inline modifier or the i flag at the end of the pattern: /serial\D+\K\d+/i
Instead of [^\d], use \D
There is no need for a lazy quantifier in something like \D+\d+ because the two tokens are mutually exclusive: there is no danger that the \D will run over the \d
The problem is not greediness; it's case-sensitivity.
Currently your regex matches the 12 at the end of serialnun12 because those are the only digits following serial. The ones you want follow SERIAL. S and s are different characters.
There are two solution.
Use the uppercase characters in the pattern.
my ($serial) = $string =~ /SERIAL\D*(\d+)/;
Use case-insensitive matching.
my ($serial) = $string =~ /serial\D*(\d+)/i;
There's probably no need for this, but I thought I'd mention it just in case.

Regex to match N-NN-NN

I need some help with a RegEx pattern match.
How do i write a regex if i want it to match
N-NN-N-NN-NN-N-NNN
but also
N-NN-NN-NN
Exmaple:
10pcs- ratchet spanner combination wrench 6-8-10-11-12-13-14-15-17-19
Cr-v,heated 12pcs-1/4dr 4-4.5-5-5.5-6-7-8-9-10-11-12-13 Cr-v,heated
17pcs-1/2dr 10-11-12-13-14-15-16-17-18-19-20-21-22-23-24-27-30
Cr-v,heated 1-2-33 Cr-V heater 1-.2-1-4
It needs to match where they is at least 2 - in the total string. So a phone number like this 020-11223344 is not to be matched.
The strings almost always look like this 6-8-10-11-12-13-14-15-17-19 , except sometimes a . can apper before a number, they also differ in length, is it possible?
I came up with this so far but it also matches on phone numbers and when a . appears it doenst match at all.
(\d-[^>])
On this page you can find the different patters: http://www.cazoom.nl/en/partij-aanbod/186-pcs-working-tools-trolly-3
What about this pattern:
[\d.]+(?:-[\d.]+){2,}
Match [\d.]+ if followed by at least 2x -[\d.]+
(?: Using a non capturing group for repetition.
test at regex101
The following regex will match the thing.
(?:\.?\d\.?\d?-){2,}\.?\d\.?\d?
Debuggex Demo
Just try with following regex:
^\d-\d{2}-\d(\d-\d{2})|(\d-\d{2}-\d-\d{3})$

Regex matching problem

I don't know how to write such a regex. I will start with example.
My bad regex:
(\d*),?(\d*\.?\d*)-?(\d*\.?\d*),?([0-1]?),?([0-1]?),?([^\/]*)
Matches that are OK:
1,2-3,1,1,asdf
1,2-3,1,1
1,2-3,1
1,2-3
1,2
1
But unfortunately this will also be matched and I don't want it to be:
asdf
1,asdf
Ideally, I would like something like - match, if previous groups was matched.
I know that probably positive look behind should be used, but if I'm not wrong, it should be used right in front each group, except 1st and regex would be large and smelly after that. Um, and it would probably be variable length.
Is there any elegant way to do that?
EDIT
I want to match all lines given below Matches that are OK.
I would like to match \d* to first group. Then, if there was a match to \d* followed by ,, I would like to match (\d*\.?\d*) to second group. After that, if there was a match in first group followed by , and match in second group followed by - I would like to match another (\d*\.?\d*)... etc. to the end of Regex.
You're not very clear in your question, but from the examples I think this is what you need:
^\d(,\d-\d(,\d(,\d(,[a-z]+)?)?)?)?$
It matches:
1,2-3,1,1,asdf
1,2-3,1,1
1,2-3,1
1,2-3
1,2
1
Test link.