Pattern match any alphanumeric text before - in the string - regex

I try to get any alphanumeric word or text in a string before the negative sign - for example:
earth-green, random-stuff, coffee-stuff, another-tag
I try to match earth random coffee another
I tried the following regex:
(\w*[^][\-])
However, it matches earth- random- coffee- another-
This DEMO shows the situation. there you may notice that earth- random- coffee- another- are highlighted while I don't want include the negative sign - in the highlighting.

This is a good example to use positive look ahead regex.
You can use a regex like this:
(\w+)(?=-)
Working demo
On the other hand, the problem in your regex was that you were putting the hypen and ^ within the capturing group:
(\w*[^][\-])
^---^---- Here (btw... you don't need [^])
You had to use this one instead
(\w+)-
Working demo

You can just add a word boundary and - to bookmark what you want:
\b(\w+)-
Demo

>>> x = 'earth-green, random-stuff, coffee-stuff, another-tag'
>>> re.compile('(\w+)-\w+').findall(x)
['earth', 'random', 'coffee', 'another']
>>>
A lot of good examples with diverse use cases regex howtos

You can match it like this.
my $string = "earth-green, random-stuff, coffee-stuff, another-tag";
while ($string =~ m/[\w]*-/g)
{
my $temp = $&;
$temp =~s/-//;
print "$temp\n";
}
Hope this helps.

Related

How can I match multiple hits between 2 delimiters?

Hi, my fellow RegEx'ers ;)
I'm trying to match multiple Texts between every two quotes
Here's my text:
...random code
someArray[] = ["Come and",
"get me,",
"or fail",
"trying!",
"Yours truly"]
random code...
So far, I managed to get the correct matches with two patterns, executed after each other:
(?s)someArray\[\].*?=.*?\[(.*?)\]
this extracts the text between the two brackets and on the result, I use this one:
"(.*?)"
This is working just fine, but I'd love to get the Texts in one regex.
Any help is highly appreciated!
Consider using \G. With its help, you may match "(.*?)" preceded by either someArray[] = [ or previous match of "(.*?)" (well, strictly speaking previous match of entire regex). Then just grab first capture groups from all matches:
(?:(?s).*someArray\[\].*?=.*?\[|\G[^"\]]+)"(.*?)"
Demo: https://regex101.com/r/eBQWdU/3
How you grab the first capture groups from depends on the language you're using regex in. For example in PHP you may do something like this:
preg_match_all('/(?:(?s).*someArray\[\].*?=.*?\[|\G[^"\]]+)"(.*?)"/', $input, $matches);
$array_items = $matches[1];
Demo: https://ideone.com/mZgU1x

regex to select only the zipcode

,Ray Balwierczak,4/11/2017,,895 Forest Hill Rd,Apalachin,NY,13732,y,,
i want to select only 13732 from the line. I came up with this regex
(\d)(\s*\d+)*(\,y,,)
But its also selecting the ,y,, .if i remove it that part from regex, the regex also gets valid for the date. please help me on this.
Generally, if you want to match something without capturing it, use zero-length lookaround (lookahead or lookbehind). In your case, you can use lookahead:
(\d)(\s*\d+)*(?=\,y,,)
The syntax (?=<stuff>) means "followed by <stuff>, without matching it".
More information on lookarounds can be found in this tutorial.
Regex: \D*(\d{5})\D*
Explanation: match 5 digits surrounded by zero or more non-digits on both sides. Then you can extract group containing the match.
Here's code in python:
import re
string = ",Ray Balwierczak,4/11/2017,,895 Forest Hill Rd,Apalachin,NY,13732,y,,"
search = re.search("\D*(\d{5})\D*", string)
print search.group(1)
Output:
13732

regex a number range

i ´ve got the following problem:
how can i regex a string like this:
?partner=87835223&token=yygQWaaT
for 87835223 and yygQWaaT can be any other combination
thanks for help!
You can use following regular expression =\w+, but this way you will get = also in the match, so you will have to take care of that
Input = ?partner=87835223&token=yygQWaaT
Matches = =87835223, =yygQWaaT
I think this regex will suffice
\?partner=\d+&token=\w+
^^ ^^
|| Matches one or more alphanumeric characters
Matches one or
more digits
Regex Demo
PHP Code
$re = "/\\?partner=\\d+&token=\\w+/";
$str = "?partner=87835223&token=yygQWaaT";
print(preg_match($re, $str, $matches));
Ideone Demo

how to replace a string with a dynamic string

Case 1.
I have a string of alphabets like fthhdtrhththjgyhjdtygbh. Using regex I want to change it to ftxxxxxxxxxxxxxxxxxxxxx, i.e, keep the first two letters and replace the rest by x.
After a lot of googling, I achieved this:
s/^(\w\w)(\w+)/$1 . "x" x length($2)/e;
Case 2.
I have a string of alphabets like sdsABCDEABCDEABCDEABCDEABCDEsdf. Using regex I want to change it to sdsABCDExyxyxyABCDEsdf, i.e, keep the first and last ABCDE and replace the ABCDE in the middle with xy.
I achieved this:
s/ABCDE((ABCDE)+)ABCDE/$len = length($1)\/5; ABCDE."xy"x $len . ABCDE/e;
Problem : I am not happy with my solution to the mentioned problem. Is there any better or neat solution to the mentioned problem.
Contraint : Only one regex have to be used.
Sorry for the poor English in the title and the body of the problem, english isn't my first language. Please ask in comments if anything is not clear.
Task 1: Simplify the password hider regex
Use a Positive Lookbehind Assertion to replace all word characters preceded by two other word characters. This removes the need for the /e Modifier:
my $str = 'fthhdtrhththjgyhjdtygbh';
$str =~ s/(?<=\w{2})\w/x/g;
print $str;
Outputs:
ftxxxxxxxxxxxxxxxxxxxxx
Task 2: Translate inner repeated pattern regex
Use both a Positive Lookbehind and Lookahead Assertion to replace all ABCDE that are bookended by the same string:
my $str = 'sdsABCDEABCDEABCDEABCDEABCDEsdf';
$str =~ s/(?<=(ABCDE))\1(?=\1)/xy/g;
print $str, "\n";
Output:
sdsABCDExyxyxyABCDEsdf
One regex, less redundancy using \1 to refer to first captured group,
s|(ABCDE)\K (\1+) (?=\1)| "xy" x (length($2)/length($1)) |xe;

regular expression to match strings with decimals

I'm trying to create a regex which will do the following:
Name description: "QUARTERLY PATCH FOR XAQE (JUL 2013 - 11.2.0.3.20) : (125546467)"
Val version : 11.2.0.3.4
In order to output:
"Name, 11.2.0.3.20"
"Val, 11.2.0.3.4"
I have created the following regex: /^([\w]+).*([\d\.\d]+).*/, but it is only matching the last number in the 2nd group, i.e. in 11.2.0.3.4 it will only match 4. Could anyone help?
Also, there could be more than the two lines given above, so it needs to account for arbitrary lines where the version number could be anywhere in the line.
You can use a one-liner for this as well:
perl -lne '/(\w+).*?(\d+(\.\d+)+)/; print "$1, $2"' <filename>
__END__
Name, 11.2.0.3.20
Val, 11.2.0.3.4
If you are only planning for the output and not doing any processing over the captured groups, then this will do:
$str =~ s/([\n\r]|^)(Name|Val).*?(\d+(\.\d+)+).*/$1"$2, $3"/g;
Your problem is that .* is greedy and will consume as much as it can whilst the pattern still matches. One solution is to make is lazy .*?
Also [\d\.\d]+ means match one of \d, \. and \d, so it's the same as [\d.]+ which isn't what you want since it would match "2013" in the first line. \d+(\.\d+)+ is more suitable.
After those 2 changes you have:
^([\w]+).*?(\d+(\.\d+)+).*
RegExr