php separate strings with delimiters - regex

i have string for example:
$stringExample = "(({FAPAGE15}+500)/{GOGA:V18})"
// separete content { }
I need the result to be something like that: :
$response = array("FAPAGE15","GOGA:V18")
I assume it must be something with : preg_split or preg_match

Here's the regex you need:
\{(.*?)\}
Regex example:
http://regex101.com/r/qU8eB0
PHP:
$str = "(({FAPAGE15}+500)/{GOGA:V18})";
preg_match_all("/\{(.*?)\}/", $str, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => FAPAGE15
[1] => GOGA:V18
)
Working Example:
https://eval.in/92516

You can use a negative character class: [^}] (all that is not a })
preg_match_all('~(?<={)[^}]++(?=})~', $str, $matches);
$result = $matches[0];
pattern details
~ # pattern delimiter
(?<={) # preceded by {
[^}]++ # all that is not a } one or more times (possessive)
(?=}) # followed by }
~ # pattern delimiter
note: the possessive quantifier ++ is not essential to have the good result and can be replaced by +. You can find more informations about this feature here.

Related

Perl: Method to convert regexp with greedy quantifiers to non-greedy

My user gives a regexp with quantifiers that default to being greedy. He can give any valid regexp. So the solution will have to deal with anything that the user can throw at me.
How do I convert the regexp so any greedy quantifier will be non-greedy?
Does Perl have a (?...:regexp) construct that forces the greedy default for quantifiers into a non-greedy one?
If not: Is there a different way I can force a regexp with greedy quantifiers into a non-greedy one?
E.g., a user may enter:
.*
[.*]
[.*]{4,10}
[.*{4,10}]{4,10}
While these four examples may look similar, they have completely different meanings.
If you simply add ? after every */} you will change the character sets in the last three examples.
Instead they should be changed to/behave like:
.*?
[.*]
[.*]{4,10}?
[.*{4,10}]{4,10}?
but where the matched string is the minimal match, and not first-match, that Perl will default to:
$a="aab";
$a=~/(a.*?b)$/;
# Matches aab, not ab
print $1;
But given the non-greedy regexp, the minimal match can probably be obtained by prepending .*:
$a="aab";
$a=~/.*(a.*?b)$/;
# Matches ab
print $1;
"Greedyness" is not a property of the whole regular expression. It's a property of a quantifier.
It can be controlled for each quantifier separately. Just add a ? after a quantifier to make it non-greedy, e.g.
[a-z]*?
a{2,3}?
[0-9]??
\s+?
And no, there isn't any built-in way to turn the whole regex to some "default-non-greedy" mode. You need to parse the regex, detect all quantifiers and change them accordingly. Maybe there's a regex-parsing library somewhere on CPAN.
The closest I've found so far is the Regexp::Parser module. I didn't try it, but looks like it could parse the regex, walk the tree, make appropriate changes and then build a modified regex. Please take a look.
You can use a state machine:
#!/usr/bin/perl
use strict;
use warnings;
my #regexes = ( ".*", "[.*]", "[.*]{4,10}", "[.*{4,10}]{4,10}" );
for (#regexes) {
print "give: $_\n";
my $ungreedy = make_ungreedy($_,0);
print "got: $ungreedy\n";
print "============================================\n"
}
sub make_ungreedy {
my $regex = shift;
my $class_state = 0;
my $escape_state = 0;
my $found = 0;
my $ungreedy = "";
for (split (//, $regex)) {
if ($found) {
$ungreedy .= "?" unless (/\?/);
$found = 0;
}
$ungreedy .= $_;
$escape_state = 0, next if ($escape_state);
$escape_state = 1, next if (/\\/);
$class_state = 1, next if (/\[/);
if ($class_state) {
$class_state = 0 if (/\]/);
next;
}
$found = 1 if (/[*}+]/);
}
$ungreedy .= '?' if $found;
return $ungreedy;
}

Find all commas between two seperate characters in string

I have a substring that contains commas. This substring lives inside of another string that is a semi colon delimited list. I need to match the commas in that substring. The substring has a key field "u3=" in front of it.
Example:
u1=something;u2=somethingelse;u3=cat,matt,bat,hat;u4=anotherthing;u5=yetanotherthing
Regex so far:
(?<=u3)(.*)(?=;)
The regex i've been working on above matches everything between "u3" and the last ";" in the outerstring. I need to match only the commas in the substring.
Any guidance would be greatly appreciated.
You didn't specify language!
C#, VB (.NET):
Using an infinite positive lookbehind,
(?<=u3=[^;]*),
Java:
Using a variable-length positive lookbehind:
(?<=u3=[^;]{0,9999}),
PHP (PCRE), Perl, Ruby:
Using \G along with \K token:
(?>u3=|\G(?!^))[^,;]+\K,
Live demo
JavaScript:
Using two replace() methods (if you are going to substitute),
var s = 'u1=something;u2=somethingelse;u3=cat,matt,bat,hat;u4=anotherthing;u5=yetanotherthing';
console.log(
s.replace(/u3=[^;]+/, function(match) {
return match.replace(/,/g, '*');
})
)
Try to use this regex:
(?<=u3)[^;]+
The result is:
=cat,matt,bat,hat
If this was PHP I would do this:
<?php
$str = 'u1=something;u2=somethingelse;u3=cat,matt,bat,hat;u4=anotherthing;u5=yetanotherthing;';
$split = explode(';', $str);
foreach ($split as $key => $value) {
$subsplit = explode('=',$value);
if ($subsplit[0] == 'u3') {
echo $subsplit[1];
preg_match_all('/,/', $subsplit[1], $matches, PREG_OFFSET_CAPTURE);
}
}
var_dump($matches);

Get Value from string between ":" and ","

Here is example of my string
"{"id":128,"order":128,"active":"1","name":"\"
Now I need to get "128" - id parameter. So its first value between ":" and ",".
I have tried with preg_match and different regular expressions but I'm just not good in regular expressions. Maybe someone will knew how to make it ?
$id = preg_match('/:(,*?)\,/s', $content, $matches);
Here is a sample code to get the number after the first : using a regex:
$re = "/(?<=\\:)[0-9]+/";
$str = "\"{\"id\":128,\"order\":128,\"active\":\"1\",\"name\":\"\"";
preg_match($re, $str, $matches);
print $matches[0];
Here is a sample program on TutorialsPoint.
Just a small detail about this regex (?<=\\:)[0-9]+: it uses a fixed-width look-behind that PHP supports, fortunately.
<?php
$txt='"{"id":128,"order":128,"active":"1","name":"\\"';
$re1='.*?'; # Non-greedy match on filler
$re2='(\\d+)'; # Integer Number 1
$re3='.*?'; # Non-greedy match on filler
$re4='(\\d+)'; # Integer Number 2
$re5='.*?'; # Non-greedy match on filler
$re6='(\\d+)'; # Integer Number 3
if ($c=preg_match_all ("/".$re1.$re2.$re3.$re4.$re5.$re6."/is",$txt, $matches))
{
$int1=$matches[1][0];
$int2=$matches[2][0];
$int3=$matches[3][0];
print "($int1) ($int2) ($int3) \n";
}
?>

RegEx - grouping a string

Can't seem to figure out an expression which handles this line of text:
'SOME_TEXT','EVEN_MORE_TEXT','EXPRESSION IS IN (''YES'',''NO'')'
To this groupings
SOME_TEXT
EVEN_MORE_TEXT
EXPRESSION IS IN ('YES', 'NO')
....I'd rather have a nifty regex than solving this by string functions like indexOf(), etc..
The regex '([^']|'')++' will match the parts you're interested in, as this demo shows:
$text = "'SOME_TEXT','EVEN_MORE_TEXT','EXPRESSION IS IN (''YES'',''NO'')'";
preg_match_all("/'([^']|'')+'/", $text, $matches);
print_r($matches[0]);
which prints:
Array
(
[0] => 'SOME_TEXT'
[1] => 'EVEN_MORE_TEXT'
[2] => 'EXPRESSION IS IN (''YES'',''NO'')'
)

regex: match a pattern between last occurence of a "/" and the end of the line

I can't figure out how to match a pattern between LAST / and the end of the line.
I have tons of:
/usr/etc/blabla:/etc/bbb
/usr/etc/blabla:/etc/bffb.gh
/usr/etc/blabla:/local/fffusr
/usr/etc/blabla:/bin/dfusrd
/usr/etc/var:/etc/aaaaaf.ju
For example i want to match "usr" only when it is in the bold part.
I'm using grep.
EDIT:
I've a small problem with this solution:
/([^/]+)$
It doesn't match the pattern if it is immediately after the /, for example those:
/usr/etc/blabla:/bin/usrlala
/bin/bla/:/etc/usr
are not matched
FOUND IT: /([^/]*)$
It would be:
/([^/]+)$
But maybe you must escape the slash (/) depending on your language:
/\/([^\/]+)$/
Why do you want to use regex on such simple task?
If you're using php you can use
$pos = strrpos($line, '/');
to determine last occurance of / and then copy everything from there
$name = substr($line, $pos+1);
regex is not ultimate solution to everything. It will be slower on such simple string operations. Well, it will always be slower to your own procedure parsing a string (if it's written good).
echo "
/usr/etc/blabla:/etc/bbb
/usr/etc/blabla:/etc/bffb.gh
/usr/etc/blabla:/local/fffusr
/usr/etc/blabla:/bin/dfusrd
/usr/etc/var:/etc/aaaaaf.ju" | sed -n 's#.*/##;/.*usr.*/p'
fffusr
dfusrd
answer in javascript
var s = "/usr/etc/var:/etc/aaaaaf.ju"
s ; //# => /usr/etc/var:/etc/aaaaaf.ju
var last = s.match(/[^/]+$/);
last ; //# => aaaaaf.ju
Using PCRE:
$re = '/.+\/.*usr.*/i';
$string = '/usr/etc/blabla:/etc/bbb
/usr/etc/blabla:/etc/bffb.gh
/usr/etc/blabla:/local/fffusr
/usr/etc/blabla:/bin/dfusrd
/usr/etc/var:/etc/aaaaaf.ju';
$nMatches = preg_match_all($re, $string, $aMatches);
Result:
Array
(
[0] => Array
(
[0] => /usr/etc/blabla:/local/fffusr
[1] => /usr/etc/blabla:/bin/dfusrd
)
)