Problem with regex_replace in prestashop smarty - regex

I'm trying to use regex_replace in prestashop product_list.tpl. My code is like:
{$product.description|regex_replace:".*(?=Kompatybilny)":""|strip_tags:'UTF-8'}
I'd like it to show $product.destription after "Kompatybilny" word, but it doesn't work and I don't know why. I've tried different regex functions but still the same - variable doesn't show at all.

You may use
{$product.description|regex_replace:"/.*?(?=Kompatybilny)/su":''}
The regex will match
.*? - any 0+ chars, as few as possible, up to (but excluding from the match) the first occurrence of
(?=Kompatybilny) - the Kompatybilny substring
su - s means . can match linebreak chars and u supports Unicode strings.

Related

How to match characters between two occurrences of the same but random string

Base string looks like:
repeatedRandomStr ABCXYZ /an/arbitrary/##-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc
The things I know about this base string are:
ABCXYZ is constant and always present.
repeatedRandomStr is random, but its first occurrence is always at the beginning and before ABCXYZ
So far I looked at regex context matching, recursion and subroutines but couldn't come up with a solution myself.
My currently working solution is to first determine what repeatedRandomStr is with:
^(.*)\sABCXYZ
and then use:
repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr
to match what I want in $1. But this requires two separate regex queries. I want to know if this can be done in a single execution.
In Go, where RE2 library is used, there is no way other than yours: keep extracting the value before the ABCXYZ and then use the regex to match a string between two strings, as RE2 does not and won't support backreferences.
In case the regex flavor can be switched to PCRE or compatible, you can use
^(.*?)\s+ABCXYZ\s(.*)\1
^(.*?)\s+ABCXYZ\s(.*?)\1
See the regex demo.
Details:
^ - start of string
(.*?) - Group 1: zero or more chars other than line break chars as few as possible
\s+ - one or more whitespaces
ABCXYZ - some constant string
\s - a whitespace
(.*) - Group 2: zero or more chars other than line break chars as many as possible
\1 - the same value as in Group 1.

REGEX - How to find two hyphens in a filename?

I'd like to search for filenames that contain two hyphens (only). Some filenames have one hyphen, I just want the one's with two hyphens in the name:
THIS: some text - more text - yet more.txt
NOT THIS: some text - more text.txt
The hyphens are always surrounded by a space, FWIW.
I tried using (.*) - (.*) - (.*) and a couple variants, but the results aren't what I am looking for. I either get nothing or filenames with just one hyphen when I try various combinations.
I know this is an obvious one, but I have tried wading through regex tutorials concerning greedy, look aheads, etc. but can't for the life of me solve this. Can anyone help? I'm not looking for just the solution--I'd like to understand what I'm doing wrong in the regex syntax.
You can use this regex,
^[^-]*(?:-[^-]*){2}$
This when written in expanded form will look like this,
^[^-]*-[^-]*-[^-]*$
Which is how you wanted it, but I've compacted it by using quantifier to restrict the occurrence of hyphen to just two only.
Demo
If you want to extend your regex, just change .* to [^-]* to make your regex this, otherwise .* will match additional hyphens too leading to unexpected match results.
^([^-]*) - ([^-]*) - ([^-]*)$
Notice you should use start ^ and end $ anchors to make the filename match whole regex.
Demo with your modified regex

Match all occurences with a single or regex

I need to find regex which matches both:
;hostname:MytestHello;
;message:#Hellowtestworld;
In this value:
;hostname:MytestHello;severity:major;message:#Hellowtestworld;
Here is my regex shot:
(hostname:|message:).*?(test).*?\;
But I only get the first occurence:
hostname:nimsofttest22;
What can I do in order to get BOTH results ?
While the multiple matching part is easy to solve with a global modifier or the correct language function/method that returns multiple matches, your pattern contains a flaw: it may return unwanted results if message or hostname with no test after them appear before another occurrence with test. See this regex demo to understand what I mean.
So, the correct way is to restrict . here, to match any char but ; (that acts as a delimiter in your string):
/(?:hostname|message):[^;]*?test[^;]*;/g
See this regex demo.
Note: you should adapt the pattern for any language method//function that you will choose later in the code.
Details
(?:hostname|message) - either of the 2 substrings
: - a colon
[^;]*? - any 0+ chars other than ;, as few as possible
test - test
[^;]* - any 0+ chars other than ; as many as possible
; - a semi-colon.

Regex filter list of url's

I need some help with a regex to filter a large list of urls, like:
/page-to-search-for/id/any-string
The problem is that the list also includes url's with a sub-page, like
/page-to-search-for/id/any-string/registration-form
Those pages needs to be excluded from the results.
So, the regex needs to like somewhat like:
/page-to-search-for\/(\d+)\/(\w+)(\/?(?!registration-form))
Unfortunately, the last part isn't working.
Hopefully someone can help me out?
Thanks!
It seems you want to block any URLs that, right after any-string, have registration-form at the end of string position.
You may use
some-page\/(\d+)\/([^\/]+)(?:\/(?!registration-form$).*)?$
See the regex demo.
I suggest replacing \w with [^\/]+ (to match any subpart, 1+ chars other than /) and the (?:\/(?!registration-form$).*)?$ will match:
(?:\/(?!registration-form$).*)?$ - 1 or 0 (optionally) sequences of:
\/ - a slash
(?!registration-form$) - not followed with registration-form and end of string ($)
.* - any 0+ chars
$ - end of string.

Mixing Lookahead and Lookbehind in 1 Regexp

I'm trying to match first occurrence of window.location.replace("http://stackoverflow.com") in some HTML string.
Especially I want to capture the URL of the first window.location.replace entry in whole HTML string.
So for capturing URL I formulated this 2 rules:
it should be after this string: window.location.redirect("
it should be before this string ")
To achieve it I think I need to use lookbehind (for 1st rule) and lookahead (for 2nd rule).
I end up with this Regex:
.+(?<=window\.location\.redirect\(\"?=\"\))
It doesn't work. I'm not even sure that it legal to mix both rules like I did.
Can you please help me with translating my rules to Regex? Other ways of doing this (without lookahead(behind)) also appreciated.
The pattern you wrote is really not the one you need as it matches something very different from what you expect: text window.location.redirect("=") in text window.location.redirect("=") something. And it will only work in PCRE/Python if you remove the ? from before \" (as lookbehinds should be fixed-width in PCRE). It will work with ? in .NET regex.
If it is JS, you just cannot use a lookbehind as its regex engine does not support them.
Instead, use a capturing group around the unknown part you want to get:
/window\.location\.redirect\("([^"]*)"\)/
or
/window\.location\.redirect\("(.*?)"\)/
See the regex demo
No /g modifier will allow matching just one, first occurrence. Access the value you need inside Group 1.
The ([^"]*) captures 0+ characters other than a double quote (URLs you need should not have it). If these URLs you have contain a ", you should use the second approach as (.*?) will match any 0+ characters other than a newline up to the first ").