Google Tag Manager - Regex match - regex

I want to check if a specific string is included in a GTM variable. The value of this variable is a first-party-cookie value decoded via URI looking like this:
"\"prodirversion\":5,\"panellanguage\":\"de\",\"preferences\":false,"\"marketing\":true,\"necessary\":true,\"statistics\":false,\"social_"
I now want to check if the following string is included.
marketing":true
I created another variable with a regex table and tried different regex expressions but nothing seems to work. It works on online regex tester but not in Google Tag Manager.
My guess would be the following but it doesn't work.
marketing\\":true
or
marketing.{3}true
or
marketing\\.{2}true
GTM variable

Some Regex engines will have an error on not escaping " char in marketing\\":true
Try escaping it like this: marketing\\\":true, and it should match.
Update:
marketing":true seems to be working in GTM
from that, we can conclude that escaping character \ in input string is for show only in GTM case, and should be ignored when regex testing/debugging.

Related

Lucene regex v4

I am trying to query on Kibana version 7.9.1 for a uuidv4. I disabled the KQL an now it looks like it is using lucene.
Example of a uuid v4:
2334e133-37a6-4039-8acd-b0a561b961b2
Now if I input :
/[0-9a-fA-F]{8}/
in the search bar I get hits, but as soon as I try to escape the hyphen like
/[0-9a-fA-F]{8}\-/
nothing shows up. I would like to use the full regular expression:
[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}
But I can't because of the hyphens.
Is there any other way to escape that pesky hyphen?
I am using elastic search 7.9.1 by the way
I'm not sure why that regex above won't work for you, but this was the best I could come up with given the context: ^[0-9a-fA-F]{8}[^\s\d\w!##$%^&*()_+=\\\][{}|';:"\/.,<>?][0-9a-fA-F]{4}[^\s\d\w!##$%^&*()_+=\\\][{}|';:"\/.,<>?][0-9a-fA-F]{4}[^\s\d\w!##$%^&*()_+=\\\][{}|';:"\/.,<>?][0-9a-fA-F]{4}[^\s\d\w!##$%^&*()_+=\\\][{}|';:"\/.,<>?][0-9a-fA-F]{12}$
It basically is just replacing your "-" with a character not in range "[^...]" that I filled with almost everything except - and added a start character "^" and end character "$"
Again, not sure if lucene is just not using certain parts of regex, but try not escaping the -'s I know some programs will automatically escape symbols for you when using regex.
I ended up using the following regex on lucene in the kibana discover option:
/[0-9a-fA-F]{8}/ AND /[0-9a-fA-F]{4}/ AND /[0-9a-fA-F]{12}/
Not pretty, but it works.

Id match in firestore database rules

I am working on my security rules, but when I try to match the document id with a regex, it doesn't work.
I tried to use the matches function, but it doesn't seem to accept the method.
Even when I tried using the Firebase pattern YYYY-MM-DD (/^(19|20)[0-9][0-9][-\\/. ](0[1-9]|1[012])[-\\/. ](0[1-9]|[12][0-9]|3[01])$/) from here, but it didn't work (I tried with 1950-01-01).
I am trying to check roomId for this pattern (/^(\\d){6,}#[a-zA-Z0-9]{65,}$/)
Edit: I tried removing the " " around the regex but it gives me this error: mismatched input ')' expecting {'{', '/', PATH_SEGMENT}
(I know the regex is OK, but I don't know why it won't work in the code I wrote)
You're getting the syntax mixed up between Realtime Database and Firestore.
In Realtime Database security rules, the regular expression is specific as a JavaScript regex, so enclosed in / for opening and closing.
In Firestore security rules the regular expression needs to be passed as a string, which also means it shouldn't be wrapped in / symbols.
So:
allow create: if docId.matches("^(19|20)[0-9][0-9][-\\/. ](0[1-9]|1[012])[-\\/. ](0[1-9]|[12][0-9]|3[01])$");

Removing sensitive informations from the logs using regex

In my Ruby app I have the following regex that helps me with removing sensitive informations from logs:
/(\\"|")secure[^:]+:\s*\1.*?\1/
It works when in logs are the following information:
{"secure_data": "Test"}
but when instead of string I have object in logs it does not work:
{"secure_data": {"name": "Test"}}
How can I update regex to work with both scenarios?
https://rubular.com/r/h9EBZot1e7NUkS
You may use this regex with negated character classes and an alternation:
"secure[^:]+:\s*(?:"[^"]*"|{[^}]*})
Inside non-capturing group (?:"[^"]*"|{[^}]*}) we are matching a quoted string or an object that starts with { and ends with }.
Update RegEx Demo
The following should work for what you're trying to do. I'd suggest using a json parser though.
{"secure[^:]*?:\s({?(?:(?:,[^"]*?)?"[^"]*?"(?::\s"[^"]*?")?)*?)*?}?}
With this regex the object in secure_data may also contain multiple key-value(string)-pairs. It will still match. Other objects will not.

Regular expression not working in google analytics

Im trying to build a regular expression to capture URLs which contain a certain parameter 7136D38A-AA70-434E-A705-0F5C6D072A3B
Ive set up a simple regex to capture a URL with anything before and anything after this parameter (just just all URLs which contain this parameter). Ive tested this on an online checker: http://scriptular.com/ and seems to work fine. However google analytics is saying this is invalid when i try to use it. Any idea what is causing this?
Url will be in the format
/home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd
so i just want to capture URLs that contain that specific "z" parameter.
regex
^.+(?=7136D38A-AA70-434E-A705-0F5C6D072A3B).+$
You just need
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B.+$
Or (a bit safer):
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&.+$)
And I think you can even use
=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&)
See demo
Your regex is invalid because GA regex flavor does not support look-arounds (and you have a (?=...) positive look-ahead in yours).
Here is a good GA regex cheatsheet.
To match /home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd you can use:
\S*7136D38A-AA70-434E-A705-0F5C6D072A3B\S*

RegEx SQL, issue escaping quotes

I am trying to use PSQL, specifically AWS Redshift to parse a line. Sample data follows
{"c.1.mcc":"250","appId":"sx-calllog","b.level":59,"c.1.mnc":"01"}
{"appId":"sx-voice-call","b.level":76,"foreground":9}
I am trying the following regex in order to to extract the appId field, but my query is returning empty fields.
'appId\":\"[\w*]\",'
Query
SELECT app_params,
regexp_substr(app_params, 'appId\":\"[\w*]\",')
FROM sample;
You can do that as follows:
(\"appId\":\"[^"]*\")(?:,)
Demo: http://regex101.com/r/xP0hW3
The first extracted group is what you want.
Your regex was not matching because \w does not match -
Adding this here despite this being an old question since it may help someone viewing this down the road...
If your lines of data are valid json, you can use Redshift's JSON_EXTRACT_PATH_TEXT function to extract the value a given key. Emphasis on the json being valid, as it will fail if even one line cannot be parsed and Redshift will throw a JSON parsing error.
Example using given data:
select json_extract_path_text('{"c.1.mcc":"250","appId":"sx-calllog","b.level":59,"c.1.mnc":"01"}','appId');
returns sx-calllog
This is especially useful since Redshift does not support lookahead/lookbehind (it is POSIX regex) & extract groups.
You can try using some lookahead and look behinds to isolate just the text inside the quotes for the appid. (?<=appId\":\")(?=.*\",)[^\"]*. I tested this out a bit using your examples you provided here.
To explain the regex a bit more: (?<=appId\":\")(?=.*\",)[^\"]*
(?<=appId\":\"): positive look behind for appid":". Since you don't want the appid text itself being returned (just the value), you can preface the regex with a look behind to say "find me the following regex, but only when it is following the look behind text.
(?=.*\",): positive look ahead for the ending ",. You don't want quotes to be returned in your match, but as with number 1 you want your regex to be bounded a bit and a look ahead does that.
[^\"]*: The actual matching portion. You want to find the string of chars that are NOT ". This will match the entire value and stop matching right before the closing ".
EDIT: Changed the 3rd step a little bit, removed the , from that last piece, it is not needed and would break the match if the value were to actually contain a ,.