Regex to check for valid URL in Firebase Security Rule - regex

I'm trying to implement a firebase security rule to check if it is a valid url by using regex. Here is some sample of the url :
https://firebasestorage.googleapis.com/andTheRestIsJustThePartAfterIUploadedIntoFirebaseStorage
I used this regex to check:
"imageURL": {
".validate": "newData.val().matches(/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/)"
}
However, I'm getting error at Line 26: Invalid escape: '\+'.
Any ideas?

Its not necessary to escape + in a regex class, instead use this:
[-+=&;%#.\w_]
I see another syntax errors in your pattern, some character should be escaped athors should not, your pattern should look like this :
/((([A-Za-z]{3,9}:(?:\\/\\/)?)(?:[-;:&=+\\$,\\w]+#)?‌​[A-Za-z0-9.-]+|(?:ww‌​w.|[-;:&=+\\$,\\w]+#‌​)[A-Za-z0-9.-]+)((?:‌​\\/[%+~\\/.\\w-_]*)?‌​\\??(?:[-+=&;%#.\\w_‌​]*)#?(?:[\\w]*))?)/
But i'm not sure of about the logic of your pattern, i could suggest to read this for more information about Firebase Security Rules Regular Expressions

Related

Consolidated RegEx to parse syslog data

Goal
I am trying to craft a RegEx that will parse out specific data from various syslog entries that contain subtle differences in logged content. While I am able to accomplish my goal using multiple RegEx statements, if possible, I would like to combine these statements into a single consolidated RegEx.
Log entries
The main issue I'm having is that some log entries have a URL that needs to be parsed to a named group and other log entries do not have any URL. Examples of these two different log entries are provided below.
Entry with URL
Nov 3 11:33:04 host1 postfix/smtpd[12812]: NOQUEUE: reject: RCPT from 178.red-83-59-180.dynamicip.rima-tde.net[83.59.180.178]: 554 5.7.1 Service unavailable; Client host [83.59.180.178] blocked using b.barracudacentral.org; http://www.barracudanetworks.com/reputation/?pr=1&ip=83.59.180.178; from=<lmclapp68#newmail.spamcop.net> to=<user1#example.com> proto=ESMTP helo=<178.red-83-59-180.dynamicip.rima-tde.net>
Entry without URL
Nov 2 16:01:25 host1 postfix/smtpd[31667]: NOQUEUE: reject_warning: RCPT from mail1.sendersrv.com[185.3.229.125]: 554 5.7.1 Service unavailable; Client host [185.3.229.125] blocked using bl.spamcop.net; from=<bounces+rL59wUXq98_inBrG#sendersrv.com> to=<user1#example.com> proto=ESMTP helo=<mail1.sendersrv.com>
RegEx statements
In the RegEx statements that follow, the first two are what I currently use for each of the previous log messages. The third RegEx is my attempt at consolidating these both into a single RegEx that will parse data from either log message. My attempt was to use a conditional statement that would basically check for the existence of http(s) and if found, then to parse the URL to a named group. If http(s) was not found, then it would parse out everything until the next RegEx token.
The issue is that when I test the RegEx against a log entry that has a URL, the RegEx does not seem to find http(s) despite this token being set as optional (i.e. using the ? quantifier). However, if I remove the ? quantifier, it does find http(s) and then parses the URL as desired. However, without the quantifier, the RegEx does not work with log entries that do not have a URL.
Parse entries with URL
^(?P<datetime>.+) host1 postfix.+RCPT from (?P<srcDns>.+)\[(?P<srcIp>[0-9\.]+)\]:.+blocked using (?P<blkList>.+);.+https?:\/{2}(?P<entryUrl>.+);\s.+\sto=\<(?P<destEm>.+)>.+$
Parse entries without URL
^(?P<datetime>.+) host1 postfix.+RCPT from (?P<srcDns>.+)\[(?P<srcIp>[0-9\.]+)\]:.+blocked using (?P<blkList>.+);\s.+\sto=\<(?P<destEm>.+)>.+$
Attempt at consolidating RegEx
^(?P<datetime>.+) host1 postfix.+RCPT from (?P<srcDns>.+)\[(?P<srcIp>[0-9\.]+)\]:.+blocked using (?P<blkList>.+)(?<=[a-z]);.+(https?:\/{2})?(?(5)(?P<entryUrl>.+)|.+)to=\<(?P<destEm>.+)>.+$
I'm sure the issue is my misunderstanding as to how the conditional statements and the ? quantifier works.
Looking at your patterns, the email address for to: is between tags < and > but due to the formatting in the question they are not shown.
The parts in your pattern like .+ first match until the end of the string, and will then backtrack and try to match the rest of the pattern.
You can make the pattern a bit more performant making the parts that you want and know more specific.
For the datetime, you can make the pattern match the specified format instead of .+ using ^(?P<datetime>[A-Z][a-z]{2}\s+\d{1,2}\s* \d{1,2}:\d{1,2}:\d{1,2})
For (?P<blkList>[^;]+) and (?P<entryUrl>[^;]+) you can use a negated character class matching any char except ;
For group (?P<destEm>[^<>\s]+) you can exclude matching tags.
To make match the url, instead of using a condition you can make the group optional using ?
For example
^(?P<datetime>[A-Z][a-z]{2}\s+\d{1,2}\s* \d{1,2}:\d{1,2}:\d{1,2}) host1 postfix\b.*? RCPT from (?P<srcDns>.*?)\[(?P<srcIp>[0-9\.]+)\]:.*? blocked using (?P<blkList>[^;]+);(?:.+?https?:\/\/(?P<entryUrl>[^;]+);)?\s.*? to=[^<]*<(?P<destEm>[^<>\s]+)>
See a regex demo.
Have you tried to test your regex on page like regex101?
to=\<(?P<destEm>.+)> doesn't seem to match your examples. You should either remove <> or replace to with helo. Be careful to make your quantifier lazy after blkList otherwise you might catch too much text.
You can then make your url optional with ? and it should work in both cases:
^(?P<datetime>.+) host1 postfix.+RCPT from (?P<srcDns>.+)\[(?P<srcIp>[0-9\.]+)\]:.+blocked using (?P<blkList>.+?);(.+https?:\/{2}(?P<entryUrl>.+);\s)?.+\sto=(?P<destEm>.+?)\s.*$
One approach would be to replace in the first regex .+https?:\/{2}(?P<entryUrl>.+); with (?:.+https?:\/{2}(?P<entryUrl>.+);)? where ?: indicates that it is a non-capturing group and the ? at the end means that it is optional.
However, it still does not work because .+ is greedy, so use lazy .+? instead.
Final regex:
^(?P<datetime>.+?) host1 postfix.+?RCPT from (?P<srcDns>.+?)\[(?P<srcIp>[0-9\.]+)\]:.+?blocked using (?P<blkList>.+?);(?:.+?https?:\/{2}(?P<entryUrl>.+?);)?\s.+?\sto=\<(?P<destEm>.+?)>.+?$
https://regex101.com/r/QkmXWz (to see it in action)

Setting regular expression to validate URL format in Adobe CQ5

I want to validate a URL inside a textfield using Adobe CQ5, so I set up the properties regex and regexText as usual, but for some reason is not working:
<facebook
jcr:primaryType="cq:Widget"
emptyText="http://www.facebook.com/account-name"
fieldDescription="Set the Facebook URL"
fieldLabel="Facebook"
name="./facebookUrl"
regex="/^(http://www.|https://www.|http://|https://)[a-z0-9]+([-.]{1}[a-z0-9]+)*.[a-z]{2,5}(:[0-9]{1,5})?(/.*)?$/"
regexText="Invalid URL format"
xtype="textfield"/>
So when I type inside the component I can see an error message at the console:
Uncaught TypeError: this.regex.test is not a function
To be more accurate the error comes from this line:
if (this.regex && !this.regex.test(value)) {
I tried several regular expressions and none of them worked. I guess the problem is the regular expression itself, because in the other hand I have this other regex to evaluate email address, and it works perfectly fine:
/^[A-za-z0-9]+[\\._]*[A-za-z0-9]*#[A-za-z.-]+[\\.]+[A-Za-z]{2,4}$/
Any suggestions? Thanks in advance.
The syntax of your regex seems to treat the forward slashes (/) as special characters. Since you want to parse a URL containing slashes, my guess is you should escape them twice like this: '\\/' instead of '/'. The result would be:
/^(http:\\/\\/www.|https:\\/\\/www.|http:\\/\\/|https:\\/\\/)[a-z0-9]+([-.]{1}[a-z0-9]+)‌​*.[a-z]{2,5}(:[0-9]{1,5})?(\\/.*)?$/
You need to escape them twice because the string to be compiled as a regex must contain '\/' to escape the slashes, but to introduce a backslash in a string you have to escape the backslash itself too.

Regex HTTP Response Body Message

I use a jmeter for REST testing.
I have made a HTTP Request, and this is the response data:
{"id":11,"name":"value","password":null,"status":"ACTIVE","lastIp":"0.0.0.0","lastLogin":null,"addedDate":1429090984000}
I need just the ID (which is 11) in
{"id":11,....
I use the REGEX below :
([0-9].+?)
It works perfectly but it will be a problem if my ID more than 2 digits. I need to change the REGEX to :
([0-9][0-9].+?)
Is there any dynamic REGEX for my problem. Thank you for your attention.
Regards,
Stefio
If you want any integer between {"id": and , use the following Regular Expression:
{"id":(\d+),
However the smarter way of dealing with JSON data could be JSON Path Extractor (available via JMeter Plugins), going forward this option can be much easier to use against complex JSON.
See Using the XPath Extractor in JMeter guide (scroll down to "Parsing JSON") to learn more on syntax and use cases.
I suggest using the following regular expression:
"id":([^,]*),
This will first find "id": and then look for anything that is not a comma until it finds a comma. Note the character grouping is only around the value of the ID.
This will work for ANY length ID.
Edit:
The same concept works for almost any JSON data, for example where the value is quoted:
"key":"([^"]*)"
That regular expression will extract the value from given key, as long as value is quoted and does not contain quotes. It first finds "key": and then matches anything that is not a quote until the next quote.
You can use the quantifier like this:
([0-9]{2,}.+?)
It will catch 2 or more digits, and then any symbol, 1 or more times. If you want to allow no other characters after the digits, use * instead of +:
([0-9]{2,}.*?)
Regex demo

Jmeter Regex to return token

I have a JMeter HTTP Request that returns
{
"Token" : "VwAMVWXTakkdffdkEj1I9IiTr8DlYa89fK4yimmQNWSitIY1qBb1Qbs1FU9CfZHWMMlTed3hHOaBD7vJGNh9ZugFZuANtAomk17vIjg3Zgl1Fp0kulb6UTsbnkyyGNwNMGR"
}
in the response data. The string after the colon will change each time. I need this string to then be passed to another HTTP request. I have the rest set up but I am struggling with the regex, I get the default constantly.
Currently the regex looks like -
"Token":"(.+?)"
but doesn't work.
Can anyone help?
Thanks
Use a positive lookbehind,
(?<=\"Token\" : ).*
It matches all the characters which are just after to the string "Token" :
DEMO
OR
(?<=\"Token\"\s:\s\")[^\"]*
If you want the strings inside double quotes then use above regex.
DEMO
Below regex would capture the matched characters,
(?<=\"Token\"\s:\s\")([^\"]*)
Your return is JSON data, so the best option would be to handle it as JSON, which also would make your application far easier to maintain and evolve, when the data you are handling gets more complicated than just one attribute. Plus, you can handle it easily in JavaScript. Suggested reads:
Jmeter extracting fields/parsing JSON response
Use BSF Postprocessor to parse JSON response and save the properties as JMeter variables
Your regular expression needs to account for whitespace. I'd recommend using Regular Expression Extractor, which will make this alot easier for you.
Reference Name: FOO
Regular Expression: "Token" : "(.+?)"
Template: $1$
Use corresponding variable to access the match. ${FOO}
The variables are set as follows:
FOO_matchNr - Number of matches found, possibly 0
FOO_n - (n = 1, 2, etc..) Generated by the template
FOO_n_gm - (m = 0, 1, 2) Groups for the match (n)
FOO - By itself it is always set to the default value
FOO_gn - Not set at all
You can use regex ([^"]+) when you have received response from HTTP request.
Example:
"Token":([^"]+) --> Not required to add double quotation.

Regex match multiple data in URL for IIS Rewrite rule

I'm looking for some help with a regex pattern for rewriting a URL. My URL structure is:
http://domain.com/[username]/[token]/[userid]/
The data types are:
username = alphanumeric
token = alphanumeric
userid = numeric
An example with data:
http://domain.com/john1975/aBc123/123456789/
Using a regular expressions I'm trying to get a reference for each piece of data, so I can rewrite to:
index.asp?username={R:1}&token={R:2}&userid={R:3}
Also keep in mind the regex shouldn't be too greedy, so I can still access files such as:
http://domain.com/about.asp
http://domain.com/images/logo.png
The regex I've tried is:
^[0-9a-z]+/[0-9a-z]+/[0-9]+$
This doesn't match my example URL.
You're missing the trailing forward slash. The regex should be :
^([0-9a-z]+)/([0-9a-z]+)/([0-9]+)/$
I'm assuming you're flagging it as case insensitive. If not then you need
^([0-9a-zA-Z]+)/([0-9a-zA-Z]+)/([0-9]+)/$
You also need the brackets so you can call your back references, which are also wrong - you want to match on 1,2 and 3, not 0, which is the match of the whole expression. They should read:
index.asp?username={R:1}&token={R:2}&userid={R:3}