Oracle Regex Patterns

Oracle Regex Patterns - regex

I have 3 Regex patterns from oracle 10/11 that I need to port to oracle 9. However, I am getting different results when I try to do the matching. Is there a way to fix this ? Is it even possible to port these patterns and get the same results when matching ?
Here are the patterns:
'^[[:alnum:]\_]$' which I would think would map to '[A-Za-z0-9_]*$'
'^[[:alpha:]]$' which i would think would map to '^[A-Za-z]*$'
'^(\"|\'')' which I would think needs to map over into 2 patterns '^["]*$' or '^['']*$'
owa pattern docs
oracle 10g regex docs
EDIT:
So my question was originally part of a lexor that was giving me trouble in a parser..it turns out my problem was related to a type (and not the regex) ..My regex patterns match close enough(and actually I changed the :alnum to use "\w" instead of the other pattern.)
There's one thing I still don't understand what does double enclosing brackets mean [[ with in the example regex ?

Related

REGEX number not in a list failing with a long list

I have a list of the following numbers and want a Regular expression that matches when a number is not in the list.
0,1,2,3,4,9,11,12,13,14,15,16,18,19,250
I have written the following REGEX statement.
^(?!.*(0|1|2|3|4|9|11|12|13|14|15|16|18|19|250)).*$
The problem is that it correctly gives a match for 5,6,7,8 etc but not for 17 or 251 for example.
I have been testing this on the online REGEX simulators.

This should resolve your issue..
^(?!\D*(0|1|2|3|4|9|11|12|13|14|15|16|18|19|250)\b).*$
In your earlier regex you were basically saying eliminate all numbers that start with 0/1/2/3/4/9!
So your original regex would actually match 54/623/71/88 but not the others. Also the 11-19 and 250 in the list were rendered useless.
Although as others have I would also recommend you to not use regex for this, as I believe it is an overkill and a maintenance nightmare!
Also an extra note "Variable length look arounds are very inefficient too" vs regular checks.
I would do \b\d+\b to get each number in the string and check if they are in your list. It would be way faster.

You can use the discard technique by matching what you do not want and capturing what you really want.
You can use a regex like this:
\b(?:[0-49]|1[1-689]|250)\b|(\d+)
Here you can check a working demo where in blue you have the matches (what you don't want) and in green the content you want. Then you have to grab the content from the capturing group
Working demo
Not sure what regex engine you are using, but here I created a sample using java:
https://ideone.com/B7kLe0

Regular expression not working in google analytics

Im trying to build a regular expression to capture URLs which contain a certain parameter 7136D38A-AA70-434E-A705-0F5C6D072A3B
Ive set up a simple regex to capture a URL with anything before and anything after this parameter (just just all URLs which contain this parameter). Ive tested this on an online checker: http://scriptular.com/ and seems to work fine. However google analytics is saying this is invalid when i try to use it. Any idea what is causing this?
Url will be in the format
/home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd
so i just want to capture URLs that contain that specific "z" parameter.
regex
^.+(?=7136D38A-AA70-434E-A705-0F5C6D072A3B).+$

You just need
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B.+$
Or (a bit safer):
^.+=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&.+$)
And I think you can even use
=7136D38A-AA70-434E-A705-0F5C6D072A3B($|&)
See demo
Your regex is invalid because GA regex flavor does not support look-arounds (and you have a (?=...) positive look-ahead in yours).
Here is a good GA regex cheatsheet.

To match /home/index?x=23908123890123&y=kjdfhjhsfd&z=7136D38A-AA70-434E-A705-0F5C6D072A3B&p=kljdaslkjasd you can use:
\S*7136D38A-AA70-434E-A705-0F5C6D072A3B\S*

Filter by regex example

Could anyone provide an example of a regex filter for the Google Chrome Developer toolbar?
I especially need exclusion. I've tried many regexes, but somehow they don't seem to work:

It turned out that Google Chrome actually didn't support this until early 2015, see Google Code issue. With newer versions it works great, for example excluding everything that contains banners:
/^(?!.*?banners)/

It's possible -- at least in Chrome 58 Dev. You just need to wrap your regex with forward-slashes: /my-regex-string/
For example, this is one I'm currently using: /^(.(?!fallback font))+$/
It successfully filters out any messages that contain the substring "fallback font".
EDIT
Something else to note is that if you want to use the ^ (caret) symbol to search from the start of the log message, you have to first match the "fileName.js?someUrlParam:lineNumber " part of the string.
That is to say, the regex is matching against not just the log message, but also the stack-entry for the line which made the log.
So this is the regex I use to match all log messages where the actual message starts with "Dog":
/^.+?:[0-9]+ Dog/

The negative or exclusion case is much easier to write and think about when using the DevTool's native syntax. To provide the exclusion logic you need, simply use this:
-/app/ -/some\sother\sregex/
The "-" prior to the regex makes the result negative.

Your expression should not contain the forward slashes and /s, these are not needed for crafting a filter.
I believe your regex should finally read:
!(appl)
Depending on what exactly you want to filter.
The regex above will filter out all lines without the string "appl" in them.
edit: apparently exclusion is not supported?

Looking for a regex to match more than one reference string in TortoiseSVN

We used two different methods to reference external documents and Bugzilla bug numbers.
I'm now looking for a regular expression that matches these two possibilities of reference strings for convenient display and linking in the TortoiseSVN 1.6.16 log screen. First should be a bugzilla entry of the form [BZ#123], second is [some text and numbers], which has not to be converted into a url.
This can be matched with
\[BZ#\d+\]
and
\[.*?\]
My problem now is to concatenate those two match strings together. Usually this would be done by the regex (first|second), and I've done it this way:
(\[.*?\]|\[BZ#\d+\])
Unfortunately in this case TortoiseSVN seems to catch it all as the bug number because of the round braces. Even if I add a second expression which (according to the documentation) is meant to be used to extract the issue number itself, this second expression is supposed to be ignored:
(\[.*?\]|\[BZ#\d+\])
\[BZ#(\d+)\]
In this case TortoiseSVN displays the bug and document references correctly in the separate column, but uses them completely for the bugtracker url, which is of course not working:
https://mybugzillaserver/show_bug.cgi?id=[BZ#949]
BTW, Mercurial uses a better way by using {1}, {2}, ... as the placeholder in URLs.
Has anybody an idea how to solve this problem?
EDIT
In short: We have used [BZ#123] as bug number references and [anytext] as references to other (partly non-electronic) documents. We would like to have both patterns listed in TortoiseSVN's extra column, but only the bug number from the first part shpuld be used as %BUGID% in the URL string.
EDIT 2
Supposedly TortoiseSVN cannot handle nested regex groups (round braces), so this question doesn't have any satisfactory answer at the moment.

I'm not familiar with TortoiseSVN regex, but what it looked like the problem was that the first piece of the regex ([.*?\]) would always match, so you would never even get to the part evaluating the second part, \[BZ#(\d+)\]
Try this one instead:
((?<=\[BZ#)\d+(?=\])|\[.*?\])
Explanation:
( #Opening group.
(?<=\[BZ#) #Look behind for a bugzilla placeholder.
\d+ #Capture just the digits.
(?=\]) #Look ahead for the closing bracket (probably not necessary.)
| #Or, if that fails,
\[.*?\] #Find all other placeholders.
) #Closing the group.
Edit: I've just looked at TortoiseSVN docs. You could also try to keep the Message part expression the same, but change the Bug-ID expression to:
(?<=\[BZ#)(\d+)(?=\])
Edit: ?<= represents a zero-width lookbehind. See http://www.regular-expressions.info/lookaround.html. It is possible that TortoiseSVN doesn't support lookbehinds.
What happens if you just use (\d+) for your Bug-ID expression?

CAtlRegExp for a regular expression that matches 4 characters max

Short version:
How can I get a regex that matches a#a.aaaa but not a#a.aaaaa using CAtlRegExp ?
Long version:
I'm using CAtlRegExp http://msdn.microsoft.com/en-us/library/k3zs4axe(VS.80).aspx to try to match email addresses. I want to use the regex
^[A-Z0-9._%+-]+#(?:[A-Z0-9-]+\.)+[A-Z]{2,4}$
extracted from here.
But the syntax that CAtlRegExp accepts is different than the one used there. This regex returns the error REPARSE_ERROR_BRACKET_EXPECTED, you can check for yourself using this app: http://www.codeproject.com/KB/string/mfcregex.aspx
Using said app, I created this regex:
^[a-zA-Z0-9\._%\+\-]+#([a-zA-Z0-9-]+\.)+[a-zA-Z]$
But the problem is this matches a#a.aaaaa as valid, I need it to match 4 characters maximum for the op-level domain.
So, how can I get a regex that matches a#a.aaaa but not a#a.aaaaa ?

Try: ^[a-zA-Z0-9\._%\+\-]+#([a-zA-Z0-9-]+\.)+\c\c\c?\c?$
This expression replaces the [A-Z]{2,4} sequence which CAtlRegExp doesn't support with \c\c\c?\c?
\c serves as an abbreviation of [a-zA-Z]. The question marks after the 3rd and 4th \c's indicate they can match either zero or one characters. As a result, this portion of the expression matches 2, 3 or 4 characters, but neither more nor less.

You are trying to match email addresses, a very widely used critical element of internet communication.
To which I would say that this job is best done with the most widely used most correct regex.
Since email address format rules are described by RFC822, it seems useful to do internet searches for something like "RFC822 email regex".
For Perl the answer seems to be easy: use Mail::RFC822::Address: regexp-based address validation
RFC 822 Email Address Parser in PHP
Thus, to achieve the most correct handling of email addresses, one should either locate the most precise regex that there is out somewhere for the particular toolkit (ATL in your case) or - in case there's no suitable existing regex yet - adapt a very precise regex of another toolkit (Perl above seems to be a very complete albeit difficult candidate).
If you're trying to match a specific sub part of email addresses (as seems to be the case given your question), then it probably still makes sense to start with the most up-to-date/correct/universal regex and specifically limit it to the parts that you require.
Perhaps I stated the obvious, but I hope it helped.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Oracle Regex Patterns - regex

Related

REGEX number not in a list failing with a long list

Regular expression not working in google analytics

Filter by regex example

Looking for a regex to match more than one reference string in TortoiseSVN

CAtlRegExp for a regular expression that matches 4 characters max

Categories

Resources