Regex and textmatching issue

Regex and textmatching issue - regex

I am doing some basic text matching in Postgres 9.3.5.0.
Here is my code so far:
Select text from eightks
WHERE other_events = true and
keywordRegexs = [\y(director and member \s+ and resigned)\y/ix];
I am getting the following errors
psql:test3.sql:3: invalid command \y(director
psql:test3.sql:5: ERROR: syntax error at or near "["
LINE 3: keywordRegexs = [
I am trying to find documents which contain those exact phrases.

The regular expression match operator in Postgres is ~.
The case insensitive variant is ~*.
Branches are enclosed in ().
SELECT text
FROM eightks
WHERE other_events = true
AND keywordregexs ~* '(\y(director | member \s+ |resigned)\y)';
The meaning of "those exact phrases" is not clear in the question.
Details in the manual.

Related

Regex in PostgreSQL

I'm ultimately trying to use the following regex expression.
SELECT *
into table
FROM table2
Where
(Description ~ '\bD\s*(&|AND|&AMP;|N|AMP|\*|\+)\s*B.*')
However this returns the following errors:
[XX000] ERROR: Invalid preceding regular expression prior to repetition operator. The error occured while parsing the regular expression fragment: 'P;|N|AMP|>>>HERE>>>|+)sB.'. Detail: ----------------------------------------------- error: Invalid preceding regular expression prior to repetition operator. The error occured while parsing the regular expression fragment: 'P;|N|AMP|>>>HERE>>>|+)sB.'. code: 8 ...
Any idea on the fix?

You should replace \b with \y (or \m) to fix the pattern, and you may put single chars inside a capturing group into a character class where you do not have to escape them, (&|\*|\+) -> [*+&]. Note you do not need .* at the end, unless you are matching (if you just check for a regex match with ~ you do not need it);
Use
'\yD\s*(AND|&AMP;|N|AMP|[*+&])\s*B'
See the online demo:
CREATE TABLE tb1
(website character varying)
;
INSERT INTO tb1
(website)
VALUES
('D AND B...'),
('ROCK''N''ROLL'),
('www.google.com'),
('More text here'),
('D N Brother')
;
SELECT * FROM tb1 WHERE website ~ '\yD\s*(AND|&AMP;|N|AMP|[*+&])\s*B';
Output

C# Regex Match start and end

How to get following result using Regex into C#.
string input = "<P>With effect from <<Effective Date>>, the xyz is amended as follows:</P><P>The xyz will xyz the Insured for Claims including x amount Costs or Legal Fees which arise out of or in xyz with <<Description of xyz/abc>>.</P><P>All other terms and conditions of the dddd remain unchanged.</P>";
Regex r = new Regex("Regular expression needed!!!");
So i am looking for following field collection using Regex (Starting with special charachter << AND ending with >>)
<<Effective Date>>
<<Description of xyz/abc>>

Usually when questions like this are asked some effort needs to be shown instead of creating a new regular expression object stating Regular expression needed!!! inside the pattern. So please take consideration to state the exact problem with at least some effort on what you have attempted next time.
To get you started, you can use the following.
foreach (Match m in Regex.Matches(input, #"<<[^>]*>>"))
Console.WriteLine(m.Value);
Explanation:
<< # '<<'
[^>]* # any character except: '>' (0 or more times)
>> # '>>'
Working Demo
Here are a few references to start your path of learning regular expressions.
Regular-Expressions.info
Quick-Start: Regex Cheat Sheet

Confusion in regex pattern for search

Learning regex in bash, i am trying to fetch all lines which ends with .com
Initially i did :
cat patternNpara.txt | egrep "^[[:alnum:]]+(.com)$"
why : +matches one or more occurrences, so placing it after alnum should fetch the occurrence of any digit,word or signs but apparently, this logic is failing....
Then i did this : (purely hit-and-try, not applying any logic really...) and it worked
cat patternNpara.txt | egrep "^[[:alnum:]].+(.com)$"
whats confusing me : . matches only single occurrence, then, how am i getting the output...i mean how is it really matching the pattern???
Question : whats the difference between [[:alnum:]]+ and [[:alnum:]].+ (this one has . in it) in the above matching pattern and how its working???
PS : i am looking for a possible explanation...not, try it this way thing... :)
Some test lines for the file patternNpara.txt which are fetched as output!
valid email = abc#abc.com
invalid email = ab#abccom
another invalid = abc#.com
1 : abc,s,11#gmail.com
2: abc.s.11#gmail.com

Looking at your screenshot it seems you're trying to match email address that has # character also which is not included in your regex. You can use this regex:
egrep "[#[:alnum:]]+(\.com)" patternNpara.txt
DIfference between 2 regex:
[[:alnum:]] matches only [a-zA-Z0-9]. If you have # or , then you need to include them in character class as well.
Your 2nd case is including .+ pattern which means 1 or more matches of ANY CHARACTER

If you want to match any lines that end with '.com', you should use
egrep ".*\.com$" file.txt
To match all the following lines
valid email = abc#abc.com
invalid email = ab#abccom
another invalid = abc#.com
1 : abc,s,11#gmail.com
2: abc.s.11#gmail.com
^[[:alnum:]].+(.com)$ will work, but ^[[:alnum:]]+(.com)$ will not. Here is the reasons:
^[[:alnum:]].+(.com)$ means to match strings that start with a a-zA-Z or 0-9, flows two or more any characters, and end with a 'com' (not '.com').
^[[:alnum:]]+(.com)$ means to match strings that start with one or more a-zA-Z or 0-9, flows one character that could be anything, and end with a 'com' (not '.com').

Try this (with "positive-lookahead") :
.+(?=\.com)
Demo :
http://regexr.com?38bo0

Regular expression extract filename from line content

I'm very new to regular expression. I want to extract the following string
"109_Admin_RegistrationResponse_20130103.txt"
from this file content, the contents is selected per line:
01-10-13 10:44AM 47 107_Admin_RegistrationDetail_20130111.txt
01-10-13 10:40AM 11 107_Admin_RegistrationResponse_20130111.txt
The regular expression should not pick the second line, only the first line should return a true.

Your Regex has a lot of different mistakes...
Your line does not start with your required filename but you put an ^ there
missing + in your character group [a-zA-Z], hence only able to match a single character
does not include _ in your character group, hence it won't match Admin_RegistrationResponse
missing \ and d{2} would match dd only.
As per M42's answer (which I left out), you also need to escape your dot . too, or it would match 123_abc_12345678atxt too (notice the a before txt)
Your regex should be
\d+_[a-zA-Z_]+_\d{4}\d{2}\d{2}\.txt$
which can be simplified as
\d+_[a-zA-Z_]+_\d{8}\.txt$
as \d{2}\d{2} really look redundant -- unless you want to do with capturing groups, then you would do:
\d+_[a-zA-Z_]+_(\d{4})(\d{2})(\d{2})\.txt$

Remove the anchors and escape the dot:
\d+[a-zA-Z_]+\d{8}\.txt

I'm a newbie in php but i think you can use explode() function in php or any equivalent in your language.
$string = "01-09-13 10:17AM 11 109_Admin_RegistrationResponse_20130103.txt";
$pieces = explode("_", $string);
$stringout = "";
foreach($i = 0;$i<count($pieces);i++){
$stringout = $stringout.$pieces[$i];
}

groovy: how to escape "(" regex etc in textarea?

I have some text area field in my grails application. I got the following errors:
.PatternSyntaxException: Unmatched closing ')' near index 36 Name: note: 1) data listing ....
how could i escape the regular expressions in the text area field?
thanks.

The same as in a literal: place a backslash before it:
\(
EDIT: But if none of the characters from the text area must be treated special, try:
String escapedContents = java.util.regex.Pattern.quote(textArea.getText());

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex and textmatching issue - regex

Related

Regex in PostgreSQL

C# Regex Match start and end

Confusion in regex pattern for search

Regular expression extract filename from line content

groovy: how to escape "(" regex etc in textarea?

Categories

Resources