Postgres search a char X but not XX - regex

Hi I'm trying to find strings in a table that have '=' but not that are '==' in a postgrs table. If I use the following search
SELECT * FROM someTable where someColumn ~ ' R ';
I find all string with R. But I want to exlude this one that are RR, but if a string has 'something R other RR other' I would it have as result.
Can you geve me some tips on how to resolve this?
Tank's.

You can try and do something like so: SELECT * FROM someTable where someColumn ~ ' R[^R] ';
This should match any string R which is not followed by another R.

If you want to use a regex, word boundaries, \y can be used here:
select * from your_table where s ~ '\yR\y';
See PostgreSQL documentation:
\y matches only at the beginning or end of a word
See an online test:
CREATE TABLE table1
(s character varying)
;
INSERT INTO table1
(s)
VALUES
('R'),
('that are RR'),
('that are R')
;
select * from table1 where s ~ '\yR\y';
Output:
s
1 R
2 that are R

Related

Extract all substrings bounded by the same characters

Given a name_loc column of text like the following:
{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}
I'm trying to extract the names, ideally separated by commas:
Charlie, Wrigley, Ana
I've gotten this far:
SELECT SUBSTRING(CAST(name_loc AS VARCHAR) from '"([^ –]+)')
FROM table;
which returns
Charlie
How can I extend this query to extract all names?
You can do this with a combination of regexp_matches (to extract the names), array_agg (to regroup all matches in a row) and array_to_string (to format the array as you'd like, e.g. with a comma separator):
WITH input(name_loc) AS (
VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
, ('{"Other - somewhere}') -- added this to show multiple rows do not get merged
)
SELECT array_to_string(names, ', ')
FROM input
CROSS JOIN LATERAL (
SELECT array_agg(name)
FROM regexp_matches(name_loc, '"(\w+)', 'g') AS f(name)
) AS f(names);
array_to_string
Charlie, Wrigley, Ana
Other
View on DB Fiddle
My two cents, though I'm rather new to postgreSQL and I had to copy the 1st piece from #Marth's his answer:
WITH input(name_loc) AS (
VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
, ('{"Other - somewhere"}')
)
SELECT REGEXP_REPLACE(name_loc, '{?(,)?"(\w+)[^"]+"}?','\1\2', 'g') FROM input;
regexp_replace
Charlie,Wrigley,Ana
Other
Your string literal happens to be a valid array literal.
(Maybe not by coincidence? And the column should be type text[] to begin with?)
If that's the reliable format, there is a safe and simple solution:
SELECT t.id, x.names
FROM tbl t
CROSS JOIN LATERAL (
SELECT string_agg(split_part(elem, ' – ', 1), ', ') AS names
FROM unnest(t.name_loc::text[]) elem
) x;
Or:
SELECT id, string_agg(split_part(elem, ' – ', 1), ', ') AS names
FROM (SELECT id, unnest(name_loc::text[]) AS elem FROM tbl) t
GROUP BY id;
db<>fiddle here
Steps
Unnest the array with unnest() in a LATERAL CROSS JOIN, or directly in the SELECT list.
What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
Take the first part with split_part(). I chose ' – ' as delimiter, not just ' ', to allow for names with nested space like "Anne Nicole". See:
Split comma separated column data into additional columns
Aggregate results with string_agg(). I added no particular order as you didn't specify one.
Concatenate multiple result rows of one column into one, group by another column

Postgres regexp replace not working

Im trying to create a regexp for this query:
SELECT gruppo
FROM righe_conto_ready
WHERE regexp_replace(gruppo,'(\[{1})|(\].*?\[)|(\].*$)','','g') = '[U6][U53]'
LIMIT 10
This is an example of 'gruppo' column:
[U6] CAFFETTERIA [U43] THE E TISANE
Im currently using this query for testing:
SELECT regexp_replace(gruppo,'(\[{1})|(\].*?\[)|(\].*$)','','g') FROM ....
and it returns just U6
How can i change the regexp to remove everything outside brackets?
You can use regexp_matches() with the much simpler regular expression:
with righe_conto_ready(gruppo) as (
select '[U6] CAFFETTERIA [U43] THE E TISANE'::text
)
select gruppo
from righe_conto_ready,
lateral regexp_matches(gruppo, '\[.+?\]', 'g') matches
group by 1
having string_agg(matches[1], '') = '[U6][U43]'
gruppo
-----------------------------------------
[U6] CAFFETTERIA [U43] THE E TISANE
(1 row)
When you are looking for multiple matches of some pattern, regexp_matches() seems more natural than regexp_replace().
You can also search for first two substrings in brackets (without the g flag the function yields no more than one row):
select gruppo
from righe_conto_ready,
lateral regexp_matches(gruppo, '(\[.+?\]).*(\[.+?\])') matches
where concat(matches[1], matches[2]) = '[U6][U43]'

Use Regex from a column in Redshift

I have 2 tables in Redshift, one of them has a column containing Regex strings. And I want to join them like so:
select *
from one o
join two t
on o.value ~ t.regex
But this query throws an error:
[Amazon](500310) Invalid operation: The pattern must be a valid UTF-8 literal character expression
Details:
-----------------------------------------------
error: The pattern must be a valid UTF-8 literal character expression
code: 8001
context:
query: 412993
location: cgx_impl.cpp:1911
process: padbmaster [pid=5211]
-----------------------------------------------;
As far as I understood from searching in the docs, the right side of a regex operator ~ must be a string literal.
So this would work:
select *
from one o
where o.value ~ 'regex'
And this would fail:
select *
from one o
where 'regex' ~ o.value
Is there any way around this? Anything I missed?
Thanks!
Here's a workaround I am using. Maybe it's not super fast, but it works:
First create a function:
CREATE FUNCTION is_regex_match(pattern text, s text) RETURNS BOOLEAN IMMUTABLE AS $$
import re
return True if re.search(pattern, s) else False
$$ LANGUAGE plpythonu;
Then use it like this (o.value contains a regex pattern):
select *
from one o
where is_regex_match(o.value, 'some string');
You could try using the built-in function regexp_substr()
https://docs.aws.amazon.com/redshift/latest/dg/REGEXP_SUBSTR.html
select *
from one o
join two t
on regexp_substr(o.value, t.regex) <> ''
Edit example added of raw query
It appears that the fields must be explicitly cast as varchars when built.
with fake_table as (
SELECT 'sample value'::varchar as value, '[a-z]'::varchar as pattern
)
SELECT *
, regexp_substr(value, pattern)
FROM
fake_table
WHERE
regexp_substr(value, pattern) <>''

SQL pattern matching using regular expression

Can we use Regex i.e, Regular Expression in SQL Server? I'm using SQL-2012 and 2014 and there is an requirement to match and return input from my stored procedure.
I can't use LIKE in this situation since like only returns matching words, Using Regex I can match whole bunch of characters like Space, Hyphen, Numbers.
Here is my SP
--Suppose XYZ P is my Search Condition
Declare #Condition varchar(50) = 'XYZ P'
CREATE PROCEDURE [dbo].[usp_MATCHNAME]
#Condition varchar(25)
as
Begin
select * from tblPerson
where UPPER(Name) like UPPER(#Condition) + '%'
-- It should return both XYZ P and xyzp
End
Here my SP is going to return all matching condition where Name=XYZ P, but how to retrieve other Column having Name as [XYZP, XYZ-P]
and if search condition have any Alphanumeric value like
--Suppose XYZ 1 is my Search Condition
Declare #Condition varchar(50) = 'XYZ 1'
Then my search result should also return nonspace value like [XYZ1, xyz1, Xyz -1].
I don't want to use Substring by finding space and splitting them based on space and then matching.
Note: My input condition i.e., #Condition can have both Space or Space less, Hyphen(-) value when executing Stored Procedure.
Use REPLACE command.
It will replace the single space into %, so it will return your expected results:
SELECT *
FROM tblPerson
WHERE UPPER(Name) LIKE REPLACE(UPPER(#Condition), ' ', '%') + '%'

Postgres Query with Regex

I'm trying to create a regex to find (and then eventually replace) parts of strings in a PG DB. I'm using PSQL 9.0.4
I've tested my regex outside of PG and it works perfectly. However, it isn't playing well with PG. If anyone can help me understand what I'm doing wrong it would me much appreciated.
Regex:
{php}.*\n.*\n.*'mister_xx']\)\);.*\n} \n{\/php}
Postgres Query:
SELECT COUNT(*) FROM (SELECT * FROM "Table" WHERE "Column" ~ '{php}.*\n.*\n.*'mister_xx']\)\);.*\n} \n{\/php}') as x;
Postgres Response:
WARNING: nonstandard use of escape in a string literal
LINE 1: ...M (SELECT * FROM "Table" WHERE "Column" ~ '{php}.*\n...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
ERROR: syntax error at or near "mister_xx"
LINE 1: ..."Table" WHERE "Column" ~ '{php}.*\n.*\n.*'mister_x...
In SQL, quotes are delimited as two quotes, for example:
'Child''s play'
Applying this to your regex makes it work:
SELECT COUNT(*)
FROM "Table"
WHERE "Column" ~ '{php}.*\n.*\n.*''mister_xx'']\)\);.*\n} \n{\/php}' as x;
Note also how the redundant subquery .
You need to double escape the backslashes and add an E before the statement:
SELECT * FROM "Table" WHERE "Column" ~ E'{php}\n.\n.*''mister_xx'']\)\);.*\n} \n{\/php}'