How to remove carriage returns and new lines on all the columns in a table using Postgresql? - regex

I am trying to see if there is any way to remove carriage and new lines from all the varchar columns in a table using one statement.
I know that we can do this for a single column using something like below
select regexp_replace(field, E'[\\n\\r]+', ' ', 'g' )
In that case I need have one for every column, which I don't want to do unless there is any easy way.
Appreciate your help!

You can do this either creating a plpgsql function to execute dynamic SQL, or directly run it via DO, as the following example (replace my_table with the name of your table`):
do $$declare _q text; _table text = '<mytable>';
begin
select 'update '||attrelid::regclass::text||E' set\n'||
string_agg(' '||quote_ident(attname)||$q$ = regexp_replace($q$||quote_ident(attname)||$q$, '[\n\r]+', ' ', 'g')$q$, E',\n' order by attnum)
into _q
from pg_attribute
where attnum > 0 and atttypid::regtype::text in ('text', 'varchar')
group by attrelid
having attrelid = _table::regclass;
raise notice E'Executing:\n\n%', _q;
-- uncomment this line when happy with the query:
-- execute _q;
end;$$;

Related

Truncation when using CASE in SQL statement in SAS (Enterprise Guide)

I am trying to manipulate some text files in SAS Enterprise Guide and load them line by line in a character variable "text" which gets the length 1677 characters.
I can use the Tranwrd() function to create a new variable text21 on this variable and get the desired result as shown below.
But if I try to put some conditions on the execution of exactly the same Tranwrd() to form the variable text2 (as shown below) it goes wrong as the text in the variable is now truncated to around 200 characters, even though the text2 variable has the length 1800 characters:
PROC SQL;
CREATE TABLE WORK.Area_Z_Added AS
SELECT t1.Area,
t1.pedArea,
t1.Text,
/* text21 */
( tranwrd(t1.Text,'zOffset="0"',compress('zOffset="'||put(t2.Z,8.2)||'"'))) LENGTH=1800 AS text21,
/* text2 */
(case when t1.type='Area' then
tranwrd(t1.Text,'zOffset="0"',compress('zOffset="'||put(t2.Z,8.2)||'"'))
else
t1.Text
end) LENGTH=1800 AS text2,
t1.Type,
t1.id,
t1.x,
t1.y,
t2.Z
FROM WORK.VISSIM_IND t1
LEFT JOIN WORK.AREA_Z t2 ON (t1.Type = t2.Type) AND (t1.Area = t2.Area)
ORDER BY t1.id;
QUIT;
Anybody got a clue?
This is a known problem with using character functions inside a CASE statement. See this thread on SAS Communities https://communities.sas.com/t5/SAS-Programming/Truncation-when-using-CASE-in-SQL-statement/m-p/852137#M336855
Just use the already calculated result in the other variable instead by using the CALCULATED keyword.
CREATE TABLE WORK.Area_Z_Added AS
SELECT
t1.Area
,t1.pedArea
,t1.Text
,(tranwrd(t1.Text,'zOffset="0"',cats('zOffset="',put(t2.Z,8.2),'"')))
AS text21 length=1800
,(case when t1.type='Area'
then calculated text21
else t1.Text
end) AS text2 LENGTH=1800
,t1.Type
,t1.id
,t1.x
,t1.y
,t2.Z
FROM WORK.VISSIM_IND t1
LEFT JOIN WORK.AREA_Z t2
ON (t1.Type = t2.Type)
AND (t1.Area = t2.Area)
ORDER BY t1.id
;
If you don't need the extra TEXT21 variable then use the DROP= dataset option to remove it.
CREATE TABLE WORK.Area_Z_Added(drop=text21) AS ....

Looking for the proper way to format the text in a column and compare that with the value of a cell?

I am trying to format the information from a column that I am querying and compare that to information in a cell. I have tried to hack together various ways to do this, but I am not a proficient SQL/spreadsheet user.
In COLUMN I there is nothing.
In COLUMN K there is a match on A2.
In COLUMN N there is Information formatted like 31'-40' and 41'+.
I would prefer to use = instead of contains.
The REPLACE Function seems to work when I substitute N for a String and run it on the W3 School Website.
The REGEXREPLACE seems to work on D2. I would expect them to match, but they do not.
COUNT( QUERY( '2019'!A2:P, "select D where I='' and upper(K) contains '" & UPPER(A2) & "' and REPLACE(REPLACE(REPLACE(N, '-', ''), '''', ''), '+','') contains '"& Regexreplace(D2,"[[:punct:]]","") &"' ")
I get 0 matches.
you almost had it, but try like this:
=COUNTA(FILTER(2019!D2:D, I2:I="",
REGEXMATCH(UPPER(K2:K), UPPER(A2)),
REGEXMATCH(UPPER(N2:N), UPPER(D2))))

QueryBuilder: Search a value in a column containing comma-separated integers

I have a column tags containing ids in a comma separated list.
I want to search all rows where a given value is in that column.
Say I have two rows where the column tags looks like this:
Row1: 1,2,3,4
Row2: 2,5,3,12
and I want to search for a row where the column contains a 1. I try to do it this way:
$qb = $this->createQueryBuilder('p')
->where(':value IN (p.tags))
->setParameter('value', 1);
I expect it to do something like
SELECT p.* FROM mytable AS p WHERE 1 IN (p.tags)
Executing this in MySQL directly works perfectly. In Doctrine it does not work:
Error: Expected Literal, got 'p'
It works the other way around, though, but this is not what I need:
->where("p.tags IN :value")
I've tried a lot to make this work, but it just won't... Any ideas?
I think you should use the LIKE function for each scenario, as example:
$q = "1";
$qb = $this->createQueryBuilder('p')
->andWhere(
$this->expr()->orX(
$this->expr()->like('p.tags', $this->expr()->literal($q.',%')), // Start with...
$this->expr()->like('p.tags', $this->expr()->literal('%,'.$q.',%')), // In the middle...
$this->expr()->like('p.tags', $this->expr()->literal('%,'.$q)), // End with...
),
);
See the SQL statement result in this fiddle
Hope this help

How to find all the source lines containing desired table names from user_source by using 'regexp'

For example we have a large database contains lots of oracle packages, and now we want to see where a specific table resists in the source code. The source code is stored in user_source table and our desired table is called 'company'.
Normally, I would like to use:
select * from user_source
where upper(text) like '%COMPANY%'
This will return all words containing 'company', like
121 company cmy
14 company_id, idx_name %% end of coding
453 ;companyname
1253 from db.company.company_id where
989 using company, idx, db_name,
So how to make this result more intelligent using regular expression to parse all the source lines matching a meaningful table name (means a table to the compiler)?
So normally we allow the matched word contains chars like . ; , '' "" but not _
Can anyone make this work?
To find company as a "whole word" with a regular expression:
SELECT * FROM user_source
WHERE REGEXP_LIKE(text, '(^|\s)company(\s|$)', 'i');
The third argument of i makes the REGEXP_LIKE search case-insensitive.
As far as ignoring the characters . ; , '' "", you can use REGEXP_REPLACE to suck them out of the string before doing the comparison:
SELECT * FROM user_source
WHERE REGEXP_LIKE(REGEXP_REPLACE(text, '[.;,''"]'), '(^|\s)company(\s|$)', 'i');
Addendum: The following query will also help locate table references. It won't give the source line, but it's a start:
SELECT *
FROM user_dependencies
WHERE referenced_name = 'COMPANY'
AND referenced_type = 'TABLE';
If you want to identify the objects that refer to your table, you can get that information from the data dictionary:
select *
from all_dependencies
where referenced_owner = 'DB'
and referenced_name = 'COMPANY'
and referenced_type = 'TABLE';
You can't get the individual line numbers from that, but you can then either look at user_source or use a regexp on the specific source code, which woudl at least reduce false positives.
SELECT * FROM user_source
WHERE REGEXP_LIKE(text,'([^_a-z0-9])company([^_a-z0-9])','i')
Thanks #Ed Gibbs, with a little trick this modified answer could be more intelligent.

oracle regular expression and MERGE

As updating my previous question,
I've a some newline separated strings.
I need to insert those each words into a table.
The new logic and its condition is that, it should be inserted if not exists, or update the corresponding count by 1. (as like using MERGE).
But my current query is just using insert, so I've used CONNECT BY LEVEL method without checking the value is existing or not.
it syntax is somewhat like:
if the word already EXISTS THEN
UPDATE my_table set w_count = w_count +1 where word = '...';
else
INSERT INTO my_table (word, w_count)
SELECT REGEXP_SUBSTR(i_words, '[^[:cntrl:]]+', 1 ,level),
1
FROM dual
CONNECT BY REGEXP_SUBSTR(i_words, '[^[:cntrl:]]+', 1 ,level) IS NOT NULL;
end if;
Try this
MERGE INTO my_table m
USING(WITH the_data AS (
SELECT 'a
bb
&
c' AS dat
FROM dual
)
SELECT regexp_substr(dat, '[^[:cntrl:]]+', 1 ,LEVEL) wrd
FROM the_data
CONNECT BY regexp_substr(dat, '[^[:cntrl:]]+', 1 ,LEVEL) IS NOT NULL) word_list
ON (word_list.wrd = m.word)
WHEN matched THEN UPDATE SET m.w_count = m.w_count + 1
WHEN NOT matched THEN insert(m.word,m.w_count) VALUES (word_list.wrd,1);
More details on MERGE here.
Sample fiddle