How to execute a dynamic SQL statement in a single Select statement? - regex

I just wonder how to eval the content of dynamic SQL using one select; this is the example. This is only an example. but I would like dynamically functions, and manage using single selects. ( I know that sqls are only for SELECT instead of modify... but In this deep querentee Im becomeing in a crazy developer)
SELECT 'SELECT SETVAL(' || chr(39) || c.relname || chr(39)|| ' ,
(SELECT MAX(Id)+1 FROM ' || regexp_replace(c.relname, '_[a-zA-Z]+_[a-zA-Z]+(_[a-zA-Z0-9]+)?', '', 'g') ||' ), true );'
FROM pg_class c WHERE c.relkind = 'S';
The original output is:
SELECT SETVAL('viewitem_id_seq' , (SELECT MAX(Id)+1 FROM viewitem ), true );
SELECT SETVAL('userform_id_seq' , (SELECT MAX(Id)+1 FROM userform ), true );
This is the dynamic sentence:
(SELECT MAX(Id)+1 FROM ' || regexp_replace(c.relname, '[a-zA-Z]+[a-zA-Z]+(_[a-zA-Z0-9]+)?', '', 'g')
is an string that generates as output a SQL, how to eval in the same line this statement?
The desired output is:
SELECT SETVAL('viewitem_id_seq' , 25, true );
SELECT SETVAL('userform_id_seq' , 85, true );
thanks!

If those are serial or identity columns it would be better to use pg_get_serial_sequence() to get the link between a table's column and its sequence.
You can actually run dynamic SQL inside a SQL statement by using query_to_xml()
I use the following script if I need to synchronize the sequences for serial (or identity) columns with their actual values:
with sequences as (
-- this query is only to identify all sequences that belong to a column
-- it's essentially similar to your select * from pg_class where reltype = 'S'
-- but returns the sequence name, table and column name to which the
-- sequence belongs
select *
from (
select table_schema,
table_name,
column_name,
pg_get_serial_sequence(format('%I.%I', table_schema, table_name), column_name) as col_sequence
from information_schema.columns
where table_schema not in ('pg_catalog', 'information_schema')
) t
where col_sequence is not null
), maxvals as (
select table_schema, table_name, column_name, col_sequence,
--
-- this is the "magic" that runs the SELECT MAX() query
--
(xpath('/row/max/text()',
query_to_xml(format('select max(%I) from %I.%I', column_name, table_schema, table_name), true, true, ''))
)[1]::text::bigint as max_val
from sequences
)
select table_schema,
table_name,
column_name,
col_sequence,
coalesce(max_val, 0) as max_val,
setval(col_sequence, coalesce(max_val, 1)) --<< this uses the value from the dynamic query
from maxvals;
The dynamic part here is the call to query_to_xml()
First I use format() to properly deal with identifiers. It also makes writing the SQL easier as no concatenation is required. So for every table returned by the first CTE, something like this is executed:
query_to_xml('select max(id) from public.some_table', true, true, '')
This returns something like:
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<max>42</max>
</row>
The value is than extracted from the XML value using xpath() and converted to a number which then is used in the final SELECT to actually call setval()
The nesting with multiple CTEs is only used to make each part more readable.
The same approach can e.g. used to find the row count for all tables
Some background on how query_to_xml() works

Related

How to use LISTAGG redshift

SELECT Distinct 'DROP TABLE IF EXISTS deleted.' || LISTAGG("table_name",',') || ';' FROM svv_all_columns WHERE schema_name = 'sn' and database_name='db';
ERROR:One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, et
SELECT Distinct 'DROP TABLE IF EXISTS deleted.' || LISTAGG("table_name",',') || ';' FROM svv_all_columns WHERE schema_name = 'sn' and database_name='db';
It concatanetes all table names in one drop statement
That's what LISTAGG() does - it aggregates strings from multiple rows together. It can work in conjunction with GROUP BY to combine strings from only within a group.
It sounds like you just want to have individual table names concatenated with the static text. Like this:
SELECT Distinct 'DROP TABLE IF EXISTS deleted.' || "table_name" || ';' FROM svv_all_columns WHERE schema_name = 'sn' and database_name='db'
Which will put each DROP statement on its own row. If instead you want one block of text you can use LISTAGG() over the combined strings like this:
SELECT LISTAGG('DROP TABLE IF EXISTS deleted.' || "table_name" || ';',',') FROM svv_all_columns WHERE schema_name = 'sn' and database_name='db';
Now DISTINCT doesn't make sense in this case as there is only one string. This could result in extra DROP statement so if you really need the minimum number of DROP statements AND for all of this to be in 1 string:
WITH drops AS (
SELECT Distinct 'DROP TABLE IF EXISTS deleted.' || "table_name" || ';' AS statement
FROM svv_all_columns WHERE schema_name = 'sn' and database_name='db'
)
SELECT LISTAGG(statement,' ')
FROM drops;
Note that I took the ',' out as the text separator as this doesn't make sense in a block of SQL.
Hopefully these examples will give you enough info to understand the basics of LISTAGG().

how do i loop through tables to get counts

I have tried to use the following, but it seems that different nodes cannot be mixed:
WITH tables_i_want AS (
SELECT *, table_schema||'.'||table_name as tbl FROM temp.redshift_mod_dates WHERE table_schema = 'whatever'
)
SELECT nspname
FROM pg_catalog.pg_class AS c
JOIN pg_catalog.pg_namespace AS ns
ON c.relnamespace = ns.oid
INNER JOIN tables_i_want as tiw
ON tiw.tbl = c.oid
AND relname not like 'pg_%'
so, then I tried a procedure:
CREATE OR REPLACE PROCEDURE f_test()
LANGUAGE plpgsql
AS $$
DECLARE
full_table_name1 VARCHAR;
full_table_name VARCHAR;
BEGIN
FOR full_table_name IN (SELECT table_schema||'.'||table_name as full_table_name FROM temp.redshift_mod_dates WHERE table_schema = 'whatever')
LOOP
EXECUTE 'SELECT INTO temp.redshift_tables_with_cnt %, COUNT(*) FROM %', full_table_name;
RAISE INFO '%', full_table_name;
END LOOP;
END;
$$;
seems there's an issue with the variable:
[42601] ERROR: syntax error at or near "$1" Where: SQL statement in PL/PgSQL function "f_test" near line 5
If you want to receive the row-count for all the tables you could achieve it using the following query
select tab.table_schema,
tab.table_name,
tinf.tbl_rows as rows
from svv_tables tab
join svv_table_info tinf
on tab.table_schema = tinf.schema
and tab.table_name = tinf.table
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in('pg_catalog','information_schema')
and tinf.tbl_rows > 1
order by tinf.tbl_rows desc;
You can have the data stored into a temporary table and then move them to a persistant table or do further processing as required.

Postgres regular expression to find procedures with wrong Update syntax

My Postgres database was migrated from MySQL using a tool and the code base has lot of syntax issues.
One of the issues is with the UPDATE statements inside procedures where the column name contains alias name as below.
UPDATE table1 t1 SET t1.col1 = 'some_value';
Having alias name after SET keyword as in t1.col1 is a wrong syntax in Postgres.
As the number of procedures is huge, I'm trying to write a regular expression to find which procedures have this pattern.
select proname, prosrc from pg_proc
where regexp_replace(prosrc, E'[\\n\\r]+', ' ', 'g' ) ~* '[:UPDATE:]([[:space:]]+)[:set:]([[:space:]]+)^[a-z]([a-z]|[0-9])+\.^[a-z]([a-z]|[0-9])+([[:space:]]*)[:=:]';
The regexp_replace part on the left side of the condition is to remove line breaks which works fine. The main part on the right side is not returning the desired result.
I am trying to find the procedures that has UPDATE keyword followed by one or more space, followed by SET keyword, followed by one more space, followed by one more alphanumeric characters (which starts with an alphabet), followed by a dot(.) , followed by one more alphanumeric characters (which starts with an alphabet), followed by zero or more spaces, followed by an equal to sign (=).
But the statement I formed seems to be wrong. Any help on this is much appreciated.
I think this may be more complex than you think... A procedure/function may have more than one update statement, and a simple regex will likely come up with many false positives.
I think you want a function to do a better job of eliminating false positives that result from:
Alias that occurs after the update, in a separate statement (after the semicolon) -- fix by splitting statements by semicolons
Aliases within the update that occur after a FROM or WHERE clause, which are valid and not syntax errors
Less frequent, aliases used in a CTE prior to the update - fix by ignoring everything prior to the update keyword
Here is a boilerplate for what I think will get you close and minimize false positives:
create or replace function find_bad_syntax()
returns setof text
language plpgsql as
$BODY$
DECLARE
r pg_proc%rowtype;
dml varchar[];
eval varchar;
alias varchar;
BEGIN
FOR r IN
SELECT * FROM pg_proc WHERE prosrc ilike '%update%'
LOOP
dml := string_to_array (r.prosrc, ';');
foreach eval in array dml
loop
alias := substring (lower (eval), 'update [\w.]+\s+(\w+)');
continue when alias is null or lower (alias) = 'set';
eval := regexp_replace (eval, 'from\s+.*', '', 'i');
eval := regexp_replace (eval, 'where\s.*', '', 'i');
eval := regexp_replace (eval, '^.*update', '', 'i');
if eval ~* (alias || '\.\w+\s*=') then
-- if eval ~* (alias || '\.\w+\s+=') then
return next format ('PROC: %s ALIAS: %s => ERROR: %s', r.proname, alias, eval);
end if;
end loop;
END LOOP;
END;
$BODY$
So to get the results simply:
select * from find_bad_syntax()
I did a test run, and your function did show up in the results.
The below query gives me the expected results. It checks for code where we have SET followed by one or more space, followed by one or more alphanumeric character and _, followed by a dot(.), followed by one or more alphanumeric character and _, followed by one or more spaces and followed by =.
This fetches all the procedures that have the issue that I posted in question.
select proname, prosrc from pg_proc
where regexp_replace(prosrc, E'[\\n\\r]+', ' ', 'g' )
~* '( SET)[[:space:]]+([a-z]|[0-9]|(_))+\.([a-z]|[0-9]|(_))+[[:space:]]+(=)';
Yes, in PostgreSQL this is not working:
UPDATE table1 t1 SET t1.col1 = 'some_value';
But, this is working correctly:
UPDATE table1 t1 SET col1 = 'some_value';
So we only need to clear the update field alias.
Example for do it:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
)
select regexp_replace(t1.txt, 'SET (.*)\d\.', 'SET ', 'g') from t1
For finding, selecting:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
)
select * from t1 where t1.txt ~ 'SET (.*)\d\.'
Some small changes:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
union all
select 'UPDATE table1 tbp3232 SET tbp3232.col1 = some_value'
union all
select 'select pp3.* from table1 pp3'
union all
select 'UPDATE table1 SET col1 = some_value'
union all
select 'UPDATE table1 t SET t.col1 = some_value'
)
select * from t1 where t1.txt ~ 'SET (.*)\w\.'
--Result:
'UPDATE table1 t1 SET t1.col1 = some_value'
'UPDATE table1 tbp3232 SET tbp3232.col1 = some_value'
'UPDATE table1 t SET t.col1 = some_value'

How to excldue null values using REGEXP_SUBSTR

The following statement retrieve the value of sub tag msg_id from MISC column if the sub stag contain value like %PACS%.
SELECT REGEXP_SUBSTR(MISC, '(^|\s|;)msg_id = (.*?)\s*(;|$)',1,1,NULL,2) AS TRANS_REF FROM MISC_HEADER
WHERE MISC LIKE '%PACS%';
I notice the query return record with null value (without msg_id) as well. Any idea if can exclude those null records from the syntax of REGEXP_SUBSTR, without adding any where clause.
Sample data of MISC:
channel=atm ; phone=0123 ; msg_id=PACS00812 ; ustrd=U123
channel=pos; phone=9922; ustrd=U156
The second record without msg_id, so it need to be excluded.
This method does not use REGEXP so may not be suitable for you.
However, it does satisfy your requirement.
This takes your embedded list of msg_id, breaks it out to a row for each component for an ID (I've assumed you do have something uniquely identifies each record).
It then only returns the original row where one of the rows for the ID has 'PACS' in it.
WITH thedata
AS (SELECT 1 AS theid
, 'channel=atm ; phone=0123 ; msg_id=PACS00812 ; ustrd=U123'
AS msg_id
FROM DUAL
UNION ALL
SELECT 2, 'channel=pos; phone=9922; ustrd=U156' FROM DUAL)
, mylist
AS (SELECT theid, COLUMN_VALUE AS msg_component
FROM thedata
, XMLTABLE(('"' || REPLACE(msg_id, ';', '","') || '"')))
SELECT *
FROM thedata td
WHERE EXISTS
(SELECT 1
FROM mylist m
WHERE m.theid = td.theid
AND m.msg_component LIKE '%PACS%')
Thedata sub-query is simply to generate a couple of records and pretend to be your table. You could remove that and substitute your actual table name.
There are other ways to break up an embedded list including ones that use REGEXP, I just find the XMLTABLE method 'cleaner'.

Is it possible to combine REPLACE with LIKE to replace multiple values in oracle database column

This is similar to the question here but instead of replacing a single value I want to replace multiple values with matching pattern.
--create table
create table my_table (column1 varchar2(10));
--load table
insert into my_table values ('Test1');
insert into my_table values ('Test2');
insert into my_table values ('Test3');
insert into my_table values ('Test4');
insert into my_table values ('Test5');
insert into my_table values ('Lesson');
--this query replaces 'Test1' with blank
select replace(column1, 'Test1', ' ') from my_table;
--now i want to replace all matching values with blank but i get an error
select replace(column1, like '%Test%', ' ') from my_table; --this throws below error.
--ORA-00936: missing expression
--00936. 00000 - "missing expression"
--*Cause:
--*Action:
--Error at Line: 19 Column: 25
Running Oracle Database 11g Enterprise Edition Release 11.2.0.1.0
I would use regexp_replace.
select regexp_replace(column1,'Test[[:digit:]]', ' ') from my_table;
regexp_replace
In the original post, you were indicating by %Test% that you want to replace the entire string with a space if it had the string "Test" anywhere in it:
with my_table(col1) as
( select 'Test1' from dual
union
select 'Test2' from dual
union
select 'thisisaTestofpatternmatching4' from dual
union
select 'thisisa Test ofpatternmatching5' from dual
union
select 'Test at the start' from dual
union
select 'Testat the start no following space' from dual
union
select 'Ending with Test' from dual
union
select 'Ending nospacebeforewithTest' from dual
union
select 'Testy' from dual
union
select 'Lesson' from dual
)
select regexp_replace(col1, '^.*Test.*$', ' ') from my_table;
I suspect you really only want to replace the word Test though? Can it occur more than once in a line?
select regexp_replace(col1, 'Test', ' ') from my_table;
The word test followed by a digit?
select regexp_replace(col1, 'Test\d', ' ') from my_table;
Tip: Make sure your test cases are set up to include all kinds of combinations of your test data, even where they may be unexpected. When testing your regular expressions, sometimes you may get unexpected results so make sure all possible conditions are tested.