posix regulation does not work with postgreSQL in trigger function

posix regulation does not work with postgreSQL in trigger function - regex

I am working on a trigger function, and I only want to insert the data when it satisfies specific format. So I tried to use posix regulation expressions. But none of it seems working.
if(new.tin ~ '\d{10}') then
if not exists(
select *
from taxcodes
where code = substr(new.tin, 1, 4)
) then return null;
end if;
end if;

Related

Determine if string is valid XML in Postgres

I have a text field in a table that contains JSON data as well as XML data. As I want to work with XML data only if it's valid XML, I want a way to make sure I can cast the string as XML without producing an error when '{"key":"val"}'::XML is possible.
Basically I want a function select isxml('{"key":"val"}) to return false, and select isxml('<key>1</key>') to be true.
I checked existing Postgres functions such as xml_is_well_formed, but they still return true when checking JSON strings. Maybe I can catch the error and deal with it in exceptions after a bad cast? Is there a good way to do this?

One possibility would be to use the xml_is_well_formed together with a function that checks whether or not the text content is a valid json. I.e.:
create or replace function is_valid_json(content text)
returns boolean
as
$$
begin
return (content::json is not null);
exception
when others then
return false;
end;
And in your query, you do [...] xml_is_well_formed(content) and not is_valid_json(content) [...].

My temporary solution is as follows
CREATE OR REPLACE FUNCTION isjson(p_json text)
RETURNS integer
LANGUAGE plpgsql
IMMUTABLE
AS $function$
begin
perform (p_json::json is not null);
return 1;
exception
when others then
return 0;
end;
$function$;
CREATE OR REPLACE FUNCTION isxml(p_xml text)
RETURNS boolean
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
PERFORM (p_xml::XML IS NOT NULL);
IF (xml_is_well_formed(p_xml)
AND NOT (CASE WHEN isjson(p_xml) = 1 THEN TRUE ELSE false END)
AND (SELECT p_xml ~ '^<.*>') )THEN -- regex matches <>, this may have uncovered edge cases
RETURN true;
ELSE
RETURN false;
END IF;
EXCEPTION
WHEN OTHERS THEN
RETURN false;
END;
$function$;
Note: my isjson function returns integer due to other legacy compatibility reasons, it would be easier to use boolean for this specific case. This should rule out most problematic cases but have lots of limitations in the regex used, accepting suggestions for improvement.

Amazon Redshift: having issue at the END IF point

Working with Amazon Redshift having issue at the END IF point:
CREATE OR REPLACE PROCEDURE IF_CON()
AS $$
DECLARE
BEGIN
IF(SELECT EXISTS(SELECT clientid FROM ods_epremis.new_old_merge)) THEN
BEGIN
UPDATE ods_epremis.new_old_merge SET patientencounter_id=(SELECT max(patientencounter_id)
FROM ods_epremis.new_old_merge)+1
WHERE new_old_merge.clain_oid =(SELECT top 1 claim_oid from ods_epremis.new_old_merge)
INSERT INTO ods_epremis.CLM_REM_MAPPING_PATIENT_ENCOUNTER
SELECT * from ods_epremis.new_old_merge
where claim_oid=(select top 1 claim_oid from ods_epremis.new_old_merge order by claim_oid)
END IF
END
$$ LANGUAGE plpgsql;

Your nesting is: IF / BEGIN / END IF / END
It should probably be: IF / BEGIN / END / END IF
This keeps the BEGIN/END transaction inside the IF.
Also, based on examples from Structure of PL/pgSQL - Amazon Redshift, commands should end with semi-colons (;):
CREATE OR REPLACE PROCEDURE record_example()
LANGUAGE plpgsql
AS $$
DECLARE
rec RECORD;
BEGIN
FOR rec IN SELECT a FROM tbl_record
LOOP
RAISE INFO 'a = %', rec.a;
END LOOP;
END;
$$;

Can i use REGEXP_LIKE as a condition with IF in a PL/SQL block

I'm trying to create a function designed to traverse a tree of organisational units filtering out some based on their level in the tree structure and weather they appear on our intranet page. The input to the function is the ORG_UNIT_ID of the starting unit, a flag to show if we should care about the intranet flag and a comma separated list of levels. For instance '2,3'. I'm trying to use REGEXP_LIKE in conjunction with an ELSEIF inside a loop to run up the tree until I hit the first eligible parent unit.
T_STOP is the control variable for the loop. R_ORG_UNIT_OVER is used to query meta-data on the above unit. During the loops first pass this will be the unit above the one passed as input to the function.
The cursor definition:
CURSOR C_ORG_UNIT_OVER(V_ORG_UNIT_ID ORG_UNIT.ORG_UNIT_ID%TYPE) IS
SELECT ORUI.ORG_UNIT_ID
, ORUI.ORG_LEVEL
, ORUI.SHOW_ON_INTRANET
FROM ORG_UNIT ORUI
JOIN ORG_UNIT_PARENT OUPA ON ORUI.ORG_UNIT_ID=OUPA.ORG_UNIT_ID_PARENT
WHERE OUPA.ORG_UNIT_ID = V_ORG_UNIT_ID;
The failing code segment in the loop:
IF R_ORG_UNIT_OVER.SHOW_ON_INTRANET = 'N' THEN
T_ORG_UNIT_ID := R_ORG_UNIT_OVER.ORG_UNIT_ID;
ELSEIF REGEXP_LIKE (P_SKIP_LEVEL, '(^|,)' || R_ORG_UNIT_OVER.ORG_LEVEL || '($|,)') THEN
T_ORG_UNIT_ID := R_ORG_UNIT_OVER.ORG_UNIT_ID;
ELSE
T_STOP := 'Y';
END IF;
However this code always throws a PLS-00103 error on the REGEXP_LIKE symbol. Is there some sort of limitation or alternate way in which REGEXP_LIKE works when used as a condition in a PL/SQL IF/ELSEIF block as opposed to in a regular query?

PL/SQL uses ELSIF, not ELSEIF. With your edit your code does get the error you described; with this it doesn't:
IF R_ORG_UNIT_OVER.SHOW_ON_INTRANET = 'N' THEN
T_ORG_UNIT_ID := R_ORG_UNIT_OVER.ORG_UNIT_ID;
ELSIF REGEXP_LIKE (P_SKIP_LEVEL, '(^|,)' || R_ORG_UNIT_OVER.ORG_LEVEL || '($|,)') THEN
T_ORG_UNIT_ID := R_ORG_UNIT_OVER.ORG_UNIT_ID;
ELSE
T_STOP := 'Y';
END IF;

Yes you can.
declare
testvar varchar2(20) := 'Kittens';
begin
if regexp_like(testvar, '^K') then
dbms_output.put_line(testvar || ' matches ''^K''');
end if;
end;
Kittens matches '^K'
PL/SQL procedure successfully completed.
Include some test data and I'll try to see what's not working as expected. For example,
declare
p_skip_level number := 2;
org_level number := 3;
begin
if regexp_like (p_skip_level, '(^|,)' || org_level || '($|,)')
then
dbms_output.put_line('Matched');
else
dbms_output.put_line('Not matched');
end if;
end;

pl sql How to make my code less

Can anyone help me to make my code less? (If you can notice to both if-elsif statements I make the same Select.. so I wish there was a way to make this select once. and update with 1 or 0 depending on the pilot_action).
Below its my code.
create or replace
PROCEDURE F_16 (TRK_ID NUMBER, pilot_action NUMBER) IS
BEGIN
BEGIN
IF pilot_action=0 THEN
UPDATE "ControlTow"
SET "Intention"=0
WHERE "Id" IN (
SELECT "Id" FROM "ControlTow" WHERE "Id"=TRK_ID );
ELSIF pilot_action=1 THEN
UPDATE "ControlTow"
SET "Intention"=1
WHERE "Id" IN (
SELECT "Id" FROM "ControlTow" WHERE "Id"=TRK_ID );
END IF;
EXCEPTION
WHEN NO_DATA_FOUND THEN dbms_output.put_line('False Alarm');
COMMIT;
END;
END F_16;
thank you , in advance.

Your code has several issues I have addressed in the comments below. Note that transaction management is not discussed as it's not clear based on the question when commit/rollback should take place.
-- #1 use of explicit parameter mode
create or replace procedure f_16(p_trk_id in number, p_pilot_action in number) is
begin
-- #2 use of in
if p_pilot_action in (0, 1)
then
-- #3 unnecessary subquery removed
update controltow
set intention = p_pilot_action
where id = p_trk_id;
-- #4 use pl/sql implicit cursor attribute to check the number of affected rows
if sql%rowcount = 0
then
dbms_output.put_line('false alarm');
end if;
end if;
end;

Since you seem to be assigning pilot_action to Intention, I would do following:
create or replace
PROCEDURE F_16 (TRK_ID NUMBER, pilot_action NUMBER) IS
BEGIN
BEGIN
IF pilot_action IN (0, 1) THEN
-- if the only condition in subselect is the ID then use it directly
UPDATE "ControlTow"
SET "Intention"= pilot_action
WHERE "Id"=TRK_ID;
-- if there are more conditions than just the ID then subselect may be the way to go
--(hard to say without more information)
-- WHERE "Id" IN (
-- SELECT "Id" FROM "ControlTow" WHERE "Id"=TRK_ID AND ... )
ELSE
Null; -- do whatever you need in this case. Raise exception?
END IF;
EXCEPTION
WHEN NO_DATA_FOUND THEN dbms_output.put_line('False Alarm');
COMMIT;
END;
END F_16;
EDIT: As #user272735 said, there was room for more improvement on the code. Specifically rewriting the if condition to use in and simplifying the where clause (supposing Id is really the only condition to select rows to be updated).

PL/SQL optimize searching a date in varchar

I have a table, that contains date field (let it be date s_date) and description field (varchar2(n) desc). What I need is to write a script (or a single query, if possible), that will parse the desc field and if it contains a valid oracle date, then it will cut this date and update the s_date, if it is null.
But there are one more condition - there are must be exactly one occurence of a date in the desc. If there are 0 or >1 - nothing should be updated.
By the time I came up with this pretty ugly solution using regular expressions:
----------------------------------------------
create or replace function to_date_single( p_date_str in varchar2 )
return date
is
l_date date;
pRegEx varchar(150);
pResStr varchar(150);
begin
pRegEx := '((0[1-9]|[12][0-9]|3[01])[.](0[1-9]|1[012])[.](19|20)\d\d)((.|\n|\t|\s)*((0[1-9]|[12][0-9]|3[01])[.](0[1-9]|1[012])[.](19|20)\d\d))?';
pResStr := regexp_substr(p_date_str, pRegEx);
if not (length(pResStr) = 10)
then return null;
end if;
l_date := to_date(pResStr, 'dd.mm.yyyy');
return l_date;
exception
when others then return null;
end to_date_single;
----------------------------------------------
update myTable t
set t.s_date = to_date_single(t.desc)
where t.s_date is null;
----------------------------------------------
But it's working extremely slow (more than a second for each record and i need to update about 30000 records). Is it possible to optimize the function somehow? Maybe it is the way to do the thing without regexp? Any other ideas?
Any advice is appreciated :)
EDIT:
OK, maybe it'll be useful for someone. The following regular expression performs check for valid date (DD.MM.YYYY) taking into account the number of days in a month, including the check for leap year:
(((0[1-9]|[12]\d|3[01])\.(0[13578]|1[02])\.((19|[2-9]\d)\d{2}))|((0[1-9]|[12]\d|30)\.(0[13456789]|1[012])\.((19|[2-9]\d)\d{2}))|((0[1-9]|1\d|2[0-8])\.02\.((19|[2-9]\d)\d{2}))|(29\.02\.((1[6-9]|[2-9]\d)(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00))))
I used it with the query, suggested by #David (see accepted answer), but I've tried select instead of update (so it's 1 regexp less per row, because we don't do regexp_substr) just for "benchmarking" purpose.
Numbers probably won't tell much here, cause it all depends on hardware, software and specific DB design, but it took about 2 minutes to select 36K records for me. Update will be slower, but I think It'll still be a reasonable time.

I would refactor it along the lines of a single update query.
Use two regexp_instr() calls in the where clause to find rows for which a first occurrence of the match occurs and a second occurrence does not, and regexp_substr() to pull the matching characters for the update.
update my_table
set my_date = to_date(regexp_subtr(desc,...),...)
where regexp_instr(desc,pattern,1,1) > 0 and
regexp_instr(desc,pattern,1,2) = 0
You might get even better performance with:
update my_table
set my_date = to_date(regexp_subtr(desc,...),...)
where case regexp_instr(desc,pattern,1,1)
when 0 then 'N'
else case regexp_instr(desc,pattern,1,2)
when 0 then 'Y'
else 'N'
end
end = 'Y'
... as it only evaluates the second regexp if the first is non-zero. The first query might also do that but the optimiser might choose to evaluate the second predicate first because it is an equality condition, under the assumption that it's more selective.
Or reordering the Case expression might be better -- it's a trade-off that's difficult to judge and probably very dependent on the data.

I think there's no way to improve this task. Actually, in order to achieve what you want it should get even slower.
Your regular expression matches text like 31.02.2013, 31.04.2013 outside the range of the month. If you put year in the game,
it gets even worse. 29.02.2012 is valid, but 29.02.2013 is not.
That's why you have to test if the result is a valid date.
Since there isn't a full regular expression for that, you would have to do it by PLSQL really.
In your to_date_single function you return null when a invalid date is found.
But that doesn't mean there won't be other valid dates forward on the text.
So you have to keep trying until you either find two valid dates or hit the end of the text:
create or replace function fn_to_date(p_date_str in varchar2) return date is
l_date date;
pRegEx varchar(150);
pResStr varchar(150);
vn_findings number;
vn_loop number;
begin
vn_findings := 0;
vn_loop := 1;
pRegEx := '((0[1-9]|[12][0-9]|3[01])[.](0[1-9]|1[012])[.](19|20)\d\d)';
loop
pResStr := regexp_substr(p_date_str, pRegEx, 1, vn_loop);
if pResStr is null then exit; end if;
begin
l_date := to_date(pResStr, 'dd.mm.yyyy');
vn_findings := vn_findings + 1;
-- your crazy requirement :)
if vn_findings = 2 then
return null;
end if;
exception when others then
null;
end;
-- you have to keep trying :)
vn_loop := vn_loop + 1;
end loop;
return l_date;
end;
Some tests:
select fn_to_date('xxxx29.02.2012xxxxx') c1 --ok
, fn_to_date('xxxx29.02.2012xxx29.02.2013xxx') c2 --ok, 2nd is invalid
, fn_to_date('xxxx29.02.2012xxx29.02.2016xxx') c2 --null, both are valid
from dual
As you are going to have to do try and error anyway one idea would be to use a simpler regular expression.
Something like \d\d[.]\d\d[.]\d\d\d\d would suffice. That would depend on your data, of course.
Using #David's idea you could filter the ammount of rows to apply your to_date_single function (because it's slow),
but regular expressions alone won't do what you want:
update my_table
set my_date = fn_to_date( )
where regexp_instr(desc,patern,1,1) > 0

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

posix regulation does not work with postgreSQL in trigger function - regex

Related

Determine if string is valid XML in Postgres

Amazon Redshift: having issue at the END IF point

Can i use REGEXP_LIKE as a condition with IF in a PL/SQL block

pl sql How to make my code less

PL/SQL optimize searching a date in varchar

Categories

Resources