PL/SQL key-value String using Regex - regex

I have a String stored in a table in the following key-value format: "Key1☺Value1☺Key2☺Value2☺KeyN☺ValueN☺".
Given a Key how can I extract the Value? Is regex the easiest way to handle this? I am new to PL/SQL as well as Regex.

In this case, I would use just a regular split and iterate through the resulting array.
public string GetValue(string keyValuePairedInput, string key, char separator)
{
var split = keyValuePairedInput.Split(separator);
if(split.Lenght % 2 == 1)
throw new KeyWithoutValueException();
for(int i = 0; i < split.Lenght; i += 2)
{
if(split[i] == key)
return split[i + 1];
}
throw new KeyNotFoundException();
}
(this was not compiled and is not pl/sql anyway, treat it as pseudocode ☺)
OK I hear your comment...
Making use of pl/sql functions, you might be able to use something like this:
select 'key' as keyValue,
(instr(keyValueStringField, keyValue) + length(keyValue) + 1) as valueIndex,
substr(keyValueStringField, valueIndex, instr(keyValueStringField, '\1', valueIndex) - valueIndex) as value
from Table

For this kind of string slicing and dicing in PL/SQL you will probably have to use regular expressions. Oracle has a number of regular expression functions you can use. The most commonly used one is REGEXP_LIKE which is very similar to the LIKE operator but does RegEx matching.
However you probably need to use REGEXP_INSTR to find the positions where the separators are then use the SUBSTR function to slice up the string at the matched positions. You could also consider using REGEXP_SUBSTR which does the RegEx matching and slicing in one step.

As an alternative to regular expressions...
Assuming you have an input such as this:
Key1,Value1|Key2,Value2|Key3,Value3
You could use some PL/SQL as shown below:
FUNCTION get_value_by_key
(
p_str VARCHAR2
, p_key VARCHAR2
, p_kvp_separator VARCHAR2
, p_kv_separator VARCHAR2
) RETURN VARCHAR2
AS
v_key VARCHAR2(32767);
v_value VARCHAR2(32767);
v_which NUMBER;
v_cur VARCHAR(1);
BEGIN
v_which := 0;
FOR i IN 1..length(p_str)
LOOP
v_cur := substr(p_str,i,1);
IF v_cur = p_kvp_separator
THEN
IF v_key = p_key
THEN
EXIT;
END IF;
v_key := '';
v_value := '';
v_which := 0;
ELSIF v_cur = p_kv_separator
THEN
v_which := 1;
ELSE
IF v_which = 0
THEN
v_key := v_key || v_cur;
ELSE
v_value := v_value || v_cur;
END IF;
END IF;
END LOOP;
IF v_key = p_key
THEN
RETURN v_value;
END IF;
raise_application_error(-20001, 'key not found!');
END;
To get the value for 'Key2' you could do this (assuming your function was in a package called test_pkg):
SELECT test_pkg.get_value_by_key('Key1,Value1|Key2,Value2|Key3,Value3','Key2','|',',') FROM dual

Related

PL/SQL. Parse clob UTF8 chars with regexp_like regular expressions

I want to check if any line of my clob have strange characters like (ñ§). These characters are read from a csv-file with an unexpected encoding (UTF-8) which converts some of them.
I tried to filter each line using a regular expression but it's not working as intended. Is there a way to know the encoding of a csv-file when read?
How could I fix the regular expression to allow lines with only these characters? a-zA-Z 0-9 .,;:"'()-_& space tab.
Clob example readed from csv:
l_clob clob :='
"exp","objc","objc","OBR","031110-5","S","EXAMPLE","NAME","08/03/2018",,"122","3","12,45"
"xp","objc","obj","OBR","031300-5","S","EXAMPLE","NAME","08/03/2018",,"0","0","0"
';
Another clob:
DECLARE
l_clob CLOB
:= '"exp","objc","objc","OBR","031110-5","S","EXAMPLE","NAME","08/03/2018",,"122","3","12,45"
"xp","objc","obj","OBR","031300-5","S","EXAMPLE","NAME","08/03/2018",,"0","0","0"';
l_offset PLS_INTEGER := 1;
l_line VARCHAR2 (32767);
csvregexp CONSTANT VARCHAR2 (1000)
:= '^([''"]+[-&\s(a-z0-9)]*[''"]+[,:;\t\s]?)?[''"]+[-&\s(a-z0-9)]*[''"]+' ;
l_total_length PLS_INTEGER := LENGTH (l_clob);
l_line_length PLS_INTEGER;
BEGIN
WHILE l_offset <= l_total_length
LOOP
l_line_length := INSTR (l_clob, CHR (10), l_offset) - l_offset;
IF l_line_length < 0
THEN
l_line_length := l_total_length + 1 - l_offset;
END IF;
l_line := SUBSTR (l_clob, l_offset, l_line_length);
IF REGEXP_LIKE (l_line, csvregexp, 'i')
THEN -- i (case insensitive matches)
DBMS_OUTPUT.put_line ('Ok');
DBMS_OUTPUT.put_line (l_line);
ELSE
DBMS_OUTPUT.put_line ('Error');
DBMS_OUTPUT.put_line (l_line);
END IF;
l_offset := l_offset + l_line_length + 1;
END LOOP;
END;
If you only want to allow special characters you can use this regex:
Your Regex
csvregexp CONSTANT VARCHAR2 (1000) := '^[a-zA-Z 0-9 .,;:"''()-_&]+$' ;
Regex-Details
^ Start of your string - no chars before this - prevents partial match
[] a set of allowed chars
[]+ a set of allowed chars. Has to be one char minimum up to inf. (* instead of + would mean 0-inf.)
[a-zA-Z]+ 1 to inf. letters
[a-zA-Z0-9]+ 1 to inf. letters and numbers
$ end of your string - no chars behind this - prevents partial match
I think you can work it out with this ;-)
If you know there could be an other encoding in your input, you could try to convert and check against the regex again.
Example-convert
select convert('täst','us7ascii', 'utf8') from dual;

How to use REGEXP_LIKE in trigger's When condition?

create or replace trigger emp_trig
before insert or update of salary on emp
for each row
when `REGEXP_LIKE(:new.job_id, 'ac*','i')` -- Here
BEGIN
IF inserting then
:new.commission_pct := 0.20;
elsif (:old.commission_pct is null) then
:new.commission_pct := 0.1;
END IF;
END;
create or replace trigger emp_trig
before insert or update of salary on emp
for each row
when (REGEXP_LIKE(new.job_id, 'ac*','i'))
BEGIN
IF inserting then
:new.commission_pct := 0.20;
elsif (:old.commission_pct is null) then
:new.commission_pct := 0.1;
END IF;
END;
/
Hey. if you are trying to do a simple match then avoid using Regular
expression. Instead go with LIKE and your test condition. Below
snippet illustrates a simple example to suffice your requirement. Hope
it helps
CREATE OR REPLACE TRIGGER emp_trig before
INSERT OR
UPDATE OF sal ON emp FOR EACH row
WHEN (new.job LIKE '%TEST%')
DECLARE
BEGIN
IF inserting THEN
:new.comm := 0.20;
elsif (:old.comm IS NULL) THEN
:new.comm := 0.1;
END IF;
END;

Handle special characters during textfile import into OracleDB via Apex

I'm working on a tool that imports textfiles into a BLOB column (OracleDB). This is handled via an Apex page with a File Browse button and connected import procedure.
For more details about the import to BLOB procedure: http://ittichaicham.com/2011/03/file-browser-in-apex-4-with-blob-column-specified-in-item-source-attribute/
The textfiles that I'm using contain special characters, null values, decimal seperators etc. For example:
(...) 111888|Overflakkée, Blabla|streetname with Rhône||12-13|UXC
Placename (...)
Since it's all character data, I'm converting the BLOB to CLOB with this procedure:
FUNCTION blob_to_clob (blob_in IN BLOB)
RETURN CLOB
AS
v_clob CLOB;
v_varchar VARCHAR2(32767);
v_start PLS_INTEGER := 1;
v_buffer PLS_INTEGER := 32767;
BEGIN
DBMS_LOB.CREATETEMPORARY(v_clob, TRUE);
FOR i IN 1..CEIL(DBMS_LOB.GETLENGTH(blob_in) / v_buffer)
LOOP
v_varchar := UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(blob_in, v_buffer, v_start));
DBMS_LOB.WRITEAPPEND(v_clob, LENGTH(v_varchar), v_varchar);
v_start := v_start + v_buffer;
END LOOP;
RETURN v_clob;
END blob_to_clob;
See for more info:http://www.dba-oracle.com/t_convert_blob_to_clob_script.htm
The problem:
While converting the blob to clob, some of the special characters are lost/altered.
For example, this row:
(...) 111888|Overflakkée, Blabla|streetname with Rhône||12-13|UXC
Placename (...)
will become this row:
(...) 111888|Overflakk� Blabla|streetname with Rh�|12-13|UXC
Placename (...)
Row length, characters and even seperators (in this case a '|') are altered/not visible.
Is there a way to obtain the lost characters + keep seperators/null values in place? (if its necessary to change 'é' to 'e', that's fine).
Is there a more efficient way to import textfiles into a BLOB/CLOB column?
Regards
You need to do a conversion from the source character set to the character set of the database
Here is an example I made (mainly for getting big json objects, javascript is utf8, to work with in a 8859p1 database), It's pretty simple so I won't explain it too much.
example usage with conversion:
l_clob := blob_to_clob (l_blob, '1');
Function:
function blob_to_clob (blob_in in blob, p_convertutf8 in char default 0)
return clob as
/* Ólafur Tryggvason */
l_clob clob;
l_varchar varchar2 (32767);
l_start pls_integer := 1;
l_buffer pls_integer := 32767;
l_characterset nls_database_parameters.value%type;
begin
select value
into l_characterset
from nls_database_parameters
where parameter = 'NLS_CHARACTERSET';
dbms_lob.createtemporary (l_clob, true);
for i in 1 .. ceil (dbms_lob.getlength (blob_in) / l_buffer) loop
l_varchar := utl_raw.cast_to_varchar2 (dbms_lob.substr (blob_in, l_buffer, l_start));
if p_convertutf8 = '1' then
l_varchar := convert (l_varchar, l_characterset, 'UTF8'); -- WE8ISO8859P1
end if;
dbms_lob.writeappend (l_clob, length (l_varchar), l_varchar);
l_start := l_start + l_buffer;
end loop;
return l_clob;
end blob_to_clob;

Delphi - How can I extract the digits from a character string?

I was developing a program that validate a CPF, a type of document of my country. I already did all the math. But in the input Edit1, the user will insert like:
123.456.789-00
I have to get only the numbers, without the hyphen and the dots, to my calcs worth.
I'm newbie with Delphi, but I think that's simple. How can I do that? Thanks for all
You can use
text := '123.456.789-00'
text := TRegEx.Replace(text, '\D', '')
Here, \D matches any non-digit symbol that is replaced with an empty string.
Result is 12345678900 (see regex demo).
Using David's suggestion, iterate your input string and remove characters that aren't numbers.
{$APPTYPE CONSOLE}
function GetNumbers(const Value: string): string;
var
ch: char;
Index, Count: integer;
begin
SetLength(Result, Length(Value));
Count := 0;
for Index := 1 to length(Value) do
begin
ch := Value[Index];
if (ch >= '0') and (ch <='9') then
begin
inc(Count);
Result[Count] := ch;
end;
end;
SetLength(Result, Count);
end;
begin
Writeln(GetNumbers('123.456.789-00'));
Readln;
end.

Checking for a pure string using regexp_like

I need to check a "substring of the first 6 characters" of an input string for a pure string.
declare
p_str varchar2(30) := 'ABCD1240';
l_result varchar2(20);
begin
if REGEXP_LIKE(substr(p_str,1,6), '[[:alpha:]]') then
dbms_output.put_line('It is a pure string');
else
dbms_output.put_line('It is an alphanumeric');
end if;
end;
/
I can see that the first 6 characters of the string ABCD1290 is alphanumeric as it contains 12.
But, the output that is printed says otherwise.
Am I doing something wrong with the "alpha" in regexp_like ?
I thought alpha was supposed to be pure characters and not numbers.
Here, ABCD1290 should give me: alphanumeric as output.
ABCDXY90 should be : pure string
Try this:
declare
l_res varchar2(100);
begin
for i in (select 'abcdef123' val from dual union
select '123abc123' from dual union
select '123456abc' from dual)
loop
if REGEXP_LIKE(i.val, '^\D{6}')
then
l_res := 'alpha';
else
l_res := 'numeric';
end if;
dbms_output.put_line(i.val || ' is ' || l_res);
end loop;
end;
123456abc is numeric
123abc123 is numeric
abcdef123 is alpha