Oracle to Vertica reqexp_replace - keep just numbers and alphas - regex

I am using the following expression in Oracle and it performs as expected returning only alpha numeric characters. When I try to use it in Vertica I get null. Any wisdom out there?
trim(upper(regexp_replace(PATIENT_CITY,'[^[:alpha:]^[:alnum:]'' '']', NULL))),

Why are you setting the fourth parameter (position) to NULL?
The following statement works well:
SELECT regexp_replace('451 #$%!^ 657asdsg', '[^[:alpha:][:alnum:]'' '']')
Note the second caret ("^") should not be there (even for Oracle).

Related

OpenModelica SimulationOptions 'variableFilter' not working with '^' exceptions

To reduce size of my simulation output files, I want to give variable name exceptions instead of a list of many certain variables to the simulationsOptions/outputFilter (cf. OpenModelica Users Guide / Output) of my model. I found the regexp operator "^" to fullfill my needs, but that didn't work as expected. So I think that something is wrong with the interpretation of connected character strings when negated.
Example:
When I have any derivatives der(...) in my model and use variableFilter=der.* the output file will contain all the filtered derivatives. Since there are no other varibles beginning with character d the same happens with variableFilter=d.*. For testing I also tried variableFilter=rde.* to confirm that every variable is filtered.
When I now try to except by variableFilter=^der.*, =^rde.* or =^d.*, I get exactly the same result as without using ^. So the operator seems to be ignored in this notation.
When I otherwise use variableFilter=[^der].*, =[^rde].* or even =[^d].*, all wanted derivation variables are filtered from the ouput, but there is no difference between those three expressions above. For me it seems that every character is interpretated standalone and not as as a connected string.
Did I understand and use the regexp usage right or could this be a code bug?
Side/follow-up question: Where can I officially report this for software revision?
_
OpenModelica v.1.19.2 (64-bit)

Trying to write an SQL query with regexp_matches() look behind positive in postgresql

From a PostgreSQL database, I'm trying to match 6 or more digits that come after a string that looks like "(OCoLC)" and I thought I had a working regular expression that would fit that description:
(?<=\(ocolc\))[0-9]{6,}
Here are some strings that it should return the digits for:
|a(OCoLC)08507541 will return 08507541
|a(OCoLC)174097142 will return 174097142
etc...
This seems to work to match strings when I test it on regex101.com, but when I incorporate it into my query:
SELECT
regexp_matches(v.field_content, '(?<=\(ocolc\))[0-9]{6,}', 'gi')
FROM
varfield as v
LIMIT
1;
I get this message:
ERROR: invalid regular expression: quantifier operand invalid
I'm not sure why it doesn't seem to like that expression.
UPDATE
I ended up just resorting to using a case statement, as that seemed to be the best way to work around this...
SELECT
CASE
WHEN v.field_content ~* '\(ocolc\)[0-9]{6,}'
THEN (regexp_matches(v.field_content, '[0-9]{6,}', 'gi'))[1]
ELSE v.field_content
END
FROM
varfield as v
as electricjelly noted, I'm kind of after just the numeric characters, but they have to be preceded by the "(OCoLC)" string, or they're not exactly what I'm after. This is part of a larger query, so I'm running a second case statement a boolean flag in cases where the start of the string wasn't "(OCoLC)". These seems to be more helpful anyway, as I'm going to probably want to preserve those other values somehow.
After looking over your question it seems your error is caused from a syntax problem, not so much from the function not being available on your version of PostgreSQl, as I tested it on 9.6 and I received the same error.
However, what you seem to want is to pull the numbers from a given field as in
|a(OCoLC)08507541 becomes 08507541
an easy way you could accomplish this would be to use regex_replace
the function would be:
regexp_replace('table.field', '\D', '', 'g')
the \D in the function finds all non-numbers and replaces it with a nothing (hence the '') and returns everything else
It looks like after doing some more searching, this is only a feature of versions of PostgreSQL server >= 9.6
https://www.postgresql.org/docs/9.6/static/functions-matching.html#POSIX-CONSTRAINTS-TABLE
The version I am running is version 9.4.6
https://www.postgresql.org/message-id/E1ZsIsY-0006z6-6T#gemulon.postgresql.org
So, the answer is it's not available for this version of PostgreSQL, but presumably this would work just fine in the latest version of the server.

Untranslatable character when extracting dates from strings

I am attempting to extract dates from a free-text field (because our process is awesome like that :\ ) and keep hitting Teradata error 6706. The regex I'm using is: REGEXP_SUBSTR(original_field,'(\d{2})\/(\d{2})\/(\d{4})',1) AS new_field. I'm unsure of the field's type HELP TABLE has a blank in the Type column for the field.
I've already tried converting using TRANSLATE(col USING LATIN_TO_UNICODE), as well as UNICODE_TO_LATIN, those both actually cause the error by themselves. A straight CAST(original_field AS VARCHAR(255)) doesn't fix the issue, though that cast does work. I've also tried stripping various special characters (new-line, carriage return, etc.) from the field before letting the REGEXP_SUBSTR take a crack at it, both by itself and with the CAST & TRANSLATEs I already mentioned.
At this point I'm not sure what the issue could be, and could use some guidance on additional options to try.
The final version that worked ended up being
, CASE
WHEN TRANSLATE_CHK(field USING LATIN_TO_UNICODE) = 0 THEN
REGEXP_SUBSTR(TRANSLATE(field USING LATIN_TO_UNICODE),'(\d{2})\/(\d{2})\/(\d{4})',1)
ELSE NULL
END AS Ref_Date
For whatever reason, using a TRIM inside the TRANSLATE seems to cause an issue. Only once I striped any and all functions from inside the TRANSLATE did the TRANSLATE, and thus the REGEXP_SUBSTR, work.

Remove or replace '�' character in Informatica

We have a requirement wherein we need to replace or remove '�' character (which is an unrecognizable, undefined character) present in our source. While running my workflow it runs successfully but when i check the records in target they are not committed. I get the following error in Informatica
Error executing query for record 37: 6706: The string contains an untranslatable character.
I tried functions like replace_chr, reg_replace, replace_str etc., but none seems to be working. Kindly advise on how to get rid of this. Any reply is greatly appreciated.
You need to use in your schema definitions charset=> utf8-unidode-ci
but now you can do:
UPDATE tablename
SET columnToCheck = REPLACE(CONVERT(columnToCheck USING ascii), '?', '')
WHERE ...
or
update tablename
set columnToCheck = replace(columnToCheck , char(146), '');
Replace NonASCII Characters in MYSQL
You can replace the special characters in an expression transformation.
REPLACESTR(1,Column_Name,'?',NULL)
REPLACESTR - Function
1 - Position
Column_Name - Column name which has a special character
? - Special character
NULL - Replacing character
You need to fetch rows with the appropriate character set defined on your connection. What is the connection you're using, ODBC or native? What's the DB?
Special characters are a challenge and having checked the informatica network I can see there is a kludge involving replace_str setting first a variable to the string with all non special characters first and then using the resulting variable in a replace_str so that the final value has only the allowed characters https://network.informatica.com/thread/20642 (awesome workaround by nico so long as you can positively identify every character that should be allowed) ...
As an alternate kludge I would also attempt something using an xml transformation somewhere within the mapping as informatica conveniently converts special characters to encoded (decimal or hex I cant remember) values... so long as you can live with these encoded values appearing in your target text you should be fine ( and build some extra space into your strings to accommodate any bloatage from the extra characters

How to check the field value value is blank and make it to 0 in informatica powercenter?

I am currently working on a project where there is a scenario in which I have to check the field value(in decimal data type) is blank or not, if it is blank I have to make that particular blank value into 0. How to bring out this logic using expression transformation in informatica powercenter 9.6.1?
Create an output port with the expression:
IIF(ISNULL(field),0,field)
Create a new output port in the Expression transformation and use either of the two expressions.
Expression 1:
Decode(Field,NULL,0,Field)
Expression 2:
IIF(ISNULL(Field),0,Field)
Since the question speaks about blank values as well, you might want to check for '' apart from nulls as mentioned in other answers:
IIF(ISNULL(FIELD) OR LTRIM(RTRIM(FIELD))='',0,FIELD)