How to remove the space between the minus sign and number's in informatica - informatica

i have a issue where the there is a amount field which has data like
(- 98765.00),minus{spaces]{numbers} ?, i need to remove the space between the minus and the number and get is as (-98765.00), how do i do it in expression transformation.
field datatype is decimal (8,2).
Thanks,
Kiran

output_port: TO_DECIMAL(REPLACECHR(FALSE,input_port,' ',''))
REPLACECHR replaces the blanks with empty character, essentially removing them. The first argument can be TRUE/FALSE to specify case sensitive or not, but it is not important in this case.

You can use REG_REPLACE function to replace space
To achieve this you need to follow below steps,
* Create two variable ports
* REG_REPLACE - function requires string column, so you need to convert the decimal column to string column using TO_CHAR function
First variable port(string) - TO_CHAR(column_name)
* In previous port data is converted to string, now convert it again to decimal and apply REG_REPLACE function
Second variable port(decimal) - to_decimal(reg_replace(first_variable_port,'s+',''))
s - determines the white spaces in informatica regular expression
See the below image,
same number which you provided is used. Use the same data type and function
Debugger gives the exact result by removing white space in the below image,
May be you have the issue with other transformations which you are passing through. Debug and verify the data once.
Hope you got it, any issues feel free to ask
To have enjoy informatica, have a fun on https://etlinfromatica.wordpress.com/

If my understanding is correct, you need to replace both the spaces and the brackets. Here's the expression:
TO_DECIMAL(
REPLACECHR(0,
REPLACECHR(0, '(- 98765.00)', ' ', '') -- this part does the space replacement
, '()', '') -- this part replaces the brackets
)

Related

Regex to detect string is x.x.x where x is a digit from 1-3 digits

I have values 1000+ rows with variable values entered as below
5.99
5.188.0
v5.33
v.440.0
I am looking in Gsheet another column to perform following operations:
Remove the 'v' character from the values
if there is 2nd '.' missing as so string can become 5.88 --> 5.88.0
Can help please in the regex and replace logic as tried this but new to regex making. Thanks for the help given
=regexmatch(<cellvalue>,"^[0-9]{1}\.[0-9]{1,3}\.[0-9]{1,3}$")
I have done till finding the value as 5.88.0 returns TRUE and 5.99 returns false, need to now append ".0" so 5.99 --> 5.99.0 and remove 'v' if found.
You can use a combination of functions, it may not be pretty, but it does the work
Replace any instance of v with an empty string using substitute, by making the content of the cell upper case, if we don't put UPPER(CELL) we could exclude any upper case V or lower case v(it will depend which function you use)
SUBSTITUTE(text_to_search, search_for, replace_with, [occurrence_number])
=SUBSTITUTE(UPPER(A1),"V","")
Look for cell missing the last block .xxx, you need to update a bit your regex to specified that the last group it's not present
^([0-9]{1}\.[0-9]{1,3} ( \.[0-9]{1,3}){0} )$
Using REGEXMATCH and IF we can then CONCATENATE the last group as .0
REGEXMATCH(text, regular_expression)
CONCATENATE(string1, [string2, ...])
=IF(REGEXMATCH(substitute(upper(A2),"V",""),"^([0-9]{1}\.[0-9]{1,3}(\.[0-9]{1,3}){0})$"),concatenate(A2,".0"), A2)
The last A2 will be replace with something similar than what we have until now, but before that we need to make small change in the regex, we want to look for the groups you specified were the last group it's present, that's your orignal regex, if it meets the regex it will put it in the cell, otherwise it will put INVALID, you can change that to anything you want it to be
^([0-9]{1}.[0-9]{1,3}.[0-9]{1,3})$
This it's the piece we are putting instead of the last A2
IF(REGEXMATCH(substitute(upper(A2),"V",""),"^([0-9]{1}\.[0-9]{1,3}\.[0-9]{1,3})$"),substitute(upper(A2),"V",""),"INVALID")
With this the final code to put in your cell will be:
=IF(REGEXMATCH(substitute(upper(A2),"V",""),"^([0-9]{1}\.[0-9]{1,3}(\.[0-9]{1,3}){0})$"),concatenate(SUBSTITUTE(UPPER(A2),"V",""),".0"),IF(REGEXMATCH(substitute(upper(A2),"V",""),"^([0-9]{1}\.[0-9]{1,3}\.[0-9]{1,3})$"),substitute(upper(A2),"V",""),"INVALID"))

Extract Specific Parameter value using Regex Postgresql

Given input string as
'PARAM_1=TRUE,THRESHOLDLIST=kWh,2000,Gallons,1000,litre,3000,PARAM_2=TRUE,PARAM_3=abc,123,kWh,800,Gallons,500'
and unit_param = 'Gallons'
I need to extract value of unit_param (Gallons) which is 1000 using postgresql regex functions.
As of now, I have a function that first extracts value for THRESHOLDLIST which is "kWh,2000,Gallons,1000,litre,3000", then splits and loops over the array to get the value.
Can I get this efficiently using regex.
SELECT substring('PARAM_1=TRUE,THRESHOLDLIST=kWh,2000,Gallons,1000,litre,3000,PARAM_2=TRUE,PARAM_3=abc,123,xyz' FROM '%THRESHOLDLIST=#".........#",%' FOR '#')
Use substring() with the target input grouped:
substring(myCol, 'THRESHOLDLIST=[^=]*Gallons,([0-9]+)')
The expression [^=]* means “characters that are not =”, so it won’t match Gallons within another parameter.
select
Substring('PARAM_1=TRUE,THRESHOLDLIST=kWh,2000,Gallons,1000,litre,3000,PARAM_2=TRUE,PARAM_3=abc,123,xyz' from 'Gallons,\d*');
returns Gallons,1000

How can I replace multiple words "globally" using regexp_replace in Oracle?

I need to replace multiple words such as (dog|cat|bird) with nothing in a string where there may be multiple consecutive occurrences of a word. The actual code is to remove salutations and suffixes from a name. Unfortunately the garbage data I get sometimes contains "SNERD JR JR."
I was able to create a regular expression pattern that accomplishes my goal but only for the first occurrence. I implemented a stupid hack to get rid of the second occurrence, but I believe there has to be a better way. I just can't figure it out.
Here is my "hacked" code;
FUNCTION REMOVE_SALUTATIONS(IN_STRING VARCHAR2) RETURN VARCHAR2 DETERMINISTIC
AS
REGEX_SALUTATIONS VARCHAR2(4000) := '(^|\s)(MR|MS|MISS|MRS|DR|MD|M D|SR|SIR|PHD|P H D|II|III|IV|JR)(\.?)(\s|$)';
BEGIN
RETURN TRIM(REGEXP_REPLACE(REGEXP_REPLACE(IN_STRING,REGEX_SALUTATIONS,' '),REGEX_SALUTATIONS,''));
END REMOVE_SALUTATIONS;
I was actually proud that I was able to get this far, as regular expression are not very regular to me. All help is appreciated.
EDIT:
The default for regexp_replace based on my understanding is to do a global replace. But on the outside chance my DB is configured different I did try;
select REGEXP_REPLACE('SNERD JR JR','(^|\s)(MR|MS|MISS|MRS|DR|MD|M D|SR|SIR|PHD|P H D|II|III|IV|JR)(\.?)(\s|$)',' ',1,0) from dual;
and the results are;
SNERD JR
Use occurrence parameter of REGEXP_REPLACE function. The docs says:
occurrence is a nonnegative integer indicating the occurrence of the replace operation:
If you specify 0, then Oracle replaces all occurrences of the match.
If you specify a positive integer n, then Oracle replaces the nth occurrenc
https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions137.htm#SQLRF06302
It should look like:
...
REGEXP_REPLACE(IN_STRING,REGEX_SALUTATIONS,' ', 1,0 )
...

Extract numbers from a field in PostgreSQL

I have a table with a column po_number of type varchar in Postgres 8.4. It stores alphanumeric values with some special characters. I want to ignore the characters [/alpha/?/$/encoding/.] and check if the column contains a number or not. If its a number then it needs to typecast as number or else pass null, as my output field po_number_new is a number field.
Below is the example:
SQL Fiddle.
I tired this statement:
select
(case when regexp_replace(po_number,'[^\w],.-+\?/','') then po_number::numeric
else null
end) as po_number_new from test
But I got an error for explicit cast:
Simply:
SELECT NULLIF(regexp_replace(po_number, '\D','','g'), '')::numeric AS result
FROM tbl;
\D being the class shorthand for "not a digit".
And you need the 4th parameter 'g' (for "globally") to replace all occurrences.
Details in the manual.
For a known, limited set of characters to replace, plain string manipulation functions like replace() or translate() are substantially cheaper. Regular expressions are just more versatile, and we want to eliminate everything but digits in this case. Related:
Regex remove all occurrences of multiple characters in a string
PostgreSQL SELECT only alpha characters on a row
Is there a regexp_replace equivalent for postgresql 7.4?
But why Postgres 8.4? Consider upgrading to a modern version.
Consider pitfalls for outdated versions:
Order varchar string as numeric
WARNING: nonstandard use of escape in a string literal
I think you want something like this:
select (case when regexp_replace(po_number, '[^\w],.-+\?/', '') ~ '^[0-9]+$'
then regexp_replace(po_number, '[^\w],.-+\?/', '')::numeric
end) as po_number_new
from test;
That is, you need to do the conversion on the string after replacement.
Note: This assumes that the "number" is just a string of digits.
The logic I would use to determine if the po_number field contains numeric digits is that its length should decrease when attempting to remove numeric digits.
If so, then all non numeric digits ([^\d]) should be removed from the po_number column. Otherwise, NULL should be returned.
select case when char_length(regexp_replace(po_number, '\d', '', 'g')) < char_length(po_number)
then regexp_replace(po_number, '[^0-9]', '', 'g')
else null
end as po_number_new
from test
If you want to extract floating numbers try to use this:
SELECT NULLIF(regexp_replace(po_number, '[^\.\d]','','g'), '')::numeric AS result FROM tbl;
It's the same as Erwin Brandstetter answer but with different expression:
[^...] - match any character except a list of excluded characters, put the excluded charaters instead of ...
\. - point character (also you can change it to , char)
\d - digit character
Since version 12 - that's 2 years + 4 months ago at the time of writing (but after the last edit that I can see on the accepted answer), you could use a GENERATED FIELD to do this quite easily on a one-time basis rather than having to calculate it each time you wish to SELECT a new po_number.
Furthermore, you can use the TRANSLATE function to extract your digits which is less expensive than the REGEXP_REPLACE solution proposed by #ErwinBrandstetter!
I would do this as follows (all of the code below is available on the fiddle here):
CREATE TABLE s
(
num TEXT,
new_num INTEGER GENERATED ALWAYS AS
(NULLIF(TRANSLATE(num, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ. ', ''), '')::INTEGER) STORED
);
You can add to the 'ABCDEFG... string in the TRANSLATE function as appropriate - I have decimal point (.) and a space ( ) at the end - you may wish to have more characters there depending on your input!
And checking:
INSERT INTO s VALUES ('2'), (''), (NULL), (' ');
INSERT INTO t VALUES ('2'), (''), (NULL), (' ');
SELECT * FROM s;
SELECT * FROM t;
Result (same for both):
num new_num
2 2
NULL
NULL
NULL
So, I wanted to check how efficient my solution was, so I ran the following test inserting 10,000 records into both tables s and t as follows (from here):
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
INSERT INTO t
with symbols(characters) as
(
VALUES ('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
)
select string_agg(substr(characters, (random() * length(characters) + 1) :: INTEGER, 1), '')
from symbols
join generate_series(1,10) as word(chr_idx) on 1 = 1 -- word length
join generate_series(1,10000) as words(idx) on 1 = 1 -- # of words
group by idx;
The differences weren't that huge but the regex solution was consistently slower by about 25% - even changing the order of the tables undergoing the INSERTs.
However, where the TRANSLATE solution really shines is when doing a "raw" SELECT as follows:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
NULLIF(TRANSLATE(num, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ. ', ''), '')::INTEGER
FROM s;
and the same for the REGEXP_REPLACE solution.
The differences were very marked, the TRANSLATE taking approx. 25% of the time of the other function. Finally, in the interests of fairness, I also did this for both tables:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT
num, new_num
FROM t;
Both extremely quick and identical!

DB2: find field value where first character is a lower case letter

I am trying to pick out a value in a field where the first character is a lower case letter. This is difficult since DB2 does not permit regular expressions. My current attempt is:
select * from mytable
where field1 like lcase('_%')
where I was hoping the underscore followed by percent wildcard would find any character in the first position, and then wrap the lcase() around that to ensure it is lower case. the result is that any and every value gets selected, so the lcase() is not performing what I want it to do, and in hindsight is used to cast to lowercase.
With that in mind, how to I ensure that the result of
('_%')
is lowercase only?
Thanks very much
i would use something like:
... where substr(field1,1,1) <> upper(substr(field1,1,1))
solution with 'a'...'z' will not work with characters different from latin characterset (e.g. cyrilic etc)
Why not
where field1 >= 'a' and field1 < '{'
This will even make use of an appropriate index, if any.
Be warned, however, that this won't work when your DB instance does lexicongraphic ordering. I am not sure if the latter is a DB attribute or a session attribute, however.
Another, more general way (especially when considering non ASCII letters) would be to check if the length of the field is > 0 and the lowercased substring consisting of the first character equals the substring consisting of the first character while the uppercased first character does not equal the first character. (Look up the functions in the DB2 reference, I have mine not ready at the moment.)
DB2 DOES allow Regular expressions with xQuery. For example:
with cteGender(VALUE) as
(
values
('M'),('F'),('U'),('S'),(' M'),('f')
),
cteResult(VALUE,RESULT_BOOLEAN) as
(
select '"' || VALUE || ‘"',
xmlquery('fn:matches($VALUE,''^[MFU]{1}$'')') from cteGender
)
select VALUE, RESULT_BOOLEAN,
xmlcast(RESULT_BOOLEAN as integer) RESULT_INTEGER from cteResult;
I took this example from: http://www.idug.org/p/bl/et/blogid=278&blogaid=187 That article explain very well how to use xQuery.
DB2 does not have SQL functions for Regular Expressions, but with xQuery you can do that. But if you really want SQL functions for RegEx, please visit this site: https://www.ibm.com/developerworks/jp/data/library/db2/j_d-regularexpression/ (In Japanese, but the code can be understood)
For more information about RegEx in DB2 please visit: http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrregexp.html