How to do it in Informatica - informatica

New to Informatica.
For Ex: This is a Flat file to Flat file load.
I have a expression that has calculated the data to the sample given below:
The some CUST has one entry with N flag and some has two with N and Y.
I need only the 1 and N or 2 and Y occurrence to be on target table, as sated below, pls let me know how to do it in Informatica.
Source
CUST-111|N|1
CUST-222|N|1
CUST-222|Y|2
CUST-333|N|1
CUST-444|N|1
CUST-555|N|1
CUST-555|Y|2
CUST-666|N|1
CUST-666|Y|2
Target:
CUST-111|N|1
CUST-222|Y|2
CUST-333|N|1
CUST-444|N|1
CUST-555|Y|2
CUST-666|Y|2
Thanks a lot guys

You can first calculate count of customer. Then, if count =1 and flag = N, pass it to target else if count >1, then pass to target only the record with flag =Y.
Steps below -
Sort data by Cust ID (CID)
Use Aggregator to calculate count.
Use CUST_ID as group by. Create a new output port
out_FLAG_CNT = COUNT(*).
Use joiner to join step 2 and step1. Join condition is Cust ID.
Then use a filter with below condition-
IIF (out_FLAG_CNT>1 AND FLAG='Y',TRUE, IIF( out_FLAG_CNT=1 AND FLAG='N', TRUE, FALSE))
Finally link this data to target.
|-->Agg( count by CID)-|
SQ --> SRT (Sort by CID) -->|---------------------->|JNR (on CID) -->FIL (Cond above) --> Target
Pls note, if you have more than 1 N or more than 1 Y data, then above will not work and you need to attach another aggregator in the end.

Related

How to merger these two records ino one row removing Null value in Informatica using transformation. Please see the snapshot for scenario

enter image description here
Input-
Code value Min Max
A abc 10 null
A abc Null 20
Output-
Code value Min Max
A abc 10 20
You can use an aggregator transformation to remove nulls and get single row. I am providing solution based on your data only.
use an aggregator with below ports -
inout_Code (group by)
inout_value (group by)
in_Min
in_Max
out_Min= MAX(in_Min)
out_Max = MAX(in_Max)
And then attach out_Min, out_Max, code and value to target.
You will get 1 record for a combination of code and value and null values will be gone.
Now, if you have more than 4/5/6/more etc. code,value combinations and some of min, max columns are null and you want multiple records, you need more complex mapping logic. Let me know if this helps. :)

How to countif 56 exists in 156/56/2567 and only return true once? Google sheets

I have one sheet with data on my facebook ads. I have another sheet with data on the products in my store. I'm having trouble with some countifs where I'm counting how many times my product ID exists in a row where multiple numbers are. They are formatted like this: /2032/2034/2040/1/
It's easy on the rows where only one product ID exists but some rows have multiple ID's separated by a /. And I need to see if the ID exists as a exact match alone or somewhere between the /'s.
Rows with facebook ads data:
A1: /2032/2034/2040/1/
A2: /1548/84/2154/2001/
A3: /2032/1689/1840/2548/
Row with product data:
B1: 2034
C1: I need a countifs here that checks how many times B1 exists in column A. Lets say I have thousands of rows with different variations of A1 where B1 could standalone. How do I count this? I always need exact matches.
You can compare the number you want (56) with the REGEX #MonkeyZeus commented whith a little change -> "(?:^|/)"&B1&"(?:/|$)" so the end result is:
=IF(REGEXMATCH(A1, "(?:^|/)"&B1&"(?:/|$)"), true, false)
Example:
UPDATE
If you need to count the total of 56 in X rows you can change the "True / False" of the condition for "1 / 0" and then do a =SUM(C1:C5) on the last row:
=IF(REGEXMATCH(A1, "(?:^|/)"&B1&"(?:/|$)"), 1, 0)
UPDATE 2
Thanks for contributing. Unfortunately I'm not able to do it this way
since I have loads of data to do this on. Is there a way to do it with
a countif in a single cell without adding a extra step with "sum"?
In that case you can do:
=COUNTA(FILTER(A:A, REGEXMATCH(A:A, "(?:^|/)"&B2&"(?:/|$)")))
Example:
UPDATE 3
With the following condition you check every single possibility just by adding another COUNTIF:
=COUNTIF(A:A,B1) + COUNTIF(A:A, "*/"&B1) + COUNTIF(A:A, B1&"/*") + COUNTIF(A:A, "*/"&B1&"/*")
Hope this helps!
try:
=COUNTIF(SPLIT(A1, "/"), B1)
UPDATE:
=ARRAYFORMULA(IF(A2<>"", {
SUM(IF((REGEXMATCH(""&DATA!C:C, ""&A2))*(DATA!B:B="carousel"), 1, )),
SUM(IF((REGEXMATCH(""&DATA!C:C, ""&A2))*(DATA!B:B="imagepost"), 1, ))}, ))

Reading Values from a Bag of Tuples in pig

I have UDF output as :-
Sample records:-
({(Todd,1),(Todd,1),(Todd,1),(Todd,1),(Todd,1),(Todd,5),(Todd,10),(Todd,20),(Todd,10),(Todd,10),(Todd,10),(Todd,10),(Todd,10),(Todd,10)})
({(Jon,1),(Jon,1),(Jon,1),(Jon,1),(Jon,1),(Jon,5),(Jon,10),(Jon,20),(Jon,10),(Jon,10),(Jon,10),(Jon,10),(Jon,5),(Jon,20),(Jon,1)})
Schema for UDF:- name:chararray(1 single column)
Now i want to read this bag of tuples and generate output as :-
Todd,240
Jon,422
The output of the UDF i stored in a temp file and read it back using different schema as:-
D = LOAD '/home/training/pig/pig/UDFdata.txt' AS (B: bag {T: tuple(name:chararray, denom:int)});
After that i am trying to use foreach loop and reference dot notation to find the sum.
X = foreach D generate B.T.name,SUM(B.T.denom);
2017-03-04 13:52:59,507 ERROR org.apache.pig.tools.grunt.Grunt: ERROR
1128: Cannot find field T in name:chararray,denom:int Details at
logfile: /home/training/pig_1488648405070.log
Can you please let me know how to find it? I am new to Apache Pig so not sure how it traverse in Bag of Tuples and find sum.
GROUP the dataset on name before performing SUM.
FLATTEN the bag to perform GROUP.
flattened = FOREACH D GENERATE FLATTEN(B);
dump flattened;
...
(Todd,10)
(Todd,10)
(Jon,1)
(Jon,1)
....
Then, GROUP them on name
grouped = GROUP flattened by name;
dump grouped;
(Jon,{(Jon,1),(Jon,20),(Jon,5),(Jon,10),(Jon,10),(Jon,10),(Jon,10),(Jon,20),(Jon,10),(Jon,5),(Jon,1),(Jon,1),(Jon,1),(Jon,1),(Jon,1)})
(Todd,{(Todd,10),(Todd,10),(Todd,10),(Todd,10),(Todd,10),(Todd,10),(Todd,20),(Todd,10),(Todd,5),(Todd,1),(Todd,1),(Todd,1),(Todd,1),(Todd,1)})
And apply SUM() over the result
final_sum = FOREACH grouped GENERATE group, SUM(flattened.denom);
dump final_sum;
(Jon,106)
(Todd,100)

plsql if control different calculation

I am using PLSQL to realize some of the function below.
I have the table which have piece level data with each piece weight. Basically I want to realize the following function:
if piece weight is over 1 LB. groupby ceil(weight) (next LB)
if piece weight is less 1 LB groupby cell(weight*16) ( Next OZ)
I am just curious how can I realize that in plsql. I feel I need to have the if statement. But I am not sure how to do that.
(Weight is already an variable in that table, do I need to declare here?)
begin
if weight <1 then
select ceil(weight*16),sum(weight)
from ops_owner.track_mail_item
where manifestdate = '24-aug-2016'
group by ceil(weight*16)
else select ceil(weight),sum(weight)
from ops_owner.track_mail_item
where manifestdate = '24-aug-2016'
end if,
end;
Thank you very much!
I would adjust the weight value in an inline view, I think.
select
ceil(adjusted_weight),
sum(adjusted_weight)
from
(
select
case
when weight < 1 then weight * 16
else weight
end adjusted_weight
from
ops_owner.track_mail_item
where
manifestdate = '24-aug-2016'
)
group by
ceil(adjusted_weight);

Based on count value i have to create number of rows,is that possible without java transformation?

Hey guys anyone know how to create number of rows based on the count value without using java transformation in informatica 9.6(For flat file).Please help me with that
You can create an auxiliary table with n rows for each possible count value between 1 and N:
1
2
2
3
3
3
...
...
N rows with the last value
...
N rows with the last value
Join this table to the source data using the n count value as the key and you will get n copies of each source row.