Get output with flag if same record comes next time in informatica power center - compare

I need to put flag if records again comes from source in informatica power center
This need to achieve in informatica power center. After that I will use filter transformation to pass only flag =1 records to output. Basically I need to track changed record of flag and load as scd 2 in target table.
Input
Number Code Date
1234 3 2022/01/22
1234 3 2022/01/23
1234 4 2022/01/24
1234 3 2022/01/25
1234 3 2022/01/26
1234 2 2022/01/27
1234 4 2022/01/28
4567 1 2022/01/29
4567 1 2022/01/20
4567 3 2022/01/21
Output
Number Code Date Flag
1234 3 2022/01/22 1
1234 3 2022/01/23 2
1234 4 2022/01/24 1
1234 3 2022/01/25 1
1234 3 2022/01/26 2
1234 2 2022/01/27 1
4567 1 2022/01/29 1
4567 1 2022/01/20 2
4567 3 2022/01/21 1

You need to use variable ports in an expression transformation to track values in the previous record and set a flag depending on whether a value has changed or not.
Because Informatica evaluates variable ports in order, if the variable port that compares the current record (input port) with the previous record (variable port x) is before variable port x, variable port x will hold the value from the previous record.
There are plenty of detailed examples of this common pattern if you google for them e.g. this one

Related

Flag everytime when ID change date DAX

I have table where with orders, articles belonging to orders and their shipping dates. What I want to do is, flag every time when shipping date changed or (when all dates for OrderID are the same) flag only once.
I tried to use calculated columns wrote in DAX, like nextdate, prevdate, nextorder, prevorder and reffer to them, but it doesn't work
I would appreciate every tip how to solve my prblem. Thanks!
OrderID
Article ID
Shipping date
Flag
123
1
01.01.2012
1
123
2
01.01.2012
0
123
1
02.01.2012
1
1234
12
15.03.2012
1
678
12
25.05.2014
1
678
345
25.05.2014
0
678
567
25.05.2014
0

Amazon QuickSight - Working out size of network

I have a database table with a record for each IOT device connected, each device has a unique device id and a unique network id associated with it.
For example:
device_id
network_id
1
1
2
1
3
1
4
2
5
2
6
3
7
3
8
3
9
3
10
4
I would like to be able visualise the size of each network based on its id. So I would have an output like such based on the above data:
network_id
size
1
3
2
2
3
4
4
1
I'm not currently sure how to do this
I found that using the countOver function worked for this
I made a calculated field called NetworkSize which was defined as:
countOver
(
{device_id}
,[{network_id}]
)
Which gives the right output I was looking for
However I have to include device_id in the visual which is a bit inconvenient

Plotting categorical variables using a bar diagram/bar chart

data
I am trying to plot a bar graph for both sept and oct waves. As in the image you can see the id are the individuals who are surveyed across time. So on the one graph I need to plot sept in-house, oct in-house, sept out-house, oct out-house and just have to show the proportion of people who said yes in sept in-house, oct in-house, sept out-house, oct out-house. Not all the categories have to be taken into account.
Also I have to show whiskers for 95% confidence intervals for each of the respective categories.
* Example generated by -dataex-. For more info, type help dataex
clear
input float(id sept_outhouse sept_inhouse oct_outhouse oct_inhouse)
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 3 3 3
5 4 4 3 3
6 4 4 3 3
7 4 4 4 1
8 1 1 1 1
9 1 1 1 1
10 1 1 1 1
end
label values sept_outhouse codes
label values sept_inhouse codes
label values oct_outhouse codes
label values oct_inhouse codes
label def codes 1 "yes", modify
label def codes 2 "no", modify
label def codes 3 "don't know", modify
label def codes 4 "refused", modify
save tokenexample, replace
rename (*house) (house*)
reshape long house, i(id) j(which) string
replace which = subinstr(proper(which), "_", " ", .)
gen yes = house == 1
label def WHICH 1 "Sept Out" 2 "Sept In" 3 "Oct Out" 4 "Oct In"
encode which, gen(WHICH) label(WHICH)
statsby, by(WHICH) clear: ci proportion yes, jeffreys
set scheme s1color
twoway scatter mean WHICH ///
|| rspike ub lb WHICH, xla(1/4, noticks valuelabel) xsc(r(0.9 4.1)) ///
xtitle("") legend(off) subtitle(Proportion Yes with 95% confidence interval)
This has to be solved backwards.
The means and confidence intervals have to be plotted using twoway as graph bar is a dead-end here, because it does not allow whiskers too.
The confidence limits have to be put in variables before the graphics. Some graph commands, notably graph bar, will calculate means for you, but as said that is a dead end. So, we need to calculate the means too.
To do that you need an indicator variable for Yes.
The best way I know to get the results then is to reshape to a different structure and then apply ci proportion under statsby.
As a detail, the option jeffreys is explicit as a signal that there are different methods for the confidence interval calculation. You should choose one knowingly.

How to sum by group in Power Query Editor?

My table look like this :
Serial WO# Value Indicator
A 333 10 333-1
A 333 4 333-2
B 456 5 456-1
A 334 1 334-1
A 334 5 334-2
I want to create a new column that sums up the Values based on WO#. It should look like this:
Serial WO# Value Indicator SumValue
A 333 10 333-1 14
A 333 4 333-2 14
B 456 5 456-1 5
A 334 1 334-1 6
A 334 5 334-2 6
Eventually I will remove duplicates on the WO# and remove the Value and Indicator Columns from the data. I can't seem to find a function in M that allows for sum by group. Thanks in advance!
If you load the data with Power Query, there is a Group command on the ribbon that will do just that.
Make sure to use the Advanced option and add all columns you want to retain to the grouping section. Screenshot from Excel ....
.... and from Power BI

Find social network components in Stata

[I copied part of the below example from a separate post and changed it to suit my specific needs]
pos_1 pos_2
2 4
2 5
1 2
3 9
4 2
9 3
The above is read as person_2 is connected to person_4,...,person_4 is connected to person_2, and person_9 is connected to person_3.
I want to create a third categorical [edited] variable, component, that lets me know if the observed link is part of a connected component (subnetwork) within this network. In this case, there are two connected components in the network:
pos_1 pos_2 component
2 4 1
2 5 1
1 2 1
3 9 2
4 2 1
9 3 2
All nodes in component 1 are connected to each other, but not to the nodes in component 2 and vice versa. Is there a way to generate this component variable in Stata? I know there are alternative programs to do this in, but my code would be more seamless if I can integrate it into Stata.
If you reshape the data to long form, you can use group_id (from SSC) to get what you want:
clear
input pos_1 pos_2
2 4
2 5
1 2
3 9
4 2
9 3
end
gen id = _n
reshape long pos_, i(id) j(n)
clonevar comp = id
list, sepby(comp)
group_id comp, match(pos)
reshape wide pos_, i(id) j(n)
egen component = group(comp)
list