What is rolling sum and how to impliment it in Informatica and my requirement is as follows? - informatica

Can someone pl tell me what is rolling sum and how to implement it in Informatica?
My requirement is as below:(Given by client)
ETI_DUR :
SUM(CASE WHEN AGENT_EXPNCD_DIM.EXCEPTION_CD='SYS/BLDG ISSUES ETI' THEN IEX_AGENT_DEXPN.SCD_DURATION ELSE 0 END)
ETI_30_DAY :
ROLLING SUM(CASE WHEN (SYSDATE-IEX_AGENT_DEXPN.ROW_DT)<=30 AND AGENT_EXPNCD_DIM.EXCEPTION_CD = 'SYS/BLDG ISSUES ETI'
THEN IEX_AGENT_DEXPN.SCD_DURATION ELSE 0 END)
ETI_30_DAY_OVRG :
CASE WHEN ETI_DUR > 0 THEN
CASe
WHEN ROLLINGSUM(ETI_DUR_30_DAY FOR LAST 29 DAYS) BETWEEN 0 AND 600 AND ROLLINGSUM(ETI_DUR_30_DAY FOR LAST 29 DAYS) + ETI_DUR > 600 THEN ROLLINGSUM(ETI_DUR_30_DAY FOR LAST 30 DAYS) - 600
WHEN ROLLINGSUM(ETI_DUR_30_DAY FOR LAST 29 DAYS) > 600 THEN ETI_DUR
ELSE 0 END
ELSE 0 END
And i have implemented as below in Informatica.
Expression Transformation:
o_ETI_DUR-- IIF(UPPER(EXCEPTION_CD_AGENT_EXPNDIM)='SYS/BLDG ISSUES ETI',SCD_DURATION,0)
o_ETI_29_DAY-- IIF(DATE_DIFF(TRUNC(SYSDATE),trunc(SCHD_DATE),'DD') <=29 AND UPPER(EXCEPTION_CD_AGENT_EXPNDIM) = 'SYS/BLDG ISSUES ETI' ,SCD_DURATION,0)
o_ETI_30_DAY -- IIF(DATE_DIFF(TRUNC(SYSDATE),trunc(SCHD_DATE),'DD') <=30 AND UPPER(EXCEPTION_CD_AGENT_EXPNDIM) = 'SYS/BLDG ISSUES ETI' ,SCD_DURATION,0)
Aggregator transformation:
o_ETI_30_DAY_OVRG:
IIF(sum(i_ETI_DUR) > 0,
IIF((sum(i_ETI_29_DAY)>=0 and sum(i_ETI_29_DAY)<=600) and (sum(i_ETI_29_DAY)+sum(i_ETI_DUR)) > 600,
sum(i_ETI_30_DAY) - 600,
IIF(sum(i_ETI_29_DAY)>600,sum(i_ETI_DUR),0)),0)
But is not working. Pl help ASAP.
Thanks a lot....!

Rolling sum is just the sum of some amount over a fixed duration of time. For example, everyday you can calculate the sum of expense for last 30 days.
I guess you can use an aggregator to calculate ETI_DUR, ETI_30_DAY and ETI_29_DAY. After that, in an expression you can implement the logic for ETI_30_DAY_OVRG. Note that you cannot write an IIF expression like that in an aggregator. Output ports must use an aggregate function.

Here is a rolling sum example:
count, rolling_sum
1,1
2,3
5,8
1,9
1,10
Basically it is the sum of the values listed previously. To implement it in Informatica use 'local variables' (variable port in expression transformation) as follows:
input port: count
variable port: v_sum_count = v_sum_count + count
output port: rolling_sum = v_sum_count

we have a moving sum function defined in Numerical functions in Expression transformation:
MOVINGSUM(n as numeric, i as integer, [where as expression]).
Please check if it helps.

Related

DAX - average with multiple filter conditions

I have data like:
Folder Replied Complied
1 testing 1 1
2 /complete/ 0 1
3 none 1 1
4 Incomplete 0 1
5 complete// 0 0
6 Incomplete 1 0
7 ABCcomplete 1 1
I like a measure to calculate the average of Complied (sum divided by count), only where Folder contains the string complete AND Replied is 0 (both conditions simultaneously).
Therefore rows 2, 4, 5 should be used in the count, resulting in 0.66... (1 + 1 + 0)/3
i've tried several things but the formula either results in an error, or returns the wrong result
i.e.
Measure = CALCULATE (
Average( [Complied]),
CONTAINSSTRING([Folder],"complete") && [replied] = 0
)
DAX is very confusing to me. Thanks in advance
edit:
I've seen examples like
`
= CALCULATE(AVERAGE([col]), CONTAINSSTRING([Folder],"complete") , [replied] = 0)
note the , instead of && but that doesn't work for some reason either. Neither does AND(condition1, condition2).
This dax measure should be the one you are looking for:
Measure = CALCULATE(AVERAGE(Sheet1[Complied]),
CONTAINSSTRING(Sheet1[Folder],"complete") && Sheet1[Replied]=0)
So how is it working?
ContainString to check about "complete", work like VBA instr function
&& in order to meet both condition
Calculate(method, expression) to filter all the value
Scorecard
You may first test with the following measure to check if statement is working in your case first:
IF(CONTAINSSTRING(Sheet1[Folder],"complete") && Sheet1[Replied]=0,"True","False")
Only three row is True here:

Transforming a logic constraint into python pulp code

I started working on a problem in the past several days...
A company plans its business in a three month period. It can produce
110 units at a cost of 600 each. The minimum amount it must produce
per month is 15 units if active (but of course, it can choose to be closed
during the month, and produce 0 units). Each month it can subcotract the
prodution of 60 units, at a cost of 660 each. Storing a unit for one month
costs 20$ per unit per month. The marketing department has forcasted
sales of 100, 130 and 150 units for the next three months, respectively.
The goal is to meet the demand each month while minimizing the total
cost.
I deduced that we need to have an objective function of form min[Sum(i=0..3) 600*x1+660*x2+20*x3].
We need to add some constrains on x1>=15, and on x2 0<=x2<=60
Also we will also need another constraint for each month...
For the first one i=1 => x1+x2 = 100 - x3last (x3last is an extra variable that should hold the amount existing in deposit from the previous month), and for i=2 and i=3 same constraints.
I don't have any idea how to write this in pulp, and i would appreciate some help. Thx ^_^
I'd tend to agree with #Erwin that you should focus on formulating the problem as a Linear Program. It is then easy to translate this into code in PULP or one of many other PULP libraries/tools/languages.
As an example of this - lets work through this process for the example problem you have written out in your question.
Decision Variables
The first thing to decide is what you can/should decide. This set of information is called the decision variables. Picking the best/easiest decision variables for your problem comes with practice - the important thing is that once you know the values of the variables you have a unique solution to the problem.
Here I would suggest the following. These assume that the forecasts for demand are perfect. For each month i:
Whether the production line should be open - o[i]
How much to produce in that month - p[i]
How much to hold in storage for next month - s[i]
How much to get made externally - e[i]
Objective Function
The objective in your case is obvious - minimise the total cost. So we can just write this down: sum(i=0...2)[p[i]*600 + s[i]*20 + e[i]*660]
Constraints
Let's lift these directly our of your problem description:
"It can produce 110 units at a cost of 600 each. The minimum amount it must produce per month is 15 units if active (but of course, it can choose to be closed during the month, and produce 0 units)."
p[i] >= o[i]*15
p[i] <= o[i]*110
The first constraint forces the minimum production about to be 15 if the production is open that month (o[i] == 1), if the production is not open this constraint has not effect. The second constraint sets a maximum value on p[i] of 110 if the production is open and a maximum production of 0 if the production is closed that month (o[i] == 0).
"Each month it can subcotract the prodution of 60 units, at a cost of 660 each"
e[i] <= 60
"The marketing department has forcasted sales of 100, 130 and 150 units for the next three months, respectively. The goal is to meet the demand each month while minimizing the total cost." If we declare the sales in each mongth to be sales[i], we can define our "flow constraint" as:
p[i] + e[i] + s[i-1] == s[i] + sales[i]
The way to think of this constraint is inputs on the left, and outputs on the right. Inputs of units are production, external production, and stuff taken out of storage from last month. Outputs are units left/put in storage for next month and sales.
Finally in code:
from pulp import *
all_i = [1,2,3]
all_i_with_0 = [0,1,2,3]
sales = {1:100, 2:130, 3:150}
o = LpVariable.dicts('open', all_i, cat='Binary')
p =LpVariable.dicts('production', all_i, cat='Linear')
s =LpVariable.dicts('stored', all_i_with_0, lowBound=0, cat='Linear')
e =LpVariable.dicts('external', all_i, lowBound=0, cat='Linear')
prob = LpProblem("MinCost", LpMinimize)
prob += lpSum([p[i]*600 + s[i]*20 + e[i]*660 for i in all_i]) # Objective
for i in all_i:
prob += p[i] >= o[i]*15
prob += p[i] <= o[i]*110
prob += e[i] <= 60
prob += p[i] + e[i] + s[i-1] == sales[i] + s[i]
prob += s[0] == 0 # No stock inherited from previous monts
prob.solve()
# The status of the solution
print ("Status:", LpStatus [prob.status])
# Dislay the optimums of each var
for v in prob.variables ():
print (v.name, "=", v.varValue)
# Objective fcn
print ("Obj. Fcn: ", value(prob.objective))
Which returns:
Status: Optimal
external_1 = 0.0
external_2 = 10.0
external_3 = 40.0
open_1 = 1.0
open_2 = 1.0
open_3 = 1.0
production_1 = 110.0
production_2 = 110.0
production_3 = 110.0
stored_0 = 0.0
stored_1 = 10.0
stored_2 = 0.0
stored_3 = 0.0
Obj. Fcn: 231200.0

ternary operator ?: is throwing errors about wrong arguments

I'm trying to display a different time frame macd on a given time frame chart. so display 5 min macd on 1 min chart etc.
I've decided to accomplish that by multiplying a number 5 to the interval which is an integer and then turn that into a string and use that in the plot.
This works fine since I don;'t have to change it every time I change the time frame of the chart from 1 to 10 min etc, and it will still display the longer time frame macd based on the multiple.
This following code works fine using the ternary operator ?:
//#version = 2
study(title="test")
source = close
fastLength = input(12, minval=1)
slowLength=input(26,minval=1)
signalLength=input(9,minval=1)
// res5 mutiplies the current interval which is an integer by a factor 5 and turns it into a string with the value of "interval*5" or "1D" depending on the value of interval*5
res5= interval*5 < 1440 ? tostring(interval*5) : "1D"
src5=security(tickerid, res5, close)
fastMA5 = ema(src5, fastLength)
slowMA5 = ema(src5, slowLength)
macd5 = fastMA5 - slowMA5
signal5 = sma(macd5, signalLength)
outMacD5 = security(tickerid, res5, macd5)
plot( outMacD5 ? outMacD5 : na, color= red)
But if I were to change it to have more conditions like below, the ternary operator fails.
//#version = 2
study(title="test")
source = close
fastLength = input(12, minval=1)
slowLength=input(26,minval=1)
signalLength=input(9,minval=1)
// res5 mutiplies the current interval which is an integer by a factor 5 and turns it into a string with the value of "interval*5" or "1D" depending on the value 9of inteval*5
//res5= interval*5 < 1440 ? tostring(interval*5) : "1D"
res5= interval*5 < 1440 ? tostring(interval*5) : interval >= 1440 and interval*5 < 2880 ? "1D":na
src5=security(tickerid, res5, close)
fastMA5 = ema(src5, fastLength)
slowMA5 = ema(src5, slowLength)
macd5 = fastMA5 - slowMA5
signal5 = sma(macd5, signalLength)
outMacD5 = security(tickerid, res5, macd5)
plot( outMacD5 ? outMacD5 : na, color= red)
That brings back the error
Add to Chart operation failed, reason: Error: Cannot call `operator ?:` with arguments (bool, literal__string, na); available overloads ...
Using the iff brings back the same error about the arguments being incorrect.
I could really use some help here. I'm so lost in using these conditional operators.
Any tips are helpful.
Use this:
res5= interval*5 < 1440 ? tostring(interval*5) : interval >= 1440 and interval*5 < 2880 ? "1D": ""
plotchar(res5=="5", "res5 test", "", location=location.top)
The plotchar() call will allow you to confirm res5's value. Here it is being tested for "5", so you will be able to verify in the Data Window (doesn't print anything in the indicator's pane so it doesn't disturb scale) that its value is 1 -> true when you are on a 1min chart.
[Edit 2019.08.19 09:02 — LucF]
Your question was about the ternary not working, which the code above resolves. Following your comment, you also need a more complete function to calculate a multiple of the current timeframe in v2. Use this:
f_MultipleOfRes( _mult) =>
// Convert target timeframe in minutes.
_TargetResInMin = interval * _mult * (
isseconds ? 1. / 60. :
isminutes ? 1. :
isdaily ? 1440. :
isweekly ? 7. * 24. * 60. :
ismonthly ? 30.417 * 24. * 60. : na)
// Find best way to express the TF.
_TargetResInMin <= 0.0417 ? "1S" :
_TargetResInMin <= 0.167 ? "5S" :
_TargetResInMin <= 0.376 ? "15S" :
_TargetResInMin <= 0.751 ? "30S" :
_TargetResInMin <= 1440 ? tostring(round(_TargetResInMin)) :
tostring(round(min(_TargetResInMin / 1440, 365))) + "D"
See here for a use case, but don't use that function code as it's v4.

Power BI dividing function shows negative value instead of positive

I have a problem and probably it is very simple. There are sales values of 2 years for some markets in my data. I'm trying to calculate annual growths so I divide 2019 value of a market by 2018 value and then subtract -1. (Example: 2018 sales: $100, 2019 sales: $200. Growth is [(200/100)-1]= 1= %100) But some markets were 0 in 2018 and started to operate in 2019. Therefore, the growth must be +%100. But it gives -%100.
YTD19vs18 = (DIVIDE(SUM(YTDPerformans[YTD 2019]);SUM(YTDPerformans[YTD 2018])))-1
Sure it gives -1 because thats the result of the calculation
(0 / 200) - 1 = -1
You can include a check for this case. Somthing like if the previous year is zero return always 1 = 100%:
YTD19vs18 = IIF(
YTDPerformans[YTD 2018] = 0;
1;
DIVIDE(SUM(YTDPerformans[YTD 2019]);
SUM(YTDPerformans[YTD 2018]))
)-1
)

How to create a variable taking value X+1 if an event doesn't occur in X periods?

How can I create a new variable that takes value X+1 if an event doesn't occur in X periods of time?
Specifically, I have data of many people in 12 years. For a question, they could answer yes (1) or no(0). I care the first time someone says Yes during 12 years and created a variable that takes value of the number of years with Yes replies.
But if someone replies No for 12 years, I set value of that variable equal 13. But I'm stuck at how to do that.
by hhidpn (wave), sort: gen byte EarlyHeart = sum(rhearte) == 1
gen EarlyHeart1=year if EarlyHeart==1
(what's next?)
If the last cumulative sum for an individual is 0, then they all are.
by hhidpn (wave), sort: gen byte EarlyHeart = sum(rhearte) == 1
by hhidpn : replace EarlyHeart = 13 if EarlyHeart[_N] == 0