countrows gives wrong output for large dataset in powebi dax - powerbi

Below are the details of table1 where I've few columns like shop, shelf and product. It represents a particular shop has a particular shelf where products are being placed.
Below table replicates small volume of dataset.
shop shelf product
a a1 p1xxxxx
a a2 p2xxxxx
a a3 p1xxxxx
a a4 p1xxxxx
b b1 p1xxxxx
b b2 p2xxxxx
b b3 p3xxxxx
b b4 p1xxxxx
b b5 p2xxxxx
b b6 p1xxxxx
c c1 p3xxxxx
c c2 p3xxxxx
c c3 p2xxxxx
c c5 p2xxxxx
c c6 p3xxxxx
My aim is to get the count of a particular product "p1" where it's being placed.
For small volume, the below formula works fine, but when I tried to run the same formula on large volume of data it represent the total sum of product for each row.
count = if(
CALCULATE(
countrows(Table1),
SEARCH("*p1*",Table1[product],,0))=blank(),
0,
CALCULATE(
COUNTROWS(Table1),
SEARCH("*p1*",Table1[product],,0)
)
)
Below screenshot is for small volume of data. The formula works fine.
But for large volume data, the formula doesn't work, instead it gives total sum of that particular product. Below is the screenshot for reference.
This post also is related to the another question of same dataset.

Related

How to create an alphanumeric grid in a certain sequence allowing double digit numbers?

I have a grid feature class that varies in size and shape. My test shapefile is a 3x4 grid. I need to create an alphanumeric sequence that goes in a specific order but can be scaled for a different size grid. Below is the order the grid is in:
A4 | B4 | C4
A3 | B3 | C3
A2 | B2 | C2
A1 | B1 | C1
and to use this alphanumeric sequence, the list will need to be printed in a specific order (starting from the bottom left of the table, moving to the right, and then returning to the left value on the next row up:
A1, B1, C1, A2, B2, C2, A3, B3, C3, A4, B4, C4
I had this:
from itertools import product
from string import ascii_uppercase, digits
for x, y in product(ascii_uppercase, digits):
print('{}{}'.format(x, y))
It generates a sequence like: A0 through A9, then B0 through B9, and so forth.
However I also need larger grids so the script would have to compensate and allow the sequence to use double digits after 9 if the grid is larger than 9 high.
ie. A10, B10, C10
I then tried to make 2 lists and then combine them together, but I ran into the problem of joining these in the sequence I need.
w = 3
h = 4
alpha = []
numeric = []
for letter in ascii_uppercase[:w]:
alpha.append(letter)
for num in range(1, h+1):
numeric.append(num)
I assume I might not need to make a numeric list, but don't know how to do it. I know slightly more than just the basics of python and have created so more complex scripts, but this is really puzzling for me! I feel like I am so close but missing something really simple from both of my samples above. Thank you for any help you can give me!
Solved, here is what I have for others who might need to use my question:
w = 9
h = 20
alpha = []
numeric = []
for letter in ascii_uppercase[:w]:
alpha.append(letter)
for num in range(1, h+1):
numeric.append(num)
longest_num = len(str(max(numeric)))
for y in numeric:
for x in alpha:
print '{}{:0{}}'.format(x, y, longest_num)
I didn't need the code formatted as a table since I was going to perform a field calculation in ArcMap.
After you compute numeric, also do:
longest_num = len(str(max(numeric)))
and change your format statement to:
'{}{:0{}}'.format(x, y, longest_num)
This ensures that when you get to double digits you get the following result:
A12 | B12 | C12
A11 | B11 | C11
...
A02 | B02 | C02
A01 | B01 | C01
To actually print the grid however you need to change your code:
longest_num = len(str(max(numeric)))
for y in reversed(numeric):
print(" | ".join('{}{:0{}}'.format(x, y, longest_num)
for x in alpha))

Minimum between two boolean decision variables in Pyomo

I need to do something like this:
d1 == min(d2,d3)
where d is a decision variable. I need to use Pyomo. In cplex the solution is achieved by the funnction minl, how can do this in Pyomo or in an equivalent linear form?
I searched for a solution on Google and found that I could assert that d1 must be less or equal to d2 and d3. But this do not fit my proble, because if d2 and d3 is equal to 1, d1 <= 1 while I need d1 == 1.
Thanks for replies.
When the d variables are binary variables,
d1 = min(d2,d3)
is really the same as multiplication
d1 = d2*d3
This is often linearized as
d1 <= d2
d1 <= d3
d1 >= d2+d3-1

Predict data values given history and constraints

If I have data series and a set of constraints and want to predict the most likely values, what is the right algorithm or approach? For example, given the data table as follows:
The first three rows illustrate typical data values. Imagine we have dozens or hundreds of such rows. The constraints on the system are as follows:
G1 + G2 + G3 + G4 == D1 + D2 + D3
G1 + G2 = D1 - C1
G3 = D2 + C1 - C2
G4 = D3 + C2
So, given D1, D2 and D3 we need to predict G1, G2, G3, G4, C1, and C2. Note that there may not necessarily be enough information to solve the system by linear programming alone and so some kind of trend analysis or probability distribution might need to be made.
What is the right algorithm or approach to solve a problem like this?

"initial values are not feasible" error message while running structural equation model in Stata?

I am working on a structural equation model (sem) model with 47 observed variables and 6 latent variables, of which 5 observed variables are endogenous and one latent variable is endogenous. Data has no missing values and sample size is 4,634.
I ran sem in Stata using the following command:
sem (I -> i1 i2 i3 i4 i5_1) ///
(N -> n1 n2 n3 n4) ///
(S -> s1 s2 s3 s4 s5 s6 s7 s8 s9) ///
(T -> t1 t2 t3 t4) ///
(SES -> se1 se2 se3 se4 se5 se6 se7 se8 se9 se10 ///
se11 se12 se13 se14 se15 se16 se17 se18 se19 se20) ///
(CS -> c1 c2 c3 c4 c5) ///
(CS <- I N S T SES)
It returned the following error message:
initial values are not feasible
Why am I receiving this message? How can I deal with this error?
I would start by looking at each measurement model separately, and see if there are problems there. i.e.:
sem ( i1 i2 i3 i4 i5_1 <- I)
sem ( i1 i2 i3 i4 i5_1 <- N)
etc.
My guess would be that the model for SES might prove to be the problem.
Edit:
Based on your comment we now know that the measurement models in isolation converge. Next step would be to check each of the measurement models to see whether they make sense: Do each of the loading have the expected sign? Are there loading that are unexpectedly large or small? If you see that, you need to figure out why that is the case. This just requires staring at your data, looking at graphs and correlation tables.
If there is no problem with your measurement model, then the next step would be to look at the structural part. Obviously you cannot do the same trick as with the measurement model, that is, you cannot estimate the structural part without the measurement models. The structural contains latent variables, and it is the measurement models that define what they are. So without measurement models, the structural model is not identified.
What I would do instead is simplify your model, and than add complication till you run into problems. For example I might start with:
sem (I -> i1 i2 i3 i4 i5_1) ///
(CS -> c1 c2 c3 c4 c5) ///
(CS <- I)
Than continue with:
sem (I -> i1 i2 i3 i4 i5_1) ///
(N -> n1 n2 n3 n4) ///
(CS -> c1 c2 c3 c4 c5) ///
(CS <- I N)
etc.
That way you can find which latent variable causes trouble. My first move would be to look at the measurement model of that variable and look at the scale of that variable. By default sem "borrows" the scale of one of the observed variables by setting the loading for that variable to 1. Is that variable in some sense "weird"? Similarly I would also look at the scale for your endogenous latent variable CS. If they are weird, you can choose to constrain the loading of another variable with a more reasonable scale to 1, or you can "standardize" your latent variable by constraining the variance of the latent variable to be 1.

Finding a submatrix with largest sum [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Finding a submatrix with the maximum possible sum in O(n^2)
I have an NxN matrix. I want to find out an MxM submatrix, with largest sum of its elements, of the above matrix.
What is the efficient algorithm for this.
I have an algorithm that runs in constant time when enlarging M and quadratic time when enlarging N.
First submatrix count as usual. Save the sum. Then move one row field right - the two MxM matrices overlap, so you can just just sum the two non overlaping columns. Save all the sums. Now you can choose the largest sum for the line.
Move to next line. Remember the saved sums? The MxM matrices of first and second lines overlap again, so you can just sum the first and the last lines of MxM matrices and compute first sum in second line.
Now move to the second sum of the second line. Do the same thing as above, but you find out that the last lines of first sum and the second sum in the second row overlap again.
I know it is kind of confusing, if you don't get it, let me know I will draw some picture of it. The algorithm is based on paper in this answer.
EDIT: I know I promised picture, but this should be enough:
A AB AB AB AB B
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
AC ABCD ABCD ABCD ABCD BD
C CD CD CD CD D
These are four submatices, A, B, C, D, positioned like this:
AB
CD
First you count the sum of the A submatrix: sum(A). No optimizations here. Now you want to count the sum of B: sum(B). You can do the same as in A, but notice that A and B overlap. So you assign sum(A) to sum(B), count the sum of vertical vector A AC AC AC AC and substract if from sum(B), then count the sum of vertical vector B BD BD BD BD and add it to sum(B).
sum(B) = sum(A) - sum(A AC AC AC AC) + sum(B BD BD BD BD)
You have sum(B). Now you can continue and compute the whole first line of submatices.
Move to the second line: matices C and D. You don't have to sum the whole matice C, because in previous line, you saved sum(A). Notice they overlap again. You just need to add the difference between A and C:
//code (1)
subC = sum([A AB AB AB AB]) //as substract C
addC = sum([C CD CD CD CD]) //as add C
sum(C) = sum(A) - subC + addC
You have sum(C). Now you can get sum(D) like this:
//code (2)
subD = sum([AB AB AB AB B]) //as substract D
addD = sum([CD CD CD CD D]) //as add D
sum(D) = sum(B) - subD + addD
but compare subD vs. subC and addD vs addC. They overlap! So you can do it this way:
//code (3)
subD = subC - A + B //from subC substract value on A and add B to it
addD = addC - C + D //same as above
sum(D) = sum(B) - subD + addD
You see that instead of 25 aditions for comuting sum of one submatrix, we do 6. For every possible size of MxM, we have MxM aditions for first submatrix, M*2+2 aditions for first row and for first column and 6 aditions for the rest.