Postgres: Window Function row_number() wrong output? - row

i have a confussing problem here. I'm working with some arrays and trying to get the 10 minors values from all of them merged as well as the array they are within and the position they are inside such array.
My relation is arrays(id int, array float[]);
So, on it i have several stored arrays:
1, '{v1,v2,v3,v4,v5...}'
2, '{v1,v2,v3,v4,v5...}'...etc
My first query is next:
WITH T1 AS(SELECT id, unnest(array) value from arrays order by value LIMIT 10)
SELECT T1.id as id, cell(array,value) as offset, value from T1;
In this case cell() is an UDF i developed to return the position given an array and an arbitrary value.
The second query (using w-functions) is next:
WITH T1 AS(SELECT id, unnest(array) value from arrays)
SELECT id, row_number() over (partition by sid) as offset, value from T1 order by value LIMIT 10;
Despite they both return the same values (which is correct), the offset is not the same and seems they are somehow upside-down.
These are some examples outputs with bigger arrays im working with, and you can see the problemim having.
Query 1 output:
id | offset | value
-----+--------+-----------
1 | 17569 | 0.0156216
1 | 20801 | 0.0164499
1 | 20802 | 0.0171007
1 | 17570 | 0.0171008
1 | 17568 | 0.0180476
1 | 20800 | 0.0182249
1 | 20803 | 0.0194675
1 | 1411 | 0.02142
1 | 1412 | 0.02142
1 | 1413 | 0.0215976
Query 2 output:
id | offset | value
-----+--------+-----------
1 | 6591 | 0.0156216
1 | 9823 | 0.0164499
1 | 9824 | 0.0171007
1 | 6592 | 0.0171008
1 | 6590 | 0.0180476
1 | 9822 | 0.0182249
1 | 9825 | 0.0194675
1 | 26144 | 0.02142
1 | 26140 | 0.02142
1 | 26149 | 0.0215976
I would appreciate any help please. Thank you!

You haven't got an order specified in your window function in Query 2, which means that Postgres will probably be internally sorting by sid - before you outer ORDER BY value is applied.
WITH t1 AS (
SELECT id, UNNEST( array ) AS value
FROM arrays
)
SELECT id, row_number() OVER ( PARTITION BY sid ORDER BY value ) as offset, value
FROM t1
ORDER BY value
LIMIT 10;

Related

What is the more efficient way to find row-wise sum in power bi DAX?

I have a sample table with the following values:
location | col1 | col2 | col3 | col4
------------------------------------------
usa1 | 1 | 1 | 1 | 1
usa2 | 1 | 0 | 1 | 1
values are boolean for true (1) and false (0).
I would like to add a new column that shows the sum per row. from https://www.c-sharpcorner.com/article/sum-multiple-column-using-dax-in-power-bi/
it suggested the following approach:
Measure Total = SUM(table[col1]) + SUM(table[col2]) + ... + SUM(table[colx])
I am getting the expected sum for the four columns I tried. But if I have 20 columns, I was hoping you can guide me to write the DAX more efficiently.
expected output
location | col1 | col2 | col3 | col4 | sum
------------------------------------------
usa1 | 1 | 1 | 1 | 1 | 4
usa2 | 1 | 0 | 1 | 1 | 3
I would use unpivoting feature of PowerQuery to go from wide to long by selecting location and unpivot all other columns.
Then the sum by location would be immediate in any visual, no need for DAX.
One way I do it is
Sum = table[col1] + table[col2] + table[col3] + ...
I am not sure if there is another way for your situation since I only had at most 5 columns to add.

How to sum up a measure based on different levels in Power BI using DAX

I have the following table structure:
| Name 1 | Name 2 | Month | Count 1 | Count 2 | SumCount |
|--------|--------|--------|---------|---------|----------|
| A | E | 1 | 5 | 3 | 8 |
| A | E | 2 | 1 | 6 | 7 |
| A | F | 3 | 3 | 4 | 7 |
Now I calculate the following with a DAX measure.
Measure = (sum(Table[Count 2] - sum(Table[Count 1])) * sum(Table[SumCount])
I can't use a column because then the formula is applied before excluding a layer (eg. month). Added to my table structure and excluded month it would look like that:
| Name 1 | Name 2 | Count 1 | Count 2 | SumCount | Measure |
|--------|--------|---------|---------|----------|---------|
| A | E | 6 | 9 | 15 | 45 |
| A | F | 3 | 4 | 7 | 7 |
I added a table to the view which only displays Name 1in which case the measure of course will sum up Count 1, Count 2 and SumCount and applies the measure which leads to the following result:
| Name 1 | Measure |
|--------|---------|
| A | 88 |
But the desired result should be
| Name 1 | Measure |
|--------|---------|
| A | 52 |
which is the sum of Measure.
So basically I want to have the calculation on my base level Measure = (sum(Table[Count 1] - sum(Table[Count 2])) * sum(Table[SumCount]) but when drilling up and grouping those names it should only perform a sum.
An iterator function like SUMX is what you want here since you are trying to sum row by row rather than aggregating first.
Measure = SUMX ( Table, ( Table[Count 2] - Table[Count 1] ) * Table[SumCount] )
Any filters you have will be applied to the first argument, Table, and it will only sum the corresponding rows.
Edit:
If I'm understanding correctly, you want to aggregate over Month before taking the difference and product. One way to do this is by summarizing (excluding Month) before using SUMX like this:
Measure =
VAR Summary =
SUMMARIZE (
Table,
Table[Name 1],
Table[Name 2],
"Count1Sum", SUM ( Table[Count 1] ),
"Count2Sum", SUM ( Table[Count 2] ),
"SumCountSum", SUM ( Table[SumCount] )
)
RETURN
SUMX ( Summary, ( [Count2Sum] - [Count1Sum] ) * [SumCountSum] )
You don't want measure in this case, rather you need new column,
Same formula but new column will give your desired result.
Column = ('Table (2)'[Count1]-'Table (2)'[Count2])*'Table (2)'[SumCount]

How to index match a condition set in a cell

I am trying to avoid having a multiple if formula by index matching a table instead, however what i need to match is the actual condition and a string.
Lookup table:
+---+------------------+-------------------+-------+
| | A | B | C |
+---+------------------+-------------------+-------+
| 1 | Current to Prior | Portfolio Comment | Error |
| 2 | =0 | "" | 1 |
| 3 | <>0 | "" | -1 |
| 4 | >0 | OK – Losses | 0 |
| 5 | <0 | OK – Losses | 1 |
| 6 | <0 | OK – New Sales | 0 |
| 7 | >0 | OK – New Sales | 1 |
+---+------------------+-------------------+-------+
Column A: Lookup Condition
Column B: Lookup string
Column C: Return value
Data example with correct hard coded output (column C):
+---+------------------+-------------------+-------+
| | A | B | C |
+---+------------------+-------------------+-------+
| 1 | Current to Prior | Portfolio comment | Error |
| 2 | 0 | | 1 |
| 3 | -100 | OK – Losses | 1 |
| 4 | 50 | | -1 |
| 5 | 200 | OK – Losses | 0 |
| 6 | 0 | | 1 |
| 7 | -400 | OK – New Sales | 0 |
| 8 | 0 | | 1 |
+---+------------------+-------------------+-------+
Column A: Data value
Column B: Data string
Column C: Output formula
I need a formula that matches the data value with the lookup condition, the data string with the lookup string and outputs the return value.
I know you weren't necessarily asking for a VBA solution, but myself (and many others) prefer using UDFs as, in my opinion, it makes reading formulas easier and cleaner - plus you can do without the helper cells.
We start off your UDF by creating a Select Case Statement. We could choose to use either the Numerical Value or String for the cases. I decided to go with the string.
Within each case, you will compare the numerical values provided to the lngCondition parameter, which will ultimately return the value to the function.
Since you didn't have any cases for when textual values could have a lngCondition = 0, I made it return a worksheet error code #VALUE, just as you'd expect from any other Excel formula. This is the reason for the UDF having a variant return type.
Public Function ReturnErrorCode(lngCondition As Long, strComment As String) As Variant
Select Case strComment
Case ""
If lngCondition = 0 Then
ReturnErrorCode = 1
Else
ReturnErrorCode = -1
End If
Case "OK - Losses"
If lngCondition > 0 Then
ReturnErrorCode = 0
ElseIf lngCondition < 0 Then
ReturnErrorCode = 1
Else
' Your conditions don't specify that 'OK - Losses'
' can have a 0 value
ReturnErrorCode = CVErr(xlErrValue)
End If
Case "OK - New Sales"
If lngCondition < 0 Then
ReturnErrorCode = 0
ElseIf lngCondition > 0 Then
ReturnErrorCode = 1
Else
' Your conditions don't specify that 'OK - New Sales'
' can have a 0 value
ReturnErrorCode = CVErr(xlErrValue)
End If
Case Else
ReturnErrorCode = CVErr(xlErrValue)
End Select
End Function
You would then use this formula in the worksheet as such:
=ReturnErrorCode(A1, B1)
Great! But I have no knowledge of VBA and don't know how to add a UDF.
First, you need to open the VBA Editor. You can do this by simultaneously pressing Alt + F11.
Next, you need to create a standard code module. In the VBE, click Insert then select Module (NOT Class module!).
Then copy the code above, and paste it into the new code module you just created.
Since you have now added VBA code to your workbook, you now need to save it as a macro-enabled workbook the next time you save.

How to find a cell based on row and col criteria?

I have a table like this:
| a | b | c |
x | 1 | 8 | 6 |
y | 5 | 4 | 2 |
z | 7 | 3 | 5 |
What I want to do is finding a value based on the row and col titles, so for example if I have c&y, then it should return 2. What function(s) should I use to do this in OpenOffice Calc?
later:
I tried =INDEX(B38:K67;MATCH('c';B37:K37;0);MATCH('y';A38:A67;0)), but it writes invalid argument.
It turned out I wrote the arguments of INDEX in the wrong order. The =INDEX(B38:K67;MATCH('y';A38:A67;0);MATCH('c';B37:K37;0)) formula works properly. The second argument is the row number and not the col number.

Show count of columns distinct values

Hello my fellow colleagues from StackOverflow!
I will be brief, and cut to the point:
I have a table in MS Access, it contains 2 columns of interest- County, and TGTE (Type Of Geothermal Energy ). Column TGTE is of type VARCHAR and it can have 1 of two values, to make it easier let's say it is either L or H.
I need to create SQL query that shows a result which is described bellow:
Bellow is the part of the table:
County | TGTE | ... |
First | L |
First | L |
First | H |
Second | H |
Third | L |
__________________
I need a resulting query that shows the count of distinct TGTE in every County like this:
County | TGTE = L | TGTE = H |
First | 2 | 1 |
Second | 0 | 1 |
Third | 1 | 0 |
__________________________________
How can I create query that displays the desired result described above ?
NOTE:
I have browsed through archive, and found similar things, but nothing to help me.
To be honest, I do not know how to formulate the question properly, so I guess that is why Google couldn't be of much help...
I have tried with this:
SELECT County, COUNT(TGTE) as [Something]
FROM MyTable
WHERE TGTE = "L"
GROUPBY COUNTY;
but this is the result I get:
County | TGTE = L |
First | 2 |
Second | 0 |
Third | 1 |
__________________________________
If I change L to H, in the query above, I get this:
County | TGTE = H |
First | 1 |
Second | 1 |
Third | 0 |
__________________________________
I work on Windows XP, in C++, using ADO to access an MS Access 2007 database.
If there is anything else that I can do to help, ask and I will gladly do it.
EDIT #1:
After trying Declan's solution this is what I get:
Values in main table:
| County | TGTE |
| Стари Град | H |
| Сурчин | L |
| Стари Град | H |
| Савски Венац | H |
| Раковица | H |
Output :
| County | TGTE = L | TGTE = H |
| Раковица | 1 | 1 |
| Савски Венац | 1 | 0 |
| Сурчин | 1 | 0 |
| Стари Град | 1 | 0 |
It should output this:
| County | TGTE = L | TGTE = H |
| Раковица | 1 | 0 |
| Савски Венац | 1 | 0 |
| Сурчин | 0 | 1 |
| Стари Град | 2 | 0 |
EDIT #2:
On Declan's request, here is the original query I use:
wchar_t *query = L"select Општина, \
sum( iif( Тип_геотермалне_енергије =
'Хидрогеотермална енергија', 1, 0 ) ) as [HGTE], \
sum( iif( Тип_геотермалне_енергије =
'Литогеотермална енергија', 1, 0 ) ) as [LGTE] \
from Објекат \
group by Општина; ";
Translated to our example, it looks like this:
wchar_t *query = L"select County, \
sum( iif( TGTE = 'H', 1, 0 ) ) as [HGTE], \
sum( iif( TGTE = 'L', 1, 0 ) ) as [LGTE] \
from MyTable \
group by County; ";
EDIT #3:
After I copy the above query in Access and run it, everything works fine, thus I believe that the problem lies in in usage of ADO.
EDIT #4:
After browsing through Internet, I am sure that problem is ADO.
How can I use IIF() in ADO so my query can work?
If it can't be done, than how to modify y query to do what I have described above?
You need to use the iif function within the two additional columns. Here is some pseudo code to get you started.
SELECT County
,sum(iif(TGTE = "L",1,0)) as [L_Count]
,sum(iif(TGTE = "H",1,0)) as [H_Count]
FROM MyTable
GROUP BY
COUNTY;
I have reworked Deslan's query like bellow, and it works:
SELECT County
,sum( switch( ТGTE = 'L', 1, TGTE = 'H', 0 ) ) as [L_Count]
,sum( switch( ТGTE = 'H', 1, TGTE = 'L', 0 ) ) as [H_Count]
FROM MyTable
GROUP BY
County;
Everything works fine, when I run it through ADO and MS Access 2007.
I do not understand why IIF() isn't working in ADO, maybe it is not supported or something...
Thank you Declan anyway, for your solution.You have +1 from me.