Handle account balance during concurrent transactions - concurrency

I've been developing an application that handles accounts and transactions made over all the accounts.
Currently the MariDB database table the application uses is modeled the following way:
id column in account_transaction is primary key and it will auto increment
account_transaction
+------+-------------+----------------------+---------+------------------+-----+
| id | account_id | date | value | resulting_amount | ... |
+------+-------------+----------------------+---------+------------------+-----+
| 101 | 100 | 03/may/2012 10:13:33 | 2000 | 2000 | ... |
| 102 | 100 | 03/may/2012 10:13:33 | 500 | 2500 | ... |
| 103 | 100 | 03/may/2012 10:13:34 | -1000 | 1500 | ... |
| 104 | 200 | 03/may/2012 10:13:35 | 1300 | 1300 | ... |
| 105 | 200 | 03/may/2012 10:13:36 | 200 | 1500 | ... |
| 106 | 200 | 03/may/2012 10:13:37 | -500 | 1000 | ... |
+------+-------------+----------------------+---------+------------------+-----+
The query to credit the amount 300 to the account_id (100) the query is
INSERT INTO account_transaction (account_id,date, value, resulting_amount)
VALUES (100, NOW(), 300, COALESCE((SELECT at.resulting_amount
FROM account_transaction at
WHERE at.account_id = 100
ORDER BY at.date DESC, at.id DESC
LIMIT 1), 0) + 300)
The query to debit the amount 300 to the account_id (100) the query is
INSERT INTO account_transaction (account_id,date, value, resulting_amount)
VALUES (100, NOW(), -300, COALESCE((SELECT at.resulting_amount
FROM account_transaction at
WHERE at.account_id = 100
ORDER BY at.date DESC, at.id DESC
LIMIT 1), 0) - 300)
I am using sub query to find latest balance while inserting new transaction. I have used coalesce if there are no transactions for the account.
I could have ran the below subquery separately to find the current balance of the account and use it in the new transaction but the problem is multiple concurrent transactions are reading same balance which leads to account balance discrepancy and it is loss to the company.So I have written the subquery inside insert query to avoid balance discrepancy
SELECT at.resulting_amount
FROM account_transaction at
WHERE at.account_id = 100
ORDER BY at.date DESC, at.id DESC
LIMIT 1
Subquery inside insert query approach was able to handle balance discrepancy if concurrent requests are lesser than 50.
If number of transactions are more than 50 then balance discrepancy is occurring some times.
Example of balance discrepancy: If account balance is 1000 and if 2 concurrent transactions wants to debit 100 then resulting_amount for both transactions would be 900 which is incorrect.
Please suggest better approach to handle balance discrepancy when large number concurrent transactions are placed. If you want to suggest locks approach then use column level lock (lock account_id column).

The easy answer is don't keep resulting_amount in the transaction table, just a balance in a separate table (with primary key account_id).
Or do that and in a transaction update the account balance and use the new balance as the resulting_amount to insert.
Your existing code just assumes ORDER BY at.date DESC, at.id DESC will always find the most recently inserted record, and that isn't going to hold true with concurrent requests.

Related

Calculate monthly balance in PowerBI

Data-Table:
DB-Fiddle
CREATE TABLE vouchers (
id SERIAL PRIMARY KEY,
event_date DATE,
credits_collected INT,
credits_redeemed INT
);
INSERT INTO vouchers
(event_date, credits_collected, credits_redeemed
)
VALUES
('2020-01-08', '900', '700'),
('2020-02-15', '500', '300'),
('2020-02-20', '100', '250'),
('2020-03-19', '600', '850'),
('2020-04-03', '450', '130');
SQL-Query:
SELECT
t1.event_month AS event_month,
t1.credits_collected AS credits_collected,
t1.credits_redeemed AS credits_redeemed,
SUM(t1.credits_collected - t1.credits_redeemed) OVER (
ORDER BY t1.event_month ASC ROWS UNBOUNDED PRECEDING) AS balance
FROM
(SELECT
DATE_PART('month', v.event_date) AS event_month,
SUM(v.credits_collected) AS credits_collected,
SUM(v.credits_redeemed) AS credits_redeemed
FROM vouchers v
GROUP BY 1
ORDER BY 1) t1
GROUP BY 1,2,3
ORDER BY 1;
Result:
event_month | credits_collected | credits_redeemed | balance
-------------|---------------------|--------------------|------------
1 | 900 | 700 | 200
2 | 600 | 550 | 250
3 | 600 | 850 | 0
4 | 450 | 130 | 320
I am loading the above data-table into PowerBI.
Now, I want to create a report that looks like the results I am getting with the SQL-Query above.
I am able to put credits_collected and credits_redeemed to the report.
However, I have no clue what DAX formula I need to calculated the balance for the end of each month.
Do you have any idea how I can solve this issue?
I could solve the issue with two steps:
Step 1: Implementing an additional column with this DAX formula:
column_balance_calculated_daily = CALCULATE(SUM(Tabelle1[credits_collected])-SUM(Tabelle1[credits_redeemed]),ALL('Tabelle1'),'Tabelle1'[event_date]<=EARLIER('Tabelle1'[event_date] ))
Step 2: Use the column in the following DAX formula:
balance = CALCULATE(MAX(Tabelle1[column_balance_calculated_daily]),ENDOFMONTH(Tabelle1[event_date]))

Adding a measure which finds the next row value for every row (similar to SQL Lead window function)

will be very grateful if you could share your experience and advice on the following problem in Power BI:
3 Tables given in the data model:
calendar dimension table
fact table on sessions
fact table on spending
| CW | Total cost | Sessions | Expected Column 1 | Expected Column 2 |
+----+-------------+-----------+-------------------+-------------------+
| 1 | 1200 | 50 | | |
| 2 | 1500 | 60 | 1200 | 50 |
| 3 | 1700 | 48 | 1500 | 60 |
| 4 | 1150 | 36 | 1700 | 48 |
| 5 | 900 | 29 | 1150 | 36 |
+----+-------------+-----------+-------------------+-------------------+
CW column indicates the calendar week and it is from calendar table. Sessions and Total cost are from sessions and spending tables respectively. Data is aggregated and visualized on calendar week level.
Problem: I need to create measures to derive Expected column 1 and expected column 2 based on total cost and sessions columns. Basically getting next values for each row similar to lead window function.
I have checked power BI community and there are several ideas (for example here https://community.powerbi.com/t5/Desktop/DAX-Query-to-Find-Next-Value/td-p/833896).
But these solution assume all columns are from the same table, however in the above described case
all 3 columns are from different tables.
Will the be possible to get expected columns 1 and 2 and how? Many thanks in advance!

ALL() isn't working to "remove a filter" in DAX; relationship issue?

Basic premise:
'Orders' are comprised of items from multiple 'Zones'.
Customers can call in for 'Credits' (refunds) on botched 'Orders'.
There is a true many-to-many relationship here, because one order could have multiple credits called in at different times; similarly, a customer can call in once regarding multiple orders (generating only one credit memo).
'Credits' granularity is at the item level, i.e.
CREDIT | SO | ITEM | ZONE | CREDAMT
-------------------------------------------------------
42 | 1 | 56 | A | $6
42 | 1 | 52 | A | $8
42 | 1 | 62 | B | $20
42 | 2 | 56 | A | $12
'Order Details' granularity is at the zone level, i.e.
SO | ZONE | DOL_AMT
-------------------------------
1 | A | $50
1 | B | $20
1 | C | $100
2 | A | $26
I have a 'Zone' filter table that helps me sort things better and roll up into broader categories, i.e.
ZONE | TEMP | SORT
-------------------------------
A | DRY | 2
B | COLD | 3
C | DRY | 1
What I need:
I want a pair of visuals for a side by side comparison of order total by zone next to credit total by zone.
What's working:
The 'Credits' component is easy, CreditTotal = abs(sumx(Credits,Credits[CREDAMT])) with Zone as a legend item.
I have a ORDER component that works when the zone is in the credit memo
Order $ by Zone =
CALCULATE (
SUM ( 'Order Details'[DOL_AMT] ),
USERELATIONSHIP ( 'Order Details'[SO], Credits[SO] ),
ALL ( Credits[CreditCategory] )
)
My issue:
Zones that didn't have a credit against them won't show up. So instead of
CREDIT | ZONE | ORDER $ BY ZONE
----------------------------------
42 | A | $76
42 | B | $20
42 | C | $100
I get
CREDIT | ZONE | ORDER $ BY ZONE
----------------------------------
42 | A | $76
42 | B | $20
I have tried to remove this filter by tacking on ALL(Zones[Zone]) and/or ALL('Order Details'[Zone]), but it doesn't help, presumably because it is reporting "all zones" actually found in the 'Credits' table. I'm hoping there's some way to ask it to report all zones in the 'Order Details' table based upon SOs in the 'Credits' table.
In case it helps, here's how the relationships are structured; as an aside, I've tried mixing and matching various combinations of active/inactive, single vs. bidirectional filtering, etc., but the current configuration is the only one that seems to remotely work as desired.
I'm grateful for any suggestions; please let me know if anything is unclear. Thank you.
I was able to get it to work by using 'Order Details'[Zone] rather than Zones[Zone] in the table visual and this measure:
Order $ by Zone =
CALCULATE (
SUM ( 'Order Details'[DOL_AMT] ),
USERELATIONSHIP ( 'Order Details'[SO], Credits[SO] )
)
Notice that regardless of your measure, there is no row in Credits corresponding to zone C, so it doesn't know what to put in the CREDIT column unless you tell it exactly how.
If you remove the CREDIT dimension column, then you don't need to swap tables as I suggested above. You can just use the measure above and then write a new measure for the CREDIT column instead:
CreditValue =
CALCULATE(
VALUES(Credits[CREDIT]),
ALL(Credits),
Credits[SO] IN VALUES('Order Details'[SO])
)

SUM of column conditional to many values of another column

I am trying to accomplish something, but don't know how to do it.
I have a Dimension (Table called TEntry) that represents time entries for employees like so :
Id | EmployeeId | EntryDT | TimeInMinutes | PriceAgreementId
------ | ---------- | ---------- | ------------- | ----------------
1 | 1 | 2017-03-20 | 100 | 1
2 | 1 | 2017-03-31 | 50 | null
3 | 2 | 2017-03-21 | 100 | 1
4 | 2 | 2017-03-23 | 125 | 2
5 | 3 | 2017-03-15 | 90 | null
6 | 3 | 2017-03-25 | 60 | 1
Sometimes they work on "PriceAgreements", and sometimes they don't.
In my Dashboard, i have a Table that groups the table TEntry by EmployeeId and Sums the TimeInMinutes. I also have a Slicer for EntryDT :
EmployeeId | TimeInMinutes
-------------- | -------------
1 | 150
2 | 225
3 | 150
I need to create 2 new columns that represent :
The total TimeInMinutes an Employee has worked on all PriceAgreements
So for EmployeeId #1, the Total would be 100.
The total TimeInMinutes ALL Employees have worked, but only for the PriceAgreements the current Employee (current row) has worked on.
The Table would look like this (without the PriceAgreementIds in parenthesis) :
EmployeeId | TimeInMinutes | TimeInMinutes on PriceAgreements | TimeInMinutes on PriceAgreements ALL other EmployeeIds
-------------- | ------------- | -------------------------------- | ------------------------------------------------------
1 | 150 | 100 (PriceAgreementId=1) | 260 (PriceAgreementId=1)
2 | 225 | 225 (PriceAgreementId=1 and 2) | 385 (PriceAgreementId=1 and 2)
3 | 150 | 150 (PriceAgreementId=1) | 260 (PriceAgreementId=1)
Column "TimeInMinutes on PriceAgreements" is quite easy, but the other one, i cannot find a solution...
I have this DAX expression I started, but it is not complete:
CALCULATE(SUM(TEntry[TimeInMinutes]), NOT ISBLANK(TEntry[PriceAgreementId]), ALL(TEmployee))
TEmployee is a Dimension linked to the main TEntry Table.
Any help would be appreciated.
Thank you
I'm throwing this on as an answer because (a) it might get you (or someone else) going in the right direction and (b) if it's guaranteed that an Employee would only ever have time entries corresponding to 2 price agreements, this would work - which is unlikely the case for you, but might be the case for others trying to accomplish a similar thing.
Measure =
CALCULATE (
SUM ( TEntry[TimeInMinutes] ),
FILTER (
ALL ( TEntry ),
(
TEntry[PriceAgreementID] = MIN ( TEntry[PriceAgreementID] )
|| TEntry[PriceAgreementID] = MAX ( TEntry[PriceAgreementID] )
)
&& TEntry[PriceAgreementID] <> BLANK ()
)
)
This measure is saying: SUM the TimeInMinutes for all records in the TEntry table where the PriceAgreementID matches either the minimum OR maximum PriceAgreementID (in the context of the current row) AND the PriceAgreementID isn't blank.
The fatal flaw in this answer is in the MIN and MAX. For Employee ID 2, who has 2 PriceAgreementIDs (1 & 2) - the MIN will calculate the minutes for PriceAgreementID 1 and the MAX will calculate the minutes for PriceAgreementID 2. However, to expand to a case where there might be more than 2 PriceAgreements...I don't know how to do that.
It does work on the sample data in your question, though (since there is a max of 2 price agreements per employee):
Typically when I'm faced with a problem like this that isn't easy to solve, I think about my data model and make sure that it conforms to a star schema as closely as possible.
In your case, an employee can have multiple price agreements, and a price agreement can be associated with many employees. That, to me, suggests a many-to-many relationship. I'd strongly recommend reading more about many-to-many relationships and whether restructuring the underlying tables (e.g. to include a bridge table) would help get you closer to the answer you need.
A good starting point might be: https://www.sqlbi.com/articles/many-to-many-relationships-in-power-bi-and-excel-2016/

% of Grand Total of a Measure that uses other Measures and is Crossfiltered

this seems so easy in my head but I haven't been able to get it for the last few hours....
I have a Table visualization that provides Cost by Hour using measures.
Category | Total Cost | Hours | Cost per Hour
A | 1000 | 10 | 100
B | 2000 | 100 | 20
C | 100 | 4 | 25
D | -500 | 100 | -5
Total | 2600 | 214 | 12.1495
For my purposes, I would also like to create a % of Grand Total of Cost per hour to add to a treechart visualization. However, if I simply add [Cost per Hour] to the treechart again and use the "quick clac" functionality on the field it would return 823.7% for the first record in the above table as (100/12.1495) = 8.2307. I would like this % of GT of Cost per Hour to use the total sum of the Cost per Hour column. Desired Result:
Category | Total Cost | Hours | Cost per Hour | % of Cost per Hour
A | 1000 | 10 | 100 | 71.4%
B | 2000 | 100 | 20 | 14.3%
C | 100 | 4 | 25 | 17.9%
D | -500 | 100 | -5 | -3.8%
Total | 2600 | 214 | 12.1495 | 100%
A few things to note that makes the application of any DAX challenging. All of the below Measures are filtered by multiple filter visualizations from Tables 1-5 and page level filters from Tables 1-5
The table visualization exists in Table1. Costs exist in Tables 2-5 and are related to Table1 using a Many-to-One Single Direction Filter Relationship.
[Total Cost] is a Measure that adds together values from 4 different tables. Eg:Total Cost = sum(table2[value])+sum(table3[value])+sum(table4[value])+sum(table5[value])
[Hours] is a Measure that adds together a column from a table and divides by the distinct count of records in that table. Eg:Hours = sum(table1[hours])/Distinctcount(table1[records])
[Cost per Hour] is a Measure consisting of two other measure.Cost per Hour = [Total Cost] / [Hours]
I sort of feel like this is similar to people wanting to add percentages to pie charts... I'm just trying to ascribe a real number to express the proportion displayed in the TreeChart visualization. I really hope that this is easier than it seems.
EDIT #alejandrozuleta:
Table1 is the original table from which tables 2-5 are referenced&created. An index number was assigned in Table1 and tables 2-5 are linked on this reference number. The reason that tables 2-5 exists separately is because they contain separate cost "types" and a join that occurs in these tables adds additional columns that are only applicable to specific costs types.... for example Table2 is Personnel Costs:
index | Category | Cost Type | Value | Age of Personnel
1 | A | Personnel | 1 | 33
and Table3 is Maintenance Costs:
index | Category | Cost Type | Value | Scheduled or UnScheduled Maint
2 | A | Maintenance | 5 | Scheduled
The if [Age of Personnel] existed in Table3 then it would have a "null" for any record of the Maintenance [Cost Type] vice-versa [Scheduled or UnScheduled Maint] would have a "null" if it existed in Table2. Because I don't want to have to deal with filter visualizations needing to select "(blanks)" for certain costs types the data relationship between these tables is a Many-to-One Single Direction Filter using [index] as the key.
EDIT2:
Working .pbix file with notional data and the data model I described is linked:
StackOverflow_GTofMeasure_Crosfilltered.pbix
I think this solution could work for you. Basically I've created two helper measures (which you don't have to show in your table):
CostPerHourHelper = SUMX(TableName,[Cost per Hour])
CostPerHourTotal = SUMX(ALL(TableName),[Cost per Hour])
Now you can create your % of Cost per Hour measure using this expression:
% Cost Per Hour = [CostPerHourHelper]/[CostPerHourTotal]
It should produce:
UPDATE:
Use ALLSELECTED() function to preserve the explicit filters you applied.
% Cost Per Hour = SUMX ( TableName, [Cost per Hour] )
/ SUMX ( ALLSELECTED ( TableName ), [Cost per Hour] )
Let me know if this helps.