time_start column holds start of conversation
time_ends column holds end of conversation, both are in time type of data in hh:mm:ss format
how can I calculate time duration between calls:
time_between_calls = time_start2 - time_end1
time_start2 - start of 2nd call
time_end1 - end of 1st call
I need to create measure which will give average of that duration for xy calls.
I don't know how to calculate this. I guess I need some kind of temporary table where I would sort dates first.
Any hint could be useful.
I forgot to add that I need to filter by agent_id first in order to get call information only for that individual agent.
So
FOR EACH agent_id do {
all above }
I've manage to solve it by following this solution.
Related
I have a google sheet where I'm getting the duration of a Youtube video as follows:
=REGEXEXTRACT(IMPORTXML(A2,"//*[#itemprop='duration']/#content"),"PT(\d+)M(\d+)S")
This gives me two cells with two values (minutes and seconds). However, I want to perform further calculations on them (multiply the minutes by 60 and add the seconds). How can I 'access' these values within a function, if at all?
You want to retrieve the duration time as the unit of the second.
You want to achieve this using the built-in formulas of Spreadsheet.
If my understanding is correct, how about these sample formulas?
Sample formula:
=VALUE(REGEXREPLACE(IMPORTXML(A2,"//*[#itemprop='duration']/#content"),"PT(\d+)M(\d+)S","00:$1:$2")*24*3600)
In this sample formula, the cell "A2" has the URL like https://www.youtube.com/watch?v=###.
The retrieved duration time is converted to the time format, and the value is retrieved as the second.
For example, when IMPORTXML(A2,"//*[#itemprop='duration']/#content") returns PT1M10S, VALUE(REGEXREPLACE("PT1M10S","PT(\d+)M(\d+)S","00:$1:$2")*24*3600) returns 70.
Even when the time is more than 1 hour, for example, the value like PT123M45S is returned. And =VALUE(REGEXREPLACE("PT123M45S","PT(\d+)M(\d+)S","00:$1:$2")*24*3600) returns 7425.
References:
REGEXREPLACE
VALUE
If I misunderstood your question and this was not the result you want, I apologize.
Added:
As other pattern, if you want to use =REGEXEXTRACT(IMPORTXML(A2,"//*[#itemprop='duration']/#content"),"PT(\d+)M(\d+)S"), how about the following formula?
Sample formula:
=QUERY(ARRAYFORMULA(VALUE(REGEXEXTRACT(IMPORTXML(A2,"//*[#itemprop='duration']/#content"),"PT(\d+)M(\d+)S"))),"SELECT Col1*60+Col2 label Col1*60+Col2 ''")
In this formula, values from the array are used and calculated.
or like this:
=TEXT(VALUE("00:"&SUBSTITUTE(REGEXREPLACE(
IMPORTXML(A1, "//*[#itemprop='duration']/#content"), "PT|S", ), "M",":")), "[ss]")*1
or shortest:
=REGEXREPLACE(IMPORTXML(A1,"//*[#itemprop='duration']/#content"),
"PT(\d+)M(\d+)S", "00:$1:$2")*86400
I've written some code in Siddhi that logs/prints the average of a batch of the last 100 events. So the average for event 0-100, 101-200, etc. I now want to compare these averages with each other to find some kind of trend. In first place I just want to see if there is some simple downward of upward trend for a certain amount of averages. For example I want to compare all average values with all upcoming 1-10 average values.
I've looked into Siddhi documentation but I did not find the answer that I wanted. I tried some solutions with partitioning, but this did not work. The below code is what I have right now.
define stream HBStream(ID int, DateTime String, Result double);
#info(name = 'Average100Query')
from HBStream#window.lengthBatch(100)
select ID, DateTime, Result, avg(Result)
insert into OutputStream;
Siddhi sequences can be used to match the averages and to identify a trend, https://siddhi.io/en/v5.1/docs/query-guide/#sequence
from every e1=HBStream, e2=HBStream[e2.avgResult > e1.avgResult], e3=HBStream[e3.avgResult > e2.avgResult]
select e1.ID, e3.avgResult - e1.avgResult as tempDiff
insert into TempDiffStream;
Please note you have to use partition to decide this patter per ID is you need averages to be calculated per Sensor. In your app, also use group by if you need average per sensor
#info(name = 'Average100Query')
from HBStream#window.lengthBatch(100)
select ID, DateTime, Result, avg(Result) as avgResult
group by ID
insert into OutputStream;
I'm trying to apply a certain type of ranking system to my data-set and having trouble.
My Issue:
RANK() OVER(PARTITION BY Staff, Storage ORDER BY Order_Flow)
Essentially, whenever Storage 20 occurs, I want to assign a number to that row, and anything between it and the next occurrence of Storage 20 has the same number. Then from the next occurrence of Storage 20 to the next, the same thing.
My current Rank function will not accurately capture Storage 80 because it only started occurring later in the order flow.
Please view image (it can start from 1, doesn't necessarily have to start from 0).
image of example data
It looks like this can be solved using a RESET WHEN in your window function:
MAX() OVER(
PARTITION BY <...>
ORDER BY Order_Flow
RESET WHEN Storage = 20
)
I believe you can leave out the PARTITION BY if you just want to control the ordering and don't need to do any partitioning. Or just use a constant value, like PARTITION BY 1 or something to that effect.
Documentation:
https://docs.teradata.com/reader/756LNiPSFdY~4JcCCcR5Cw/8uRgqNTevlcmjBfsU3WQsw
Stackoverflow:
teradata, reset when, partition by, order by
It's a simple Cumulative Sum over a CASE:
sum(case when Storage = 20 then 1 else 0 end)
over(Partition By Staff
Order By Order_Flow
rows unbounded preceding)
I hope somebody can help me with some hints for the following analysis. The students may do some actions for some courses (enroll, join, grant,...) and also the reverse - to cancel the latest action.
The first metric is to count all the action occurred in the system between two dates - these are exposed like a filter/slicer.
Some sample data :
person-id,person-name,course-name,event,event-rank,startDT,stopDT
11, John, CS101, enrol,1,2000-01-01,2000-03-31
11, John, CS101, grant,2,2000-04-01,2000-04-30
11, John, CS101, cancel,3,2000-04-01,2000-04-30
11, John, PHIL, enrol, 1, 2000-02-01,2000-03-31
11, John, PHIL, grant, 2, 2000-04-01,2000-04-30
The data set (ds) is above and I have added the following code for the count metric:
evaluate
sumx(
addcolumns( ds
,"z+", if([event] <> "cancel",1,0)
,"z-", if([event] = "cancel",-1,0)
)
,[z+] + [z-])
}
The metric should display : 3 subscriptions (John-CS101 = 1 , John-PHIL=2).
There are some other rules but I don't know how to add them to the DAX code, the cancel date is the same as the above action (non-cancel) and the rank of the cancel-action = the non-cancel-action + 1.
Also there is a need for adding the number for distinct student and course, the composite key . How to add this to the code, please ? (via summarize, rankx)
Regards,
Q
This isn't technically an answer, but more of a recommendation.
It sounds like your challenge is that you have actions that may then be cancelled. There is specific logic that determines whether an action is cancelled or not (i.e. the cancellation has to be the immediate next row and the dates must match).
What I would recommend, which doesn't answer your specific question, is to adjust your data model rather than put the cancellation logic in DAX.
For example, if you could add a column to your data model that flags a row as subsequently cancelled, then all DAX has to do is check that flag to know if an action is cancelled or not. A CALCULATE statement. You don't have to have lots of logic to determine whether the event was cancelled. You entirely eliminate the need for SUMX, which can be slow when working with a lot of rows since it works row by row.
The logic for whether an action is cancelled or not moves to your source system (e.g. SQL or even a calculated column in Excel), or to your ETL (e.g. the Query Editor in Power BI) which are better equipped for such tasks. The logic is applied 1 time and then exists in your data model for all measures, instead of needing to apply the logic each time a measure is used.
I know this doesn't help you solve your logic question, but the reason I make this recommendation is that DAX is fundamentally a giant calculator. It adds things up. It's great at filters (adding some things up but not others), but it works best when everything is reduced to columns that it can sum or count. Once you go beyond that (e.g. wanting to look at the row below to adjust something about the current row), your DAX is going to get very complicated (and slow), whereas a source system or the Query Editor will likely be able to handle such requirements more easily.
I need to define a calculated member in MDX (this is SAS OLAP, but I'd appreciate answers from people who work with different OLAP implementations anyway).
The new measure's value should be calculated from an existing measure by applying an additional filter condition. I suppose it will be clearer with an example:
Existing measure: "Total traffic"
Existing dimension: "Direction" ("In" or "Out")
I need to create a calculated member "Incoming traffic", which equals "Total traffic" with an additional filter (Direction = "In")
The problem is that I don't know MDX and I'm on a very tight schedule (so sorry for a newbie question). The best I could come up with is:
([Measures].[Total traffic], [Direction].[(All)].[In])
Which almost works, except for cells with specific direction:
So it looks like the "intrinsic" filter on Direction is overridden with my own filter). I need an intersection of the "intrinsic" filter and my own. My gut feeling was that it has to do with Intersecting [Direction].[(All)].[In] with the intrinsic coords of the cell being evaluated, but it's hard to know what I need without first reading up on the subject :)
[update] I ended up with
IIF([Direction].currentMember = [Direction].[(All)].[Out],
0,
([Measures].[Total traffic], [Direction].[(All)].[In])
)
..but at least in SAS OLAP this causes extra queries to be performed (to calculate the value for [in]) to the underlying data set, so I didn't use it in the end.
To begin with, you can define a new calculated measure in your MDX, and tell it to use the value of another measure, but with a filter applied:
WITH MEMBER [Measures].[Incoming Traffic] AS
'([Measures].[Total traffic], [Direction].[(All)].[In])'
Whenever you show the new measure on a report, it will behave as if it has a filter of 'Direction > In' on it, regardless of whether the Direction dimension is used at all.
But in your case, you WANT the Direction dimension to take precendence when used....so things get a little messy. You will have to detect if this dimension is in use, and act accordingly:
WITH MEMBER [Measures].[Incoming Traffic] AS
'IIF([Direction].currentMember = [Direction].[(All)].[Out],
([Measures].[Total traffic]),
([Measures].[Total traffic], [Directon].[(All)].[In])
)'
To see if the Dimension is in use, we check if the current cell is using OUT. If so we can return Total Traffic as it is. If not, we can tell it to use IN in our tuple.
I think you should put a column in your Total Traffic fact table for IN/OUT indication & create a Dim table for the IN & Out values. You can then analyse your data based on IN & Out.