I have two tables:
DateDim
Time
I am trying to get the sum of hours_actual from my Time table where they are between two dates from my DateDim. They have a relationship on the date shown in the following:
I am currently using the following DAX formula:
PreviousPeriod_Hours = CALCULATE(SUM('Time'[hours_actual])
,DATESBETWEEN(
DateDim[FullDateAlternateKey],
[Start of Previous Period],
[End of Previous Period]),
ALL(DateDim)
)
The values for [Start of Previous Period] and [End of Previous Period] are calculated DAX dates, that are showing as I would expect.
In order to arrive at those dates I create a few DAX functions first:
Start of This Period = FIRSTDATE(DateDim[FullDateAlternateKey])
End of This Period = LASTDATE(DateDim[FullDateAlternateKey])
Days in This Period = DATEDIFF([Start of This Period],[End of This Period],DAY)
End of Previous Period = PREVIOUSDAY(LASTDATE(DATEADD(DateDim[FullDateAlternateKey],-1*[Days in This Period],DAY)))
Start of Previous Period = PREVIOUSDAY(FIRSTDATE(DATEADD(DateDim[FullDateAlternateKey],-1*[Days in This Period] + IF(MOD(Year('MeasureTable'[End of This Period]),4) == 0,1,0),DAY)))
To quickly summarize the above, it is finding the days between a start and end date, and then subtracting these days from my start and end dates that are selected. If it is a leap year, then add a day.
The dax formula is giving me the correct sum total I am expecting. However, if I display the hours by month between the 2 dates, they are showing something different altogether from what it should be, and don't add to the sum it displays.
I was expecting the following values:
I am not sure where the 13 is coming from, and the 28.25 looks to be a repeat from the previous month of the following year. What I am missing here? Is my current approach correct, I am just doing something incorrectly? or am I taking the wrong approach altogether?
UPDATE - Adding in some of the data I am working with:
Then the DateDim is just a generated date table, for example, a row looks like the following (2016-2021):
FullDateAlternateKey Year Month Month Name Quarter Week of Year Week of Month Day Day of Week Day of Year Day Name Fiscal Year Fiscal Period Fiscal Quarter
2016-01-02 2016 1 January 1 1 1 2 6 2 Saturday 2016 5 2
And the hours_actual and date look like the following:
Date_Start hours_actual
2019-03-05 12:00:00 AM 5
2019-03-26 12:00:00 AM 3
2019-04-23 12:00:00 AM 0.75
2019-04-24 12:00:00 AM 0.08
2019-05-22 12:00:00 AM 4
2019-05-22 12:00:00 AM 2
2019-05-22 12:00:00 AM 1.75
2019-05-27 12:00:00 AM 8
2019-05-31 12:00:00 AM 0.25
2019-06-03 12:00:00 AM 0.25
2019-06-05 12:00:00 AM 0.25
2019-06-21 12:00:00 AM 1
2019-06-27 12:00:00 AM 2
2019-06-27 12:00:00 AM 0.5
2019-06-28 12:00:00 AM 1
2019-06-28 12:00:00 AM 3
2019-07-04 12:00:00 AM 3
2019-07-05 12:00:00 AM 3
2019-07-10 12:00:00 AM 2.5
2019-07-10 12:00:00 AM 0.5
2019-07-10 12:00:00 AM 1.5
2019-07-10 12:00:00 AM 0.5
2019-07-10 12:00:00 AM 2
2019-07-12 12:00:00 AM 2.5
2019-07-17 12:00:00 AM 1
2019-07-18 12:00:00 AM 0.5
2019-07-24 12:00:00 AM 0.5
2019-07-24 12:00:00 AM 1
2019-07-24 12:00:00 AM 1.5
2019-07-24 12:00:00 AM 1
2019-07-25 12:00:00 AM 1
2019-07-25 12:00:00 AM 0.5
2019-07-31 12:00:00 AM 1
2019-07-31 12:00:00 AM 1.5
2019-07-31 12:00:00 AM 1
2019-07-31 12:00:00 AM 0.5
2019-08-01 12:00:00 AM 2
2019-08-07 12:00:00 AM 4
2019-08-07 12:00:00 AM 3.75
2019-08-08 12:00:00 AM 4
2019-08-14 12:00:00 AM 1.25
2019-09-11 12:00:00 AM 3.5
2019-09-11 12:00:00 AM 2.5
2019-09-12 12:00:00 AM 3
2019-09-12 12:00:00 AM 1.75
2019-09-13 12:00:00 AM 4
2019-09-13 12:00:00 AM 1.75
2019-09-13 12:00:00 AM 3
2019-09-14 12:00:00 AM 2
2019-09-14 12:00:00 AM 3.25
2019-09-16 12:00:00 AM 0.5
2019-09-16 12:00:00 AM 0.5
2019-09-26 12:00:00 AM 2.5
After experimenting a little more, the DAX functions for the previous start and end dates were being picked up on a monthly basis as well as a yearly basis. My mistake was thinking the DAX function would only evaluate on the slicers and not on table values presented.
I took a different approach, and basically created a reference table of the Time table, and added a column that added a year to the date for each row. I then joined the reference table to my DateDim table by this future_date column. I was finally able to show the values by the current period and previous period and it accurately gave the results I was looking for.
Related
Example
Record Table
id value created_datetime
1 10 2022-01-18 10:00:00
2 11 2022-01-18 10:15:00
3 8 2022-01-18 15:15:00
4 25 2022-01-19 09:00:00
5 16 2022-01-19 12:00:00
6 9 2022-01-20 11:00:00
I want to filter this table 'Record Table' as getting each date latest value.For Example there are three dates 2022-01-18,2022-01-19,2022-01-20 in which latest value of these dates are as follows
Latest value of each dates are (Result that iam looking to get)
id value created_datetime
3 8 2022-01-18 15:15:00
5 16 2022-01-19 12:00:00
6 9 2022-01-20 11:00:00
So how to filter to recieve results as the above mentioned table
It can be done in two steps:
First get the latest datetime for each day and then filter the records by that.
max_daily_date_times = Record.objects.extra(select={'day': 'date( created_datetime )'}).values('day') \
.annotate(latest_datetime=Max('created_datetime'))
records = Record.objects.filter(
created_datetime__in=[entry["latest_datetime"] for entry in max_daily_date_times]).values("id", "value",
"created_datetime")
I have 4 values collected daily.
I want to graph the average of the 4 values on a time series graph.
If I was to plot this.
1/03/2021 will show an average value of 15 and 2/03/2021 will show an average value of 35.
I tried using quick measure that says rolling average of 1 day before 0 days after, it gives me an error.
The Dax which I've tried didn't work either - getting "too many arguments were passed to the Values Function. the maximum argument count for the function is 1". This is me trying to follow some instructions online for the first time.
Day Avg = AVERAGEX(VALUES([Date], [Values]))
Thanks for the input.
Gem
Assuming your data looks like this
Table
Date
Time
Value
01/03/2021
00:01:00
10
01/03/2021
06:00:00
20
01/03/2021
12:00:00
15
01/03/2021
18:00:00
15
02/03/2021
00:01:00
30
02/03/2021
06:00:00
20
02/03/2021
12:00:00
40
02/03/2021
18:00:00
50
It seems your row context is at the table level, so you don't need to use VALUES.
AVG =
AVERAGEX ( 'Table', 'Table'[Value] )
I have a query that generates every day of the year(shown below). What if I want to get a series of every hour of every day of the year from the current timestamp. Example: today is July 23,2019 10:30:00 AM, the result I am hoping to get is below
2019-07-23 20:30:00
2019-07-23 20:00:00
2019-07-23 19:00:00
2019-07-23 18:00:00
.
.
.
2018-07-23 20:00:00
This is a Redshift (PostgreSQL 8.0.2) query for Eclipse Birt. Hoping to create a parameter for both date and time but seems difficult to achieve if 2 separate ranges.
select cast(convert_timezone('UTC','AEST',cast(now() as timestamp without time zone)) as date) - generate_series(0, 365) date,
to_char(cast(convert_timezone('UTC','AEST',cast(now() as timestamp without time zone)) as date) - generate_series(0, 365), 'dd/mm/yyyy') date_disp;
Example: today is July 23,2019 10:30:00 AM, the result I am hoping to get is below:
2019-07-23 20:30:00
2019-07-23 20:00:00
2019-07-23 19:00:00
2019-07-23 18:00:00
.
.
.
2018-07-23 20:00:00
This is to similar to your previous question.
Use:
SELECT date_trunc('hour', now()::timestamp) - generate_series(0, 24 * 365) * interval '1 hour'
This outputs:
2019-07-23 05:00:00
2019-07-23 04:00:00
etc
You can use the DATEADD Redshift function, using "h", "hr" or "hrs" as your first parameter. Documentation for this function can be found here and here.
I am trying to create a variance measure in PowerBI.
This is the data that I have,
Month Year MonthNo Value
Jan 2016 1 700
Feb 2016 2 800
March 2016 3 900
April 2016 4 750
.
.
Jan 2017 13 690
Feb 2017 14 730
And My variance for the Month Number 7 should be like,
`{Avg(values(4,5,6) - Value(7)} / Value(7)`
i.e (Average of last 3 months value - current month value) / Current month value
How to do this in Power BI? Thanks.
If it is okay for you to use a column, I believe you could add one with this code to get what you want:
Variance = (CALCULATE(AVERAGEX(Sheet1,Sheet1[Value]),FILTER(FILTER(Sheet1,Sheet1[MonthNo]<=EARLIER(Sheet1[MonthNo])-1),Sheet1[MonthNo]>=EARLIER(Sheet1[MonthNo])-3))-Sheet1[Value])/Sheet1[Value]
You'll need to replace all instances of Sheet1 with the name of your table.
It'll give you something like this:
My pandas dataframe is structured like this (with 'date' as index):
starttime duration_seconds
date
2012-12-24 11:52:00 31800
2012-12-23 0:28:00 35940
2012-12-22 2:00:00 26820
2012-12-21 1:57:00 23520
2012-12-20 1:32:00 23100
2012-12-19 0:50:00 25080
2012-12-18 1:17:00 24780
2012-12-17 0:38:00 25440
2012-12-15 10:38:00 32760
2012-12-14 0:35:00 23160
2012-12-12 22:54:00 3960
2012-12-12 0:21:00 24060
2012-12-10 23:45:00 900
2012-12-11 11:00:00 24840
2012-12-10 0:27:00 25980
2012-12-09 19:29:00 4320
2012-12-09 3:00:00 29880
2012-12-08 2:07:00 34380
I use the following to groupby date and sum the total seconds each day:
df_sum = df.groupby(df.index.date).sum()
What I'd like to do is sum duration_seconds from noon on one day to noon on the following day. Is there an elegant (pandas) way of doing this? Thanks in advance!
pd.TimeGrouper is a custom groupby class for time-interval grouping of NDFrames with a DatetimeIndex, TimedeltaIndex or PeriodIndex. (If your dataframe index is using date-strings, you'll need to convert it to a DatetimeIndex first by using df.index = pd.DatetimeIndex(df.index).)
df.groupby(pd.TimeGrouper('24H')).sum() groups df using 24-hour intervals starting at time 00:00:00.
df.groupby(pd.TimeGrouper('24H'), base=12).sum() groups df using 24-hour intervals starting at time 12:00:00:
In [90]: df.groupby(pd.TimeGrouper('24H', base=12)).sum()
Out[90]:
duration_seconds
2012-12-07 12:00:00 34380.0
2012-12-08 12:00:00 34200.0
2012-12-09 12:00:00 26880.0
2012-12-10 12:00:00 24840.0
2012-12-11 12:00:00 28020.0
2012-12-12 12:00:00 NaN
2012-12-13 12:00:00 23160.0
2012-12-14 12:00:00 32760.0
2012-12-15 12:00:00 NaN
2012-12-16 12:00:00 25440.0
2012-12-17 12:00:00 24780.0
2012-12-18 12:00:00 25080.0
2012-12-19 12:00:00 23100.0
2012-12-20 12:00:00 23520.0
2012-12-21 12:00:00 26820.0
2012-12-22 12:00:00 35940.0
2012-12-23 12:00:00 31800.0
Documentation on pd.TimeGrouper is a little sparse. It is a subclas of pd.Grouper and thus many of its parameters have the same meaning as those documented for pd.Grouper. You can find more examples of pd.TimeGrouper usage in the Cookbook. I found the base parameter by inspecting the source code. The base parameter in pd.TimeGrouper has the same meaning as the base parameter in pd.resample and that is not surprising since pd.resample is implemented using pd.TimeGrouper.
In fact, come to think of it, another way to compute the desired result is
df.resample('24H', base=12).sum()