How to specify time-range relative to latest data instead of relative to today or fixed date in Superset? - apache-superset

Data is imported not continuously but every now and then. I would like some charts do display data of the last week with last not meaning last seven days from now, but last seven days from maximum time in data. Is that possible (without a virtual dataset)? I could solve this with a virtual dataset but prefer a more generic solution.

You can achieve this behavior by using Custom SQL in Filters. You can filter based on a subquery from same or another table's max timestamp.
Example:

Related

Power BI: Relative Time Under 5 Hours returns no data

I have a PBI desktop dashboard I've created to pull machine data from a local SQL server. I'm using a relative date time filter on one of the pages to drill down data for live feed, however anything under 5 hours of the relative time, the data goes blank.
I use 4 log tables for the raw data, each having their own time stamp for each instance. Each are related using a ID table with other general information contained. In addition, time is related using a calculated table to create a timeframe of all instances:
Relationship Model
DateTable = distinct(union(SUMMARIZE(LogFault,LogFault[Time]),SUMMARIZE(LogGood,LogGood[Time]),SUMMARIZE(LogReject,LogReject[Time]),SUMMARIZE(LogState,LogState[Time])))
5 Hours Relative Time
4 hours relative time
As you can see from the top right of the images, not even the times are pulled to the page. Is there a limitation to PBI on the relative time function? This wouldn't make sense to me if there is a "minutes" option under relative time. Any feedback on this would be appreciated.
For those looking in the future, unfortunately PowerBI desktop, along with service, appears to only like to work in the UTC time zone. So the relative date/time was filtering based on the UTC time zone, not my time zone (EST). In order to resolve this, I had to create a new calculated column next to my distinct time stamps to correct for the time zone. I then used the adjusted time for the relative time filtering, but the charts remained under the original time stamps.
UTC to EST time zone adjust
UTC_AdjustTZ = FORMAT(DateTable[Time]+TIME(4,0,0),"General Date")
Chart Example after adjust
Chart after fix implemented
Probably because your filter on Date Table doesn't reach the destined table. Normally filter moves from one side to many side, then one side to many side in a chain of relationships; but
In your case for example:
Filter goes from Date Table to Log Reject then It can't move to RejectDefinitions because of the filter direction. You have 2 options here:
1) Change the model relationships : Make Log Reject(One side) and RejectDefinitions(Many side) if It is possible.
OR
2) Set the filter direction as Both in the model.
You need to do this for all the remaining log tables(LogFault-FaultDefinitions,Logstate-StateDefinitions)
I hope It solves your problem. Please check that your model is not ambiguous after making those changes.

how to set current quarter in Superset?

I want to set current quarter dynamically, e.g [2021-01-01 ~ 2021-04-01)
Does superset support it? if so how to config it?
The Last vs Previous and date range control in general has been a source of confusion for my users.
Last Quarter just shows the last 3 months [because it's a quarter of a year?].
It would be great to have options like Week to date, Month/Period to date, Quarter to date, etc...
Another issue is that each company may define their quarters/periods on different starting dates, depending on their fiscal calendar.
As a stop-gap, I've done the following.
enriched the underlying dataset to have additional columns like period_start_date and fiscal_quarter_start_date.
created a fiscal_dates table that contains a list of every day over the years I need to query. The columns correlate with date columns in my other tables, like dob, fiscal_week_start_date, period_start_date, fiscal_quarter_start_date . I created this table in postgres using generate series
created a new virtual dataset that contains the column period_start_date, that shows the last 4 years of period start dates.
use a value native filter to select from the list of dates.
make the values sorted descending, and default value as "first item in list".
This allows the user to select all records that occur in the same quarter/period, with a default of the current quarter.
The tentative apache/superset#17416 pull request should remedy this problem, i.e., for the QTD you would simply specify the START as datetrunc(datetime("now"), quarter) and leave the END undefined.

How would one go around creating a due by attribute in redshift

I am currently trying to calculate due by dates in a table by adding the sla time to the time the request was created. From what I am able to understand, the way to go around this is to create a table with the work days and hours and query that table to find the due date. However, redshift does not allow one to declare variables. I was wondering how I would go around creating a work hour table in redshift and if that is not possible, how I would calculate the due date by other means. Thanks!
It appears that you would like to provide a timestamp and then calculate the timestamp that is 'n work hours later', most probably taking into account certain rules such as:
Weekdays: 9am-5pm
Weekends: No Hours
Holidays: Occasional weekdays with No Hours
This could be done by Creating a scalar Python UDF - Amazon Redshift that would be passed a 'start' timestamp and a number of hours, and would return the 'end' timestamp.
Please note that Scalar UDFs cannot access tables or 'call outside' of Redshift, so it would need to be self-contained.
There is code on the web that shows How to find the number of hours between two dates excluding weekends and certain holidays in Python? BusinessHours package - Stack Overflow. You would need to modify such code to specify the duration rather than finding the duration.
The alternate method of "creating a work hour table" would work well when trying to find the number of work hours between two timestamps but would be a bit harder when trying to add workhours to a timestamp.

How to use REGEX patterns to return day of the week if a date is entered?

I am trying to create something that will populate the day of the week in one cell based on a date entered in another, BUT in Master Data Services. I know I will need to do this in a business rule and apply it to the attributes. I am wondering if this can be don using REGEX patterns or any other clever method. So, for example if I have a column with 12/21/2016 in it, I want the next column to say "Wednesday". Thanks!
The simplest way would be using sql and use a calculated column or directly in a view. I'm not that familiar with MDS so there might be a better solution (or not even possible).
alter table mytable add weekday as DATENAME(dw,fieldname) persisted
Do never try to do your own date/time functions, you will fail

Add column with difference in days

i'm trying the new Power BI (Desktop) to create a barchart that shows me the duration in days for the delivery of an order.
I have 2 files. 1 with the delivery data (date, barcode) and another file with the deliverystatusses (date, barcode).
I Created a relation in the powerBI relations tab on the left side to create a relation on barcode. 1 Delivery to many DeliveryStatusses.
Now I want to add a column/measure to calculate the number of days before a package is delivered. I searched a few blogs but with no succes.
The function DATEDIFF is only recognized in a measure, and measures seem to work on table date, not rowdata. So adding a column using the DATEDIFF function doesn't work.
Adding a column using a formula :
Duration = [DeliveryDate] - Delivery[OrderDate]
results in an error that the right side is a list (It seems the relationship isn't in place)?
What am I doing wrong?
You might try doing this in the Query window instead since I think each barcode has just one delivery date and one delivery status. You could merge the two queries into a single table. Then you wouldn't need to worry about the relationships... If on the other hand you can have multiple lines for each delivery in the delivery status table, then you need to get more fancy. If you're only interested in the last status (as opposed to the history of status) you could again use the Query windows to group the data. If you need the full flexibility, you'd probably need to create a Measure that expresses the logic you want.
The RELATED keyword is used to reference another table. Update your query as follows and it should work.
Like this:
Duration = [DeliveryDate] - RELATED(Delivery[OrderDate])