AmCharts4: Datagrouping, Tooltiptext and changing ranges - amcharts4

From a server, I get an entry for every hour in the range I request. If I request a range of 1 Year, I get something like 8000 Datapoints.
When rendering the graph, I want to group my data to hours(which is the raw data without grouping), days and months. However, the chart looks like this:
The tooltip does only display on the very first column, all other columns are above 1.5, but my ValueAxis does not scale automatically. I already checked if I set fixed min and max for the valueAxis, this is not the case.
Interestingly, if i use the scrollbar to zoom in until grouping kicks in, everything seems to work:
After zooming out again, it also works, but i cannot see the tooltip on the "June-Column":
And finally, if I trigger "invalidateData" the graph goes back to the state it was before.
My grouping looks as follows:
series = entry.chart.series.push(new am4charts.ColumnSeries());
dateAxis.groupData = true;
dateAxis.groupCount = 40;
dateAxis.groupIntervals.setAll([
{ timeUnit: "hour", count: 1 },
{ timeUnit: "day", count: 1 },
{ timeUnit: "month", count: 1 },
{ timeUnit: "year", count: 1 },
{ timeUnit: "year", count: 10 }
]);
series.groupFields.valueY = "sum";
I am also not very sure what I should set those values to. I want to see:
months when there is a period of 3 months or more
days when there is a period of 3 days until 3 months
hours when there is a period below 3 days
It is very difficult to do a fiddle for this, as there already is so much code and its hard to extract only the essential parts.
Maybe I am missing something obvious, please help!
Edit:
I forgot another question which is part of datagrouping:
How can I make the tooltip to show the date in a formatted matter so that:
hour-columns shows "dd hh:mm"(where mm obviously is 00 all the time)
day-columns shows: "dd.mm"
month-columns shows: "MM. YYYY"

Nevermind, I solved this issue. This was actually a bug fixed in this release:
https://github.com/amcharts/amcharts4/releases/tag/4.7.19
so updating my local amCharts4 files did the trick.
I still do not know how to change the tooltiptext and the grouping as described in my question

Related

Hour:Minute format on an APEX chart is not possible

I use Oracle APEX (v22.1) and on a page I created a (line) chart, but I have the following problem for the visualization of the graphic:
On the y-axis it is not possible to show the values in the format 'hh:mi' and I need a help for this.
Details for the axis:
x-axis: A date column represented as a string: to_char(time2, 'YYYY-MM')
y-axis: Two date columns and the average of the difference will be calculated: AVG(time2 - time1); the date time2 is the same as the date in the x-axis.
So I have the following SQL query for the visualization of the series:
SELECT DISTINCT to_char(time2, 'YYYY-MM') AS YEAR_MONTH --x-axis,
AVG(time2 - time1) AS AVERAGE_VALUE --y-axis
FROM users
GROUP BY to_char(time2, 'YYYY-MM')
ORDER BY to_char(time2, 'YYYY-MM')
I have another problem to solve it in another way: I am not familiar with JavaScript, if the solution is only possible in this way. Because I started new with APEX, but I have seen in different tutorials that you can use JS. So, when JS is the only solution, I would be happy to get a short description what I must do on the page.
(I don't know if this point is important for this case: The values time1 and time2 are updated daily.)
On the attributes of the chart I enabled the 'Time Axis Type' under Settings
On the y-axis I change the format to "Time - Short" and I tried with different pattern like ##:## but in this case you see for every value and also on the y-axis the value '01:00' although the line chart was represented in the right way. But when I change the format to Decimal the values are shown correct as the line chart.
I also tried it with the EXTRACT function for the value like 'EXTRACT(HOUR FROM AVG(time2 - time1))|| ':' || EXTRACT(MINUTE FROM AVG(time2 - time1))' but in this case I get an error message
So where is my mistake or is it more difficult to solve this?
ROUND(TRUNC(avg(time2 - time1)/60) + mod(avg(time2 - time1),60)/100, 2) AS Y
will get close to what you want, you can set Y Axis minimum 0 maximum 24
then 12.23 means 12 hour and 23 minutes.

Calculate the difference between 2 rows in PowerBI using DAX

I'm trying to complete something which should be quite simple but for the life of me, I can't work it out.
I'm trying to calculate the difference between 2 rows that share the same 'Scan type'.
I have attached a photo showing sample data from production. We run a scan and depending on the results of the scan, it's assigned a color.
I want to find the difference in Scan IDs between each Red scan.
Using the attached Photo of Sample data, I would expect a difference of 0 for id 3. A difference of 1 for id 4 and a difference of 10 for id 14.
I have (poorly) written something that works based on the maximum value from the scan id.
I have also tried following a few posts to see if I can get it to work..
var _curid= MAX(table1[scanid])
var _curclueid = MAX(table1[scanid])
var _calc =CALCULATE(SUM(TABLE1[scanid],FILTER(ALLSELECTED(table1[scanid]),table1[scanid]))
return if(_curid-_calc=curid,0,_curid-_calc)
Edit;
Forgot to mention I have checked threads;
57699052
61464745
56703516
57710425
Try the following DAX and if it helps then accept it as the answer.
Create a calculated column that returns the ID where the colour is Red as follows:
Column = IF('Table'[Colour] = "Red", 'Table'[ID])
Create another column as following:
Column 2 =
VAR Colr = 'Table'[Colour]
VAR SCAN = 'Table'[Scan ID]
VAR Prev_ID =
CALCULATE(MAX('Table'[Column 2]),
FILTER('Table', 'Table'[Colour] = Colr && 'Table'[Scan ID] < SCAN))
RETURN
'Table'[Column] - Prev_ID
Output:
EDIT:-
If you want your first value(ID3) to be 0 then relace the RETURN line with the following line:
IF(ISBLANK(Prev_ID) && 'Table'[Colour] = "Red", 0, 'Table'[Column] - Prev_ID )
This will give you the following result:

Efficient way of running Django query over list of dates

I am working on an investment app in Django which requires calculating portfolio balances and values over time. The database is currently set up this way:
class Ledger(models.Model):
asset = models.ForeignKey('Asset', ....)
amount = models.FloatField(...)
date = models.DateTimeField(...)
...
class HistoricalPrices(models.Model):
asset = models.ForeignKey('Asset', ....)
price = models.FloatField(...)
date = models.DateTimeField(...)
Users enter transactions in the Ledger, and I update prices through APIs.
To calculate the balance for a day (note multiple Ledger entries for the same asset can happen on the same day):
def balance_date(date):
return Ledger.objects.filter(date__date__lte=date).values('asset').annotate(total_amount=Sum('amount'))
Trying to then get values for every day between the date of the first Ledger entry and today becomes more challenging. Currently I am doing it this way - assuming a start_date and end_date that are datetime.date() and tr_dates a list on unique dates on which transactions did occur (to avoid calculating balances on days where nothing happened) :
import pandas as pd
idx = pd.date_range(start_date, end_date)
main_df = pd.DataFrame(index=tr_dates)
main_df['date_send'] = main_df.index
main_df['balances'] = main_df['date_send'].apply(lambda x: balance_date(x))
main_df = main_df.sort_index()
main_df.index = pd.DatetimeIndex(main_df.index)
main_df = main_df.reindex(idx, method='ffill')
This works but my issue is performance. It takes at least 150-200ms to run this, and then I need to get the prices for each date (all of them, not just transaction dates) and somehow match and multiply by the correct balances, which makes the run time about 800 ms or more.
Given this is a web app the view taking 800ms at minimum to calculate makes it hardly scalable, so I was wondering if anyone had a better way to do this?
EDIT - Simple example of expected input / output
Ledger entries (JSON format) :
[
{
"asset":"asset_1",
"amount": 10,
"date": "2015-01-01"
},
{
"asset":"asset_2",
"amount": 15,
"date": "2017-10-15"
},
{
"asset":"asset_1",
"amount": -5,
"date": "2018-02-09"
},
{
"asset":"asset_1",
"amount": 20,
"date": "2019-10-10"
},
{
"asset":"asset_2",
"amount": 3,
"date": "2019-10-10"
}
]
Sample Price from Historical Prices:
[
{
"date": "2015-01-01",
"asset": "asset_1"
"price": 5,
},
{
"date": "2015-01-01",
"asset": "asset_2"
"price": 15,
},
{
"date": "2015-01-02",
"asset": "asset_1"
"price": 6,
},
{
"date": "2015-01-02",
"asset": "asset_2"
"price": 11,
},
...
{
"date": "2017-10-15",
"asset": "asset_1"
"price": 20
},
{
"date": "2017-10-15",
"asset": "asset_2"
"price": 30
}
{
]
In this case:
tr_dates is ['2015-01-01', '2017-10-15', '2018-02-09', '2019-10-10']
date_range is ['2015-01-01', '2015-01-02', '2015-01-03'.... '2019-12-14, '2019-12-15']
Final output I am after: Balances by date with price by date and total value by date
date asset balance price value
2015-01-01 asset_1 10 5 50
2015-01-01 asset_2 0 10 0
.... balances do not change as there are no new Ledger entries but prices change
2015-01-02 asset_1 10 6 60
2015-01-02 asset_2 0 11 0
.... all dates between 2015-01-02 and 2017-10-15 (no change in balance but change in price)
2017-10-15 asset_1 10 20 200
2017-10-15 asset_2 15 30 450
... dates in between
2018-02-09 asset_1 5 .. etc based on price
2018-02-09 asset_2 15 .. etc based on price
... dates in between
2019-10-10 asset_1 25 .. etc based on price
2019-10-10 asset_2 18 .. etc based on price
... goes until the end of date_range
I have managed to get this working but takes about a second to compute and I ideally need this to be at least 10x faster if possible.
EDIT 2 Following ac2001 method:
ledger = (Ledger
.transaction
.filter(portfolio=p)
.annotate(transaction_date=F('date__date'))
.annotate(transaction_amount=Window(expression=Sum('amount'),
order_by=[F('asset').asc(), F('date').asc()],
partition_by=[F('asset')]))
.values('asset', 'transaction_date', 'transaction_amount'))
df = pd.DataFrame(list(ledger))
df.transaction_date = pd.to_datetime(df.transaction_date).dt.date
df.set_index('transaction_date', inplace=True)
df.sort_index(inplace=True)
df = df.groupby(by=['asset', 'transaction_date']).sum()
yields the following dataframe (with multiindex):
transaction_amount
asset transaction_date
asset_1 2015-01-01 10.0
2018-02-09 5.0
2019-10-10 25.0
asset_2 2017-10-15 15.0
2019-10-10 18.0
These balances are correct (and also yield correct results on more complex data) but now I need to find a way to ffill these results to all dates in between as well as from the last date 2019-10-10 to today 2019-12-15 but not sure how that works given the multi-index.
Final solution
Thanks to #ac2001's code and pointers I have come up with the following:
ledger = (Ledger
.objects
.annotate(transaction_date=F('date__date'))
.annotate(transaction_amount=Window(expression=Sum('amount'),
order_by=[F('asset').asc(), F('date').asc()],
partition_by=[F('asset')]))
.values('asset', 'transaction_date', 'transaction_amount'))
df = pd.DataFrame(list(ledger))
df.transaction_date = pd.to_datetime(df.transaction_date)
df.set_index('transaction_date', inplace=True)
df.sort_index(inplace=True)
df['date_cast'] = pd.to_datetime(df.index).dt.date
df_grouped = df.groupby(by=['asset', 'date_cast']).last()
df_unstacked = df_.unstack(['asset'])
df_unstacked.index = pd.DatetimeIndex(df_unstacked.index)
df_unstacked = df_unstacked.reindex(idx)
df_unstacked = df_unstacked.ffill()
This gives me a matrix of asset by dates. I then get a matrix of prices by dates (from database) and multiply the two matrices.
Thanks
I think this might take some back and forth. I think the best approach is to do this in a couple steps.
Let's start with getting asset balances daily and then we will merge the prices together. The transaction amount is a cumulative total. Does this look correct? I don't have your data so it is a little difficult for me to tell.
ledger = (Ledger
.objects
.annotate(transaction_date=F('date__date'))
.annotate(transaction_amount=Window(expression=Sum('amount'),
order_by=[F('asset').asc(), F('date').asc()],
partition_by=[F('asset')]))
.values('asset', 'transaction_date', 'transaction_amount'))
df = pd.DataFrame(list(ledger))
df.transaction_date = pd.to_datetime(df.transaction_date)
df.set_index('transaction_date', inplace=True)
df.groupby('asset').resample('D').ffill()
df = df.reset_index() <--- added this line here
<---edit below --->
Then create a dataframe from HistoricalPrices and merge it with the ledger. You might have to adjust the merge criteria to ensure you are getting what you want, but I think this is the correct path.
# edit
ledger = df
prices = (HistoricalPrice
.objects
.annotate(transaction_date=F('date__date'))
.values('asset', 'price', 'transaction_date'))
prices = pd.DataFrame(list(prices))
result = ledger.merge(prices, how='left', on=['asset', 'transaction_date'])
Depending on how you are using the data later, if you need a list of dicts which is a preferred method in Django templates, you can do that conversion with df.to_dict(orient='records')
If you want to group your Ledgers by date, then calculate the daily asset amount;
Ledger.objects.values('date__date').annotate(total_amount=Sum('amount'))
this should help (edit: fix typo)
second edit: assuming you want to group them by asset as well:
Ledger.objects.values('date__date', 'asset').annotate(total_amount=Sum('amount'))

How to write CouchDB view to get currently active servers given start timestamp and end timestamp of each server?

I have set of documents which has the server name, with the start timestamp and end timestamp of that server. eg.
[
{
serverName: "Houston",
startTimestamp: "2018/03/07 17:52:13 +000",
endTimestamp: "2018/03/07 18:50:10 +000"
},
{
serverName: "Canberra",
startTimestamp: "2018/03/07 18:48:09 +000",
endTimestamp: "2018/03/07 20:10:00 +000"
},
{
serverName: "Melbourne",
startTimestamp: "2018/03/08 01:43:13 +000",
endTimestamp: "2018/03/08 12:09:10 +000"
}
]
With this data, given a Timestamp I need to get the list of active servers at that point of time.
For example. for TS="2018/03/07 18:50:00 +000" from the above data the list of active servers are ["Huston", "Canberra"]
Is it possible to achieve this using only CouchDB views. If so how to go about it?
Note: Initially I tried the following approach. In the map function I emit two documents
1 with key=doc.startTimestsamp and value={"station_add": doc.station}
1 with key=doc.startEndtsamp and value={"station_rem": doc.station}
My intention was to iterate through these in the reduce function adding stations present in "station_add" and removing stations in "stations_rem". But I found that CouchDB does not mention anything about the ordering of values in the reduce function.
If you can live with fixed periods and don't mind the extra disk space that might be needed for the view results, you can create a view of active servers per hour, for example.
Iterate over the periods between start and end and emit the time that each server was online during this period:
function(doc) {
var start = new Date(doc.startTimestamp).getTime()
var end = new Date(doc.endTimestamp).getTime()
var msPerPeriod = 60*60*1000
var msOfflineInFirstPeriod = start % msPerPeriod
var firstPeriod = start - msOfflineInFirstPeriod
var msOnlineInLastPeriod = end % msPerPeriod
var lastPeriod = end - msOnlineInLastPeriod
if (firstPeriod === lastPeriod) {
// The server was only online within one period.
emit([new Date(firstPeriod), doc.serverName], [1, msOnlineInLastPeriod - msOfflineInFirstPeriod])
} else {
// The server was online over multiple periods.
emit([new Date(firstPeriod), doc.serverName], [1,msPerPeriod - msOfflineInFirstPeriod])
for (var period = firstPeriod + msPerPeriod; period < lastPeriod; period += msPerPeriod) {
emit([new Date(period), doc.serverName], [1, msPerPeriod])
}
emit([new Date(lastPeriod), doc.serverName], [1,msOnlineInLastPeriod])
}
}
If you want the total without the server names, just add a reduce function with the built-in shortcut _sum. You'll get the number of servers online during the period as the first number and the milliseconds that the servers were online in that period as the second number.
You can play with the view if you emit the year, month and day as the first keys. Then you can use the group_level at query time to get a finer or more coarse overview.
Bear in mind that this view might get large on disk, as each row has to be stored, and also the intermediate results for each group level are stored. So you shouldn't set the period duration too small – emitting a row for each second would take a lot of disk space, for example.

cloudant index: count number of unique users per time period

A very similar post was made about this issue here. In cloudant, I have a document structure storing when users access an application, that looks like the following:
{"username":"one","timestamp":"2015-10-07T15:04:46Z"}---| same day
{"username":"one","timestamp":"2015-10-07T19:22:00Z"}---^
{"username":"one","timestamp":"2015-10-25T04:22:00Z"}
{"username":"two","timestamp":"2015-10-07T19:22:00Z"}
What I want to know is to count the # of unique users for a given time period. Ex:
2015-10-07 = {"count": 2} two different users accessed on 2015-10-07
2015-10-25 = {"count": 1} one different user accessed on 2015-10-25
2015 = {"count" 2} two different users accessed in 2015
This all just becomes tricky because for example on 2015-10-07, username: one has two records of when they accessed, but it should only return a count of 1 to the total of unique users.
I've tried:
function(doc) {
var time = new Date(Date.parse(doc['timestamp']));
emit([time.getUTCFullYear(),time.getUTCMonth(),time.getUTCDay(),doc.username], 1);
}
This suffers from several issues, which are highlighted by Jesus Alva who commented in the post I linked to above.
Thanks!
There's probably a better way of doing this, but off the top of my head ...
You could try emitting an index for each level of granularity:
function(doc) {
var time = new Date(Date.parse(doc['timestamp']));
var year = time.getUTCFullYear();
var month = time.getUTCMonth()+1;
var day = time.getUTCDate();
// day granularity
emit([year,month,day,doc.username], null);
// year granularity
emit([year,doc.username], null);
}
// reduce function - `_count`
Day query (2015-10-07):
inclusive_end=true&
start_key=[2015, 10, 7, "\u0000"]&
end_key=[2015, 10, 7, "\uefff"]&
reduce=true&
group=true
Day query result - your application code would count the number of rows:
{"rows":[
{"key":[2015,10,7,"one"],"value":2},
{"key":[2015,10,7,"two"],"value":1}
]}
Year query:
inclusive_end=true&
start_key=[2015, "\u0000"]&
end_key=[2015, "\uefff"]&
reduce=true&
group=true
Query result - your application code would count the number of rows:
{"rows":[
{"key":[2015,"one"],"value":3},
{"key":[2015,"two"],"value":1}
]}