Query from PowerBI to AI suddenly fails with (502): Bad Gateway - powerbi

In Power BI (Desktop) we use a Power BI Query (M) to get data from Application Insights Analytics. We published the Power BI Report to Power BI online configured with a daily refresh. It worked fine until it stopped working on 25-1-2017 (UTC).
The error we get is:
DataSource.Error: Web.Contents failed to get contents from '.....' (502): Bad Gateway
This is the complete error:
DataSource.Error: Web.Contents failed to get contents from 'https://management.azure.com/subscriptions/<subscriptionId>/resourcegroups/fps.fsa/providers/microsoft.insights/components/4PS%20Field%20Service%20iOS%20-%20iOS/api/query?api-version=2014-12-01-preview&csl=customEvents%0A%7C%20where%20timestamp%20%3E%20ago%2830d%29%0A%7C%20order%20by%20timestamp%20desc%0A%7C%20extend%20dimensionUserId%20%3D%20tostring%28customDimensions.%5B%27userId%27%5D%29%0A%7C%20extend%20dimensionHost%20%3D%20tostring%28customDimensions.%5B%27url%27%5D%29%0A%7C%20extend%20measurementQuantity%20%3D%20iff%28%20isnotempty%28customMeasurements.%5B%27value%27%5D%29%2C%20todouble%28customMeasurements.%5B%27value%27%5D%29%2C%200.0%29%0A%7C%20extend%20measurementKey%20%3D%20tostring%28customDimensions.%5B%27key%27%5D%29%0A%7C%20extend%20platform%20%3D%20%27iOS%27%0A&x-ms-app=AAPBI' (502): Bad Gateway
Details:
DataSourceKind=Web
DataSourcePath=https://management.azure.com/subscriptions/<subscriptionId>/resourcegroups/fps.fsa/providers/microsoft.insights/components/4PS%20Field%20Service%20iOS%20-%20iOS/api/query
Url=https://management.azure.com/subscriptions/<subscriptionId>/resourcegroups/fps.fsa/providers/microsoft.insights/components/4PS%20Field%20Service%20iOS%20-%20iOS/api/query?api-version=2014-12-01-preview&csl=customEvents%0A%7C%20where%20timestamp%20%3E%20ago%2830d%29%0A%7C%20order%20by%20timestamp%20desc%0A%7C%20extend%20dimensionUserId%20%3D%20tostring%28customDimensions.%5B%27userId%27%5D%29%0A%7C%20extend%20dimensionHost%20%3D%20tostring%28customDimensions.%5B%27url%27%5D%29%0A%7C%20extend%20measurementQuantity%20%3D%20iff%28%20isnotempty%28customMeasurements.%5B%27value%27%5D%29%2C%20todouble%28customMeasurements.%5B%27value%27%5D%29%2C%200.0%29%0A%7C%20extend%20measurementKey%20%3D%20tostring%28customDimensions.%5B%27key%27%5D%29%0A%7C%20extend%20platform%20%3D%20%27iOS%27%0A&x-ms-app=AAPBI
Does anyone know how to solve this?

The 502 Bad Gateway Message is usually due to the AA Query returning too much data. The gateway is limited to 8MB of data, period.
Example
You created a dashboard that worked in December 2016 and gave you all requests from the start of the month. Now it's January 2017 and it's failing. You use the PowerBI Dashboard to calculate some metrics from the raw results using a query like the one below.
requests | where timestamp > datetime(2016-12-01)
The fix
Determine how many days you really care about. If your intention was to get all requests and the timing from the first of the month you can cut out a lot of extra data by limiting the time range to this month AND by only projecting the columns you need
requests | where timestamp > startofmonth(now()) | project name, duration
Another fix
Assuming you are calculating things like averages and percentiles you could also just have Analytics do this for you and PowerBI just display the results.
requests | where timestamp > startofmonth(now()) | summarize count(), avg(duration), min(duration), max(duration), stdev(duration), percentiles(duration, 50, 75, 90, 95, 99) by name
More Examples
You might want a meaningful graph, so you could split up the aggregations by a time period. This would give you a lot of data in return, but severely less than you'd get if you queried the raw data points.
By Day
requests | where timestamp > startofmonth(now()) | summarize count(), avg(duration), min(duration), max(duration), stdev(duration), percentiles(duration, 50, 75, 90, 95, 99) by bin(timestamp, 1d), name
By Hour
requests | where timestamp > startofmonth(now()) | summarize count(), avg(duration), min(duration), max(duration), stdev(duration), percentiles(duration, 50, 75, 90, 95, 99) by bin(timestamp, 1h), name
I've only given a few examples and you'd have to make sure they fit in with the intention of your dashboard.

The recommendation as James suggested is to limit the result size, yet if you still need to return a larger data set you can work directly with AI API instead of ARM.
1) You will need to create an API Key, see https://dev.applicationinsights.io/documentation/Authorization/API-key-and-App-ID
2) Next you will need to update the Power BI M script that you exported from Analytics by replacing the ARM URL with AI API:
Replace ARM call:
.....
Source =
Json.Document(Web.Contents("https://management.azure.com/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourcegroups/<RESOURCE GROUP>/providers/microsoft.insights/components/<APP NAME>/api/query?api-version=2014-12-01-preview",
[Query=[#"csl"="requests",#"x-ms-app"="AAPBI"],Timeout=#duration(0,0,4,0)])),.....
With AI API call:
.... Source =
Json.Document(Web.Contents("https://api.applicationinsights.io/beta/apps/<APPLICATION_ID>/query?api-version=2014-12-01-preview",
[Query=[#"csl"="requests",#"x-ms-app"="AAPBI"],Timeout=#duration(0,0,4,0)])),.....
3) Finally, update credentials to basic, and use your API Key
enter image description here

Related

AWS CloudWatch: visualise more time series

I would like to visualise event occurrence changes in time.
Use case:
Let's say my logs contains 2 types of events (eventA, eventB).
I'm interested in a line graph that shows the number of events per hours. (line#1: dataA1, dataA2... ; line#2: dataB1, dataB2...)
What I'm aware of:
Query the logs: fields #timestamp, eventName | stats count() by bin(1h), eventName | sort bin(1h) asc
The above query gives all the data for creating the desired graph (eg: [bin(1h)], [count()], [eventName])
If I remove the eventName field form display I get a log-table with the correct data, but the line graph is showing datapoints mixed (eg: dataA1, dataA2, dataB3, dataA4, dataB5)
The question:
Is it possible to generate a line graph with more series in it?
If yes, what parametrization do I need?
See https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_Insights-Visualizing-Log-Data.html
Visualizing time series data
Time series visualizations work for queries with the following characteristics:
The query contains one or more aggregation functions. For more information, see Aggregation Functions in the Stats Command.
The query uses the bin() function to group the data by one field.
These queries can produce line charts, stacked area charts, bar charts, and pie charts.
You can't use line chart for your example because you can only use single bin() grouping to produce time series. You can however use e.g. pie chart for your use case.
Alternatively if applicable to your use case, you can start producing logs in different format as
{
"eventA": 1,
"eventB": 0
}
Then you can write query as
stats sum(eventA), sum(eventB) by bin(1h)

Retroactive calculation of VM instance total running time (GCP)

I have a number of instances in a GCP project, that I want to check retroactively how long they've been in use in the last 30 days, i.e. sum the total time an instance is not stopped or terminated during a specific month.
Does anyone know if this can be calculated, and if so - how?
Or maybe another idea that would allow me to sum the total time an instance was in use?
Based on this other post, I would recommend something like:
fetch gce_instance
| metric 'compute.googleapis.com/instance/uptime'
| align delta(30d)
| every 30d
| group_by [metric.instance_name]
I would also consider creating uptime_checks as one of the answers in said post recommends for future checks, but those wouldn't work retroactively. If you need more info about MQL you can find it here.

Visualize time values over days in QuickSight

I have an event dataset in QuickSight, where each record has a timestamp field as following:
last_day_record_ts |
-------------------|
2020-01-19 05:46:55|
2020-01-20 05:55:37|
2020-01-21 06:00:12|
2020-01-22 06:12:57|
2020-01-23 06:02:15|
2020-01-24 06:15:35|
2020-01-25 06:20:05|
2020-01-26 05:55:48|
I want to build a visualization of time values over days as a line chart as following:
However, I find it difficult to get this in AWS QuickSight. Any ideas?
Instead of desired result QuickSight persistently gives just aggregated record values (i.e 1 for each day) but not the time values itself...
UPDATE. The workaround I found for now - to add calculated fields to the Data Set in order to get numeric values instead of timestamp ones.
Calculated fields:
day_midnight | truncDate('DD',{last_day_record_ts})
time_diff_in_hours_dec | abs(dateDiff({last_day_record_ts},{day_midnight},"MI")) / 60
time_diff_in_hours_int | decimalToInt({time_diff_in_hours_dec})
time_diff_in_min | ({time_diff_in_hours_dec} - {time_diff_in_hours_int}) * 60
The only problem I still cannot solve - to get Y axis labels in HH:MM format as in green rectangle. For now, it's numeric decimals...
Unfortunately, (after many attempts of my own) this type of visual does not appear to be possible in Quicksight at the time of writing.
Quicksight has many nice features, but it's still missing some (very basic imo) things that make it limiting for anyone working with data that is outside the expected use-cases.

Why Amazon Forecast cannot train the predictor?

While training my predictor I came across this error and I got stuck how to fix it.
I have two data-series, a "Target time-series data" with 9234 rows and a single "item_id" and a second one that is "Related time-series data" with the same number of rows as I only have a single id.
I'm setting de data with a window of 180 days, what is exactly the difference between the second and the first number that has appeared on the error, 9414 - 9234 = 180.
We were unable to train your predictor.
Please ensure there are no missing values for any items in the related time series, All items need data until 2020-03-15 00:00:00.0. For example, following items have missing data: item: brl only has 9234/9414 required datapoints starting 1994-06-07 00:00:00.0, please refer to documentation for additional details.
Once my data don't have missing data and it's on a daily basis why is it returning this error?
My data starts on 1994-06-07 and ends on 2019-09-17. Why should I have 9414 data points rather than 9234?
Should I take out 180 days in my "Target time-series data"?
The future values of the related time-series data must be known.
Example of a good related-time series: You know past and future days in which marketing has or will send email newsletters promoting the product you're forecasting. You can use this data as a related-time series.
Example of a bad related-time series: You notice that Google searches for your brand correlated with the sale of your product. As a result you want to use it as a related-time series. Since you don't know how many searches will occur in the future, so you can't use this as a related time series.
In you case, You have TARGET_TIME_SERIES data for 9414 days and you want to predict demand for the next 180 days. That means your RELATED_TIME_SERIES data should be 9594 days.
Edit: I have not tested this with amazon's forecasting product. I'm basing my answer on working with Facebook Prophet (which is one of the models amazon forcast uses). Please let me know if my solution worked.

Today Google Trends for specific query

I fetch necessary data from Google Trends using the following URL:
http://www.google.com/trends/fetchComponent?q=doctor&cid=TIMESERIES_GRAPH_0&export=3&date=4/2013+3m&hl=en-US
But the output of Google Trends does not contain data for yesterday and today (even if extrapolated by the past hours). For example, the abovementioned URL on 2013-04-28 returned JavaScript code with the following fragment:
...,
{"c":[{"v":new Date(2013,3,26),"f":"Friday, April 26, 2013"},{"v":65.0,"f":"65"}]},
{"c":[{"v":new Date(2013,3,27),"f":"Saturday, April 27, 2013"},{"v":null}]},
{"c":[{"v":new Date(2013,3,28),"f":"Sunday, April 28, 2013"},{"v":null}]},
...
Notice null values for April 27, 28.
But, as we know, "Hot Trends" are available with hourly granularity. That assumes Google possesses enough data to give us the "complete answer" even for the trends request related to specific query not only for the "hottest".
Does anybody know a hack how to fetch from Google Trends up-to-date daily data for specific query? Or, may be, workarounds to derive similar trends data from other sources?