How to get average over time only when PC is up in Prometheus - wmi

I'm currently trying to get the average amount of free RAM in the last week on a Windows PC, using the WMI Exporter (https://github.com/martinlindhe/wmi_exporter) and Prometheus with the following query :
avg_over_time(wmi_os_physical_memory_free_bytes{instance="foo"}[1w])
The query inside "avg_over_time" is working correctly, but the problem is the PC is not up 24/7. Since I can't have holes in a range, I was considering the idea of using a recording rules.
The other problem is that the PC doesn't start and stop at known values, so I can't use the following solution : How to get the average over time only during the day in Prometheus because I can't tell the time of the day I need to start collecting.
Is there any recording rule that could concatenate all the information gathered only during the up time of the pc ?
Thanks in advance.

The above expression already does what you want, as there'll be no data for the periods when the target couldn't be scraped and avg_over_time doesn't try to do anything fancy with gaps.

Related

Powerquery & Zabbix API - DataSource.Error (500) Internal Server Error when asking for much data

I am using Powerquery to fetch the data from Zabbix using their API. It works fine when I fetch the data for some days, but as I increase the period and the amount of data surpasses the millions of rows I just get the error below after some time waiting, and the query doesn't return anything.
I am using Web.contents to get the data as follows:
I have added that timeout as you can see above but the error just happens much before 5 minutes have passed. How should I solve this? Are there ways to fetch large amounts of data in power query in parts, without being all at once? Or does this error happen because of connection parameters inherent to zabbix configs.?
My team changed all the possible parameters regarding server memory and nothing seemed to have worked. One thing to notice is that, although power query seems to face the same error (500) internal server error if I get data for a period of 3 days or 30 days, for the first case it shows the error much faster while in the last case it takes much more time and eventually gets to the same error.
Thanks!
It's a PHP memory limit hit, you should modify the maximum memory.
For example, in an Apache standard setup you should edit /etc/httpd/conf.d/zabbix.conf and modify the php_value memory_limit to a greater value (restart apache!).
The default is 128M, the "right" setting depends on the memory available on the system and the maximum data size you want to get.

Google AutoML Importing text items very slow

I'm importing text items to Google's AutoML. Each row contains around 5000 characters and I'm adding 70K of these rows. This is a multi-label data set. There is no progress bar or indication of how long this process will take. Its been running for a couple of hours. Is there any way to calculate time remaining or total estimated time. I'd like to add additional data sets, but I'm worried that this will be a very long process before the training even begins. Any sort of formula to create even a semi-wild guess would be great.
-Thanks!
I don't think that's possible today, but I filed a feature request [1] that you can follow for updates. I asked for both training and importing data, as for training it could be useful too.
I tried training with 50K records (~ 300 bytes/record) and the load took more than 20 mins after which I killed it. I retried with 1K, which ran for 20 mins and then emailed me an error message saying I had multiple labels per input (yes, so what? training data is going to have some of those) and I had >100 labels. I simplified the classification buckets and re-ran. It took another 20 mins and was successful. Then I ran 'training' which took 3 hours and billed me $11. That maps to $550 for 50K recs, assuming linear behavior. The prediction results were not bad for a first pass, but I got the feeling that it is throwing a super large neural net at the problem. Would help if they said what NN it was and its dimensions. They do say "beta" :)
don't wast your time trying to using google for text classification. I am a GCP hard user but microsoft LUIS is far better, precise and so much faster that I can't believe that both products are trying to solve same problem.
Luis has a much better documentation, support more languages, has a much better test interface, way faster.. I don't know if is cheaper yet because the pricing model is different but we are willing to pay more.

How do I get the most recent report data?

I'm trying to build a tool that collects a few data points from a user usage report with
https://www.googleapis.com/admin/reports/v1/usage/{user}/all/dates/{yyyy-mm-dd}
Since the data is delayed - how do I get the most recent report? If I were to query today's (2013-11-22) date I would get something like:
Data for dates later than 2013-11-19 is not yet available. Please check back later
Is there a set number of days/hours for reports to be available - or do I have to trial and error backwards until I get a successful response?
I believe there is a delay of about 48 hours for the reports as of right now. However, if Google is able to improve on that, you'll want your app to be able to take advantage of those improvements without any changes needed.
I suggest you make a first attempt using today's date. When that fails, parse the error response to grab the last date report data is available for and use that value. This way you're always making only 2 max attempts and if Google improves the delay to 24 hours or even less, your app is able to take immediate advantage of that change.

Amount of Test Data needed for load testing of a web service

I am currently working on a project that requires load testing of web services.
One of the services is being called 60,000 times in the production during Busy-Day/Busy-HR.
{PerfTest Env=PROD}
Input Account Number
Output AccountDetails
Do I really need 60,000 unique account numbers(TEST DATA) for this loadrunner script to simulate the production scenario?
If unique data is required, for endurance test I will have to prepare lot of test data for each web service.
If I don't get that much test data, what is the chance of Load Test being affected due to Application Server Cache mechanism??
Can somebody help me?
Thanks
Ram
Are you simulating a day or the highest volume hour in the last year? This can help you to shape the amount of data that you need. Rarely would you start with a 24 hour test. Instead you would be looking at your high water test of an hour with a ramp up and ramp down, so you would need approximately 1.333* your high water hour's worth of data.
So this can drop your 60K to (potentially) 20K(?) I am making an assumption that your worst hour over the last year is somewhere around 1/3 of your traditional day. I have observed this pattern over and over again in different environments over the past two decades. You will want to objectively verify this with log data or query data to support the number in your environment.
Next up, how many of these inquiries are actually unique? You are really going to need a log of the queries across a day (or your high water hour) to determine this. Log processing tools such as Microsoft Logparser or Splunk/Splunk Storm can help you to pull the observed distribution of unique account references within your data, including counts of those which are multiple. Once you know this you can simply use a data file with a fixed block size for each user for unique data and once the data is exhausted the user exits.

SimpleDB Incremental Index

I understand SimpleDB doesn't have an auto increment but I am working on a script where I need to query the database by sending the id of the last record I've already pulled and pull all subsequent records. In a normal SQL fashion if there were 6200 records I already have 6100 of them when I run the script I query records with an ID greater than > 6100. Looking at the response object, I don't see anything I can use. It just seems like there should be a sequential index there. The other option I was thinking would be a real time stamp. Any ideas are much appreciated.
Using a timestamp was perfect for what I needed to do. I followed this article to help me on my way:http://aws.amazon.com/articles/1232 I would still welcome if anyone knows if there is a way to get an incremental index number.