How do I query Prometheus for the timeseries that was updated last? - concurrency

I have 100 instances of a service that use one database. I want them to export a Prometheus metric with the number of rows in a specific table of this database.
To avoid hitting the database with 100 queries at the same time, I periodically elect one of the instances to do the measurement and set a Prometheus gauge to the number obtained. Different instances may be elected at different times. Thus, each of the 100 instances may have its own value of the gauge, but only one of them is “current” at any given time.
What is the best way to pick only this “current” value from the 100 gauges?
My first idea was to export two gauges from each instance: the actual measurement and its timestamp. Then perhaps I could take the max(timestamp), then and it with the actual metric. But I can’t figure out how to do this in PromQL, because max will erase the instance I could and on.
My second idea was to reset the gauge to −1 (some sentinel value) at some time after the measurement. But this looks brittle, because if I don’t synchronize everything tightly, the “current” gauge could be reset before or after the “new” one is set, causing gaps or overlaps. Similar considerations go for explicitly deleting the metric and for exporting it with an explicit timestamp (to induce staleness).

I figured out the first idea (not tested yet):
avg(my_rows_count and on(instance) topk(1, my_rows_count_timestamp))
avg could as well be max or min, it only serves to erase instance from the final result.

last_over_time should do the trick
last_over_time(my_rows_count[1m])
given only one of them is “current” at any given time, like you said.

Related

How to conditionally execute a SET operation in DynamoDB

I have an aggregations table in DynamoDb with the following columns: id, sum, count, max, min, and hash. I will ALWAYS want to update sum and count but will want to update min and max only when I have values greater than/lesser than the values already in the database. Also, I only want this operation to succeed when the stored hash is different from what I am sending, to prevent reprocessing the same data.
I currently have these:
UpdateExpression: ADD sum :sum ADD count :count SET hash :hash
UpdateCondition: attribute_not_exists(hash) OR hash <> :hash
The thing is that I need something like this for min and max:
SET min :min IF :min < min and something alike for max. Of course, this doesn't currently work. I could not find a suitable update function that would perform this comparision in DynamoDb. What is the proper way to achieve this.
PS.: I already was suggested doing multiple requests to dynamodb and place the max/min as UpdateConditions, but I want to avoid these multiple requests approach for data consistency reasons.
PS2.: Another way to express what I want in a JavaScript-sh way would be something like SET :min < min ? :min : min
I got to a solution to this problem by realizing that what I wanted was just not possible. There must be just one condition to the entire update and since there is no such thing as SET min = minimum(:min, min) I had to accept my fate and make more than one UpdateItem request to DynamoDB.
The nice thing is that the order of execution of these updates doesn't matter. The hard thing here is to make sure that each update is executed exactly once. Because we are firing a lot of requests (and having peaks eventually) there is a real chance of some failing updates due to ProvisionedThroughputExceededException or maybe just some rate limiting from AWS.
So here is my final solution;
Lambda function receives payload with hundreds of data points.
Lambda function aggregates this data points in memory and produces an intermediary aggregation object of the form {id, sum, count, min, max}.
Lambda function generates 3 update objects per aggregation object, of the forms (these updates are referring to the same record):
{UpdateExpression: 'ADD #SUM :sum, #COUNT :count'}
{ConditionExpression: '#MAX < :max OR attribute_not_exists(#MAX)', UpdateExpression: 'SET #MAX = :max'}
{ConditionExpression: '#MIN > :min OR attribute_not_exists(#MIN)', UpdateExpression: 'SET #MIN = :min'}
Because we need to be 100% sure that these updates will always be processed with success, then the lambda function sends them to a FIFO SQS queue (as 3 separate messages). I am not using a FIFO queue here because I want the order to be preserved but because I want the guarantee of exactly once delivery.
A consumer keeps pooling the queue and whenever there are messages it just shoots them to DynamoDB as the parameter of .updateItem.
At the end of this process, I was able to do real-time aggregations for thousands of records :)
PS.: Got rid of the hash column
It is not possible to do this in a single update since UpdateExpression doesn't support functions like max() and min(). The documentation for supported operations and functions can be found here
The best way to achieve the same effect is to add a field called latest or something similar which stores the latest value. You will need to change your update expression to be something like the following.
UpdateExpression: SET hash = :hash, latest = :latest, sum = sum + :latest, count = count + :num
Where :hash is of course your update hash to guard against replays, :latest is the latest value, and :num is 1 or whatever your increment is.
Then you can use DynamoDB Streams with a Lambda that looks at each update and checks if latest is less than min or greater than max. If not, ignore the update, otherwise perform a second update to set min or max to the latest value accordingly.
The main drawback to this approach is that there will be a small window where latest might be outside of the range of min or max however, this can be normalized easily in your application code when you read the records.
You should also consider the additional cost that will result from the DynamoDB Stream and Lambda invocations
I had a similar situation where I needed to atomically update a min value, and ended up doing this:
Let each item have an attribute of type Set (NS) keeping the candidate values for the minvalue, and when you want to set a new value that might be the new min, just add it to the set. Then at read time, find the lowest number in the set on the client side.
This is atomic and requires no condition expression, but has the downside that the set grows over time, so I added a clean up request to run as needed, for example when the set has more than N values, or simply on every get. The clean up might need to use a condition expression to be concurrent safe though, depending on if you also remove values through other use cases.
This does not solve all scenarios, but worked for me. In my case the value was a timestamp of an event in the future, and I wanted to store when the next event occurs. I could then easily also clean up by removing all values in the past.
Summary:
Set new potentially minimum value: ADD #values :value.
Read minimum value: GetItem followed by finding the lowest value in values client-side. This could if needed be combined with a clean up that finds all obsolete values, then calls UpdateItem DELETE #values [x, y, z...]

DynamoDB Scan/Query Return x Number of Items

If I scan or query in DynamoDB it is possible to set the Limit property. The DynamoDB documentation says the following:
The maximum number of items to evaluate (not necessarily the number of
matching items).
So the problem with this is if you set filters and such it won't return all the items.
My goal that I'm trying to figure out how to achieve is to have a filter in a scan or query, but have it return x number of items. No matter what. I'm ok with having to use LastEvaluatedKey and make multiple requests, but I would like to try to make it as seamless and easy as possible (so not doing that would be best.
The only way I have thought to do this is to set the Limit property to say 1 or something. Then just keep scanning or querying using the LastEvaluatedKey until I reach that x number of items I'm looking for. Problem is, this seems VERY wasteful and inefficient. I mean if you have a table of millions of records you might have to make thousands and thousands of requests. It doesn't seem like it scales very well. Of course I'm sure it's no different than what DynamoDB would be doing behind the scenes.
But is there a way to do this more efficiently where I can reduce the number of requests I have to make? Or is that the only way to achieve this?
How would you achieve this goal?
A single Query operation will read up to the maximum number of items set (if using the Limit parameter) or a maximum of 1 MB of data and then apply any filtering to the results using FilterExpression.
You're 100% right that Limit is applied before FilterExpression. Meaning Dynamo might return some number or documents less than the Limit while other documents that satisfy the FilterExpression still exist in the table but aren't returned.
Its sounds like it would be unacceptable for your api to behave in the same manner. That is going to mean that in some cases, a single request to your service will result in multiple requests to Dynamo. Also, keep in mind that there is no way to predict what the LastEvaluatedKey will be which would be required to parallelize these requests. So in the case that your service makes multiple requests to Dynamo, they will be serial. To me, this is a rather heavy tradeoff but, if it is a requirement that you satisfy the Limit whenever possible, you have options.
First, Dynamo will automatically page at 1 MB. That means you could simply send your query to Dynamo without a Limit and implement the Limit on your end. You may still need to make multiple requests to ensure that your've satisfied the Limit but this approach will result in the fewest number of requests to Dynamo. The trade off here is the total data being read and transferred. Chances are your Limit will not happen to line up perfectly with the 1 MB limit which means the excess data being read, filtered, and transferred is wasted.
You already mentioned the other extreme of sending a Limit of 1 and pointed out that will result in the maximum number of requests to Dynamo
Another approach along these lines is to create some sort of probabilistic function that takes the Limit given to your service by the client and computes a new Limit for Dynamo. For example, your FilterExpression filters out about half of the documents in the table. That means you can multiply the client Limit by 2 and that would be a reasonable Limit to send to Dynamo. Of the approaches we've talked about so far, this one has the highest potential for efficiency however, it also has the highest potential for complexity. For example, you might find that using a simple linear function is not good enough and instead you need to use machine learning to find a multi-variate non-linear function to calculate the new Limit. This approach also heavily depends on the uniformity of your data in Dynamo as well as your access patterns. Again, you might need machine learning to optimize for those variables.
In any of the cases where you are implementing the Limit on your end, if you plan on sending back the LastEvaluatedKey to the client for subsequent calls to your service, you will also need to take care to keep track of the LastEvaluatedKey that you evaluated. You will no longer be able to rely on the LastEvaluatedKey returned from Dynamo.
The final approach would be to reorganize/regroup your data either with a GSI, a separate table that you keep in sync using Dynamo Streams or a different schema altogether with the goal of not requiring a FilterExpression.

Is there a 'value in time' concept in metrics, or how do I create one?

I am using metrics-clojure http://metrics-clojure.readthedocs.io/en/latest/ lists gauges, counters, meters, timers and histograms.
What I want is instead to report a number.
Very much like counter, but with a set! operation instead of just inc!/dec! or a meter that accepts a value.
One use case is processing batches of events. I can create a meter to watch the batches, but I would prefer to include the batch size such that the reporting end can use the correct units (so I can plot the number of events processed instead of the number of batches).
Another use case is wanting to produce a plot of some number that changes over time. Say again I was processing events, and I wanted to plot per event how many unique combinations of events I'd seen so far, how can I do this?
I can fake this a little bit with a gauge. I can create an atom, and have the gauge report the atom value, and set the atom value in the code... but I can't control when the gauge will report the value. So the value will only be plotted at points whenever the gauge happened to be queried, but I might want to record the values at more specific points (like the end of a batch, at intervals in a batch, or on every event).
And it seems convoluted.
Any suggestions?

Reducing query time in table with unsorted timeranges

I had a question regarding this matter some days ago, but I'm still wondering about how to tune my performance on this query.
I have a table looking like this (SQLite)
CREATE TABLE ZONEDATA (
TIME INTEGER NOT NULL,
CITY INTEGER NOT NULL,
ZONE INTEGER NOT NULL,
TEMPERATURE DOUBLE,
SERIAL INTEGER ,
FOREIGN KEY (SERIAL) REFERENCES ZONES,
PRIMARY KEY ( TIME, CITY, ZONE));
I'm running a query like this:
SELECT temperature, time, city, zone from zonedata
WHERE (city = 1) and (zone = 1) and (time BETWEEN x AND y);
x and y are variables which may have several hundred thousands variables between them.
temperature ranges from -10.0 to 10.0, city and zone from 0-20 (in this case it is 1 and 2, but can be something else). Records are logged continuously with intervals on about 5-6 seconds from different zones and cities. This creates a lot of data, and does not necessarily mean that every record is logged in correct order of time.
The question is how I can optimize retrieval of records in a big time range (where records are not sorted 100% correctly by time). This can take a lot of time, especially when I'm retrieving from several cities and zones. That means running the mentioned query with different parameters several times. What I'm looking for is specific changes to the query, table structure (preferably not) or other changeable settings.
My application using this is btw implemented in c++.
Your data already is sorted by Time.
By having a Primary Key on (Time, City, Zone) all the records with that same Time value will be next to each other. (Unless you have specified a CLUSTER INDEX elsewhere, though I'm not familiar enough with SQLite to know if that's possible.)
In your particular case, however, that means the records that you want are not next to each other. Instead they're in bunches. Each bunch of records will have (city=1, zone=1) and have the same Time value. One bunch for Time1, another bunch for Time2, etc, etc.
It's like putting it all in Excel and ordering by Time, then by City, then by Zone.
To bunch ALL the records you want (for the same City and Zone) change that to (City, Zone, Time).
Note, however, that if you also have a query for all cities and zones but a time = ??? the key I suggested won't be perfect for that, your original key would be better.
For that reason you may wish/need to add different indexes in different orders, for different queries.
This means that to give you a specific recommended solution we need to know the specific query you will be running. My suggested key/index order may be ideal for your simplified example, but the real-life scenario may be different enough to warrant a different index altogether.
You can index those columns, it will sort it internally for faster query but you will not see it.
For a database between is hard to optimize. One way out of this is adding extra fields so you can replace between with an =. For example, if you add a day field, you could query for:
where city = 1 and zone = 1 and day = '2012-06-22' and
time between '2012-06-22 08:00' and '2012-06-22 12:00'
This query is relatively fast with an index on city, zone, day.
This requires thought to pick the proper extra fields. It requires additional code to maintain the field. If this query is in an important performance path of your application, it might be worth it.

Amazon SimpleDB Woes: Implementing counter attributes

Long story short, I'm rewriting a piece of a system and am looking for a way to store some hit counters in AWS SimpleDB.
For those of you not familiar with SimpleDB, the (main) problem with storing counters is that the cloud propagation delay is often over a second. Our application currently gets ~1,500 hits per second. Not all those hits will map to the same key, but a ballpark figure might be around 5-10 updates to a key every second. This means that if we were to use a traditional update mechanism (read, increment, store), we would end up inadvertently dropping a significant number of hits.
One potential solution is to keep the counters in memcache, and using a cron task to push the data. The big problem with this is that it isn't the "right" way to do it. Memcache shouldn't really be used for persistent storage... after all, it's a caching layer. In addition, then we'll end up with issues when we do the push, making sure we delete the correct elements, and hoping that there is no contention for them as we're deleting them (which is very likely).
Another potential solution is to keep a local SQL database and write the counters there, updating our SimpleDB out-of-band every so many requests or running a cron task to push the data. This solves the syncing problem, as we can include timestamps to easily set boundaries for the SimpleDB pushes. Of course, there are still other issues, and though this might work with a decent amount of hacking, it doesn't seem like the most elegant solution.
Has anyone encountered a similar issue in their experience, or have any novel approaches? Any advice or ideas would be appreciated, even if they're not completely flushed out. I've been thinking about this one for a while, and could use some new perspectives.
The existing SimpleDB API does not lend itself naturally to being a distributed counter. But it certainly can be done.
Working strictly within SimpleDB there are 2 ways to make it work. An easy method that requires something like a cron job to clean up. Or a much more complex technique that cleans as it goes.
The Easy Way
The easy way is to make a different item for each "hit". With a single attribute which is the key. Pump the domain(s) with counts quickly and easily. When you need to fetch the count (presumable much less often) you have to issue a query
SELECT count(*) FROM domain WHERE key='myKey'
Of course this will cause your domain(s) to grow unbounded and the queries will take longer and longer to execute over time. The solution is a summary record where you roll up all the counts collected so far for each key. It's just an item with attributes for the key {summary='myKey'} and a "Last-Updated" timestamp with granularity down to the millisecond. This also requires that you add the "timestamp" attribute to your "hit" items. The summary records don't need to be in the same domain. In fact, depending on your setup, they might best be kept in a separate domain. Either way you can use the key as the itemName and use GetAttributes instead of doing a SELECT.
Now getting the count is a two step process. You have to pull the summary record and also query for 'Timestamp' strictly greater than whatever the 'Last-Updated' time is in your summary record and add the two counts together.
SELECT count(*) FROM domain WHERE key='myKey' AND timestamp > '...'
You will also need a way to update your summary record periodically. You can do this on a schedule (every hour) or dynamically based on some other criteria (for example do it during regular processing whenever the query returns more than one page). Just make sure that when you update your summary record you base it on a time that is far enough in the past that you are past the eventual consistency window. 1 minute is more than safe.
This solution works in the face of concurrent updates because even if many summary records are written at the same time, they are all correct and whichever one wins will still be correct because the count and the 'Last-Updated' attribute will be consistent with each other.
This also works well across multiple domains even if you keep your summary records with the hit records, you can pull the summary records from all your domains simultaneously and then issue your queries to all domains in parallel. The reason to do this is if you need higher throughput for a key than what you can get from one domain.
This works well with caching. If your cache fails you have an authoritative backup.
The time will come where someone wants to go back and edit / remove / add a record that has an old 'Timestamp' value. You will have to update your summary record (for that domain) at that time or your counts will be off until you recompute that summary.
This will give you a count that is in sync with the data currently viewable within the consistency window. This won't give you a count that is accurate up to the millisecond.
The Hard Way
The other way way is to do the normal read - increment - store mechanism but also write a composite value that includes a version number along with your value. Where the version number you use is 1 greater than the version number of the value you are updating.
get(key) returns the attribute value="Ver015 Count089"
Here you retrieve a count of 89 that was stored as version 15. When you do an update you write a value like this:
put(key, value="Ver016 Count090")
The previous value is not removed and you end up with an audit trail of updates that are reminiscent of lamport clocks.
This requires you to do a few extra things.
the ability to identify and resolve conflicts whenever you do a GET
a simple version number isn't going to work you'll want to include a timestamp with resolution down to at least the millisecond and maybe a process ID as well.
in practice you'll want your value to include the current version number and the version number of the value your update is based on to more easily resolve conflicts.
you can't keep an infinite audit trail in one item so you'll need to issue delete's for older values as you go.
What you get with this technique is like a tree of divergent updates. you'll have one value and then all of a sudden multiple updates will occur and you will have a bunch of updates based off the same old value none of which know about each other.
When I say resolve conflicts at GET time I mean that if you read an item and the value looks like this:
11 --- 12
/
10 --- 11
\
11
You have to to be able to figure that the real value is 14. Which you can do if you include for each new value the version of the value(s) you are updating.
It shouldn't be rocket science
If all you want is a simple counter: this is way over-kill. It shouldn't be rocket science to make a simple counter. Which is why SimpleDB may not be the best choice for making simple counters.
That isn't the only way but most of those things will need to be done if you implement an SimpleDB solution in lieu of actually having a lock.
Don't get me wrong, I actually like this method precisely because there is no lock and the bound on the number of processes that can use this counter simultaneously is around 100. (because of the limit on the number of attributes in an item) And you can get beyond 100 with some changes.
Note
But if all these implementation details were hidden from you and you just had to call increment(key), it wouldn't be complex at all. With SimpleDB the client library is the key to making the complex things simple. But currently there are no publicly available libraries that implement this functionality (to my knowledge).
To anyone revisiting this issue, Amazon just added support for Conditional Puts, which makes implementing a counter much easier.
Now, to implement a counter - simply call GetAttributes, increment the count, and then call PutAttributes, with the Expected Value set correctly. If Amazon responds with an error ConditionalCheckFailed, then retry the whole operation.
Note that you can only have one expected value per PutAttributes call. So, if you want to have multiple counters in a single row, then use a version attribute.
pseudo-code:
begin
attributes = SimpleDB.GetAttributes
initial_version = attributes[:version]
attributes[:counter1] += 3
attributes[:counter2] += 7
attributes[:version] += 1
SimpleDB.PutAttributes(attributes, :expected => {:version => initial_version})
rescue ConditionalCheckFailed
retry
end
I see you've accepted an answer already, but this might count as a novel approach.
If you're building a web app then you can use Google's Analytics product to track page impressions (if the page to domain-item mapping fits) and then to use the Analytics API to periodically push that data up into the items themselves.
I haven't thought this through in detail so there may be holes. I'd actually be quite interested in your feedback on this approach given your experience in the area.
Thanks
Scott
For anyone interested in how I ended up dealing with this... (slightly Java-specific)
I ended up using an EhCache on each servlet instance. I used the UUID as a key, and a Java AtomicInteger as the value. Periodically a thread iterates through the cache and pushes rows to a simpledb temp stats domain, as well as writing a row with the key to an invalidation domain (which fails silently if the key already exists). The thread also decrements the counter with the previous value, ensuring that we don't miss any hits while it was updating. A separate thread pings the simpledb invalidation domain, and rolls up the stats in the temporary domains (there are multiple rows to each key, since we're using ec2 instances), pushing it to the actual stats domain.
I've done a little load testing, and it seems to scale well. Locally I was able to handle about 500 hits/second before the load tester broke (not the servlets - hah), so if anything I think running on ec2 should only improve performance.
Answer to feynmansbastard:
If you want to store huge amount of events i suggest you to use distributed commit log systems such as kafka or aws kinesis. They allow to consume stream of events cheap and simple (kinesis's pricing is 25$ per month for 1K events per seconds) – you just need to implement consumer (using any language), which bulk reads all events from previous checkpoint, aggregates counters in memory then flushes data into permanent storage (dynamodb or mysql) and commit checkpoint.
Events can be logged simply using nginx log and transfered to kafka/kinesis using fluentd. This is very cheap, performant and simple solution.
Also had similiar needs/challenges.
I looked at using google analytics and count.ly. the latter seemed too expensive to be worth it (plus they have a somewhat confusion definition of sessions). GA i would have loved to use, but I spent two days using their libraries and some 3rd party ones (gadotnet and one other from maybe codeproject). unfortunately I could only ever see counters post in GA realtime section, never in the normal dashboards even when the api reported success. we were probably doing something wrong but we exceeded our time budget for ga.
We already had an existing simpledb counter that updated using conditional updates as mentioned by previous commentor. This works well, but suffers when there is contention and conccurency where counts are missed (for example, our most updated counter lost several million counts over a period of 3 months, versus a backup system).
We implemented a newer solution which is somewhat similiar to the answer for this question, except much simpler.
We just sharded/partitioned the counters. When you create a counter you specify the # of shards which is a function of how many simulatenous updates you expect. this creates a number of sub counters, each which has the shard count started with it as an attribute :
COUNTER (w/5shards) creates :
shard0 { numshards = 5 } (informational only)
shard1 { count = 0, numshards = 5, timestamp = 0 }
shard2 { count = 0, numshards = 5, timestamp = 0 }
shard3 { count = 0, numshards = 5, timestamp = 0 }
shard4 { count = 0, numshards = 5, timestamp = 0 }
shard5 { count = 0, numshards = 5, timestamp = 0 }
Sharded Writes
Knowing the shard count, just randomly pick a shard and try to write to it conditionally. If it fails because of contention, choose another shard and retry.
If you don't know the shard count, get it from the root shard which is present regardless of how many shards exist. Because it supports multiple writes per counter, it lessens the contention issue to whatever your needs are.
Sharded Reads
if you know the shard count, read every shard and sum them.
If you don't know the shard count, get it from the root shard and then read all and sum.
Because of slow update propogation, you can still miss counts in reading but they should get picked up later. This is sufficient for our needs, although if you wanted more control over this you could ensure that- when reading- the last timestamp was as you expect and retry.