Is there a MiniZinc Predicate to model time-dependant resource bounds (like cumulative)? - scheduling

I want to find an optimal code to model certain events with duration that consume certain resources (like cumulative) but this resources can change over time (array of resource count per time slot).
I'm trying to model the typical scheduling problem with some events happening at certain times that consume an amount of limited resources that cannot be exceeded. Resources can change across time, so predicates like cumulative don't fit. I have tried to check that resources are not exceeded for every time slot, but it is incredibly slow compared to the built-in cumulative predicate, and I was wondering if there's something like:
%ensures that resources never exceed the time-dependant bound b.
predicate desired_cumulative(array[int] of var int: s, array[int] of var int: d, array[int] of var int: r, array[int] of var int: b)

You can model a varying resource availability by using the maximum possible resource availability as the cumulative limit, and adding extra fixed tasks that remove some of that available resource.
For example, say that you have a resource that starts out at 13 at time 0, and goes to 15 at time 5, and then goes down to 12 from time 10 until the end of times (say 100). To model this, use a cumulative with a fixed capacity of 15 (the maximum) and add two tasks, on in the time-span 0 to 5 with resource usage 2 (15-13), and one task in the time span 10 to 100 that has resource usage 3 (15-12).

Related

How does AWS DynamoDB count read units for Query?

I am working on a table, in which every item is approx. 3KB in size.
Now as per the docs, the read units are calculated in 4s - i.e. For every item less than 4 kb, it would be counted as 4KB, and occupy 1 read unit.
Let's say i have a table of 100 items, of 3kb each in size (total table = 300kb). I do a query, in which 50 items satisfy under the query condition, and they are returned to me.
Now, will the read units be counted like : 50 items of 3kb size (rounded to 4kb) = 200kb = 200/4 = 50 read units ?
Any help is appreciated! :) Thanks!
I think this should clarify the issue:
Capacity Units Consumed by Query
DynamoDB calculates the number of read capacity units consumed based on item size, not on the amount of data that is returned to an application.
When you do the query, you can specify a parameter ReturnConsumedCapacity to get the number of read capacity units consumed:
TOTAL — The response includes the aggregate number of read capacity units consumed.
It also depends if you use eventually consistent reads (by default for query) or strongly consistent:
for eventually consistent reads (1 unit is 2 reads): 200 / 4 / 2 = 25 units
for strongly consistent reads (1 unit is 1 read): 200 / 4 / 1 = 50 units
Yes, if you read 50 items of 3K each with strongly consistent reads, the cost will be 50 units. If you do eventual consistent reads, the answer will be half - 25.5 units.
However, there is another important cost issue you should be aware of, if you are not already. You mentioned you actually have 100 items, but only retrieving half of them by using a "query condition". DynamoDB actually has two types of "conditions" on queries. One of them are called key conditions (KeyConditions or KeyConditionExpression) and the other is post-query filters (QueryFilter or FilterExpression). If you use key conditions, you will only pay for the retrieved items - as you hoped. But if you use filtering, you will pay for all items, not just for the retrieved items. So in your example you would be paying 100 units instead of 50.

Will I get better dynamodb throughput with this change?

Right now I have mobile apps hitting a serverless aws lambda endpoint, writing 1 record. At times, the mobile app writes several of these records over and over again (50-300). An example of what 1 record looks like can be seen below:
{
"name": "John Doe",
"miscValue": "1f2ea989-5b33-49e5-a88a-19c7594afd9d",
"ratio": "1.7777777777777777",
"new": true,
"timestamp": "1524156952325"
}
Now, if I change it, so that instead of writing 1 record per lambda call, it can write multiple records per call, and then do fewer calls, would that result in a lower dynamodb throughput?
Example Scenario:
The app could write 1 record per second for 100 seconds vs 10 records per second for 10 seconds.
AWS Documentation states:
One write capacity unit represents one write per second for an item up
to 1 KB in size. If you need to write an item that is larger than 1
KB, DynamoDB will need to consume additional write capacity units. The
total number of write capacity units required depends on the item
size.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html
This leads me to believe, that since my record size is well under 1kb, if I made the change to do multiple records at a time I would see a signficant improvement in dynamodb throughput utilization. Is that correct?
One write capacity unit represents one write per second for an item up to 1 KB in size
No matter how you do it if you provision 1 write unit on the table you are allowed to write 1 item per second (besides burst capacity https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-design.html#bp-partition-key-throughput-bursting). In case the item is bigger than 1 KB, you need more then 1 write unit (ceil(size / 1KB)) to write it. If the item is smaller than 1 KB you anyway need 1 write unit.

Number of WCU equal to number of items to write in DynamoDB?

I have been struggling to understand the meaning of WCU in AWS DynamoDB Documentation. What I understood from AWS documentation is that
If your application needs to write 1000 items where each item is of
size 0.2KB then you need to provision 1000 WCU (i.e. 0.2/1 = 0.2 which
makes nearest 1KB, so 1000 items(to write) * 1KB() = 1000WCU)
If my above understanding is correct then I am wondering for those applications who requires to write millions of records in to DynamoDB per second, Do those application needs to provision that many millions of WCU?
Appreciate if you could clarify me.
I've used DynamoDB in past (and experienced scaling out the RCU and WCU for my application) and according to AWS docs :-
One write capacity unit represents one write per second for an item up
to 1 KB in size. If you need to write an item that is larger than 1
KB, DynamoDB will need to consume additional write capacity units. The
total number of write capacity units required depends on the item
size.
So it means, if you writing a document which is of size 4.5 KB, than it will consume 5 WCU, DyanamoDB roundoff it to next integer number.
Also your understanding
here each item is of size 0.2KB then you need to provision 1000 WCU
(i.e. 0.2/1 = 0.2 which makes nearest 1KB, so 1000 items(to write) *
1KB() = 1000WCU).
is correct.
To save the WCU, unit you need to design your system in such a way that your document size is always near to round-off.
Note :- To avoid the large cost associated with DynamoDB, if you are having lots of reads, you can use caching on top of dynamoDB, which is also suggested by them and was implemented by us as well.(If your application is write heavy, than this approach will not work and you should consider some other alternative like Elasticsearch etc).
According to http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html doc , see below thing
A caching solution can mitigate the skewed read activity for popular
items. In addition, since it reduces the amount of read activity
against the table, caching can help reduce your overall costs for
using DynamoDB.

DynamoDB: Making range query v/s query each item separately

Lets say, I have several items in the dynamodb with the same partition-key and different sort-keys.
Is there any difference between consumed read capacity units if I query the records using a sort-key constraint in a single go v/s query each item individually? Assume that the number of sort-keys to be fetched at-a-time are around 50. The official-documentation says that
One read capacity unit represents one strongly consistent read per
second, or two eventually consistent reads per second, for an item up
to 4 KB in size.
From this definition, it doesn't seem that there should be a difference since this definition is independent of how we query the database.
Apart from additional network delay, does the second approach have any other downside?
Please note that the costing is based on Read Capacity Units (RCU) and Write Capacity Units (WCU).
RCU formula:-
RCU = read capacity unit per item × number of reads per second
Before going into the below calculation, calculate the item size. You can get the item size from AWS console.
Go to the dynamodb table on AWS console --> Overview tab --> See at the bottom.
Lets talk about RCU. In the above case,
Scenario 1 - Getting all the data in one go using hash key only:-
In this scenario, the number of items read will be high (i.e. 50 items data). Calculate the size and check how many RCU required.
Scenario 2 - Getting the data multiple times using hash key and sort key:-
In this scenario, the API will be called multiple times. So, the number of reads per second will go up. Calculate the number of reads required and check how many RCU required.
Compare the RCU calculated in scenario 1 and 2. Choose the option which has less RCU in order to save cost.

How is Amazon DynamoDB throughput calculated and limited?

Is it averaged per second? Per minute? Per hour?
For example.. if I pay for 10 "read units" which allows for 10 highly consistent reads per second, will I be throttled if I try to do 20 reads in a single second, even if it was the only 20 reads that occurred in the last hour? The Amazon documentation and FAQ do not answer this critical question anywhere that I could find.
The only related response I could find in the FAQ completely ignores the issue of how usage is calculated and when throttling may happen:
Q: What happens if my application performs more reads or writes than
my provisioned capacity?
A: If your application performs more
reads/second or writes/second than your table’s provisioned throughput
capacity allows, requests above your provisioned capacity will be
throttled and you will receive 400 error codes. For instance, if you
had asked for 1,000 write capacity units and try to do 1,500
writes/second of 1 KB items, DynamoDB will only allow 1,000
writes/second to go through and you will receive error code 400 on
your extra requests. You should use CloudWatch to monitor your request
rate to ensure that you always have enough provisioned throughput to
achieve the request rate that you need.
It appears that they track writes in a five minute window and will throttle you when your average over the last five minutes exceeds your provisioned throughput.
I did some testing. I created a test table with throughput of 1 write/second. If I don't write to it for a while and then send a stream of requests, Amazon seems to accept about 300 before it starts throttling.
The caveat, of course, is that this is not stated in any official Amazon documentation and could change at any time.
The DynamoDB provides 'Burst Capacity' which allows for spikes in amount of data read from table. You can read more about it under: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.Bursting
Basically it's what #abjennings noticed - It uses 5min window to average number of reads from a table.
If I pay for 10 "read units" which allows for 10 highly consistent
reads per second, will I be throttled if I try to do 20 reads in a
single second, even if it was the only 20 reads that occurred in the
last hour?
Yes, this is due to the very concept of Amazon DynamoDB being fast and predictable performance with seamless scalability - the quoted FAQ is actually addressing this correctly already (i.e. you have to take operations/second literally), though the calculation is better illustrated in Provisioned Throughput in Amazon DynamoDB indeed:
A unit of Write Capacity enables you to perform one write per second
for items of up to 1KB in size. Similarly, a unit of Read Capacity
enables you to perform one strongly consistent read per second (or two
eventually consistent reads per second) of items of up to 1KB in size.
Larger items will require more capacity. You can calculate the number
of units of read and write capacity you need by estimating the number
of reads or writes you need to do per second and multiplying by the
size of your items (rounded up to the nearest KB).
Units of Capacity required for writes = Number of item writes per
second x item size (rounded up to the nearest KB)
Units of Capacity
required for reads* = Number of item reads per second x item size
(rounded up to the nearest KB) * If you use eventually consistent reads you’ll get twice the throughput in terms of reads per second.
[emphasis mine]
Getting these calculations right for real world use cases is potentially complex though, please make sure to check further details like e.g. the Provisioned Throughput Guidelines in Amazon DynamoDB as well accordingly.
My guess would be that they don't state it explicitly on purpose. It's probably liable to change/have regional differences/depend on the position of the moon and stars, or releasing the information would encourage abuse. I would do my calculations on a worst-scenario basis.
From AWS :
DynamoDB currently retains up five minutes (300 seconds) of unused read and write capacity
DynamoDB provides some flexibility in the per-partition throughput provisioning. When you are not fully utilizing a partition's throughput, DynamoDB retains a portion of your unused capacity for later bursts of throughput usage. DynamoDB currently retains up five minutes (300 seconds) of unused read and write capacity. During an occasional burst of read or write activity, these extra capacity units can be consumed very quickly—even faster than the per-second provisioned throughput capacity that you've defined for your table. However, do not design your application so that it depends on burst capacity being available at all times: DynamoDB can and does use burst capacity for background maintenance and other tasks without prior notice.
We set our 'write-limit' to 10 units/sec for one of the tables. Cloudwatch graph (see image) shows we exceeded this by one unit (11 writes/sec). I'm assuming there's a small wiggle room (<= 10%). Again , i'm just assuming ...
https://aws.amazon.com/blogs/developer/rate-limited-scans-in-amazon-dynamodb/
Using google guava library to use rateLimiter class to limit the consumed capacity is possible.