Creating prefixes in S3 to paralellise reads and increase performance - amazon-web-services

I'm doing some research and I was reading this page
https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html
It says
Amazon S3 automatically scales to high request rates. For example, your application can achieve at least 3,500 PUT/POST/DELETE and 5,500 GET requests per second per prefix in a bucket. There are no limits to the number of prefixes in a bucket. It is simple to increase your read or write performance exponentially. For example, if you create 10 prefixes in an Amazon S3 bucket to parallelise reads, you could scale your read performance to 55,000 read requests per second.
I'm not sure what the last bit means. My understanding is that for the filename 'Australia/NSW/Sydney', the prefix is 'Australia/NSW'. Correct?
How does creating 10 of these improve your read performance? Do you create for example Australia/NSW1/, Australia/NSW2/, Australia/NSW3/, and then map them to a load balancer somehow?

S3 is designed like a Hashtable/HashMap in Java. The prefix form the hash for the hash-bucket... and the actual files are stored in groups in these buckets...
To search a particular file you need to compare all files in a hash-bucket... whereas getting to a hash-bucket is instant (constant-time).
Thus the more descriptive the keys, the more hash-buckets hence lesser items in those buckets... which makes the lookup faster...
Eg. a bucket with tourist attraction details for all countries in the world
Bucket1: placeName.jpg (all files in the bucket no prefix)
Bucket2: countryName/state/placeName.jpg
now if you are looking for Sydney.info in Australia/NSW... the lookup will be faster in second bucket.

No, S3 doesn't connect to LB, ever. This article covers this topic but the important highlights:
(...) keys in S3 are partitioned by prefix
(...)
Partitions are split either due to sustained high request rates, or because they contain a large number of keys (which would slow down lookups within the partition). There is overhead in moving keys into newly created partitions, but with request rates low and no special tricks, we can keep performance reasonably high even during partition split operations. This split operation happens dozens of times a day all over S3 and simply goes unnoticed from a user performance perspective. However, when request rates significantly increase on a single partition, partition splits become detrimental to request performance. How, then, do these heavier workloads work over time? Smart naming of the keys themselves!
So Australia/NSW/ could be read from the same partition while Australia/NSW1/ and might Australia/NSW2/ be read from two others. It doesn't have to be that way but still prefixes allow some control of how to partition the data because you have a better understanding of what kind of reads you will be doing on it. You should aim to have reads spread evenly over the prefixes.

Related

Athena query timeout for bucket containing too many log entries

I am running a simple Athena query as in
SELECT * FROM "logs"
WHERE parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z')
BETWEEN parse_datetime('2021-12-01:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND
parse_datetime('2021-12-21:19:00:00','yyyy-MM-dd:HH:mm:ss');
However this times out due to the default DML 30 min timeout.
The entries of the path I am querying are a few millions.
Is there a way to address this in Athena or is there a better suited alternative for this purpose?
This is normally solved with partitioning. For data that's organized by date, partition projection is the way to go (versus an explicit partition list that's updated manually or via Glue crawler).
That, of course, assumes that your data is organized by the partition (eg, s3://mybucket/2021/12/21/xxx.csv). If not, then I recommend changing your ingest process as a first step.
You my want to change your ingest process anyway: Athena isn't very good at dealing with a large number of small files. While the tuning guide doesn't give an optimal filesize, I recommend at least a few tens of megabytes. If you're getting a steady stream of small files, use a scheduled Lambda to combine them into a single file. If you're using Firehose to aggregate files, increase the buffer sizes / time limits.
And while you're doing that, consider moving to a columnar format such as Parquet if you're not already using it.

AWS DynamoDB: What does the graph implies? What needs to be done? Few of my btachwrite (delete request) failed

Can somebody tell what needs to be done?
Im facing few issues when I am having 1000+ events.
Few of them are not getting deleted after my process.
Im doing a batch delete through batchwriteitem
Each partition on a DynamoDB table is subject to a hard limit of 1,000 write capacity units and 3,000 read capacity units. If your workload is unevenly distributed across partitions, or if the workload relies on short periods of time with high usage (a burst of read or write activity), the table might be throttled.
It seems You are using DynamoDB adaptive capacity, however, DynamoDB adaptive capacity automatically boosts throughput capacity to high-traffic partitions. However, each partition is still subject to the hard limit. This means that adaptive capacity can't solve larger issues with your table or partition design. To avoid hot partitions and throttling, optimize your table and partition structure.
https://aws.amazon.com/premiumsupport/knowledge-center/dynamodb-table-throttled/
One way to better distribute writes across a partition key space in Amazon DynamoDB is to expand the space. You can do this in several different ways. You can add a random number to the partition key values to distribute the items among partitions. Or you can use a number that is calculated based on something that you're querying on.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-sharding.html

s3 bucket date path format for faster operations

I was told by one of the consultants from AWS itself that, while naming the folders(objects) in s3 with date. use MM-DD-YYYY for faster s3 operations like get Object, but i usually use YYYY-MM-DD. I don't understand what difference it makes, is there a difference, if yes, which one is better?
This used to be a limitation due to the way data had been stored in the back end, but it doesn't apply (to the original extend, see jellycsc's comment below) anymore.
The reason for this recommendation was, that in the past Amazon Simple Storage Service (S3) partitioned data using the key. With many files having the same prefix (like e.g. all starting with the same year) this could have led to reduced performance when many files needed to be loaded from the same partition.
However, since 2018, hashing and random prefixing the S3 key is no longer required to see improved performance: https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/
S3 creates so-called partitions under the hood in order to serve up your requests to the bucket. Each partition has the ability to serve 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second. They partition the bucket based on the common prefix among all the object keys. MM-DD-YYYY date format would be slightly faster than YYYY-MM-DD because objects with MM-DD-YYYY naming will spread across more partitions.
Key take away here: more randomness at the beginning of the object keys will likely give you more performance out of the S3 bucket

Recommended way to read an entire table (Lambda, DynamoDB/S3)

I'm new to AWS and am working on a Serverless application where one function needs to read a large array of data. Never will a single item be read from the table, but all the items will routinely be updated by a schedule function.
What is your recommendation for the most efficient way of handling this scenario? My current implementation uses the scan operation on a DynamoDB table, but with my limited experience I'm unsure if this is going to be performant in production. Would it be better to store the data as a JSON file on S3 perhaps? And if so would it be so easy to update the values with a schedule function?
Thanks for your time.
PS: to give an idea of the size of the database, there will be ~1500 items, each containing an array of up to ~100 strings
It depends on the size of each item, but how?
First of all to use DynamoDB or S3 you pay for two services (in your case*):
1- Request per month
2- Storage per month
If you have small items the fist case will be up to 577 times cheaper if you read items from DynamoDB instead of S3
How: $0.01 per 1,000 requests for S3 compared to 5.2 million reads (up to 4 KB each) per month for DynamoDB. Plus you should pay $0.01 per GB for data retrieval in S3 which should be added up to that price. However, your writes into S3 will be free while you should pay for each write into your DynamoDB (which is almost 4 times more expensive than reading).
However if your items require so many RCUs per reads maybe S3 would be cheaper in this case.
And regarding the storage cost, S3 is cheaper but again you should see how big your data will be in size as you pay maximum $0.023 per GB for S3 while you pay $0.25 per GB per month which is almost 10 times more expensive.
Conclusion:
If you have so many request and your items are smaller its easier and even more straight forward to use DynamoDB as you're not giving up any of the query functionalities that you have using DynamoDB which clearly you won't have in case you use S3. Otherwise, you can consider keeping a pointer to objects' locations stored in S3 in DynamoDB.
(*) The costs you pay for tags in S3 or indexes in DynamoDB are another factors to be considered in case you need to use them.
Here is how I would do:
Schedule Updates:
Lambda (to handle schedule changes) --> DynamoDB --> DynamoDBStream --> Lambda(Read if exists, Apply Changes to All objects and save to single object in S3)
Read Schedule:
With Lambda Read the single object from S3 and serve all the schedules or single schedule depending upon the request. You can check whether the object is modified or not before reading the next time, so you don't need to read every time from S3 and serve only from memory.
Scalability:
If you want to scale, you need to split the objects to certain size so that you will not load all objects exceeding 3GB memory size (Lambda Process Memory Size)
Hope this helps.
EDIT1:
When you cold start your serving lambda, load the object from s3 first and after that, you can check s3 for an updated object (after certain time interval or a certain number of requests) with since modified date attribute.
You can also those data to Lambda memory and serve from memory until the object is updated.

Orphan management with AWS DynamoDB & S3 data spread across multiple items & buckets?

DynamoDB items are currently limited to a 400KB maximum size. When storing items longer than this limit, Amazon suggests a few options, including splitting long items into multiple items, splitting items across tables, and/or storing large data in S3.
Sounds OK if nothing ever failed. But what's a recommended approach to deal with making updates and deletes consistent across multiple DynamoDB items plus, just to make things interesting, S3 buckets too?
For a concrete example, imagine an email app with:
EmailHeader table in DynamoDB
EmailBodyChunk table in DynamoDB
EmailAttachment table in DynamoDB that points to email attachments stored in S3 buckets
Let's say I want to delete an email. What's a good approach to make sure that orphan data will get cleaned up if something goes wrong during the delete operation and data is only partially deleted? (Ideally, it'd be a solution that won't add additional operational complexity like having to temporarily increase the provisioned read limit to run a garbage-collector script.)
There are couple of alternatives for your use case:
Use the DynamoDB transactions library that:
enables Java developers to easily perform atomic writes and isolated reads across multiple items and tables when building high scale applications on Amazon DynamoDB.
It is important to note that it requires 7N+4 writes, which'll be costly. So, go this route only if you require strong ACID properties, such as for banking or other monetary applications.
If you are okay with the DB being inconsistent for a short duration, you can perform the required operations one by one and mark the entire thing complete only at the end.
You could manage your deletion events with an SQS queue that supports exactly-once semantics and use that queue to start a Step workflow that deletes the corresponding header, body chunk and attachment. In retrospect, the queue does not even need to be exactly once, as you can just stop a workflow in case the header does not exist.