Objective: Using iPhone app, I would like the users store objects in DynamoDB and have Fine-Grained Access Control for the objects using IAM with TVM.
The objects will contain only Strings, no images/file storage -- I'm thinking I won't need an S3?
Question: Since there is no server-side application, do I still need an EC2 Instance? What all suite of AWS services will I have to subscribe to in order to accomplish my objective?
You can use either DynamoDB (or S3), and neither of them would require an EC2 instance - there is no dependency.
If it was me, I'd first see if I could get what I wanted down in S3(because you mentioned it as a possibility), and then go to DynamoDB if I couldn't (i.e. I wanted to be able to run agregation queries across my data set). S3 will be cheaper and depending on what your are doing, may even be faster and would allow you to globally distribute the stored data thru CloudFront easily, which if you have a globally diverse user base may be beneficial.
Related
My AWS Lambda function needs to access data that is updated every hour and is going to be called very often via api. What is the most efficient and least expensive way?
The data that is already updated every hour is configured through Lambda batch, but I don't know where to store this data.
How about putting the latest data in the latest bucket of Amazon S3 every time? Or, even if there is a problem with the hot partition, how about storing it in Amazon DynamoDB because it is simple access? I considered the gateway cache, which is updated every hour, but at a cost. Please advise.
As you have mentioned "least expensive way" I will suggest to use Amazon DynamoDB because 25GB of space is free (always not free tier). Now if your data size is more than 25GB then also you can use DynamoDB over other services like RDS or S3 that comes at a cost.
The simplest option would be to use AWS Systems Manager Parameter Store. It is secured via IAM and is a great way to share parameters between AWS Lambda functions.
If your data is too big to store in Parameter Store, then consider storing it in Amazon S3. It is easily accessible and low-cost.
If there are problems using these services, then you could look at using databases but there is insufficient information in your question make an appropriate recommendation.
I've an application that queries some of my AWS accounts every few hours. Is it safe (from memory, number of connections perspective) to create a new client object for every request ? As we need to sync almost all of the resource types for almost all of the regions, we end up with hundred clients(number of regions multiplied by resource types) per service run.
In general creating the AWS clients are pretty cheap and it is fine to create them and quickly dispose them. The one area I would be careful with when comes to performance is when the SDK has do resolve the credentials like assuming IAM roles to get credentials. It sounds like in your case you are iterating through a bunch of accounts so I'm guessing you are explicitly setting credentials and so that will be okay.
I have 8 different AWS Lambda functions that need to share some common data. (like common configuration for database, etc)
There is no in-built technique for sharing data between Lambda functions. Each function runs independently and there is no shared datastore.
You will need to use an external datastore -- that is, something outside of Lambda that can persist the data.
Some options include:
Amazon S3: You could store information in an S3 object, that is retrieved by your Lambda functions.
Amazon DynamoDB: A fully-managed NoSQL database that provides fast performance. Ideal if you are storing and retrieving a blog of data, such as a JSON object. Your Lambda function would access DynamoDB via standard API calls. For extreme performance, you could use DynamoDB Accelerator (DAX).
AWS Systems Manager Parameter Store: Provides secure, hierarchical storage for configuration data management and secrets management.
The above options are fully-managed services, so you don't need to run any extra infrastructure.
There are other options, such as Amazon ElastiCache, but they would require additional services to be running.
Depending on the nature of configuration you can decide on using different storage options.
If the configuration doesn't change often and mostly static you can use the following options,
Amazon Systems Manager Parameters Store
Embed configuration as a part of code
If they are changing often you can consider,
AWS DynamoDB
AWS S3
Note: Depending on your use case you can also consider these configurations as parameters to these Lambda functions.
I will add my 2 cents answer :
use lambda-layer (50mb) for persistent share data over function invocation
use EFS (ec2, within vpc) for persistente share data over any aws service
the lambda temporary memory (default 512mo extand recently to 10 gb) in ephemere only for current evocation.
I'm working on client-side SDK for my product (based on AWS). Workflow is as follows:
User of SDK somehow uploads data to some S3 bucket
User somehow saves command on some queue in SQS
One of the worker on EC2 polls the queue, executes operation and sends notification via SNS. This point seems to be clear.
As you might have noticed, there are quite some unclear points about access management here. Is there any common practice to provide access to AWS services (S3 and SQS in this case) for 3rd-party users of such SDK?
Options which I see at the moment:
We create IAM-user for users of the SDK which have access to some S3 resources and write permission for SQS.
We create additional server/layer between AWS and SDK which is writing messages to SQS instead of users as well as provides one-time short-living link for SDK to write data directly to S3.
First one seems to be OK, however I'm hesitant that I'm missing some obvious issues here. Second one seems to have a problem with scalability - if this layer will be down, whole system won't work.
P.S.
I tried my best to explain the situation, however I'm afraid that question might still lack some context. If you want more clarification - don't hesitate to write a comment.
I recommend you look closely at Temporary Security Credentials in order to limit customer access to only what they need, when they need it.
Keep in mind with any solution to this kind of problem, it depends on your scale, your customers, and what you are ok exposing to your customers.
With your first option, letting the customer directly use IAM or temporary credentials exposes knowledge to them that AWS is under the hood (since they can easily see requests leaving their system). It has the potential for them to make their own AWS requests using those credentials, beyond what your code can validate & control.
Your second option is better since it addresses this - by making your server the only point-of-contact for AWS, allowing you to perform input validation / etc before sending customer provided data to AWS. It also lets you replace the implementation easily without affecting customers. On availablily/scalability concerns, that's what EC2 (and similar services) are for.
Again, all of this depends on your scale and your customers. For a toy application where you have a very small set of customers, simpler may be better for the purposes of getting something working sooner (rather than building & paying for a whole lot of infrastructure for something that may not be used).
I'd like to set up a separate s3 bucket folder for each of my mobile app users for them to store their files. However, I also want to set up size limits so that they don't use up too much storage. Additionally, if they do go over the limit I'd like to offer them increased space if they sign up for a premium service.
Is there a way I can set folder file size limits through s3 configuration or api? If not would I have to use the apis somehow to calculate folder size on every upload? I know that there is the devpay feature in Amazon but it might be a hassle for users to sign up with Amazon if they want to just use small amount of free space.
There does not appear to be a way to do this, probably at least in part because there is actually no such thing as "folders" in S3. There is only the appearance of folders.
Amazon S3 does not have concept of a folder, there are only buckets and objects. The Amazon S3 console supports the folder concept using the object key name prefixes.
— http://docs.aws.amazon.com/AmazonS3/latest/UG/FolderOperations.html
All of the keys in an S3 bucket are actually in a flat namespace, with the / delimiter used as desired to conceptually divide objects into logical groupings that look like folders, but it's only a convenient illusion. It seems impossible that S3 would have a concept of the size of a folder, when it has no actual concept of "folders" at all.
If you don't maintain an authoritative database of what's been stored by clients (which suggests that all uploads should pass through an app server rather than going directly to S3, which is the the only approach that makes sense to me at all) then your only alternative is to poll S3 to discover what's there. An imperfect shortcut would be for your application to read the S3 bucket logs to discover what had been uploaded, but that is only provided on a best-effort basis. It should be reliable but is not guaranteed to be perfect.
This service provides a best effort attempt to log all access of objects within a bucket. Please note that it is possible that the actual usage report at the end of a month will slightly vary.
Your other option is to develop your own service that sits between users and Amazon S3, that monitors all requests to your buckets/objects.
— http://aws.amazon.com/articles/1109#13
Again, having your app server mediate all requests seems to be the logical approach, and would also allow you to detect immediately (as opposed to "discover later") that a user had exceeded a threshold.
I would maintain a seperate database in the cloud to hold each users total hdd usage count. Its easy to manage the count via S3 Object Lifecycle Events which could easily trigger a Lambda which in turn writes to a DB.