When using AWS .Net SDK, what should be the lifecycle of client objects? - amazon-web-services

I've an application that queries some of my AWS accounts every few hours. Is it safe (from memory, number of connections perspective) to create a new client object for every request ? As we need to sync almost all of the resource types for almost all of the regions, we end up with hundred clients(number of regions multiplied by resource types) per service run.

In general creating the AWS clients are pretty cheap and it is fine to create them and quickly dispose them. The one area I would be careful with when comes to performance is when the SDK has do resolve the credentials like assuming IAM roles to get credentials. It sounds like in your case you are iterating through a bunch of accounts so I'm guessing you are explicitly setting credentials and so that will be okay.

Related

How to configure automatic failover of storage account when using AzureWebJobsStorage for Web Job Timer Triggers

I have an Azure Web Job that runs on a TimerTrigger to put some messages on a Service Bus queue. I have deployed this Web Job on 2 separate regions, for high availability in case one region goes down. As per https://github.com/Azure/azure-webjobs-sdk-extensions/wiki/TimerTrigger, I can see that the distributed lock mechanism is working perfectly and the timer is only executing in one region at a time, so there are no duplicate requests coming through.
However, the Web Jobs in both the regions are using the same common storage account, and the storage account is deployed to just one region. I can't use 2 separate storage accounts, because then I lose on the distributed lock functionality. I know that Azure provides Geo-redundant storage for my storage account, so the data is replicated to a secondary region.
My question is - in the event of a disaster in one region (specifically the primary region of the storage account), is there a way to have the web job automatically failover to the secondary end point? Right now, I have the "AzureWebJobsStorage" application setting specified to be one of the shared access keys of the storage account.
Appreciate any pointers!
I'm not an expert on the storage SDK but I've linked two docs that may help walk you through how to make your app highly available.
https://learn.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance?toc=/azure/storage/blobs/toc.json
https://learn.microsoft.com/en-us/azure/storage/common/geo-redundant-design?tabs=legacy
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-create-geo-redundant-storage?tabs=dotnet11
Since the caveat with Geo-redundant storage is that it's read-only on the secondary until you make a request otherwise, I did find GeoRedundantSecondaryUri property part of BlobClientOptions that will use the secondary address as part of a retry policy.

Access management for AWS-based client-side SDK

I'm working on client-side SDK for my product (based on AWS). Workflow is as follows:
User of SDK somehow uploads data to some S3 bucket
User somehow saves command on some queue in SQS
One of the worker on EC2 polls the queue, executes operation and sends notification via SNS. This point seems to be clear.
As you might have noticed, there are quite some unclear points about access management here. Is there any common practice to provide access to AWS services (S3 and SQS in this case) for 3rd-party users of such SDK?
Options which I see at the moment:
We create IAM-user for users of the SDK which have access to some S3 resources and write permission for SQS.
We create additional server/layer between AWS and SDK which is writing messages to SQS instead of users as well as provides one-time short-living link for SDK to write data directly to S3.
First one seems to be OK, however I'm hesitant that I'm missing some obvious issues here. Second one seems to have a problem with scalability - if this layer will be down, whole system won't work.
P.S.
I tried my best to explain the situation, however I'm afraid that question might still lack some context. If you want more clarification - don't hesitate to write a comment.
I recommend you look closely at Temporary Security Credentials in order to limit customer access to only what they need, when they need it.
Keep in mind with any solution to this kind of problem, it depends on your scale, your customers, and what you are ok exposing to your customers.
With your first option, letting the customer directly use IAM or temporary credentials exposes knowledge to them that AWS is under the hood (since they can easily see requests leaving their system). It has the potential for them to make their own AWS requests using those credentials, beyond what your code can validate & control.
Your second option is better since it addresses this - by making your server the only point-of-contact for AWS, allowing you to perform input validation / etc before sending customer provided data to AWS. It also lets you replace the implementation easily without affecting customers. On availablily/scalability concerns, that's what EC2 (and similar services) are for.
Again, all of this depends on your scale and your customers. For a toy application where you have a very small set of customers, simpler may be better for the purposes of getting something working sooner (rather than building & paying for a whole lot of infrastructure for something that may not be used).

Amazon AWS: DynamoDB requirements

Objective: Using iPhone app, I would like the users store objects in DynamoDB and have Fine-Grained Access Control for the objects using IAM with TVM.
The objects will contain only Strings, no images/file storage -- I'm thinking I won't need an S3?
Question: Since there is no server-side application, do I still need an EC2 Instance? What all suite of AWS services will I have to subscribe to in order to accomplish my objective?
You can use either DynamoDB (or S3), and neither of them would require an EC2 instance - there is no dependency.
If it was me, I'd first see if I could get what I wanted down in S3(because you mentioned it as a possibility), and then go to DynamoDB if I couldn't (i.e. I wanted to be able to run agregation queries across my data set). S3 will be cheaper and depending on what your are doing, may even be faster and would allow you to globally distribute the stored data thru CloudFront easily, which if you have a globally diverse user base may be beneficial.

can i meter or set a size limit to an s3 folder

I'd like to set up a separate s3 bucket folder for each of my mobile app users for them to store their files. However, I also want to set up size limits so that they don't use up too much storage. Additionally, if they do go over the limit I'd like to offer them increased space if they sign up for a premium service.
Is there a way I can set folder file size limits through s3 configuration or api? If not would I have to use the apis somehow to calculate folder size on every upload? I know that there is the devpay feature in Amazon but it might be a hassle for users to sign up with Amazon if they want to just use small amount of free space.
There does not appear to be a way to do this, probably at least in part because there is actually no such thing as "folders" in S3. There is only the appearance of folders.
Amazon S3 does not have concept of a folder, there are only buckets and objects. The Amazon S3 console supports the folder concept using the object key name prefixes.
— http://docs.aws.amazon.com/AmazonS3/latest/UG/FolderOperations.html
All of the keys in an S3 bucket are actually in a flat namespace, with the / delimiter used as desired to conceptually divide objects into logical groupings that look like folders, but it's only a convenient illusion. It seems impossible that S3 would have a concept of the size of a folder, when it has no actual concept of "folders" at all.
If you don't maintain an authoritative database of what's been stored by clients (which suggests that all uploads should pass through an app server rather than going directly to S3, which is the the only approach that makes sense to me at all) then your only alternative is to poll S3 to discover what's there. An imperfect shortcut would be for your application to read the S3 bucket logs to discover what had been uploaded, but that is only provided on a best-effort basis. It should be reliable but is not guaranteed to be perfect.
This service provides a best effort attempt to log all access of objects within a bucket. Please note that it is possible that the actual usage report at the end of a month will slightly vary.
Your other option is to develop your own service that sits between users and Amazon S3, that monitors all requests to your buckets/objects.
— http://aws.amazon.com/articles/1109#13
Again, having your app server mediate all requests seems to be the logical approach, and would also allow you to detect immediately (as opposed to "discover later") that a user had exceeded a threshold.
I would maintain a seperate database in the cloud to hold each users total hdd usage count. Its easy to manage the count via S3 Object Lifecycle Events which could easily trigger a Lambda which in turn writes to a DB.

AWS IAM forced delay to start using a recently created user

I am facing a strange behavior in an AWS IAM's application to automatically create users and roles.
My sequence of operation is:
Send an action CreateUser;
Send an action CreateAccessKey for this created user;
Send an action GetUser for this created user to get the account id. I need to do this because I only have the root key and secret;
Send an action CreateRole, with a AssumeRolePolicyDocument where the Principal is this created user.
When I execute step 4, I receive a MalformedPolicyDocument (Invalid principal in policy: "AWS":"arn:aws:iam::123412341234:user/newuser").
But, if before the step 4 I put a 15 seconds delay, it runs without any problem.
Is there any workflow that I don't need to stick with a fixed delay, like read some IAM webservice to check if the user is ready to be used?
As outlined in my answer to Deterministically creating and tagging EC2 instances, the AWS APIs need to be generally treated as eventually consistent only.
Specifically, I mention that it is reasonable to assume that each and every single API action is operated entirely independently by AWS, i.e. is a micro service on its own. This explains, why even within a service like Amazon EC2 or in your case AWS Identity and Access Management (IAM), a call to one API action that results in a resource state change isn't necessarily visible to (all) other API actions within that service right away - that's precisely what you are experiencing, i.e. even though the created user is already visible for one of the other IAM APIs GetUser, it isn't yet visible for a different IAM API action CreateRole.
The correct workflow to work around this inherent characteristics is to repeat the desired API call with an Exponential backoff strategy until it succeeds (or the configured timeout is reached), which is good practice anyway in asynchronous communication scenarios. Several AWS SDKs offer integrated support for retry with exponential support meanwhile, which is usually applied transparently, but can be tailored to specific scenarios if need be, e.g. to extend whatever default timeout for very high latency scenarios etc.