Load testing AWS SDK client - amazon-web-services

What is the recommended way to performance test AWS SDK clients? I'm basically just listing/describing resources and would like to see what happens when I query 10k objects. Does AWS provide some type of mock API, or do I really need to request 10k of each type of resource to do this?
I can of course mock in at least two levels:
SDK: I wrap the SDK with my own interfaces and create mocks. This doesn't exercise the SDK's JSON to objects code and my mocks affect the AppDomain with additional memory, garbage collection, etc.
REST API: As I understand it the SDKs are just wrappers to the REST API (hence the HTTP response codes shown in the objects. It seems I can configure the SDK to go to custom endpoints.
This isolates the mocks from the main AppDomain and is more representative, but of course I'm still making some assumptions about response time, limits, etc.
Besides the above taking a long time to implement, I would like to make sure my code won't fail at scale, either locally or at AWS. The only way I see to guarantee that is creating (and paying for) the resources at AWS. Am I missing anything?

When you query 10k or more objects you'll have to deal with:
Pagination - the API usually returns only a limited number of items per call, providing NextToken for the next call.
Rate Limiting - if you hammer some AWS APIs too much they'll rate limit you which the SDK will probably report as some kind of Rate Limit Exceeded Exception.
Memory usage - hopefully you don't collect all the results in the memory before processing. Process them as they arrive to conserve your operating memory.
Other than that I don't see why it shouldn't work.
Update: Also check out Moto - the AWS mocking library (for Python) that can also run in a standalone mode for use with other languages. However as with any mocking it may not behave 100% the same as the real thing, for instance around the Rate Limiting behaviour.

Related

Can AWS Lambda Replace an entire Rest Api layer in an enterprise web application

I am new to AWS and havebeen reading about aws lambda. Its very useful but you still have to write individual lambda functions instead of as a whole. i am wondering practically if its possible AWS Lambda can replace an entire Rest Api layer in an enterprise web application
Of course, everything is possible in the computer world but you need to answer lambda-serverless is the best way for me?
For example, you need smaller business flow per lambda(lambda have some hardware limits and need short computing and starting time for cost savings), that's mean you must separate your flow, its success depends on your business area and implementation. is your working area fit for this? But Lambda can handle almost everything with other AWS services(to be honest, maybe in some cases, lambda is a bit harder than the current system and community support is less than traditional systems but it also has lots of advantages as you know). You can check this repo, it full-serverless booking app and this serverless e-commerce repo.
To sum up, if your team is ready for it, you can start the conversion from some part of your application and check everything is ok. This answer totally depends on your team and business BCS nothing is impossible and that's engineering.
That's my opinion because your question looks like a comment question.

Dummy API for a Django Test

I have a booking app that can deal with both local and remote API bookings. Our logic —for (eg) pricing and availability— follows two very different pathways. We obviously need to test both.
But running regular tests against a remote API is slow. The test environment provided manages a response in 2-17 seconds. It's not feasible to use this in my pre_commit tests. Even if they sped that up, it's never going to be fast and will always require a connection to pass.
But I still need to test our internal logic for API bookings.
Is there some way that within a test runner, I can spin up a little webserver (quite separate to the Django website) that serves a reference copy of their API. I can then plug that into the models we're dealing with and query against that locally, at speed.
What's the best way to handle this?
Again, I need to stress that this reference API should not be part of the actual website. Unless there's a way of adding views that only apply at test-time. I'm looking for clean solutions. The API calls are pretty simple. I'm not looking for verification or anything like that here, just that bookings made against an API are priced correctly internally, handle availability issues, etc.
for your test porpuse you can mock api call functions.
you can see more here:
https://williambert.online/2011/07/how-to-unit-testing-in-django-with-mocking-and-patching/

How to relieve a rate-limited API?

We run a website which heavily relies on the Amazon Product Advertising API (APAA). What happens is that when we experience a sudden spike in users it happens that we hit the rate-limit and all functions relying on the APAA shut down for a while. What can we do so that doesn't happen?
So, obviously we have some basic caching in place, but the APAA doesn't allow us to cache data for a very long time, and APAA queries can vary a lot so there may not be any cached data at all to query.
I think that your only option is to retry the API calls until they work — but do so in a smart way. Unfortunately, that's what everybody that gets throttled does and AWS expects people to handle that themselves.
You can implement an exponential backoff and add jitter to prevent cluster calls. AWS has a great blog post about solutions for this kind of problem: https://www.awsarchitectureblog.com/2015/03/backoff.html

Api Gateway, multiple lambda in the same JAR

I'm trying to deploy an API suite by using Api Gateway and implementing code in Java using lambda. Is it ok to have many ( related, of course ) lambdas in a single jar ( what I'm supposing to do ) or it is better to create a single jar for each lambda I want to deploy? ( this will became a mess very easily)
This is really a matter of taste but there are a few things you have to consider.
First of all there are limitations to how big a single Lambda upload can be (50MB at time of writing).
Second, there is also a limit to the total size of all all code that you upload (currently 1.5GB).
These limitations may not be a problem for your use case but are good to be aware of.
The next thing you have to consider is where you want your overhead.
Let's say you deploy a CRUD interface to a single Lambda and you pass an "action" parameter from API Gateway so that you know which operation you want to perform when you execute the Lambda function.
This adds a slight overhead to your execution as you have to route the action to the appropriate operation. This is likely a very fast routing but nevertheless, it adds CPU cycles to your function execution.
On the other hand, deploying the same jar over several Lambda function will quickly get you closer to the limits I mentioned earlier and it also adds administrative overhead in managing your Lambda functions as that number grows. They can of course be managed via CloudFormation or cli scripts but it will still add an administrative overhead.
I wouldn't say there is a right and a wrong way to do this. Look at what you are trying to do, think about what you would need to manage the deployment and take it from there. If you get it wrong you can always start over with another approach.
Personally I like the very small service Lambdas that do internal routing and handles more than just a single operation but they are still very small and focused on a specific type of task be it a CRUD for a database table or managing a selected few very closely related operations.
There's some nice advice on serverless.com
As polythene say's, the answer is "it depends". But they've listed the pros and cons for 4 ways of going about it:
Microservices Pattern
Services Pattern
Monolithic Pattern
Graph Pattern
https://serverless.com/blog/serverless-architecture-code-patterns/

Developing/Testing Twitter apps without slamming the API

I'm currently working on an app that works with Twitter, but while developing/testing (especially those parts that don't rely heavily on real Twitter data), I'd like to avoid constantly hitting the API or publishing junk tweets.
Is there a general strategy people use for taking it easy on the API (caching aside)? I was thinking of rolling my own library that would essentially intercept outgoing requests and return mock responses, but I wanted to make sure I wasn't missing anything obvious first.
I would probably start by mocking the specific parts of the API you need for your application. In fact, this may actually force you to come up with a cleaner design for your app, because it more or less requires you to think about your application in terms of "what" it should do rather than "how" it should do it.
For example, if you are using the Twitter Search API, your application most likely should not care whether or not you are using the JSON or the Atom format option. The ability to search Twitter using a given query and get results back represents the functionality you want, so you should mock the API at that level of abstraction. The output format is just an implementation detail.
By mocking the API in terms of functionality instead of in terms of low-level implementation details, you can ensure that the application does what you expect it to do, before you actually connect to Twitter for real. At that point, you've already verified that the app works as intended, so the only thing left is to write the code to make the REST requests and parse the responses, which should be fairly straightforward, so you probably won't end up hitting Twitter with a lot of junk data at that point.
Caching is probably the best solution. Besides that, I believe the API is limited to 100 requests per hour. So maybe make a function that keeps counting each request and as it gets close to 100, it says, OK, every 10 API requests I will pull data. It wouldn't be hard set, probably a gradient function that curbs off when you are nearing the limit.
I've used Tweet#, it caches and should do everything you need since it has 100% of twitter's api covered and then some...
http://dimebrain.com/2009/01/introducing-tweet-the-complete-fluent-c-library-for-twitter.html
Cache stuff in a database... If the cache is too old then request the latest data via the API.
Also think about getting your application account white-listed, it will allow you to have a 20,000 api request limit per hour vs the measly 100 (which is made for a user not an application).
http://twitter.com/help/request_whitelisting