Which is faster in AWS Lambda : (Network call Vs Child Process) - amazon-web-services

I have two pieces of code
Lambda A : (written in Python)
Lambda B : (written in NodeJs)
Scenario 1:
Lambda A calls Lambda B with some payload and waits for output from lambda B.
Lambda B as part of its logic makes api calls and returns data. I call the other lambda using boto3 (InvocationType: RequestResponse).
Scenario 2:
I create a zip file containing both the pieces of the code and create a lambda using the zip file. In the python code, I invoke the NodeJs code using subprocess.call().
Can anyone tell me which approach is faster. what are the pros and cons of above approaches (w.r.t : billingTime, duration time, scalability etc..)
As per my understanding the cons for the above approaches will be :
Scenario 1:
Because of the network call, I will be billed twice (for the network call duration)
has some network overhead.
Scenario 2:
Sub process creation overhead.

The answer here boils down to "benchmark it."
The process creation overhead, itself, should be minimal, but the overhead of starting up the Node child could be a performance killer.
The reason centers around container reuse.
When a Node Lambda function is invoked for the first time, then finishes, the container and the process inside it remain on a warm standby for the next invocation. When that happens, your process is already running, and the handler function is invoked in a matter of microseconds. There is no time required to set up the container and start the process and run through any initialization code on that second invocation.
This means that, in scenario 1, the time for the function to get started is minimized. The overhead is how long it takes for the caller to make the request to Lambda and for Lambda to return the response, once available. In between those two things, there is very little time.
By contrast, if you spin up a child process with each request in scenario 2, you have all of that initialization overhead with each request.
I recently had occasion to run some code in Lambda that was in a language Lambda doesn't support, called by a Lambda function written in Node.js I do this with a child process, but with a twist: the child process was written to read from STDIN and write to STDOUT, for IPC from and to the JS code. I can then send a "request" to the child process and an event is triggered when the child writes the response.
So, the child is started from Node, with its controlling Node object in a global variable, only if not already present... but it is likely to be already present, again, due to container reuse.
In Node/Lambda, setting context.callbackWaitsForEmptyEventLoop allows the Lambda callback to consider the invocation finished, even if the event loop is still running, and this means I can leave that child process running across invocations.
With this mechanism in place, I achieve best-case runtimes for each Lambda invocation of under 3 milliseconds when the container is reused. For each new container, then first initiation of that child process is in excess of 1000 ms. The 3ms time is doubtless better than I could achieve if calling a second Lambda function from inside the first one, but the savings come fron keeping the inner process alive while the container remains alive.
Since your outer function is Python, it's not clear to me just exacrly what implications there are for you, or how useful this might be, but I thought it might serve to illustrate the value of the concept of keeping your child process alive between invocations.
But start with what you have, and benchmark both of your scenarios, multiple tines, to ensure that any longer than expected runtines aren't an artifact of new container creation.

Related

What is the timeout time for simple caching in AWS Lambda?

So I was finding caching solutions for my AWS Lambda functions and I find out something called 'Simple Caching'. It's fits perfectly for what I want since my data is not changed frequently. However one thing that I was unable to find that what is the timeout for this cache. When is the data refreshed by the function and is there any way I can control it ?
An example of the code I am using for the function:
let cachedValue;
module.exports.handler = function(event, context, callback) {
console.log('Starting Lambda.');
if (!cachedValue) {
console.log('Setting cachedValue now...');
cachedValue = 'Foobar';
} else {
console.log('Cached value is already set: ', cachedValue);
}
};
What you're doing here is taking advantage of a side effect of container reuse. There is no lower or upper bound for how long such values will persist, and no guarantee that they will persist at all. It's a valid optimization to use, but it's entirely outside your control.
Importantly, you need to be aware that this stores the value in one single container. It lives for as long as the Node process in the container are alive, and is accessible whenever a future invocation of the function reuses that process in that container.
If you have two or more invocations of the same function running concurrently, they will not be in the same container, and they will not see each other's global variables. This doesn't make it an invalid technique, but you need to be aware of that fact. The /tmp/ directory will exhibit very similar behavior, which is why you need to clean that up when you use it.
If you throw any exception, the process and possibly the container will be destroyed, either way the cached values will be gone on the next invocation, since there's only one Node process per container.
If you don't invoke the function at all for an undefined/undocumented number of minutes, the container will be released by the service, so this goes away.
Re-deploying the function will also clear this "cache," since a new function version won't reuse containers from older function versions.
It's a perfectly valid strategy as long as you recognize that it is a feature of a black box with no user-serviceable parts.
See also https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/ -- a post that is several years old but still accurate.

Spawn a new thread as soon as another has finished

I've an expensive function that need to be executed 1000 times. Execution can take between 5 seconds and 10 minutes. It has thus a high variation.
I like to have multiple threads working on it. My current implementation devised these 1000 calls in 4 times 250 calls and spawns 4 threads. However, if one thread has a "bad day", it has much longer to finish compared to the other 3 threads.
Hence I like to do a new call to the function whenever a thread has finished a previous call - until all 1000 calls have been made.
I think a thread-pool would work - but if ever possible I like to have a simple method (=as less additional code as possible). Also task-based design goes into this direction (I think). Is there an easy solution for this?
Initialize a semaphore with 1000 units. Have each of the 4 threads loop around a semaphore wait() and the work function.
All the threads will then work on the function until it has been executed 1000 times. Even if three of the threads get stuck and take ages, the fourth will handle the other 997 calls.
[Edit]
Meh.. aparrently, the standard C++11 library does not include semaphores. A semaphore is, however, a basic OS sunchro primitive and so should be easy enough to call, eg. with POSIX.
You can use either one of the reference implementation of Exectuors and then call the function via
#include <experimental/thread_pool>
using std::experimental::post;
using std::experimental::thread_pool;
thread_pool pool_{1};
void do_big_task()
{
for (auto i : n)
{
post(pool_, [=]
{
// do your work here;
});
}
}
Executors are coming in C++17 so I thought I would get in early.
Or if you want to try another flavour of executors then there is a more recent implementation with a slightly different syntax.
Given that you have already been able to segment the calls into separate entities and the threads to handle. Once approach is to use std::package_task (with its associated std::future) to handle the function call, and place them in a queue of some sort. In turn, each thread can pick up the packaged tasks and process them.
You will need to lock the queue for concurrent access, there may be some bottle necking here, but compared to the concern that a thread can have "a bad day", this should be minimal. This is effectively a thread pool, but it allows you some control over the execution of the tasks.
Another alternative is to use std::async and specify its launch policy as std::launch::async, the disadvantage it that you do not control the thread creation itself, so you are dependent on how efficient your standard library is controlling the threads vs. how many cores you have.
Either approach would work, the key would be to measure the performance of the approaches over a reasonable sample size. The measure should be for time and resource use (threads and keeping the cores busy). Most OSes will include ways of measuring the resource usage of the process.

Grand Central Dispatch, dispatch queue: enumerate array and create many concurrent tasks for each element

So, I have array(fetch result from Core Data).
I want iterate it, and for each element of the array (url, and other data) create a new parallel task (network).
Only after completing all these tasks need to start a new parallel task.
Tell me how to make it? Serial Queue? Dispatch queue?
A fun way that I have done this is to use dispatch_group_t. You can say:
dispatch_group_t myGroup = dispatch_group_create();
Then for each operation you will need to track call:
dispatch_group_enter(myGroup);
Inside of your completion blocks call this as the last line of your completion block:
dispatch_group_leave(myGroup);
Finally, call:
dispatch_group_notify(myGroup, dispatch_get_main_queue(), ^{
// kick off tasks after the number of dispatch_group_leave(myGroup) calls
// equal the number of dispatch_group_enter(myGroup) calls
});
This was a challenging problem for me in an app I am currently working on and this worked like a charm. Some things to note: Make sure you don't call enter more than leave otherwise notify will never be called. Also, your application will crash if you call leave too many times as you will be referencing a group that has been already notified and therefore released. I usually prevent this by calling enter inside a separate loop before kicking off the network tasks that have the leave calls in their completion. This may not be necessary, but it makes me feel more secure that all my enters are called before any leaves, therefore the number of enters and leaves are never equal before the last completion. Cheers!
Use enumerateObjectsWithOptions:usingBlock: and pass option NSEnumerationConcurrent.

Why should I use std::async?

I'm trying to explore all the options of the new C++11 standard in depth, while using std::async and reading its definition, I noticed 2 things, at least under linux with gcc 4.8.1 :
it's called async, but it got a really "sequential behaviour", basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
at this point it's natural for me asking why choosing std::async over even a simple set of functors ? It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Am I missing something ? Can you show an example that is granted to be executed in an async, non blocking, way ?
it's called async, but it got a really "sequential behaviour",
No, if you use the std::launch::async policy then it runs asynchronously in a new thread. If you don't specify a policy it might run in a new thread.
basically in the row where you call the future associated with your async function foo, the program blocks until the execution of foo it's completed.
It only blocks if foo hasn't completed, but if it was run asynchronously (e.g. because you use the std::launch::async policy) it might have completed before you need it.
it depends on the exact same external library as others, and better, non-blocking solutions, which means pthread, if you want to use std::async you need pthread.
Wrong, it doesn't have to be implemented using Pthreads (and on Windows it isn't, it uses the ConcRT features.)
at this point it's natural for me asking why choosing std::async over even a simple set of functors ?
Because it guarantees thread-safety and propagates exceptions across threads. Can you do that with a simple set of functors?
It's a solution that doesn't even scale at all, the more future you call, the less responsive your program will be.
Not necessarily. If you don't specify the launch policy then a smart implementation can decide whether to start a new thread, or return a deferred function, or return something that decides later, when more resources may be available.
Now, it's true that with GCC's implementation, if you don't provide a launch policy then with current releases it will never run in a new thread (there's a bugzilla report for that) but that's a property of that implementation, not of std::async in general. You should not confuse the specification in the standard with a particular implementation. Reading the implementation of one standard library is a poor way to learn about C++11.
Can you show an example that is granted to be executed in an async, non blocking, way ?
This shouldn't block:
auto fut = std::async(std::launch::async, doSomethingThatTakesTenSeconds);
auto result1 = doSomethingThatTakesTwentySeconds();
auto result2 = fut.get();
By specifying the launch policy you force asynchronous execution, and if you do other work while it's executing then the result will be ready when you need it.
If you need the result of an asynchronous operation, then you have to block, no matter what library you use. The idea is that you get to choose when to block, and, hopefully when you do that, you block for a negligible time because all the work has already been done.
Note also that std::async can be launched with policies std::launch::async or std::launch::deferred. If you don't specify it, the implementation is allowed to choose, and it could well choose to use deferred evaluation, which would result in all the work being done when you attempt to get the result from the future, resulting in a longer block. So if you want to make sure that the work is done asynchronously, use std::launch::async.
I think your problem is with std::future saying that it blocks on get. It only blocks if the result isn't already ready.
If you can arrange for the result to be already ready, this isn't a problem.
There are many ways to know that the result is already ready. You can poll the future and ask it (relatively simple), you could use locks or atomic data to relay the fact that it is ready, you could build up a framework to deliver "finished" future items into a queue that consumers can interact with, you could use signals of some kind (which is just blocking on multiple things at once, or polling).
Or, you could finish all the work you can do locally, and then block on the remote work.
As an example, imagine a parallel recursive merge sort. It splits the array into two chunks, then does an async sort on one chunk while sorting the other chunk. Once it is done sorting its half, the originating thread cannot progress until the second task is finished. So it does a .get() and blocks. Once both halves have been sorted, it can then do a merge (in theory, the merge can be done at least partially in parallel as well).
This task behaves like a linear task to those interacting with it on the outside -- when it is done, the array is sorted.
We can then wrap this in a std::async task, and have a future sorted array. If we want, we could add in a signally procedure to let us know that the future is finished, but that only makes sense if we have a thread waiting on the signals.
In the reference: http://en.cppreference.com/w/cpp/thread/async
If the async flag is set (i.e. policy & std::launch::async != 0), then
async executes the function f on a separate thread of execution as if
spawned by std::thread(f, args...), except that if the function f
returns a value or throws an exception, it is stored in the shared
state accessible through the std::future that async returns to the
caller.
It is a nice property to keep a record of exceptions thrown.
http://www.cplusplus.com/reference/future/async/
there are three type of policy,
launch::async
launch::deferred
launch::async|launch::deferred
by default launch::async|launch::deferred is passed to std::async.

thread-safe function pointers in C++

I'm writing a network library that a user can pass a function pointer to for execution on certain network events. In order to keep the listening loop from holding up the developer's application, I pass the event handler to a thread. Unfortunately, this creates a bit of a headache for handling things in a thread-safe manner. For instance, if the developer passes a function that makes calls to their Windows::Forms application's elements, then an InvalidOperationException will be thrown.
Are there any good strategies for handling thread safety?
Function pointers can not be thread safe as they declare a point to call. So they are just pointers.
Your code always runs in the thread it was called from (via the function pointer).
What you want to achieve is that your code runs in a specific thread (maybe the UI thread).
For this you must use some kind of queue to synchronize the invocation into the MainThread.
This is exactly what .Net's BeginInvoke()/Invoke() on a Form do. The queue is in that case (somewhere deep inside the .NET framework) the windows message queue.
But you can use any other queue as long as the "correct" thread reads and executes the call requests from that queue.