Cannot invoke more than 64 Lambda functions at one time? - amazon-web-services

I cannot invoke more than 64 synchronous lambda functions without getting an Ops Limit: 64 exception, and I have no idea why.
There is no mention of this in the Lambda limits documentation. In fact, it says you can have up to 1000 concurrent executions. The lambda scaling docs additionally state that temporary bursts up to 3k invocations are supported.
So why is my paltry 65 invocations causing Lambda to reject things?
Reproducing:
On my account I have a dead simple Lambda which sits for 5 seconds (to simulate work) and then returns the default json blob.
import json
import time
def lambda_handler(event, context):
# TODO implement
time.sleep(5)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Excluding the time.sleep call, this is the exact code generated when you create a new Python Lambda function. So, no weirdness going on here.
Invoking
Now to invoke it, I just submit the invocation task into a thread pool.
(defn invoke-python [_]
(aws/invoke
lambda
{:op :Invoke
:request {:FunctionName "ExampleSlowPythonFunction"
:Payload (json/write-str payload)}})
Here's all the call is doing. Just a straight invoke call to AWS. (The library here is Cognitect's AWS API. But it just defers down to the REST APIs, so shouldn't matter.)
The thread pool is just one of Java's executors. I hand it a size and the tasks, and it executes said tasks in a pool of size n.
(defn call-it-a-bunch
[n tasks]
(let [pool (Executors/newFixedThreadPool n)]
(let [futures (.invokeAll pool tasks)]
(.shutdown pool)
(mapv #(.get %) futures))))
64 invocations: no problem
(def sixty-four-invoke-tasks (map invoke-python (range 64)))
(call-it-a-bunch 64 sixty-four-invoke-tasks)
A-OK. No problemo.
65 invocations: PROBLEM
(def sixty-FIVE-invoke-tasks (map invoke-python (range 65)))
(call-it-a-bunch 65 sixty-FIVE-invoke-tasks
I will get get an Ops limit reached: 64 on that 65th request.
I have no other Lambda's running on my account. I've tried dialing up the reserved instances on the Python Lambda Function to make double sure that the lambda IS available.
However, the Ops limit error remains.
Why can I not invoke my function more than 64 times concurrently despite having a bucket 1000 concurrency available on my account?

I'm unable to reproduce your problem. I'm using regular Java. I have your exact Python Lambda and have have a Runnable:
import com.amazonaws.regions.Regions;
import com.amazonaws.services.lambda.AWSLambda;
import com.amazonaws.services.lambda.AWSLambdaClientBuilder;
import com.amazonaws.services.lambda.model.InvokeRequest;
import com.amazonaws.services.lambda.model.InvokeResult;
public class LambdaThread implements Runnable {
private String name;
public LambdaThread(String name) {
this.name = name;
}
#Override
public void run() {
System.out.println( "starting thread " + name);
Regions region = Regions.fromName("us-west-2");
AWSLambda client = AWSLambdaClientBuilder.standard()
.withRegion(region).build();
InvokeRequest req = new InvokeRequest()
.withFunctionName("test-load");
InvokeResult result = client.invoke(req);
System.out.println( "result from thread " + name + " is " + new String( result.getPayload().array()) );
}
}
And a runner:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class LambdaRunner {
public static void main(String[] argv) {
long startTime = System.currentTimeMillis();
ExecutorService executor = Executors.newFixedThreadPool(999);
for (int i = 0; i < 999; i++) {
Runnable worker = new LambdaThread("thread " + i);
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
try { Thread.sleep(1000); } catch ( InterruptedException ie ) { /* ignored */ }
System.out.println( "waiting for thread termination...");
}
System.out.println("Finished all threads in " + (System.currentTimeMillis() - startTime) + "ms");
}
}
To me, the error is inline with some of the wording in Clojure in general. What version of the Cognitect library are you using? I can't see this message in the master branch.
If I run this with say 2000 threads I get:
Exception in thread "pool-1-thread-1990"
com.amazonaws.services.lambda.model.TooManyRequestsException: Rate
Exceeded. (Service: AWSLambda; Status Code: 429; Error Code:
TooManyRequestsException; Request ID:
b7f1426b-419a-4d40-902d-e0ed306ff120)
Nothing related to ops.

Related

Can someone please explain the proper usage of Timers and Triggers in Apache Beam?

I'm looking for some examples of usage of Triggers and Timers in Apache beam, I wanted to use Processing-time timers for listening my data from pub sub in every 5 minutes and using Processing time triggers processing the above data collected in an hour altogether in python.
Please take a look at the following resources: Stateful processing with Apache Beam and Timely (and Stateful) Processing with Apache Beam
The first blog post is more general in how to handle states for context, and the second has some examples on buffering and triggering after a certain period of time, which seems similar to what you are trying to do.
A full example was requested. Here is what I was able to come up with:
PCollection<String> records =
pipeline.apply(
"ReadPubsub",
PubsubIO.readStrings()
.fromSubscription(
"projects/{project}/subscriptions/{subscription}"));
TupleTag<Iterable<String>> every5MinTag = new TupleTag<>();
TupleTag<Iterable<String>> everyHourTag = new TupleTag<>();
PCollectionTuple timersTuple =
records
.apply("WithKeys", WithKeys.of(1)) // A KV<> is required to use state. Keying by data is more appropriate than hardcode.
.apply(
"Batch",
ParDo.of(
new DoFn<KV<Integer, String>, Iterable<String>>() {
#StateId("buffer5Min")
private final StateSpec<BagState<String>> bufferedEvents5Min =
StateSpecs.bag();
#StateId("count5Min")
private final StateSpec<ValueState<Integer>> countState5Min =
StateSpecs.value();
#TimerId("every5Min")
private final TimerSpec every5MinSpec =
TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
#StateId("bufferHour")
private final StateSpec<BagState<String>> bufferedEventsHour =
StateSpecs.bag();
#StateId("countHour")
private final StateSpec<ValueState<Integer>> countStateHour =
StateSpecs.value();
#TimerId("everyHour")
private final TimerSpec everyHourSpec =
TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
#ProcessElement
public void process(
#Element KV<Integer, String> record,
#StateId("count5Min") ValueState<Integer> count5MinState,
#StateId("countHour") ValueState<Integer> countHourState,
#StateId("buffer5Min") BagState<String> buffer5Min,
#StateId("bufferHour") BagState<String> bufferHour,
#TimerId("every5Min") Timer every5MinTimer,
#TimerId("everyHour") Timer everyHourTimer) {
if (Objects.firstNonNull(count5MinState.read(), 0) == 0) {
every5MinTimer
.offset(Duration.standardMinutes(1))
.align(Duration.standardMinutes(1))
.setRelative();
}
buffer5Min.add(record.getValue());
if (Objects.firstNonNull(countHourState.read(), 0) == 0) {
everyHourTimer
.offset(Duration.standardMinutes(60))
.align(Duration.standardMinutes(60))
.setRelative();
}
bufferHour.add(record.getValue());
}
#OnTimer("every5Min")
public void onTimerEvery5Min(
OnTimerContext context,
#StateId("buffer5Min") BagState<String> bufferState,
#StateId("count5Min") ValueState<Integer> countState) {
if (!bufferState.isEmpty().read()) {
context.output(every5MinTag, bufferState.read());
bufferState.clear();
countState.clear();
}
}
#OnTimer("everyHour")
public void onTimerEveryHour(
OnTimerContext context,
#StateId("bufferHour") BagState<String> bufferState,
#StateId("countHour") ValueState<Integer> countState) {
if (!bufferState.isEmpty().read()) {
context.output(everyHourTag, bufferState.read());
bufferState.clear();
countState.clear();
}
}
})
.withOutputTags(every5MinTag, TupleTagList.of(everyHourTag)));
timersTuple
.get(every5MinTag)
.setCoder(IterableCoder.of(StringUtf8Coder.of()))
.apply(<<do something every 5 min>>);
timersTuple
.get(everyHourTag)
.setCoder(IterableCoder.of(StringUtf8Coder.of()))
.apply(<< do something every hour>>);
pipeline.run().waitUntilFinish();

Delaying actions using Decentraland's ECS

How do I make an action occur with a delay, but after a timeout?
The setTimeout() function doesn’t work in Decentraland scenes, so is there an alternative?
For example, I want an entity to wait 300 milliseconds after it’s clicked before I remove it from the engine.
To implement this you’ll have to create:
A custom component to keep track of time
A component group to keep track of all the entities with a delay in the scene
A system that updates the timers con all these
components on each frame.
It sounds rather complicated, but once you created one delay, implementing another delay only takes one line.
The component:
#Component("timerDelay")
export class Delay implements ITimerComponent{
elapsedTime: number;
targetTime: number;
onTargetTimeReached: (ownerEntity: IEntity) => void;
private onTimeReachedCallback?: ()=> void
/**
* #param millisecs amount of time in milliseconds
* #param onTimeReachedCallback callback for when time is reached
*/
constructor(millisecs: number, onTimeReachedCallback?: ()=> void){
this.elapsedTime = 0
this.targetTime = millisecs / 1000
this.onTimeReachedCallback = onTimeReachedCallback
this.onTargetTimeReached = (entity)=>{
if (this.onTimeReachedCallback) this.onTimeReachedCallback()
entity.removeComponent(this)
}
}
}
The component group:
export const delayedEntities = engine.getComponentGroup(Delay)
The system:
// define system
class TimerSystem implements ISystem {
update(dt: number){
for (let entity of delayedEntities.entities) {
let timerComponent = entity.getComponent(component)
timerComponent.elapsedTime += dt
if (timerComponent.elapsedTime >= timerComponent.targetTime){
timerComponent.onTargetTimeReached(entity)
}
})
}
}
// instance system
engine.addSystem(new TimerSystem())
Once all these parts are in place, you can simply do the following to delay an execution in your scene:
const myEntity = new Entity()
myEntity.addComponent(new Delay(1000, () => {
log("time ran out")
}))
engine.addEntity(myEntity)
A few years late, but the OP's selected answer is kind of deprecated because you can accomplish a delay doing:
import { Delay } from "node_modules/decentraland-ecs-utils/timer/component/delay"
const ent = new Entity
ent.addComponent(new Delay(3 * 1000, () => {
// this code will run when time is up
}))
Read the docs.
Use the utils.Delay() function in the utils library.
This function just takes the delay time in milliseconds, and the function you want to execute.
Here's the full documentation, explaining how to add the library + how to use this function, including example code:
https://www.npmjs.com/package/decentraland-ecs-utils

RxJava - SwitchMap alike with multiple limited active streams

I'm wondering how to transform an observable similarly to switchMap but instead of limiting to single active stream have multiple (limited) streams.
The purpose is to have multiple tasks working concurrently up to some tasks count limit, and allow new tasks to start with FIFO queue strategy, meaning any new task arrive will start immediately and the oldest task in queue will be canceled.
switchMap will create Observable for each emission of the source and will cancel previous running Observable stream once new one created, I want to achieve something similar but allow concurrency with some level (like flatMap), meaning allowing number of Observables to be created for each emission, and run concurrently up to some concurrency limit, when the concurrency limit is reached, the oldest observable will be cancel and the new one will started.
Actually, This is also similar to flatMap with maxConcurrent, but instead of new Observables waiting in queue when maxConcurrent is reached, cancel the older Observables and enter the new one immediately.
You could try with this transformer:
public static <T, R> Observable.Transformer<T, R> switchFlatMap(
int n, Func1<T, Observable<R>> mapper) {
return f ->
Observable.defer(() -> {
final AtomicInteger ingress = new AtomicInteger();
final Subject<Integer, Integer> cancel =
PublishSubject.<Integer>create().toSerialized();
return f.flatMap(v -> {
int id = ingress.getAndIncrement();
Observable<R> o = mapper.call(v)
.takeUntil(cancel.filter(e -> e == id + n));
cancel.onNext(id);
return o;
});
})
;
}
The demonstration:
public static void main(String[] args) {
PublishSubject<Integer> ps = PublishSubject.create();
#SuppressWarnings("unchecked")
PublishSubject<Integer>[] pss = new PublishSubject[3];
for (int i = 0; i < pss.length; i++) {
pss[i] = PublishSubject.create();
}
AssertableSubscriber<Integer> ts = ps
.compose(switchFlatMap(2, v -> pss[v]))
.test();
ps.onNext(0);
ps.onNext(1);
pss[0].onNext(1);
pss[0].onNext(2);
pss[0].onNext(3);
pss[1].onNext(10);
pss[1].onNext(11);
pss[1].onNext(12);
ps.onNext(2);
pss[0].onNext(4);
pss[2].onNext(20);
pss[2].onNext(21);
pss[2].onNext(22);
pss[1].onCompleted();
pss[2].onCompleted();
ps.onCompleted();
ts.assertResult(1, 2, 3, 10, 11, 12, 20, 21, 22);
}
Though a ready made solution is unavailable, something like below should assist.
public static void main(String[] args) {
Observable.create(subscriber -> {
for (int i = 0; i < 5; i++) {
Observable.timer(i, TimeUnit.SECONDS).toBlocking().subscribe();
subscriber.onNext(i);
}
})
.switchMap(
n -> {
System.out.println("Main task emitted event - " + n);
return Observable.interval(1, TimeUnit.SECONDS).take((int) n * 3)
.doOnUnsubscribe(() -> System.out.println("Unsubscribed for main task event - "+ n));
}).subscribe(n2 -> System.out.println("\t" + n2));
Observable.timer(20, TimeUnit.SECONDS).toBlocking().subscribe();
}
Observable.create section creates a slow producer which emits items in a fashion of emit 0, sleep for 1s and emit 1, sleep for 2s and emit 2 and so on.
switchMap creates Observable objects for each element which emits numbers every second. You also can note that it prints a line every time an element is emitted by the main Observable and also when it is unsubscribed.
Thus, probably in your case, you might be interested to close the oldest task with doOnUnsubscribe. Hope it helps.
Below pseudo code might help better in understanding.
getTaskObservable()
.switchMap(
task -> {
System.out.println("Main task emitted event - " + task);
return Observable.create(subscriber -> {
initiateTaskAndNotify(task, subscriber);
}).doOnUnsubscribe(() -> checkAndKillIfMaxConcurrentTasksReached(task));
}).subscribe(value -> System.out.println("Done with task and got output" + value));

Schedule/batch for large number of webservice callouts?

I'am new to Apex and I have to call a webservice for every account (for some thousands of accounts).
Usualy a single webservice request takes 500 to 5000 ms.
As far as I know schedulable and batchable classes are required for this task.
My idea was to group the accounts by country codes (Europe only) and start a batch for every group.
First batch is started by the schedulable class, next ones start in batch finish method:
global class AccValidator implements Database.Batchable<sObject>, Database.AllowsCallouts {
private List<String> countryCodes;
private countryIndex;
global AccValidator(List<String> countryCodes, Integer countryIndex) {
this.countryCodes = countryCodes;
this.countryIndex = countryIndex;
...
}
// Get Accounts for current country code
global Database.QueryLocator start(Database.BatchableContext bc) {...}
global void execute(Database.BatchableContext bc, list<Account> myAccounts) {
for (Integer i = 0; i < this.AccAccounts.size(); i++) {
// Callout for every Account
HttpRequest request ...
Http http = new Http();
HttpResponse response = http.send(request);
...
}
}
global void finish(Database.BatchableContext BC) {
if (this.countryIndex < this.countryCodes.size() - 1) {
// start next batch
Database.executeBatch(new AccValidator(this.countryCodes, this.countryIndex + 1), 200);
}
}
global static List<String> getCountryCodes() {...}
}
And my schedule class:
global class AccValidatorSchedule implements Schedulable {
global void execute(SchedulableContext sc) {
List<String> countryCodes = AccValidator.getCountryCodes();
Id AccAddressID = Database.executeBatch(new AccValidator(countryCodes, 0), 200);
}
}
Now I'am stuck with Salesforces execution governors and limits:
For nearly all callouts I get the exceptions "Read timed out" or "Exceeded maximum time allotted for callout (120000 ms)".
I also tried asynchronous callouts, but they don't work with batches.
So, is there any way to schedule a large number of callouts?
Have you tried to limit your execute method to 100? Salesforce only allows 100 callout per transaction. I.e.
Id AccAddressID = Database.executeBatch(new AccValidator(countryCodes, 0), 100);
Perhaps this might help you:
https://salesforce.stackexchange.com/questions/131448/fatal-errorsystem-limitexception-too-many-callouts-101

Akka Actors logging processing time

I have a set of Akka Actors and I give about a couple of hundreds of messages to each one of them. I want to track how much time each instance of that Actor took to process all the messages that it received. What I'm doing currently is to have a state in the Actor instance as:
var startTime
var firstCall
I set both the variables when the Actor instance is first called. Is there another way that I could use to track the processing time for my Actor instances? I want to avoid having a local state in my Actor instance.
This is a good use case for context.become.
Remember than a receive block in an Akka actor is just a PartialFunction[Any, Unit], so we can wrap that in another partial function. This is the same approach taken by Akka's builtin LoggingReceive.
class TimingReceive(r: Receive, totalTime: Long)(implicit ctx: ActorContext) extends Receive {
def isDefinedAt(o: Any): Boolean = {
r.isDefinedAt(o)
}
def apply(o: Any): Unit = {
val startTime = System.nanoTime
r(o)
val newTotal = totalTime + (System.nanoTime - startTime)
log.debug("Total time so far: " + totalTime + " nanoseconds")
ctx.become(new TimingReceive(r, newTotal))
}
}
object TimingReceive {
def apply(r: Receive)(implicit ctx: ActorContext): Receive = new TimingReceive(r, 0)
}
Then you can use it like this:
class FooActor extends Actor {
def receive = TimingReceive {
case x: String => println("got " + x)
}
}
After each message, the actor will log the time taken so far. Of course, if you want to do something else with that variable, you'll have to adapt this.
This approach doesn't measure the time the actor is alive of course, only the time taken to actually process messages. Nor will it be accurate if your receive function creates a future.