AWS Batch permits only 32 concurrent jobs in array configuration - amazon-web-services

I'm running some AI experiments that requires multiple parallel runs in order to speed up the process.
I've built and pushed a container to ECR and I'm trying to run it with AWS Batch with an Array size of 35. But only 32 starts immediately while the last three jobs remains in the RUNNABLE state and don't start until one job has finished.
I'm running Fargate Spot for cost-saving reasons with 1 vcpu and 8GB RAM.
I looked at the documentation but there are no Service Quota Limits to increase regarding the size (the max seems to be 10k) neither in Fargate, ECS and AWS Batch.
What could be the cause ?

My bad. The limit is actually imposed in the Compute Environment associated with the jobs.
I answered myself hoping to help somebody in the future.

Related

why wont my aws batch job all start and run in paralllel?

hope someone can help me stop tearing my hair out!
I have a a job with array of ~700 indexes
When i submit the job , I get no more than 20-30 running simultaneously
They all run eventually which leads me to assume its a constraint else where and as all jobs the same, its not permissions/roles/connectivity.
They are array / index jobs, and one job in the queue I can't find any limits on these types of jobs running?
note i'm using ec2 unmanaged as the job was too big for fargate
i've tried
double checked they are parallel not sequential
dropped individual cpu / membory for each job to 0.25vcpu and 1gb memmory
created 'huge' compute environments of max 4096 vpu - no desired or min
added upto 3 compute env to a queue (as per limit)
what am i missing? hope someone can point me in a different direction
thanks
Ben
Based on the comments.
The issue was caused by EC2 service limits. AWS Bash will use EC2 to run the jobs, and it will not launch more resources then those specified by the EC2 limits. You can request increase the service quota of my Amazon EC2 resources to overcome the issue.

AWS Batch permits approx 25 concurrent jobs in array configuration while compute environment allows using 256 CPU

I am running Job Array on the AWS Batch using Fargate Spot environment.
The main goal is to do some work as quickly as possible. So, when I run 100 jobs I expect that all of these jobs will be run simultaneously.
But only approx 25 of them start immediately, the rest of jobs are waiting with RUNNABLE status.
The jobs run on compute environment with max. 256 CPU. Each job uses 1 CPU or even less.
I haven't found any limits or quotas that can influence the process of running jobs.
What could be the cause?
I've talked with AWS Support and they advised me not to use Fargate when I need to process a lot of jobs as quick as possible.
For large-scale job processing, On-Demand solution is recommended.
So, after changed Provisioning model to On-Demand number of concurrent jobs grown up to CPU limits determined in settings, this was what I needed.

ECS starting tasks sequentially though resources are available

In our ECS cluster setup with ASG Capacity provider, we have 5 EC2 instances and each instance can take around 20 tasks. So overall there are resources available to run 100 tasks. Now if we submit a service with 100 tasks, though there are enough resources, not all tasks are started parallely. I see tasks are coming up in batches of size 20 with a gap of 10 secs between each batch. I observed this from ECS Service Event logs. Any configuration which we can tweak to achieve complete parallelism.
This behavior is due to artificially controlled throughput (expressed in Tasks per Second - TPS) that the ECS service control plane imposes. There is a bursting concept in there (which is the reason for which you see this batch of tasks being launched and then a delta in seconds). The reasons for which these limits exist is to avoid being throttled in other parts of the services surface. These limits can be lifted if there is a strong need but the engineering team will need to validate the use case and expectations (see the point about hitting potentially other limits). The best way to address this discussion is by opening a ticket with AWS Support and explore your alternatives (based on your requirements).

Problems with Memory and CPU limits in AWS ECS cluster running on reserved EC2 instance

I am running the ECS cluster that currently has 3 services running on T3 medium instance. Each of those services is running only one task which has a soft memory limit of 1GB, the hard limit is different for each (but that should not be the problem). I will always have enough memory to run one, new deployed task (new one will also take 1GB, and T3 medium will be able to handle it since it has 4GB total). After the new task is up and running, the old one will be stopped and I will have again 1GB free for the new deployment. I did similar to the CPU (2048 CPU, each task has 512, and 512 free for new deployments).
So everything runs fine now, but I am not completely satisfied with this setup for the future. What will happen if I need to add another service with another task? I need to deploy all existing tasks and to modify their task definitions to use less CPU and memory in order to run this new task (and new deployments). I am planning to get a reserved EC2 instance, so it will not be easy to swap the current EC2 instance with the larger one.
Is there a way to spin up another EC2 instance for the same ECS cluster to handle bursts in my tasks? Also deployments, it's not a perfect scenario to have the ability to deploy only one task, and then wait for old to be killed in order to deploy the next one, without downtimes.
And biggest concern, what if I need new service and task, I need again to adjust all others in order to run a new one and deploy others, which is not very maintainable and what if I cannot lower CPU and memory more because I already reached the lowest point in order to run the task smoothly.
I was thinking about having another EC2 instance for the same cluster, that will handle bursts, deployments, and new services/tasks. But not sure if that's possible and if that's the best way of doing this. I was also thinking about Fargate, but this is much more expensive and I cannot afford it for now. What do you think? Any ideas, suggestions, and hints will be helpful since I am desperate to find the best way to avoid the problems mentioned above.
Thanks in advance!
So unfortunately, there is no out of the box solution to ensure that all your tasks run on min possible (i.e. one) instance. You can use our new feature called Capacity Providers (CP), which will allow you to ensure the minimum number of ec2 instances required to run all your tasks. The major difference between CP vs ASG is that CP gives more weight to task placement (where as ASG will scale in/out based on resource utilization which isn't ideal in your case).
However, it's not an ideal solution. Just as you said in your comment, when the service needs to scale out during a deployment, CP will spin up another instance, the new task will be placed on it and once it gets to Running state, the old task will be stopped.
But now you have an "extra" EC2 instance because there is no way to replace a running task. The only way I can think of would be to use a lambda function that drains the new instance, which will move all the service tasks to the other instance. CP will, after about 15 minutes, terminate this instance as there are no tasks are running on it.
A couple caveats:
CP are new, a little rough around the edges, and you can't
delete/modify them. You can only create or deactivate them.
CP needs an underlying ASG and they must have a 1-1 relationship
Make sure to enable managed scaling when creating CP
Choose 100% capacity target
Don't forget to add a default capacity strategy for the cluster
Minimizing EC2 instances used:
If you're using a capacity provider, the 'binpack' placement strategy minimises the number of EC2 hosts that are used.
However, there are some scale-in scenarios where you can end up with a single task running on its own EC2 instance. As Ali mentions in their answer; ECS will not replace this running task, but depending on your setup, it may be fairly easy for you to replace it yourself by configuring your task to voluntarily 'quit'.
In my case; I always have at least 2 tasks running per service. So I just added some logic to my tasks' healthchecks, so they report as unhealthy after ~6 hours. ECS will spot the 'unhealthy' task, remove it from the load balancer, and spin up a replacement (according to the binpack strategy).
Note: If you take this approach; add some variation to your timeout so you're less likely to have all of your tasks expire at the same time. Something like: expiry = now + timedelta(hours=random.uniform(5.5,6.5))
Sharing memory 'headspace' with soft-limits:
If you set both soft and hard memory limits; ECS will place your tasks based on the soft limit. If your tasks' memory usage varies with usage, it's fairly easy to get your EC2 instance to start swapping.
For example: Say you have a task defined with a soft limit of 900mb, and a hard limit of 1800mb. You spin up a service with 4 running instances. ECS provisions all 4 of these instances on a single t3.medium. Notice here that each instance thinks it can safely use up to 1800mb, when in fact there's very little free memory on the host server. When you hit your service with some traffic; each task tries to use some more memory, and your t3.medium is incapacitated as it starts swapping memory to disk. ECS does not recover from this type of failure very well. It notices that the task instances are no longer available, and will attempt to provision replacements, but the capacity provider is very slow to replace the swapping t3.medium.
My suggestion:
Configure your service to auto-scale based on memory usage (this will be a percentage of your soft-limit), for example: a target memory usage of 70%
Configure your tasks' healthchecks so that they report as unhealthy when they are nearing their soft-limit. This way, your tasks still have some headroom for quick spikes of memory usage, while giving your load balancer a chance to drain and gracefully replace tasks that are getting greedy. This is fairly easy to do by reading the value within /sys/fs/cgroup/memory/memory.usage_in_bytes.

Running steps of EMR in parallel

I am running a spark-job on EMR cluster,The issue i am facing is all the
EMR jobs triggered are executing in steps (in queue)
Is there any way to make them run parallel
if not is there any alteration for that
Elastic MapReduce comes by default with a YARN setup very "step" oriented, with a single CapacityScheduler queue with the 100% of the cluster resources assigned. Because of this configuration, any time you submit a job to an EMR cluster, YARN maximizes the cluster usage for that single job, granting all available resources to it until it finishes.
Running multiple concurrent jobs in an EMR cluster (or any other YARN based Hadoop cluster, in fact) requires a proper YARN setup with multiple queues to properly grant resources to each job. YARN's documentation is quite good about all of the Capacity Scheduler features and it is simpler as it sounds.
YARN's FairScheduler is quite popular but it uses a different approach and may be a bit more difficult to configure depending on your needs. Given the simplest scenario where you have a single Fair queue, YARN will try to grant containers to waiting jobs as soon as they are freed by running jobs, ensuring that all the jobs submitted to a cluster get at least a fraction of compute resources as soon as they are available.
If you are concerned about YARN jobs running in a queue(submitted by spark)..
There are multiple solutions to run jobs in parallel ,
By default, EMR uses YARN CapacityScheduler with DefaultResourceCalculator and has one single DEFAULT queue where all YARN jobs are submitted. SInce there is only one queue, the number of yarn jobs that you can RUN(not submit) in parallel really depends on the parallel number of AM's , mapper and reducers that your EMR cluster supports.
For example : You have a cluster that can run atmost 10 mappers in parallel. (see AWS EMR Parallel Mappers?)
Suppose you submitted 2 map-only jobs each requiring 10 mappers one after another. The first job will take up all mapper container capacity and runs , while the second waits on the queue for the containers to free up. This behavior is similar for AM's and Reducers as well.
Now, to make them run in parallel inspire of having that limitation on number of containers that is supported by cluster ,
Keeping capacity scheduler , You can create multiple queues configuring %'s of capacity with Max capacity in each queue. So that job in first queue might not fully use up all containers even though it needs it. You can submit a seconds your job in second queue which will have pre-determined capacity.
You might need to use FAIR scheduler by configuring yarn-site.xml . The FAIR scheduler allows you share configure queues and share resources across those queues fairly. You might also use PREEMPTION option of fair scheduler.
Note that the choice of what option to go with - really depends on your use-case and business needs. It is important to learn about all options and possible impact.
https://www.safaribooksonline.com/library/view/hadoop-the-definitive/9781491901687/ch04.html
Amazon EMR now supports the ability to run multiple steps in parallel. The number of steps allowed to run at once is configurable and can be set when a cluster is launched and at any time after the cluster has started.
Please see this announcement for more details: https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-emr-now-allows-you-to-run-multiple-steps-in-parallel-cancel-running-steps-and-integrate-with-aws-step-functions/.
Just adding updated information. EMR supports parallel steps:
https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-emr-now-allows-you-to-run-multiple-steps-in-parallel-cancel-running-steps-and-integrate-with-aws-step-functions/