Scheduling strategy behind AWS Batch - amazon-web-services

I am wondering what the scheduling strategy behind AWS Batch looks like. The official documentation on this topic doesn't provide much details:
The AWS Batch scheduler evaluates when, where, and how to run jobs that have been submitted to a job queue. Jobs run in approximately the order in which they are submitted as long as all dependencies on other jobs have been met.
(https://docs.aws.amazon.com/batch/latest/userguide/job_scheduling.html)
"Approximately" fifo is quite vaque. Especially as the execution order I observed when testing AWS Batch did't look like fifo.
Did I miss something? Is there a possibility to change the scheduling strategy, or configure Batch to execute the jobs in the exact order in which they were submitted?

I've been using Batch for a while now, and it has always seemed to behave in roughly a FIFO manner. Jobs that are submitted first will generally be started first, but because of limitations with distributed systems, this general rule won't work out perfectly. Jobs with dependencies are kept in the PENDING state until their dependencies have completed, and then they go into the RUNNABLE state. In my experience, whenever Batch is ready to run more jobs from the RUNNABLE state, it picks the job with the earliest time submitted.
However, there are some caveats. First, if Job A was submitted first but requires 8 cores while Job B was submitted later but only requires 4 cores, Job B might be selected first if Batch has only 4 cores available. Second, after a job leaves the RUNNABLE state, it goes into STARTING while Batch downloads the Docker image and gets the container ready to run. Depending on a number of factors, jobs that were submitted at the same time may take longer or shorter in the STARTING state. Finally, if a job fails and is retried, it goes back into the PENDING state with its original time submitted. When Batch decides to select more jobs to run, it will generally select the job with the earliest submit date, which will be the job that failed. If other jobs have started before the first job failed, the first job will start its second run after the other jobs.
There's no way to configure Batch to be perfectly FIFO because it's a distributed system, but generally if you submit jobs with the same compute requirements spaced a few seconds apart, they'll execute in the same order you submitted them.

Related

Understanding Batch Job Behavior

If we are creating 100 batch jobs at a time and each batch job will take around 2 mins for say. Then i need to understand whether batch jobs will run Sequentially or parallelly. If Sequentially , can i say we need to wait for the previous job to complete for the next batch job to start?

How to reduce the time taken by the glue etl job(spark) to actually start executing?

I want to start a glue etl job, though the execution is fair (time concerns), however, the time taken by glue to actually start executing the job is too much.
I looked into various documentation and answers but none of them could give me the solution. There was some explanation of this behavior: cold start but no solution.
I expect to have the job up asap, it takes sometimes around 10 mins to start a job which gets executed in 2 mins.
Unfortunately it's not possible now. Glue uses EMR under the hood and it requires some time to spin up a new cluster with desired number of executors. As far as I know they have a pool of spare EMR clusters with some most common DPU configurations so if you are lucky your job can get one and start immediately, otherwise it will wait.

How can aws boto3 submit a final batch job that depends on the completion of all previous jobs?

The boto3 documentation describes how to submit a dependsOn parameter, but a single job can only depend on the completion of a maximum of 20 jobs. How can I submit a job that depends on the completion of an arbitrarily large number of jobs? Can this be done by specifying the final job type as SEQUENTIAL? Or does this need to be done by creating a lower priority queue?
While AWS Batch does limit you to 20 arbitrary jobs (you can contract them to see about bumping it), they did introduce array jobs in November 2017.
https://docs.aws.amazon.com/batch/latest/userguide/array_jobs.html
This when you want the same basic job step run on a number of machines (i.e. not totally arbitrary jobs). So it takes that one job and can break it into up to 10,000 jobs. Each is given an index parameter so you could pass a large document and have each final job work on a given page number.
Then your next job step could be dependent on that job that had 2-10,000 jobs.
Check the documents for details, especially since you can configure the dependencies in different ways.

How to reduce the initialisation and termination time in google dataflow job?

I'm currently working on a POC and primarily focusing on Dataflow for ETL processing. I have created the pipeline using Dataflow 2.1 Java Beam API, and it takes about 3-4 minutes just to initialise, and also it takes about 1-2 minutes for termination on each run. However, the actual transformation (ParDo) takes less than a minute. Moreover, I tried running the jobs by following different approaches,
Running the job on local machine
Running the job remotely on GCP
Running the job via Dataflow template
But it looks like, all the above methods consume more or less same time for initialization and termination. So this is being a bottleneck for the POC as we intend to run hundreds of jobs every day.
I'm looking for a way to share the initialisation/termination time across all jobs so that it can be a one-time activity or any other approaches to reduce the time.
Thanks in advance!
From what I know, there are no ways to reduce startup or teardown time. You shouldn't consider that to be a bottleneck, as each run of a job is independent of the last one, so you can run them in parallel, etc. You could also consider converting this to a streaming pipeline if that's an option to eliminate those times entirely.

Scaling Oozie Map Reduce Job: Does splitting into smaller jobs reduce overall runtime and memory usage?

I have a Oozie workflow that runs a Map-reduce job within a particular queue on the cluster.
I have to add more input sources/clients to this job, so this job will be processing n times more data than what it does today.
My question is If instead of have one big job processing all the data, if I break it down into multiple jobs, one per source, will I reduce the total amount of time the jobs will take to complete?
I know Mapreduce anyhow breaks down a job into smaller jobs and spreads them across the grid, so one big job should be the same as multiple small jobs.
Also the capacity allocation within the queue is done on a 'per user' basis[1], so no matter how many jobs are submitted under one user the capacity allocated to the user will be the same. Or is there something I am missing?
So will my jobs really run any faster if broken down into smaller jobs?
Thanks.
[1] https://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html#Resource+allocation