Specifying Share Identifier in EventBridge rule for an AWS Batch job - amazon-web-services

I am writing a cloudformation template for an AWS Batch job triggered by an Eventbridge rule. However, I am getting the following error:
shareIdentifier must be specified. (Service: AWSBatch; Status Code: 400; Error Code: ClientException;
I cannot find any documentation of how to pass a shareIdentifier to my batch job, how can I add it to my eventbridge rule's cloudformation template?
I have tried passing as the Input variable:
Input: |
{
"shareIdentifier": "mid"
}
this is not picked up, I have also tried passing shareIdentifier/ShareIdentifier directly in the BatchParameters. this was an unrecognised key.

In the end, I couldn't crack this, I had to wrap it in a state machine step, and call that from eventbridge instead. This was the logic in the state machine to add the Share Identifier:
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobName": <name>,
"JobDefinition": <Arn>,
"JobQueue": <QueueName>,
"ShareIdentifier": <Share>
},
If anyone works out how to do it directly from Eventbridge, I'd love to hear it.

Related

DS SDK -AWS Step functions lambda job cancelled immediately

I am getting this weird result when I try to deploy the lambda step, if I define my lambda step like this
lambda_step = steps.compute.LambdaStep(
"Query Training Results",
parameters={
"FunctionName": execution_input["LambdaFunctionName"],
"Payload": {"TrainingJobName.$": "$.TrainingJobName"},
},
)
For some reason it will automatically just grey out the box meaning the job will be cancelled immediately. If I simply remove the "Payload" part then it will work but the lambda step will still fail because it does not know the training job name that I am trying to pass in the Payload.
I followed this example to the T here. Any suggestions would be greatly appreciated.

how to add sharedIdentifier to aws event bridge rule for scheduled execution of aws batch job

I configured aws bridge event rule (via web gui) for running aws batch job - rule is triggered but a I am getting following error after invocation:
shareIdentifier must be specified. (Service: AWSBatch; Status Code: 400; Error Code: ClientException; Request ID: 07da124b-bf1d-4103-892c-2af2af4e5496; Proxy: null)
My job is using scheduling policy and needs shareIdentifier to be set but I don`t know how to set it. Here is screenshot from configuration of rule:
There are no additional settings for subsequent arguments/parameters of job, the only thing I can configure is retries. I also checked aws-cli command for putting rule (https://awscli.amazonaws.com/v2/documentation/api/latest/reference/events/put-rule.html) but it doesn`t seem to have any additional settings. Any suggestions how to solve it? Or working examples?
Edited:
I ended up using java sdk for aws batch: https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-batch. I have a scheduled method that periodically spawns jobs with following peace of code:
AWSBatch client = AWSBatchClientBuilder.standard().withRegion("eu-central-1").build();
SubmitJobRequest request = new SubmitJobRequest()
.withJobName("example-test-job-java-sdk")
.withJobQueue("job-queue")
.withShareIdentifier("default")
.withJobDefinition("job-type");
SubmitJobResult response = client.submitJob(request);
log.info("job spawn response: {}", response);
Have you tried to provide additional settings to your target via the input transformer as referenced in the AWS docs AWS Batch Jobs as EventBridge Targets ?
FWIW I'm running into the same problem.
I had a similar issue, from the CLI and the GUI, I just couldn't find a way to pass ShareIdentifier from an Eventbridge rule. In the end I had to use a state machine (step function) instead:
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobName": <name>,
"JobDefinition": <Arn>,
"JobQueue": <QueueName>,
"ShareIdentifier": <Share>
},
...
You can see it could handle ShareIdentifier fine.

Sns mail notification when a step is not kicked off within a threshold timeframe

I have an emr step which is submitted through step function. During step run I can see task is submitted, but emr step is not executed and emr console don’t have any information .
How can I debug this?
How can I send an sns when a step doesn’t start execution with in a threshold timeframe?in my case step function shows emr task submitted but no information on emr console and pipeline is long running without failing for more than half hr
You could start the debugging process through the Step Functions execution log and identify the specific step that has failed, and later, you can move on looking for the EMR console or the specific service that has failed. Usually when the EMR step doesn't appear in the EMR console, is due to a Runtime Error, caused by an exception raised when calling the EMR step.
For this scenario, you can use the Error Handling that Step Functions has, using the Catch and Timeout fields, you can find more details in the AWS documentation here.
Basically you need to add this fields as show bellow:
{
"StartAt": "EmrStep",
"States": {
"EmrStep": {
"Type": "Task",
"Resource": "arn:aws:emr:execute-X-step",
"Comment": "This is your EMR step",
"TimeoutSeconds": 10,
"Catch": [ {
"ErrorEquals": ["States.Timeout"],
"Next": "ShutdownClusterAndSendSNS"
} ],
"End": true
},
"ShutdownClusterAndSendSNS": {
"Type": "Pass",
"Comment": "This step handles the timeout exception raised",
"Result": "You can shutdown the EMR cluster to avoid increased cost here and later send a sns notification!",
"End": true
}
}
Note: To catch the timeout exception, you have to catch the error States.Timeout, but also you can define the same catch field for other types of error.

Recursive AWS Lambda function calls - Best Practice

I've been tasked to look at a service built on AWS Lambda that performs a long-running task of turning VMs on and off. Mind you, I come from the Azure team so I am not familair with the styling or best practices of AWS services.
The approach the original developer has taken is to send the entire workload to one Lambda function and then have that function take a section of the workload and then recursively call itself with the remaining workload until all items are gone (workload = 0).
Pseudo-ish Code:
// Assume this gets sent to a HTTP Lambda endpoint as a whole
let workload = [1, 2, 3, 4, 5, 6, 7, 8]
// The Lambda HTTP endpoint
function Lambda(workload) {
if (!workload.length) {
return "No more work!"
}
const toDo = workload.splice(0, 2) // get first two items
doWork(toDo)
// Then... except it builds a new HTTP request with aws sdk
Lambda(workload) // 3, 4, 5, 6, 7, 8, etc.
}
This seems highly inefficient and unreliable (correct me if I am wrong). There is a lot of state being stored in this process and in my opinion that creates a lot of failure points.
My plan is to suggest we re-engineer the entire service to use a Queue/Worker type framework instead, where ideally the endpoint would handle one workload at a time, and be stateless.
The queue would be populated by a service (Jenkins? Lambda? Manually?), then a second service would read from the queue (and ideally scale-out as well, as needed).
UPDATE: AWS EventBridge now looks like the preferred solution.
It's "Coupling" that I was thinking of, see here: https://www.jeffersonfrank.com/insights/aws-lambda-design-considerations
Coupling
Coupling goes beyond Lambda design considerations—it’s more about the system as a whole. Lambdas within a microservice are sometimes tightly coupled, but this is nothing to worry about as long as the data passed between Lambdas within their little black box of a microservice is not over-pure HTTP and isn’t synchronous.
Lambdas shouldn’t be directly coupled to one another in a Request Response fashion, but asynchronously. Consider the scenario when an S3 Event invokes a Lambda function, then that Lambda also needs to call another Lambda within that same microservice and so on.
aws lambda coupling
You might be tempted to implement direct coupling, like allowing Lambda 1 to use the AWS SDK to call Lambda 2 and so on. This introduces some of the following problems:
If Lambda 1 is invoking Lambda 2 synchronously, it needs to wait for the latter to be done first. Lambda 1 might not know that Lambda 2 also called Lambda 3 synchronously, and Lambda 1 may now need to wait for both Lambda 2 and 3 to finish successfully. Lambda 1 might timeout as it needs to wait for all the Lambdas to complete first, and you’re also paying for each Lambda while they wait.
What if Lambda 3 has a concurrency limit set and is also called by another service? The call between Lambda 2 and 3 will fail until it has concurrency again. The error can be returned to all the way back to Lambda 1 but what does Lambda 1 then do with the error? It has to store that the S3 event was unsuccessful and that it needs to replay it.
This process can be redesigned to be event-driven: lambda coupling
Not only is this the solution to all the problems introduced by the direct coupling method, but it also provides a method of replaying the DLQ if an error occurred for each Lambda. No message will be lost or need to be stored externally, and the demand is decoupled from the processing.
AWS Step Functions is one way you can achieve this. Step Functions are used to orchestrate multiple Lambda functions in any manner you want - parallel executions, sequential executions or a mix of both. You can also put wait steps, condition checks, retries in between if you need.
Your overall step function might look something like this (say you want 1,2,3 to execute in parallel. Then when all these are complete, you want to execute 4, and then again 5 and 6 in parallel)
Configuring this is also pretty simple. It accepts a JSON like the following
{
"Comment": "An example of the Amazon States Language using a parallel state to execute two branches at the same time.",
"StartAt": "Parallel",
"States": {
"Parallel": {
"Type": "Parallel",
"Next": "Task4",
"Branches": [
{
"StartAt": "Task1",
"States": {
"Task1": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"End": true
}
}
},
{
"StartAt": "Task2",
"States": {
"Task2": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"End": true
}
}
},
{
"StartAt": "Task3",
"States": {
"Task3": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"End": true
}
}
}
]
},
"Task4": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"Next": "Parallel2"
},
"Parallel2": {
"Type": "Parallel",
"Next": "Final State",
"Branches": [
{
"StartAt": "Task5",
"States": {
"Task5": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"End": true
}
}
},
{
"StartAt": "Task6",
"States": {
"Task6": {
"Type": "Task",
"Resource": "arn:aws:lambda:ap-south-1:XXX:function:XXX",
"End": true
}
}
}
]
},
"Final State": {
"Type": "Pass",
"End": true
}
}
}

How can I create an "aws.cloudformation" CloudWatch event type for a specific CloudFormation stack?

I need to create an aws.cloudformation event type for a specific CloudFormation stack. For example when StackA receives the UpdateStack event, I need to be able to catch that event.
Through the console I was able to create the following event rule (which is an AWS API Call via CloudTrail type event):
{
"source": [
"aws.cloudformation"
],
"detail-type": [
"AWS API Call via CloudTrail"
],
"detail": {
"eventSource": [
"cloudformation.amazonaws.com"
],
"eventName": [
"UpdateStack",
"CreateStack"
]
}
}
However, this event isn't for any specific CloudFormation stack, and I don't see any option for adding anything specific (such as whenever StackA gets an UpdateStack call.
The documentation for event types give examples of other event types and how we can add a specific resource that triggers the event. For example, with the aws.codepipeline event, you can specify a pipeline equal value to PipelineA, and then the event would get triggered whenever PipelineA gets to the state you specified in the State parameter.
How can I do something similar with an aws.cloudformation event type?
Unfortunately the only way (as far as I have found) to get a stack-specific events is the notification configuration inside the stack, which can be only provided on creation/update.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-add-tags.html
I'm looking similar solution and found that should work:
https://aws.amazon.com/ru/blogs/mt/tracking-aws-service-catalog-products-provisioned-by-individual-saml-users/
Step5
aws events put-rule --name "sc-add-user" --event-pattern "{"source":["aws.cloudformation"],"detail-type":["AWS API Call via CloudTrail"],"detail":{"eventSource":["cloudformation.amazonaws.com"],"eventName":["CreateStack"]}}"