Can Glue Workflow or Trigger get parameters from EventBridge - amazon-web-services

My system design
I have created 4 Glue Jobs: testgluejob1, testgluejob2, testgluejob3 and common-glue-job.
EventBridge rule detects SUCCEEDED state of glue jobs such as testgluejob1, testgluejob2, testgluejob3.
After getting Glue Job's SUCCEEDED notification, Glue Trigger run to start common-glue-job.
Problem
I want to use the jobname string in common-glue-job script as parameter
Is it possible to pass parameters to Glue Workflow or Trigger from EventBridge?
The things I tried
Trigger can pass parameters to common-glue-job
  https://docs.aws.amazon.com/ja_jp/AWSCloudFormation/latest/UserGuide/aws-resource-glue-trigger.html
Type: AWS::Glue::Trigger
...
Actions:
- JobName: prod-job2
Arguments:
'--job-bookmark-option': job-bookmark-enable
If set Run Properties for Glue Workflow, I cat get it from common-glue-job by using boto3 and get_workflow_run_properties() function. But I have no idea how to put Run Properties from EventBridge by CFn
https://docs.aws.amazon.com/glue/latest/dg/workflow-run-properties-code.html
I set Target InputTransformer of EventBridge Rule, but I'm not sure how to use this value in common-glue-job.
DependsOn:
- EventBridgeGlueExecutionRole
- GlueWorkflowTest01
Type: AWS::Events::Rule
Properties:
Name: EventRuleTest01
EventPattern:
source:
- aws.glue
detail-type:
- Glue Job State Change
detail:
jobName:
- !Ref GlueJobTest01
state:
- SUCCEEDED
Targets:
-
Arn: !Sub arn:aws:glue:${AWS::Region}:${AWS::AccountId}:workflow/${GlueWorkflowTest01}
Id: GlueJobTriggersWorkflow
RoleArn: !GetAtt 'EventBridgeGlueExecutionRole.Arn'
InputTransformer:
InputTemplate: >-
{
"--ORIGINAL_JOB": <jobName>
}
InputPathsMap:
jobName : "$.detail.jobName"
Any help would be greatly appreciated.

If I understand you correctly, you already have information in EventBridge event, but cannot access it from your Glue job. I used the following workaround to do this:
You need to get an event ID from Glue workflow properties
event_id = glue_client.get_workflow_run_properties(Name=self.args['WORKFLOW_NAME'],
RunId=self.args['WORKFLOW_RUN_ID'])['RunProperties']['aws:eventIds'][1:-1]
Get all NotifyEvent events for the last several minutes. It's up to you to decide how much time can pass between the workflow start and your job start.
response = event_client.lookup_events(LookupAttributes=[{'AttributeKey': 'EventName',
'AttributeValue': 'NotifyEvent'}],
StartTime=(datetime.datetime.now() - datetime.timedelta(minutes=5)),
EndTime=datetime.datetime.now())['Events']
Check which event has an enclosed event with the event ID we get from Glue workflow.
for i in range(len(response)):
event_payload = json.loads(response[i]['CloudTrailEvent'])['requestParameters']['eventPayload']
if event_payload['eventId'] == event_id:
event = json.loads(event_payload['eventBody'])
In event variable you get full content of the event that triggered workflow.

Related

Terraform / Cloudformation - Pass parameter as YAML

I use Terraform to launch a Cloudformation stack to create Glue Databrew resources that don't exist yet on Terraform.
The thing is that I've a variable in Terraform that corresponds to the list of my data sources and in order to create the databrew resources associated to this data, I loop over my list to create one instance of my Cloudformation template for each data source.
Inside this template, I've a resource that I want to be different per data source. It correspond to the AWS::DataBrew::Ruleset resource.
It looks like this :
DataBrewDataQualityRuleset:
Type: AWS::DataBrew::Ruleset
Properties:
Name: !Ref RuleSetName
Description: Data Quality ruleset
Rules:
- Name: Check columns for missing values
Disabled: false
CheckExpression: AGG(MISSING_VALUES_PERCENTAGE) == :val1
SubstitutionMap:
- ValueReference: ":val1"
Value: '0'
ColumnSelectors:
- Regex: ".*"
- Name: Check two
Disabled: false
CheckExpression: :col IN :list
SubstitutionMap:
- ValueReference: ":col"
Value: "`group`"
- ValueReference: ":list"
Value: "[\"Value1\", \"Value2\"]"
TargetArn: !Sub SomeArn
What I want to do is, extract the Rules part of the component and create one file where I will put all my rules per data sources. In fact having something like below :
DataBrewDataQualityRuleset:
Type: AWS::DataBrew::Ruleset
Properties:
Name: !Ref RuleSetName
Description: Data Quality ruleset
Rules: !Ref Rules
TargetArn: !Sub SomeArn
And in my terraform, my Rules parameter would be my actual set of rules for one particular data source.
I've thought about having one YAML file from which I would loop on terraform but I'm not sure it's doable and if cloudformation would accept YAML as parameter type.
Below you'll also find my terraform component :
resource "aws_cloudformation_stack" "databrew_jobs" {
for_each = var.data_sources
name = "datachecks-${each.value.stack_name}"
parameters = {
Bucket = "test_bucket"
DataSetKey = "raw/${each.value.job_name}"
DataSetName = "dataset-${each.value.stack_name}"
RuleSetName = "ruleset-${each.value.stack_name}"
JobName = "profile-job-${each.value.stack_name}"
DataSourceName = "${each.value.stack_name}"
JobResultKey = "databrew-results/${each.value.job_name}"
RoleArn = iam_role_test.arn
}
template_body = file("${path.module}/databrew-job.yaml")
}
Do you have any idea how could I achieve this ?
Thanks in advance !

Invoking AWS lambda on a schedule using the Go CDK

I have an awslambda.Function that I've configured using the Go CDK.
I want to trigger this lambda every night at 2 AM UTC.
To this end, I am trying to define a CloudWatch Event alike the one from the below JavaScript example:
const newLambda = new lambda.Function(this, 'newLambda',{
runtime: lambda.Runtime.PYTHON_3_8,
code: lambda.Code.fromAsset('functions'),
handler: 'index.handler',
});
const eventRule = new events.Rule(this, 'scheduleRule', {
schedule: events.Schedule.cron({ minute: '0', hour: '1' }),
});
eventRule.addTarget(new targets.LambdaFunction(newLambda))
I believe that I need to use awsevents.NewRule, but I cannot figure out how to pass my lambda
where the "WHAT GOES HERE?" placeholder is written. Note that I am using v2 of the aws-cdk.
event := awsevents.NewRule(stack, aws.String("TriggerDownloadOrdersToS3LambdaEvent"), &awsevents.RuleProps{
Schedule: awsevents.Schedule_Cron(&awsevents.CronOptions{Hour: aws.String("2")}),
Targets: &[]awsevents.IRuleTarget{WHAT GOES HERE?},
})
Does anyone know how I can define a recurring trigger for my lambda using the Go CDK?
You can create an IRuleTarget using the awseventstargets module:
&awsevents.RuleProps{
Schedule: awsevents.Schedule_Cron(&awsevents.CronOptions{Hour: aws.String("2"), Minute: aws.String("0")}),
Targets: &[]awsevents.IRuleTarget{awseventstargets.NewLambdaFunction(lambda, nil)},
},

AWS CloudFormation & Service Catalog - Can I require tags with user values?

Our problem seems very basic and I would expect common.
We have tags that must always be applied (for billing). However, the tag values are only known at the time the stack is deployed... We don't know what the tag values will be when developing the stack, or when creating the product in the Service Catalog...
We don't want to wait until AFTER the resource is deployed to discover the tag is missing, so as cool as AWS config may be, we don't want to rely on its rules if we don't have to.
So things like Tag Options don't work, because it appears that they expect we know the tag value months prior to some deployment (which isn't the case.)
Is there any way to mandate tags be used for a cloudformation template when it is deployed? Better yet, can we have service catalog query for a tag value when deploying? Tags like "system" or "project", for instance, come and go over time and are not known up-front for many types of cloudformation templates we develop.
Isn't this a common scenario?
I am worried that I am missing something very, very simple and basic which mandates tags be used up-front, but I can't seem to figure out what. Thank you in advance. I really did Google a lot before asking, without finding a satisfying answer.
I don't know anything about service catalog but you can create Conditions and then use it to conditionally create (or even fail) your resource creation. Conditional Resource Creation e.g.
Parameters:
ResourceTag:
Type: String
Default: ''
Conditions:
isTagEmpty:
!Equals [!Ref ResourceTag, '']
Resources:
DBInstance:
Type: AWS::RDS::DBInstance
Condition: isTagEmpty
Properties:
DBInstanceClass: <DB Instance Type>
Here RDS DB instance will only be created if tag is non-empty. But cloudformation will still return success.
Alternatively, you can try & fail the resource creation.
Resources:
DBInstance:
Type: AWS::RDS::DBInstance
Properties:
DBInstanceClass: !If [isTagEmpty, !Ref "AWS::NoValue", <DB instance type>]
I haven't tried this but it should fail as DB instance type will be invalid if tag is null.
Edit: You can also create your stack using the createStack CFN API. Write some code to read & validate the input (e.g. read from service catalog) & call the createStack API. I am doing the same from Lambda (nodejs) reading some input from Parameter Store. Sample code -
module.exports.create = async (event, context, callback) => {
let request = JSON.parse(event.body);
let subnetids = await ssm.getParameter({
Name: '/vpc/public-subnets'
}).promise();
let securitygroups = await ssm.getParameter({
Name: '/vpc/lambda-security-group'
}).promise();
let params = {
StackName: request.customerName, /* required */
Capabilities: [
'CAPABILITY_IAM',
'CAPABILITY_NAMED_IAM',
'CAPABILITY_AUTO_EXPAND',
/* more items */
],
ClientRequestToken: 'qwdfghjk3912',
EnableTerminationProtection: false,
OnFailure: request.onfailure,
Parameters: [
{
ParameterKey: "SubnetIds",
ParameterValue: subnetids.Parameter.Value,
},
{
ParameterKey: 'SecurityGroupIds',
ParameterValue: securitygroups.Parameter.Value,
},
{
ParameterKey: 'OpsPoolArnList',
ParameterValue: request.userPoolList,
},
/* more items */
],
TemplateURL: request.templateUrl,
};
cfn.config.region = request.region;
let result = await cfn.createStack(params).promise();
console.log(result);
}
Another option: add a AWS Custom Resource backed by Lambda. Check for tags in this section & return failure if it doesn't satisfy the constraints. Make all other resource creation depend on this resource (so that they all create if your checks pass). Link also contains example. You will also have to add handling for stack update & deletion (like a default success). I think this is your best bet as of now.

Dynamic AWS Sam Schedule Event Input param

We are automating a lambda via SAM to run on a Schedule Event. We use YAML but we are unable to work out how to use !Sub to make the Input be dynamic.
If you read the sam documentation it says that Input needs to be a JSON formatted string
The following code works for us:
Events:
Event1:
Type: Schedule
Properties:
Schedule: rate(1 minute)
Input: >-
{
"sqsUrl": "https://sqs.12344.url",
"snsArn": "arn:val"
}
But we need to insert dynamic params into the Input like so:
Events:
Event1:
Type: Schedule
Properties:
Schedule: rate(1 minute)
Input: >-
{
"sqsUrl": "https://sqs.${AWS::AccountId}.url",
"snsArn": "arn:val"
}
We have tried to do this in multiple ways, using a !Sub but the deployment always fails saying that it needs to be valid JSON.
What is the correct way to make this JSON string use variables?
Thanks,
Mark
So, you should wrap all Input value (in your case this is json-string and of course it should be wrapped with some quotes) with the !Sub function.
Then, it will look like:
Input:
Fn::Sub: '{"sqsUrl": "https://sqs.${AWS::AccountId}.url","snsArn": "arn:val"}'
I've used something like:
!Sub |
{
"sqsUrl": "https://sqs.${AWS::AccountId}.url",
"snsArn": "arn:val"
}
the | (and >- among others) define the way yaml handles the line breaks in the string.

Invoke Lambda from CodePipeline with multiple UserParameters

This tutorial shows how to Invoke a Lambda from CodePipeline passing a single parameter:
http://docs.aws.amazon.com/codepipeline/latest/userguide/how-to-lambda-integration.html
I've built a slackhook lambda that needs to get 2 parameters:
webhook_url
message
Passing in JSON via the CodePipeline editor results in the JSON block being sent in ' ' so it can't be parsed directly.
UserParameter passed in:
{
"webhook":"https://hooks.slack.com/services/T0311JJTE/3W...W7F2lvho",
"message":"Staging build awaiting approval for production deploy"
}
User Parameter in Event payload
UserParameters: '{
"webhook":"https://hooks.slack.com/services/T0311JJTE/3W...W7F2lvho",
"message":"Staging build awaiting approval for production deploy"
}'
When trying to apply multiple UserParameters directly in the CLoudFormation like this:
Name: SlackNotification
ActionTypeId:
Category: Invoke
Owner: AWS
Version: '1'
Provider: Lambda
OutputArtifacts: []
Configuration:
FunctionName: aws-notify2
UserParameters:
- webhook: !Ref SlackHook
- message: !Join [" ",[!Ref app, !Ref env, "build has started"]]
RunOrder: 1
Create an error - Configuration must only contain simple objects or strings.
Any guesses on how to get multiple UserParameters passing from a CloudFormation template into a Lambda would be much appreciated.
Here is the lambda code for reference:
https://github.com/byu-oit-appdev/aws-codepipeline-lambda-slack-webhook
You should be able to pass multiple UserParameters as a single JSON-object string, then parse the JSON in your Lambda function upon receipt.
This is exactly how the Python example in the documentation handles this case:
try:
# Get the user parameters which contain the stack, artifact and file settings
user_parameters = job_data['actionConfiguration']['configuration']['UserParameters']
decoded_parameters = json.loads(user_parameters)
Similarly, using JSON.parse should work fine in Node.JS to parse a JSON-object string (as shown in your Event payload example) into a usable JSON object:
> JSON.parse('{ "webhook":"https://hooks.slack.com/services/T0311JJTE/3W...W7F2lvho", "message":"Staging build awaiting approval for production deploy" }')
{ webhook: 'https://hooks.slack.com/services/T0311JJTE/3W...W7F2lvho',
message: 'Staging build awaiting approval for production deploy' }