How much of Amazon Connect is scriptable, either through Terraform, Ansible or another approach? - amazon-web-services

This question concerns AWS Connect, the cloud-based call center. For those people who have been involved in the setup and configuration of AWS Connect, is there any portion of Amazon Connect that is configurable through a continuous integration flow other than any possible Lambda touchpoints. What I am looking for is scripting various functions such as loading exported flows, etc.
Looking at the AWS CLI, I see a number of AWS Connect calls but a majority is getting access to information (https://docs.aws.amazon.com/cli/latest/reference/connect/index.html) but very few that are available to configure portions of AWS Connect.

There is basically nothing at this time. All contact flows must be imported/exported by hand. Other settings (e.g. routing profiles, prompts, etc.) must be re-created manually.
Someone has created a "beta" Connect CloudFormation template though that actually uses puppeteer behind the scenes to automate the import/export process. I imagine that Amazon will eventually support this, because devops is certainly one of the rough edges of the platform right now.

For new people checking this question. Amazon has recently published the APIs you are looking for. create-contact-flow
It uses a JSON-based language specific to Amazon Connect, below is an example:
{
"Version": "2019-10-30",
"StartAction": "12345678-1234-1234-1234-123456789012",
"Metadata": {
"EntryPointPosition": {"X": 88,"Y": 100},
"ActionMetadata": {
"12345678-1234-1234-1234-123456789012": {
"Position": {"X": 270, "Y": 98}
},
"abcdef-abcd-abcd-abcd-abcdefghijkl": {
"Position": {"X": 545, "Y": 92}
}
}
},
"Actions": [
{
"Identifier": "12345678-1234-1234-1234-123456789012",
"Type": "MessageParticipant",
"Transitions": {
"NextAction": "abcdef-abcd-abcd-abcd-abcdefghijkl",
"Errors": [],
"Conditions": []
},
"Parameters": {
"Prompt": {
"Text": "Thanks for calling the sample flow!",
"TextType": "text",
"PromptId": null
}
}
},
{
"Identifier": "abcdef-abcd-abcd-abcd-abcdefghijkl",
"Type": "DisconnectParticipant",
"Transitions": {},
"Parameters": {}
}
]
}
Exporting from the GUI does not produce a JSON in this format. Obviously, a problem with this is keeping a state. I am keeping a close eye on Terraform/CloudFormation/CDK and will update this post if there is any support (that does not use puppeteer).

I think it's doable now; with the newest APIs, you can do many things to script the entire process. There are some issues with the contact flows themself, but I think this will improve over the next few months.
In the meantime, there is some effort to add Amazon Connet to Terraform. here are the issues and the WIP PRs
Github Issue
Service PR
Resource PR

Related

What are SageMaker pipelines actually?

Sagemaker pipelines are rather unclear to me, I'm not experienced in the field of ML but I'm working on figuring out the pipeline definitions.
I have a few questions:
Is sagemaker pipelines a stand-alone service/feature? Because I don't see any option to create them through the console, though I do see CloudFormation and CDK resources.
Is a sagemaker pipeline essentially codepipeline? How do these integrate, how do these differ?
There's also a Python SDK, how does this differ from the CDK and CloudFormation?
I can't seem to find any examples besides the Python SDK usage, how come?
The docs and workshops seem only to properly describe the Python SDK usage,it would be really helpful if someone could clear this up for me!
SageMaker has two things called Pipelines: Model Building Pipelines and Serial Inference Pipelines. I believe you're referring to the former
A model building pipeline defines steps in a machine learning workflow, such as pre-processing, hyperparameter tuning, batch transformations, and setting up endpoints
A serial inference pipeline is two or more SageMaker models run one after the other
A model building pipeline is defined in JSON, and is hosted/run in some sort of proprietary, serverless fashion by SageMaker
Is sagemaker pipelines a stand-alone service/feature? Because I don't see any option to create them through the console, though I do see CloudFormation and CDK resources.
You can create/modify them using the API, which can also be called via the CLI, Python SDK, or CloudFormation. These all use the AWS API under the hood
You can start/stop/view them in SageMaker Studio:
Left-side Navigation bar > SageMaker resources > Drop-down menu > Pipelines
Is a sagemaker pipeline essentially codepipeline? How do these integrate, how do these differ?
Unlikely. CodePipeline is more for building and deploying code, not specific to SageMaker. There is no direct integration as far as I can tell, other than that you can start a SM pipeline with CP
There's also a Python SDK, how does this differ from the CDK and CloudFormation?
The Python SDK is a stand-alone library to interact with SageMaker in a developer-friendly fashion. It's more dynamic than CloudFormation. Let's you build pipelines using code. Whereas CloudFormation takes a static JSON string
A very simple example of Python SageMaker SDK usage:
processor = SKLearnProcessor(
framework_version="0.23-1",
instance_count=1,
instance_type="ml.m5.large",
role="role-arn",
)
processing_step = ProcessingStep(
name="processing",
processor=processor,
code="preprocessor.py"
)
pipeline = Pipeline(name="foo", steps=[processing_step])
pipeline.upsert(role_arn = ...)
pipeline.start()
pipeline.definition() produces rather verbose JSON like this:
{
"Version": "2020-12-01",
"Metadata": {},
"Parameters": [],
"PipelineExperimentConfig": {
"ExperimentName": {
"Get": "Execution.PipelineName"
},
"TrialName": {
"Get": "Execution.PipelineExecutionId"
}
},
"Steps": [
{
"Name": "processing",
"Type": "Processing",
"Arguments": {
"ProcessingResources": {
"ClusterConfig": {
"InstanceType": "ml.m5.large",
"InstanceCount": 1,
"VolumeSizeInGB": 30
}
},
"AppSpecification": {
"ImageUri": "246618743249.dkr.ecr.us-west-2.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3",
"ContainerEntrypoint": [
"python3",
"/opt/ml/processing/input/code/preprocessor.py"
]
},
"RoleArn": "arn:aws:iam::123456789012:role/foo",
"ProcessingInputs": [
{
"InputName": "code",
"AppManaged": false,
"S3Input": {
"S3Uri": "s3://bucket/preprocessor.py",
"LocalPath": "/opt/ml/processing/input/code",
"S3DataType": "S3Prefix",
"S3InputMode": "File",
"S3DataDistributionType": "FullyReplicated",
"S3CompressionType": "None"
}
}
]
}
}
]
}
You could use the above JSON with CloudFormation/CDK, but you build the JSON with the SageMaker SDK
You can also define model building workflows using Step Function State Machines, using the Data Science SDK, or Airflow

Serverless framework for AWS : Adding initial data into Dynamodb table

Currently I am using Serverless framework for deploying my applications on AWS.
https://serverless.com/
Using the serverless.yml file, we create the DynamoDB tables which are required for the application. These tables are accessed from the lambda functions.
When the application is deployed, I want few of these tables to be loaded with the initial set of data.
Is this possible?
Can you provide me some pointers for inserting this initial data?
Is this possible with AWS SAM?
Don't know if there's a specific way to do this in serverless, however, just add a call to the AWS CLI like this to your build pipeline:
aws dynamodb batch-write-item --request-items file://initialdata.json
Where initialdata.json looks something like this:
{
"Forum": [
{
"PutRequest": {
"Item": {
"Name": {"S":"Amazon DynamoDB"},
"Category": {"S":"Amazon Web Services"},
"Threads": {"N":"2"},
"Messages": {"N":"4"},
"Views": {"N":"1000"}
}
}
},
{
"PutRequest": {
"Item": {
"Name": {"S":"Amazon S3"},
"Category": {"S":"Amazon Web Services"}
}
}
}
]
}
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.LoadData.html
A more Serverless Framework option is to use a tool like the serverless-plugin-scripts plugin that allows you to add your own CLI commands to the deploy process by default:
https://github.com/mvila/serverless-plugin-scripts

Execute stack after cloudFormation Deploy

I have a ApiGateway made with Serverless-model-application that I made a integration with GitHub via CodePipeline, everything is running fine, the pipeline reads the webhook, builds the buildpsec.yml and deploys the CloudFormation file, creating the updating the stack.
The thing is after the stack is updated it still needs a approval on the console, how can I make the execute on the stack update be auto-run?
It sounds like your pipeline is doing one of two things, unless I'm misunderstanding you:
Making a change set but not executing it in the cloudformation console.
Proceeding to a manual approval step in the pipeline and awaiting your confirmation.
Since #2 is simply solved by removing that step, let's talk about #1.
Assuming you are successfully creating a change set called ChangeSetName, you need a step in your pipeline with the following (cfn JSON template syntax):
"Name": "StepName",
"ActionTypeId": {"Category": "Deploy",
"Owner": "AWS",
"Provider": "CloudFormation",
"Version": "1"
},
"Configuration": {
"ActionMode": "CHANGE_SET_EXECUTE",
"ChangeSetName": {
"Ref": "ChangeSetName"
},
...
Keep the other parameters (e.g. RoleArn) consistent per usual.

Error when updating a PowerBI workspace collection from an arm template

We have deployed a PowerBI embedded workspace collection with the following really simple arm template
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {},
"variables": {},
"resources": [
{
"comments": "Test Power BI workspace collection",
"apiVersion": "2016-01-29",
"type": "Microsoft.PowerBI/workspaceCollections",
"location": "westeurope",
"sku": {
"name": "S1",
"tier": "Standard"
},
"name": "myTestPowerBiCollection",
"tags": {
"displayNmae": "Test Power BI workspace collection"
}
}
],
"outputs": {}
}
For deployment we used the well known Powershell command New-AzureRmResourceGroupDeployment After the creation if we try to execute the command again it fails with the following message
New-AzureRmResourceGroupDeployment : Resource Microsoft.PowerBI/workspaceCollections 'myTestPowerBiCollection' failed with message
{
"error": {
"code": "BadRequest",
"message": ""
}
}
If we delete the collection and execute again succeeds without a problem. I tried with both the options for the -Mode parameter (Incremental, Complete) and didn't help, even though Incremental is the default option.
This is a major issue for us as we want to provision the collection as part of our Continuous Delivery and we execute this several times.
Any ideas on how to bypass this problem?
As you mentioned , if PowerBI Workspace Collection name is existed, it will throw expection when we try to deploy the PowerBI Workspace Collection again.
If it is possible to add customized logical code, we could use Get-AzureRmPowerBIWorkspaceCollection to check whether PowerBI Workspace Collection is existed. If it is existed, it will return PowershellBIworkspaceCollection object, or will throw not found exception.
We also could use Remove-AzureRmPowerBIWorkspaceCollection command to remove PowerBI Workspace Collection. If PowerBI workspace Connection is existed we could skip to deploy or delete and renew it according to our logic.

Want to server-side encrypt S3 data node file created by ShellCommandActivity

I created a ShellCommandActivity with stage = "true". Shell command creates a new file and store it in ${OUTPUT1_STAGING_DIR}. I want this new file to be server-side encrypted in S3.
According to document all files created in s3 data node are server-side encrypted by default. But after my pipeline completes an un-encrypted file gets created in s3. I tried setting s3EncryptionType as SERVER_SIDE_ENCRYPTION explicitly in S3 datanode but that doesn't help either. I want to encrypt this new file.
Here is relevant part of pipeline:
{
"id": "DataNodeId_Fdcnk",
"schedule": {
"ref": "DefaultSchedule"
},
"directoryPath": "s3://my-bucket/test-pipeline",
"name": "s3DataNode",
"s3EncryptionType": "SERVER_SIDE_ENCRYPTION",
"type": "S3DataNode"
},
{
"id": "ActivityId_V1NOE",
"schedule": {
"ref": "DefaultSchedule"
},
"name": "FileGenerate",
"command": "echo 'This is a test' > ${OUTPUT1_STAGING_DIR}/foo.txt",
"workerGroup": "my-worker-group",
"output": {
"ref": "DataNodeId_Fdcnk"
},
"type": "ShellCommandActivity",
"stage": "true"
}
Short answer: Your pipeline definition looks correct. You need to ensure you're running the latest version of the Task Runner. I will try to reproduce your issue and let you know.
P.S. Let's keep conversation within a single thread here or in AWS Data Pipeline forums to avoid confusion.
Answer on official AWS Data Pipeline Forum page
This issue is resolved when I downloaded new TaskRunner-1.0.jar. I was running older version.