How to Automate the AWS Data Migration Services - amazon-web-services

Is there any way to schedule the DMS task at specific Time. In AWS Console I didn't find any related options.

Take a look at Automating AWS DMS Migration Tasks | AWS Database Blog:
Currently, DMS tasks cannot be scheduled using the DMS console. To
schedule DMS tasks, we need to use the native tools present in the OS
(Windows or Linux) that you use to access AWS resources. This blog
post shows you how to do so.
Moving forward, the process that this post describes should greatly
simplify automated task deployments and modification scenarios.
Following is a detailed description on how to automatically execute
the DMS task or schedule it for execution in Linux and Windows.

Related

AWS ECS run latest task definition

I am trying to have run the lastest task definition image built from GitHub deployment (CD). Seems like on AWS it creates a task definition for example "task-api:1", "task-api:2", on was my cluster is still running task-api: 1 even though there is the latest task as a new image has been built. So far I have to manually stop the old one and start a new one . How can I have it automated?
You must wrap your tasks in a service and use rolling updates for automated deployments.
When the rolling update (ECS) deployment type is used for your service, when a new service deployment is started the Amazon ECS service scheduler replaces the currently running tasks with new tasks.
Read: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-type-ecs.html
This is DevOps, so you need a CI/CD pipeline that will do the rolling updates for you. Look at CodeBuild, CodeDeploy and CodePipeline (and CodeCommit if you integrate your code repository in AWS with your CI/CD)
Read: https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials-ecs-ecr-codedeploy.html
This is a complex topic, but it pays off in the end.
Judging from what you have said in the comments:
I created my task via the AWS console, I am running just the task definition on its own without service plus service with task definition launched via the EC2 not target both of them, so in the task definition JSON file on my Github both repositories they are tied to a revision of a task (could that be a problem?).
It's difficult to understand exactly how you have this set up and it'd probably be a good idea for you to go back and understand the services you are using a little better using the guide you are following or AWS documentation. Pushing a new task definition does not automatically update services to use the new definition.
That said, my guess is that you need to update the service in ECS to use the latest task definition. You can do that in many ways:
Through the console (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/update-service-console-v2.html).
Through the CLI (https://docs.aws.amazon.com/cli/latest/reference/ecs/update-service.html).
Through the IaC like the CDK (https://docs.aws.amazon.com/cdk/api/latest/docs/aws-ecs-readme.html).
This can be automated but you would need to set up a process to automate it.
I would recommend reading some guides on how you could automate deployment and updates using the CDK. Amazon provide a good guide to get you started https://docs.aws.amazon.com/cdk/latest/guide/ecs_example.html.

AWS Batch vs AWS CodeBuild

I am very new in AWS and when I was searching something to download a code from GitHub (a python project), run it, and save the output in s3 the first service that I found was CodeBuild.
So I implement this kind of workflow using CodeBuild.
But now I have seen that AWS have a service called AWS Batch and I am wondering if I should migrate my arquitecture to AWS Batch.
Can you explain which one - AWS CodeBuild or AWS Batch - is more suitable with my case? When use AWS Batch instead of AWS CodeBuild?
Thank very much.
TLDR Summary: AWS Codebuild is the nicer choice for simple jobs.
My (reverse) experience...
I needed to run a simple job that pulls data from external api, read/write to external database, and generate a CSV report.
The job takes ~1 hour to run, so AWS Lambda is out of the picture.
After some googling, I found AWS Batch and decided to give Creating a Simple “Fetch & Run” AWS Batch Job a try.
The required steps to this "simple" job working:
Build a Docker image with the fetch & run script
Create an Amazon ECR repository for the image
Push the built image to ECR
Create a simple job script and upload it to S3
Create an IAM role to be used by jobs to access S3
Configure a compute environment
Create a job queue
Create a job definition that uses the built image
Submit and run a job that execute the job script from S3
After spending the time to create all these resources, it did not work out of the box. I found myself debugging random things I shouldn't have to debug such as:
Dockerfile
entrypoint script
ECS cluster
EC2 instance and autoscaling group
After failing to find simple practical examples, and realizing the amount of effort required, I decided to explore other solutions.
I stumbled onto Using AWS CodeBuild to execute administrative tasks and this post.
I've used AWS Codebuild in the past for CI/CD pipeline, and thought "what the hell, lets give it a try". In a shorter amount of time, I was able to get a "codebuild job" running on a cloudwatch scheduler and codebuild slack notifications added, with less effort:
Connect build project to your source code
Select a runtime environment
Create IAM role
Create a buildspec.yml and add runtime commands
One major advantage is that CodeBuild runs tasks in a full-blown Linux environment.
Drawbacks:
Max execution time of 8 hours
AWS Codebuild was much easier to get working for my simple job.
Sorry for the long post, just wanted to share my experience with these 2 services.
AWS Batch is used highly parallel computations, e.g., processing large number of images at the same time:
AWS Batch enables you to run batch computing workloads on the AWS Cloud. Batch computing is a common way for developers, scientists, and engineers to access large amounts of compute resources, and AWS Batch removes the undifferentiated heavy lifting of configuring and managing the required infrastructure, similar to traditional batch computing software.
Thus its not suited for what you are trying to use it. CodeBuild is better choice, based on your description.

Starting AWS DMS Replication Task in Terraform

Is there any way to start an AWS Database Migration Service full-load-and-cdc replication task through Terraform? Preferably, this would start automatically upon creation of the task.
The AWS DMS console provides an option to "Start task on create", and the AWS CLI provides a start-replication-task command, but I'm not seeing similar options in the Terraform resources. The aws_dms_replication_task provides the cdc_start_time argument, but I think this may apply only to cdc tasks. I've tried setting this argument to a number of past/current/future timestamps with my full-load-and-cdc replication task, but the task never started (it was merely created and entered the ready state).
I'd be glad to log a feature request to Terraform if this feature is not supported, but wanted to check with the community first to see if I was overlooking an existing way to do this today.
(Note: This question has also been logged to the Terraform Google group.)
I've logged an issue for this feature request:
Terraform AWS Provider #2083: Support Starting AWS Database Migration Service Replication Task

Exploring tools to trigger build script to rollout specific git branch to a subset of the amazon ec2 instances

We have multiple amazon ec2 instances behind a load balancer. Our build script is written in phing and is integrated with git.
We are looking for a tool (like Jenkins or Amazon code deploy) which could display all the active instances currently behind load balancer and then allow us to select some of them (or select a group defined previously) and then trigger either of the following (whichever is better) -
a build script hosted on the same dedicated server where the tool is hosted.
or the respective build scripts hosted on the selected ec2 instances.
We should be able to do the following -
specify a git branch name, optionally, when we trigger the build script for any group of instances.
be able to roll out in batches of boxes, so as to get some time to monitor load, and then move to next batch if all is good. Best way, I guess, would be to specify a size of the batch (e.g. 10), so that the process waits for a user prompt after rollout on every batch completes.
So, if we have to rollout two different git branches to two groups of instances, we should be able to run them in two steps (if we do not specify batch size).
Would like to know about experiences of people who dealt with something similar.
For CodeDeploy, it supports Git (more precisely, GitHub). It also allows you to deploy only to tagged EC2 instances. If combined with custom DeploymentConfig (http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-deployment-configuration.html), you can also control how fast (the size of the batch) to deploy.
I would re-structure the question:
The choices you have for application deployment
and whether the tool has option to perform rolling deployments.
Jenkins is software for CI/CD, which will have to use plugins,custom scripting or leverage an existing orchestration software setup for doing the deployments.
For software orchestration, you have many choices, some of the more famous tools are Chef, puppet, ansible etc.. All of these would need you to manage some kind of centralized setup. All such software support application deployment.
You need to make a decision on whether you would want to invest in maintaining such a setup.
If you decide against such a setup, you have the option of using managed services such as AWS OpsWorks, AWS CodeDeploy, hosted chef etc.
In choosing any of these services, you delegate the management of orchestration software to a vendor, which will ensure the service is up all the time.
AWS code deploy and AWS OpsWorks are managed services on aws and work pretty well on AWS setups.
AWS OpsWorks uses chef under the hood.
AWS CodeDeploy only provides a subset of what OpsWorks provides and is responsible only for deployments. With AWS code deploy you get convenient visualization of your software deployments through AWS console.
With AWS code deploy, you can achieve the goal of partial roll out to ec2 instances.
You can do the same with other tools as well but CodeDeploy on AWS environment will take least amount of work.
CodeDeploy also allows you to deploy from GIT. Please refer to the following aws documentation
http://docs.aws.amazon.com/codedeploy/latest/userguide/github-integ-tutorial.html
The pitfall with code deploy is the fact that the agent that will run on instances has been tested for and is supported for only a limited number of OS combinations.(http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-run-agent.html#how-to-run-agent-supported-oses)
Also in future if you decide to move away from AWS, you will have to redo the deployment related work.
CodeDeploy service only charges you for the underneath AWS resources.
Please find the link to pricing documentation below:
https://aws.amazon.com/codedeploy/pricing/

Background Process in AWS

I am new to AWS and have a question about an application I'm trying to write. I have a bunch of data that sits within Amazon RDS. On a periodic basis, I would like a small snippet of code to run against this data and in certain situations have notifications sent where appropriate. Of all the AWS services, what is the best architecture for this?
Thanks
You could use a simple cron job running on an EC2 instance. The cron job could run a script (PHP, Perl, whatever) to go fetch the data and then do something with it (notify people, generate reports etc)
Does that help?
See here for details on getting started with a Linux instance: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html
You could achieve the same results using a Windows machine and Scheduled Tasks. Here's the getting started guide for Windows instances: http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/EC2Win_GetStarted.html
You can use Scheduled Lambda service driven by CloudWatch events. It acts as a resilient cron.