I have two AWS accounts. I develop code in one CodeCommit repository. Once it is done, I need to clone that code into the other account CodeCommit repository. Is there a way to do that using lambda function or any other method to automate the process.
Please help me, it was a really a headache more than a month. :)
There are several ways doing that. Essentially, what you'll need is a trigger, that then kicks of the replication process into another account after each commit. Below are two possible ways documented doing this.
Lambda + Fargate
The first one uses a combination of Lambda, which you can select CodeCommit to be a trigger for. The Lambda function then runs a Fargate task, which in turn replicates the repository using git clone --mirror. Fargate is used here as the replication of larger repositories might exceed the temporary storage that Lambda can allocate.
https://aws.amazon.com/blogs/devops/replicate-aws-codecommit-repository-between-regions-using-aws-fargate/
CodePipeline + CodeBuild
This is probably the "cleaner" variant as it uses native CI/CD tooling in AWS, making it easier to set up as compared to ECS/Fargate, amongst other advantages.
Here you're setting up AWS CodePipeline, which will monitor the CodeCommit repository for any changes. When a commit is detected, it will trigger CodeBuild, which in turn runs the same git command outlined earlier.
https://medium.com/geekculture/replicate-aws-codecommit-repositories-between-regions-using-codebuild-and-codepipeline-39f6b8fcefd2
Assuming that you have repo 1 on account A, repo 2 on account B, you want to sync repo 1 -> repo 2
The easiest way is to do the following:
create SNS topic on Account A
enable Notification for repo 1, and send all event to SNS topic
create a lambda function to subscribe the SNS topic
make sure you followed this guide https://docs.aws.amazon.com/codecommit/latest/userguide/cross-account.html to grant lambda function cross account CodeCommit permission
write a python function to decide what git events you want to replicate. If you just want to sync the main branch and ignore all other branch, you can say something like: if event["source_ref"].endswith("main"), then use boto3 CodeCommit API https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/codecommit.html, (take a look at batch_get_commits) to commit the change to the remote CodeCommit repo.
However, I really doubt that do you really need to do this? How about just dump the all git history as a zip to S3 to your remote account? and just import everytime if you see any changes? I believe your remote account is mostly READ ONLY and just serve as a backup. If you only need backup, you can just dump to S3 and don't even need to import.
Related
I am looking for a way to trigger the Jenkins job whenever the s3 bucket is updated with a particular file format.
I have tried a lambda function method with an "Add trigger -> s3 bucket PUT method". I have followed this article. But it's not working. I have explored and I have found out that "AWS SNS" and "AWS SQS" also can use for this, but the problem is some are saying this is outdated. So which is the simplest way to trigger the Jenkins job when the s3 bucket is updated?
I just want a trigger, whenever the zip file is updated from job A in jenkins1 to the S3 bucket name called 'testbucket' in AWS enviornment2. Both Jenkins are in different AWS accounts under seperate private VPC. I have attached my Jenkins workflow as a picture. Please refer below picture.
The approach you are using seems solid and a good way to go. I'm not sure what specific issue you are having so I'll list a couple things that could explain why this might not be working for you:
Permissions issue - Check to ensure that the Lambda can be invoked by the S3 service. If you are doing this in the console (manually) then you probably don't have to worry about that since the permissions should be automatically setup. If you're doing this through infrastructure as code then it's something you need to add.
Lambda VPC config - Your lambda will need to run out of the same subnet in your VPC that the Jenkins instance runs out of. Lambda by default will not be associated to a VPC and will not have access to the Jenkins instance (unless it's publicly available over the internet).
I found this other stackoverflow post that describes the SNS/SQS setup if you want to continue down that path Trigger Jenkins job when a S3 file is updated
I want to use the code in my CodeCommit repository to be used as lambda function instead of writing the complete code in lambda itself. Is it possible to achieve that?
Not by default in console or cli, but this is very common in CI/CD pipelines.
CodeCommit -> CodeBuild (or some other tools) -> Lambda
In more detail, usually, there is a pipeline (CodePipeline, CircleCI, Jenkins etc) which is triggered by commit in the repo, then pipeline clone code, CodeBuild(some other tool) process it, then pipeline deploys it to lambda.
In the case of lambda code preparation, the usual process in CodeBuild is to zip the lambda code and publish it to artifacts bucket.
I am trying to understand the correct way to setup my project on AWS so that I ultimately get the possibility to have CI/CD on the lambda functions. And also to ingrain good practices.
My application is quite simple : an API that calls lambda functions based on users' requests.
I have deployed the application using AWS SAM. For that, I used a SAM template that was using local paths to the lambda functions' code and that created the necessary AWS ressources (API Gateway and Lambda). It was necessary to use local paths for the lambda functions because the way SAM works does not allow using existing S3 buckets for S3 events trigger (see here) and I deploy a Lambda function that is watching the S3 bucket to see any updated code to trigger lambda updates.
Now what I have to do is to push my Lambda code on Github. And have a way that Github pushes the lambda functions' code from github to the created S3 bucket during the SAM deploy and the correct prefix. Now what I would like is a way to automatically to that upon Github push.
What is the preferred way to achieve that ? I could not find clear information in AWS documentation. Also, if you see a clear flaw in my process don't hesitate to point it out.
What you're looking to do is a standard CI/CD pipeline.
The steps of your pipeline will be (more or less): Pull code from GitHub -> Build/Package -> Deploy
You want this pipeline to be triggered upon a push to GitHub, this can be done by setting up a Webhook which will then trigger the pipeline.
Last two steps are supported by SAM which I think you have already implemented before, so will be a matter of triggering the same from the pipeline.
These capabilities are supported by most CI/CD tools, if you want to keep everything in AWS you could use CodePipeline which also supports GitHub integration. Nevertheless, Jenkins is perfectly fine and suitable for your use case as well.
There are a lot of ways you can do it. So would depend eventually on how you decide to do it and what tools you are comfortable with. If you want to use native AWS tools, then Codepipeline is what might be useful.
You can use CDK for that
https://aws.amazon.com/blogs/developer/cdk-pipelines-continuous-delivery-for-aws-cdk-applications/
If you are not familiar with CDK and would prefer cloudformation, then this can get you started.
https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials-github-gitclone.html
Well, I would like to avoid some types of commits to trigger an AWS CodePipeline, but I can't find any configuration about this in Source phase:
But, If AWS CodeBuild is not linked with AWS CodePipeline I have access to more features about trigger:
How can I configure trigger options using AWS CodePipeline ?
You can do this by editing the CloudWatch Event for the pipeline. Using a Lambda function, you can look for a specific type of change in your commit. The example in the link below looks for changes to specific files - so if you change the readme.md file, for example, don't deploy.
https://aws.amazon.com/blogs/devops/adding-custom-logic-to-aws-codepipeline-with-aws-lambda-and-amazon-cloudwatch-events/
You could take this example further and look for specific flags in your commit message, for example.
I want to schedule aws s3 sync s3://bucket1 s3://bucket2 command to run everyday at defined time, say 3 AM.
What options do we have to schedule this using aws resources like lambda etc?
I saw many people using Windows scheduler, but as this is s3 to s3 sync, its not a better option to use Windows scheduler of servers to run this command through cli.
This sounds like a case of The X-Y Problem. That is, it's likely "scheduling an AWS CLI command to run" is not your underlying problem. I'd urge you to consider whether your problem is actually "getting one S3 bucket to exactly replicate the contents of another".
On this point, you have multiple options. These fall broadly into two categories:
Actively sync objects from bucket A to bucket B. This can be done using any number of methods already mentioned, including your idea of scheduling the AWS CLI command.
Lean on S3's built-in replication and this is probably what you want.
The reason S3 replication was implemented by AWS is to solve exactly this problem. Unless you have additional considerations (if you do, please update your question, so we can better answer it :) ) replication is likely your best, and easiest, and most reliable, option.
There are so many ways to do this, I'll elaborate on the ones I use.
Cloudwatch events to trigger whatever is going to perform your task. You can use it just like a crontab.
Lambda functions:
1 - give the lambda function an IAM role that allows read from bucket1 and write to bucket2 and then call the api.
2 - since aws cli is a python tool, you could emmbed aws cli as a python dependency and use it within your.
Here's a link to a tutorial:
https://bezdelev.com/hacking/aws-cli-inside-lambda-layer-aws-s3-sync/
Docker+ECS Fargate:
0 - pick any docker image with aws-cli preinstalled like this one
1 - create an ECS Fargate cluster (will cost you nothing)
2 - create an ECS task definition and inside it use the image you chose the on step 0 and on command put "aws s3 sync bucket1 bucket2"
3 - create a schedule that will use your task definition created on step 2
Additional considerations:
Those are the ones I would use. You could also have cloudwatch trigger a cloudformation that would create an ec2 instance and use the userdata field to run the sync, you could create an ami of an ec2 that on /etc/rc.local has the sync command and then a halt command, and several other options that work. But I'd advise you get the lambda option, unless your sync job takes more then 15 minutes (which is lambda's timeout) then I'd go with the docker option.