AWS: How to specify files changes to which would trigger CodePipleine build? - amazon-web-services

I have a monorepo containing several sub-projects (microservices) stored on GitHub. I am trying to build it using the combination of AWS CodePipeline and CodeBuild. I would like to start pipelines depending on what sub-project has been changed. For example, if a file in a service-1 folder is changed I want to run a service-1 pipeline, and I don't want to run build for service-2 or service-3.
In Google Cloud Build, there is a possibility to specify "included files" and "ignored files". I am trying to find exactly the same thing in AWS.
Quastion: How to specify files/folders changes to which would trigger a CodePipleine build?

Like with many services in AWS, you'd have to build this yourself. There is a blogpost on how to customize triggers. It roughly consists of the following:
Create a CW event rule that triggers a lambda on codecommit repo updates.
Write the logic that suits your usecase in the abovementioned lambda
The lambda eventually emits a custom CW event
Configure your pipelines to trigger based on the custom event payload
Article can be found at: https://aws.amazon.com/blogs/devops/adding-custom-logic-to-aws-codepipeline-with-aws-lambda-and-amazon-cloudwatch-events/

Related

AWS Lambda CI/CD process

I am trying to understand the correct way to setup my project on AWS so that I ultimately get the possibility to have CI/CD on the lambda functions. And also to ingrain good practices.
My application is quite simple : an API that calls lambda functions based on users' requests.
I have deployed the application using AWS SAM. For that, I used a SAM template that was using local paths to the lambda functions' code and that created the necessary AWS ressources (API Gateway and Lambda). It was necessary to use local paths for the lambda functions because the way SAM works does not allow using existing S3 buckets for S3 events trigger (see here) and I deploy a Lambda function that is watching the S3 bucket to see any updated code to trigger lambda updates.
Now what I have to do is to push my Lambda code on Github. And have a way that Github pushes the lambda functions' code from github to the created S3 bucket during the SAM deploy and the correct prefix. Now what I would like is a way to automatically to that upon Github push.
What is the preferred way to achieve that ? I could not find clear information in AWS documentation. Also, if you see a clear flaw in my process don't hesitate to point it out.
What you're looking to do is a standard CI/CD pipeline.
The steps of your pipeline will be (more or less): Pull code from GitHub -> Build/Package -> Deploy
You want this pipeline to be triggered upon a push to GitHub, this can be done by setting up a Webhook which will then trigger the pipeline.
Last two steps are supported by SAM which I think you have already implemented before, so will be a matter of triggering the same from the pipeline.
These capabilities are supported by most CI/CD tools, if you want to keep everything in AWS you could use CodePipeline which also supports GitHub integration. Nevertheless, Jenkins is perfectly fine and suitable for your use case as well.
There are a lot of ways you can do it. So would depend eventually on how you decide to do it and what tools you are comfortable with. If you want to use native AWS tools, then Codepipeline is what might be useful.
You can use CDK for that
https://aws.amazon.com/blogs/developer/cdk-pipelines-continuous-delivery-for-aws-cdk-applications/
If you are not familiar with CDK and would prefer cloudformation, then this can get you started.
https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials-github-gitclone.html

Amazon Rekognition Custom Labels

currently trying to process a number of images simultaneously using custom labels via postman. I'm a business client with AWS and have been on hold for over 30 minutes to speak with an engineer but because AWS customer sucks I'm asking the community if they can help. Rather than analyze an image one at a time, is there away to analyze images all at once? Any help would be great, really need it at this time.
Nick
I don't think there is a direct API or SDK by AWS for asynchronous image processing with custom labels.
But the right workaround here can be introducing an event-based architecture yourself.
You can upload images in batch to S3 and configure S3 events to send the event notification to an SNS topic.
You can have your API subscribed to this S3 topic which takes in the object name and bucket name. And then within the API, you have the logic to use custom labels and store results in a Database like DynamoDB. This way, you can process images asynchronously.
Just make sure you have the right inference hours configured so you don't flood your systems thus making them unavailable
Hope this process solves your problem
You can achieve this by using a batch processing solution published by AWS.
Please refer this blog for the solution: https://aws.amazon.com/blogs/machine-learning/batch-image-processing-with-amazon-rekognition-custom-labels/
Also, the solution can be deployed from github where it is published as a AWS Sample: https://github.com/aws-samples/amazon-rekognition-custom-labels-batch-processing. If you are in a region for which the deployment button is not provided, please raise a issue.
Alternatively, you can deploy this solution using SAM. The solution is developed as a AWS Serverless Application Model. So it can be deployed using sam with the following steps:
Install the sam cli - https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html
Download the code repository on your local machine
from within the folder execute the following steps. The folder name is referrenced as sam-app in the below example.
a. #Step 1 - Build your application
i. cd sam-app
ii. sam build
b. #Step 2 - Deploy your application
i. sam deploy --guided

How to enforce standards and controls when using CDK Pipeline

CDK Pipelines is great, specially for cross-account deployments. It enables the developers to define and customize the CI/CD pipeline for their app to their heart's content.
But to remain SoC compliant, we need to make sure that necessary controls like below are validated/enforced
A manual approval stage should be present before the stage that does the cross-account deployment to production
Direct deployment to production bypassing dev/staging environment is not allowed
Test cases (Unit tests/Integration tests) and InfoSec tests should pass before deployment
I know that above things are straightforward to implement in CDK Pipelines but I am not quite sure about how to ensure that every CDK Pipeline always conforms to these standards.
I can think of below solutions
Branch restrictions - Merge to master branch (which the CDK pipeline monitors) should be restricted and allowed only via pull requests
Tests - Add unit tests or integration tests which validate that the generated cloud formation template has specific resources/properties
Create a standard production stage with all necessary controls defined and wrap it in a library which developers need to use in their definition of the CDK Pipeline if the want to deploy to production
But how to enforce above controls in an automated fashion? Developers can choose to bypass above controls by simply not specifying them while defining the pipeline. And we do not want to rely on an Approver to check these things manually.
So in summary, the question is - When using CDK pipelines, how to give developers maximum customizability and freedom in designing their CI/CD solution while ensuring that SoC restrictions and mandatory controls are validated and enforced in an automated fashion?
Open Policy Agent might be helpful to check infra against custom policies in the pipeline.
https://aws.amazon.com/blogs/opensource/realize-policy-as-code-with-aws-cloud-development-kit-through-open-policy-agent/
After researching a lot, concluded that implementing these checks via a custom AWS Config rule is the best approach.
Let's say we want to ensure that
A manual approval stage is always present in every pipeline that
has a stage that deploys to prod.
We need to
Enable AWS Config and configure it to record all changes to Codepipeline
Create a custom AWS Config rule using AWS Lambda function (say pipeline-compliance-checker)
The lambda function gets triggered on every config change of any codepipeline and receives the latest config of the pipeline in question
It parses the latest pipeline config and checks whether the pipeline has a manual approval stage before the stage that deploys to prod. If yes, it deems the pipeline as COMPLIANT else NON_COMPLIANT
Create a AWS EventBridge rule to receive a notification to an SNS topic (say pipeline-non-compliance) when any pipeline is marked as NON_COMPLIANT - (doc)
Create another AWS Lambda function (say pipeline-compliance-enforcer) that is subscribed to that SNS topic. It stops the non-compliant pipeline in question (if it is in STARTED state) and then disables the incoming transition to the stage that deploys to prod. We can also delete the pipeline here if required.
Have tested the above setup and it fulfils the requirements.
Also learnt later that AWS speaks about the same solution to this problem in this talk - CI/CD Pipeline Security: Advanced Continuous Delivery Best Practices

Configure commit message filter in AWS CodePipeline

Well, I would like to avoid some types of commits to trigger an AWS CodePipeline, but I can't find any configuration about this in Source phase:
But, If AWS CodeBuild is not linked with AWS CodePipeline I have access to more features about trigger:
How can I configure trigger options using AWS CodePipeline ?
You can do this by editing the CloudWatch Event for the pipeline. Using a Lambda function, you can look for a specific type of change in your commit. The example in the link below looks for changes to specific files - so if you change the readme.md file, for example, don't deploy.
https://aws.amazon.com/blogs/devops/adding-custom-logic-to-aws-codepipeline-with-aws-lambda-and-amazon-cloudwatch-events/
You could take this example further and look for specific flags in your commit message, for example.

S3 Bucket Notifications on Multiple Criteria

I am running a continuous code deployment with Jenkins that will automatically compile and upload binaries to S3 in parallel for multiple targets.
The final step in my deployment mechanism is to detect that all the binaries for a particular build has been uploaded, and then deploy them together.
S3 has event notifications that can trigger when objects have been pushed, but do they have anything more sophisticated that can trigger when multiple objects have been pushed?
Example:
Build machine on Windows uploads binary to S3.
Build machine on OS X uploads binary to S3.
S3 detects that both binaries are now uploaded and triggers an event.
Build machine takes both binaries and releases them together.
Right now the only solution I can think of is to set up AWS Lambda and have the event handler manually check for the existence of the other binary, which may not even be feasible if S3 has special race conditions.
Any ideas?
The short answer is no. There is no mechanism that would let you trigger an action when all three objects are uploaded. There is no conditional notification, just simple events.
But you can use something else. Create a DynamoDB table for the build records and create a row there when your build is successful from any build machine, before you upload any files. Now for each build, create a separate attribute on the row. Have S3 publish a notification to your Lambda and have your Lambda lookup and update this row and when all your attributes are in desired state, you can have this Lambda do the release.
Amazon S3 is a "base" system upon which many things can be built (eg DropBox!). As such, the functionality of Amazon S3 is limited (but very scalable and reliable).
Thus, you'll have to build your own logic on top of Amazon S3 to implement your desired solution.
One option would be to trigger an AWS Lambda function when an object is created. This Lambda function could then implement any logic you desire, such as your step #3.