I have a run document in ssm to install some agents on the server.
Now, I wanted to automate this task by running these documents whenever a new instance is launched.
I want to achieve this through aws lambda(script to implement run commands upon launch of a new instance)
Any help would be appreciated !!!
Unfortunately this is a very broad questions, one that could not possibly be answered simply.
I would first suggest you decide which language you wish to write your lambda function in currently there are .NET, python, Node.js, Java and Go.
Node.js is a fairly easy language to start with as it's well supported and you can write it within the inline AWS code editor.
I would suggest looking at the template Node.js lambda functions that aws provides when creating a new lambda function within the console. This will help you see how that could be put together and the various ways that may be used. If you get the hang of these and find them easy enough to understand then you can look at the Node.js SSM api which should be available by default in the lambda runtime and try out running a few commands.
Of course if you're not competent in Node.js and primarily use another language then that's an entirely different question.
There are many resources and examples online for writing lambdas that can be found very easily.
Use a cloudwatch rule for this.
Create a cloudwatch rule for EC2 Instance State-change Notification and running state. Use lambda as the target and invoke the SSM command from lambda (via API) on the instance. The event will have the details you need like instance id. Hope you are familiar with AWS API's and how to use it. You will need a proper IAM role for your Lambda for this to work. Also, remember Cloudwatch events are region specific and can only invoke a lambda in the same region.
Related
My AWS lambda functions have input from AWS SNS (Topic subscription) and output will go to CRUD in NoSQL Database (likewise MongoDB).
So currently I have the SNS & Lambda function setup in AWS Cloud and they are working fine. However, I would like to containerize the lambda function as well as the MongoDB database and host them on AWS EKS using Docker + Kubernetes service. (So the functions will be a Docker image)
I am totally new to this container thing and I searched online though I could not found any that mentions how to containerized AWS Lambda Functions.
Is this possible? If it is what are the ways to do it?
Thank you.
The docker environment for AWS lambda function already exist and it is lambci/lambda. So if you want to run/test your functions locally, this is the tool normally used for that:
A sandboxed local environment that replicates the live AWS Lambda environment almost identically – including installed software and libraries, file structure and permissions, environment variables, context objects and behaviors – even the user and running process are the same.
Since its open-sourced, you can also modify it if it does not suit your needs.
Lambda already uses Firecracker a microVM technology. So, not really sure why it's required to create a container out of Lambda.
The beauty of Lambda/Serverless is to simply write the function code and forget about the rest. If it's all about more control, then look at Knative which runs on top of K8S.
I recently started working with AWS and IaC, I'm using Cloudformation to provision my AWS resources, but I discovered that AWS provide both a SDK and a CDK to enable you to provision resources programmatically instead of plain json/yaml.
But based on the documentation I did not really understand how they differ, can someone explain me how they differ and for what use case you should use what?
CDK: Is a framework to model and provision your infrastructure or stack. Stack can consist of a database for ex: DynamoDB, S3 Bucket, Lambda, API Gateway etc. It provides a facility to write code to create an infrastructure in AWS. Also called Infrastructure as code.
Check here
SDK: These are the code libraries provided by Amazon in various languages, like Java, Python, PHP, Javascript, Typescript etc. These libraries help interact with AWS services (like creating data in DynamoDB) which you either create through CDK or console. SDKs simplify using AWS services in your application with an API.
Check here
AWS SDK is a library primarily to ease the access to the AWS services by handling for you the data (de)serialization, credentials management, failure handling, etc. Perhaps, for specific scenarios, you could use the AWS SDK as the infrastructure as a code tool, however it could be cumbersome as it is not the intended usage of the library.
Based on the https://docs.aws.amazon.com/whitepapers/latest/develop-deploy-dotnet-apps-on-aws/infrastructure-as-code.html, dedicated tools for the IaC are AWS CloudFormation and AWS CDK.
AWS CDK is an abstraction on top of CloudFormation. CDK scripts are in fact transformed to the CloudFormation definitions when scripts are synthesized.
The difference can be best described on an example: Imagine that for each lambda function in your stack you want to create an error CloudWatch alarm and connect to the SNS topic.
With CloudFormation you will either a) need to write a pretty much similar bunch of yaml/json definitions for each lambda function to ensure the monitoring, b) use the nested stack templates, c) use CloudFormation modules.
With CDK you can write a generic code construct - class or method, which can create the alarm for the given lambda function and create the SNS alarm action for given topic.
In other words, CDK helps you generalize and re-use your IaC in a very familiar way to how you develop your business code. The code is shorter and more readable than the CF definitions.
The difference is even more remarkable when you need to set up similar resources in different AWS regions and when you have different AWS account per environment. You can manage all AWS accounts and regions with a single CDK codebase.
Some background first: CloudFormation is Amazon's solution for an “Infrastructure as Code” approach to managing the definition, provisioning and deployment of a bunch of resources across accounts/regions. This is done by using their declarative yaml/json-based template language to define it all, and then executing the templates through various means (console, cli, APIs...). More info:
white paper: https://docs.aws.amazon.com/whitepapers/latest/develop-deploy-dotnet-apps-on-aws/infrastructure-as-code.html
faq: https://aws.amazon.com/cloudformation/faqs/
There are other popular IaC solutions or tools to help achieve it more easily out there, such as Terraform and Kubernetes (container orchestration that also uses declarative templates to define desired states).
Potential benefits of IaC: At a high level, you can better track & audit your infra, reuse definitions/processes, make all your changes in a more consistent manner, faster thanks to all the automation and assurances you can get with an infra-as-code approach. You may be familiar with these as mentioned in previous answers and more, such as:
version controlling your infrastructure definitions,
more efficient and logically complex ways of constructing templates,
ability to write tests against them,
do diffs (see "change sets") before making real infra changes with the templates,
detect when live infra differs from your definitions,
automate rollbacks,
and lots of other state management assistance through a framework like CF that might be needed when performing regular ops duties.
CDK:
This is for helping to automate CloudFormation as part of an IaC approach to provisioning and deploying resources. It lets you use various popular programming languages to help with the creation, testing, and management of your CF setup. Some of AWS’s motivations: “YAML is an excellent format for describing the desired state of your cluster, but it is does not have primitives for expressing logic and reusable abstractions.“ “AWS CDK uses the familiarity and expressive power of programming languages for modeling your applications.”
More info: https://docs.aws.amazon.com/cdk/v2/guide/home.html
However, Amazon knows about other solutions, and happily points them out on the main CDK page now, downplaying its original connection to CF. You don't need to use CloudFormation if you don't want to; specifically, they mention you can use the same CDK constructs with the help of:
cdktf for Terraform maintained by its creators, Hashicorp
cdk8s for Kubernetes by AWS. re: “We realized this was exactly the same problem our customers had faced when defining their applications through CloudFormation templates, a problem solved by the AWS Cloud Development Kit (AWS CDK), and that we could apply the same design concepts from the AWS CDK to help all Kubernetes users.”
SDK:
AWS has an API for all of their services, and the various SDKs give you access to them. For example, I can use AWS’s Java SDK to manage an API Gateway. If I wanted to script some custom deployment process, I could do so with the SDK, managing all the state, etc. myself. You could probably even re-implement the CloudFormation service with the various underlying APIs... The APIs have varying levels of documentation though. E.g. CloudFormation Java APIs are only mentioned in the raw API reference, not the friendlier Developer Guide.
I find that the difference for me is that the CDK codifies the CloudFormation JSON/YAML. First response, is great ya okay in code but the benefit on the code side of things is you can write unit testing against the code. Therefore you get to build that sense of security or insurance policy against the provisioned services in the CDK.
There are other ways to test CF, however, with a dev background, this feels more comfortable.
I am trying to programmatically recreate a bunch of AWS resources that were created/configured manually via AWS consoles.
The AWS consoles do a lot for you.
For example, you can create a Lambda function with an Api-Gateway trigger in about 10 seconds using the AWS console.
The console is doing a lot of magic under the covers, defining and configuring resources such as policies, stages, permissions, models, etc.
In theory, CloudTrail is supposed to allow me to see what exactly is happening under the covers, but it seems to be silent in this case (i.e. Lambda function with Api-Gateway trigger).
I can play hide and seek and do extensive dumps using the CLI to list stages, policies, export api definitions, etc. etc. and look for the differences but is there an easier way? - like some way to trace the REST calls that the console is creating when it does all its magic?
Note: CloudFormer could have helped but it is only half-written software (Hey Amazon!) and only covers about a third of the resources I have defined. Does embracing Cloudformation imply not using these great time-saving consoles?
CloudFormation and other Infrastructure as code services are there to lessen the clicks you make while using AWS console or any other cloud console to manage your resources.
But these come in handy when you have to spin up resources which will be having almost the same configurations and software stack.
If you use CloudFormation you will be able to define the policies according to your need, which OS image to use, which stack to install etc. etc. it provides you minute control over your resources.
I suggest if you have to deploy these resources multiple times then create a CloudFormation template and use it.
So, I would suggest that rather than finding a way to recreate the code from your current infrastructure, create a CloudFormation template and use it for future needs.
If you need something easier than your current flow, this is it, as you just have to write your required configuration once.
Hashicorp Terraform is also a good alternative to AWS CloudFormation. You can use Terraforming to export the current infrastructure into Terraform readable files.
I have an AWS CLI invocation (in this case, to launch a configured EMR cluster to do some steps and then shut down) but I'm not sure how to go about running it daily.
I guess one way to do it is an EC2 micro instance running a cron job, or an ECS task in a micro that launches the command, but that all seems like it might be overkill. It looks like there's also a way to do it in Lambda, but rom what I can tell it'd be kludgy.
This doesn't have to be a good long-term solution, something that's suitable until I can do it right (Data Pipelines) would work just fine.
Suggestions?
If it is not a strict requirement to use the AWS CLI, you can use one of the AWS SDK instead to programmatically invoke Lambda.
Schedule a CloudWatch Rules using cron
When configured, the CloudWatch Rules will trigger a Lambda function
Implement a Lambda function that calls EMR using one of the supported SDKs (e.g. the EMR class in the AWS JavaScript SDK)
Make sure that you have the IAM configuration in place
Full example is available in the Schedule AWS Lambda Functions Using CloudWatch Events
Kludgy? Yes, configuration is needed, however if you take into account the amount of work required to launch EC2 / ECS (and make sure that it re-launches in the event of failure), I'd say it evens out.
Not sure about the whole task that you are doing, but to avoid doing it:
Manually
Avoid another set up for resources in AWS (as you mentioned)
I would create a simple job in a Continuous Integration (CI) server like jenkins,bamboo,circleci ..... (list can go on). I would assume that you might already have a CI server running, why not use it?
I have few EC2 servers in AWS. Whenever the disk space exceeds a limit, i want to delete some files (may be logs folder) in EC2 instance automatically. I am planning to use Lambda and cloudwatch for this. Can i use Lambda to interact with EC2. If not possible, what is the alternate approach to achieve this functionality.
This is not an appropriate use-case for an AWS Lambda function.
AWS Lambda is suitable for tasks where compute is required in response to an event. Your use-case, however, is to manipulate information on an EC2 instance, which does not need cloud compute.
You could run a script on each each computer, triggered by a Scheduled Task.
Alternatively, you could use the Systems Manager Run Command (also known as the EC2 Run Command), which allows you to run commands on multiple Amazon EC2 instances and view the results. This could be used to trigger a local script, or it could pass the whole command to run (including the script). It is purpose-built for the type of task you describe.
AWS Lambda has access to your instances if they are available in the internet. If they are not available in the internet, it is possible to give access to AWS lambda using a NAT or instance Gateway in your VPC.
The problem is: access to your instance does not means access to the instances filesystems. To delete the files from Lambda you can use two alternatives:
Configure a network filesystem service in your instances an connect
to this services in your lambda function. Using windows you would
just "share" your disks, but in that case you would use some SMB
library in your lambda code, that "I think" did not have native SMB
support. Just keep in mind that your security guy will scream out
loud when you propose this alternative.
Create a "agent" in your EC2 instances and keep it running as a
Windows Service and call this agent from your lambda function. In
that case, the lambda will start the execution of the agent that
will be responsible for the file deletion.
Another option, is to follow Ramesh's suggestion and create a Powershell script and configure a cron job. To be easy, you can create a Image with this Powershell script and use the image to initialize each instance. The same solution would be applicable to "the agent" solution in the lambda alternantives.
I think that, in any case, you will need to change something in your 150 servers. Using a customized image can help you to simplify this a little bit, but you will not get a solution without some changes.
According to the following thread, you cannot access files inside a EC2 VM unless you are exposing files to the public using different methodology.
AWS Forum
Quoting from the forum
If you are talking about the underlying EC2 instance, answer is No, you cannot access those files.
However as a solution for your problem, you can used scheduled job to cleanup your files depending your usage. You can use a service or cron job.