While cloud-init is the optimal way to create users in an EC2 host at the time it gets instantiated I would like to hear how the community usually manages a dynamic group of users. I am planning to use Ansible to achieve the user config management but I am unsure how we can call the Ansible playbook from cloud-init script. Any pointers will be of help.
As you said, cloud-init by nature runs once and only once per instance, at creation time. If you expect your list of users to change over time and you want to keep an instance up to date, cloud-init isn't going to cover that use case.
The user module in ansible can be a good way to manage system users throughout the lifecycle of an instance.
I can't speak for the entire community but one possibility would be to maintain a list of users in ansible and then run your playbook targeting the instance or instances when there is a change to the user list.
Rather than cloud-init, these ansible-playbook runs could be manually run by an operator, run at a regular time interval, or triggered to run automatically with a combination of version control and continuous delivery.
Related
I am able to successfully provision linux ssh users via cloud-init upon initial startup using this guide: https://aws.amazon.com/premiumsupport/knowledge-center/ec2-user-account-cloud-init-user-data/
The issue I am facing is that every time I need to add or remove users, I need to shut down the machine to modify to user-data script. How would I go about creating and removing users while reducing downtime?
If I manually add users (https://aws.amazon.com/premiumsupport/knowledge-center/new-user-accounts-linux-instance/), will those users be removed when the instance restarts and runs the startup script?
You could potentially use EC2 launch templates. Very easy to setup and easy to clone to different instance sizes and classes. Pretty slick how I use them with user data to control environment variables for dev, stage, prod use too!
Let's say I have a webapp where users can click a button that starts a long running task (eg. 3 days long). The user can also select options, for example the things it wants that task to do.
No matter what, the task will be the same script that runs when the instance starts. However, I would like it to somehow take arguments from the button click to change the function of the startup script.
Is there a way to do this with AWS EC2 instances?
So, you're saying that you want to pass certain parameters to some software that will be launched on an EC2 instance.
There are many ways to do this:
When the instance is launched, you can pass User Data. This is commonly used to run a startup script, but it can also be used just to pass information to the instance that can be accessed via http://169.254.169.254/latest/user-data/. So, either pass the configuration directly or pass it as part of the startup script.
Store it in tags on the instance when the instance is launched. Once the software starts up, it can retrieve the tags associated with the instance (itself) and act appropriately.
Store the configuration in a database and have the software access the database to determine what it should do.
Store the configuration in Amazon S3 and have the software retrieve the configuration file.
Personally, I like the idea of Tags. It's very Cloud.
This behaviour isn't directly related to EC2, although EC2 could host an application that does these long-running parameterised tasks. Whether EC2 is a good choice also depends on how your tasks react to underlying failures: if the EC2 instance fails or restarts, what happens to your task?
Depending on your use case, some managed options might be:
AWS Step Functions
AWS Simple Workflow Service
AWS Batch
I have an EC2 instance that is running a few processes. I also have a Lambda script that is triggered through various means. I would like this Lambda script to talk to my EC2 instance and get a list of running processes from it (Essentially run ps aux on the EC2 box, and read the output).
Now this is easy enough with just one instance and its instance-id. Just SSH in, run the command, get the output, and be on my way. However, I would like to scale this to multiple EC2 instances, for which only the instance-id is known and SSH keys may not be given.
Is such a configuration possible with Lambda and Boto (or other libraries)? Or do I just have to run a microserver on each of my instances that will reply with the given information (something I'm really trying to avoid)
You can do this easily with AWS Systems Manager - Run Command
AWS Systems Manager provides you safe, secure remote management of your instances at scale without logging into your servers, replacing the need for bastion hosts, SSH, or remote PowerShell.
Specifically:
Use the send-command API from Lambda function to get list of all processes on a group of instances. You can do this by providing a list of instances or even a tag query
You can also use CloudWatch Events to trigger a Run Command directly
I don't think there is something available out of the box for this scenario.
Instead of querying, try an alternate approach. Install an agent on all ec2 instances, which reports the required information to a central service or probably a DynamoDB table, with HashKey as InstanceId.
You may want to bake this script as a cron job, (executed probably hourly?) in the AMI itself.
With this implementation, you reduce the complexity of managing and running a separate web service on each EC2 instance.
Query the DynamoDB table on demand. There will be a lag, as data may not be real time, but you can always reduce the CRON interval per your needs.
Like Yeshodhan mentioned, There is no direct approach for this.
However, There is one more approach.
1) Save your private key file to an s3 bucket, Create a lambda function and use python fabric module to login to the remote machines from lambda function and execute commands.
The above-mentioned approach is possible but I highly recommend launching a separate machine and use a configuration management system (Preferably ansible) and get the results from remote machines.
I seem to not understand one central point of AWS Autoscaling:
I created an AMI of an (Ubuntu) EC2 instance having my web-server installed, and I use this AMI as the launch-configuration for my Autoscaling group.
But when Autoscaling decides to Spin up a new instance, how am I supposed to launch my webserver on that instance. Am I supposed to write some start-up scripts or what is best practice to start a process of a newly spun-up instance from Autoscaling?
When I deploy an application (PostgreSQL, Elasticsearch, whatever) to an EC2 instance, I usually do it intending on being able to repeat the process. So my first step is to create an initial deployment script which will do as much of the install and setup process as possible, without needing to know the IP address, hostname, amount of memory, number of processors, etc. Basically, as much as I can without needing to know anything that can change from one instance to the next or upon shutdown / restart.
Once that is stable, I create an AMI of it.
I then create an initialization script which I use in the launch-configuration and have it execute that script on the previously created AMI.
That's for highly configured applications. If you're just going with default settings (e.g. IP address = 0.0.0.0), than yes, I would simply set 'sudo update-rc.d <> defaults 95 10', so that it runs on startup.
Then create the AMI. When you create a new instance from that AMI, the webserver should startup by default. If it doesn't, I would look at whether or not you really set the init.d script to do so.
Launching a new instance from an AMI should be no different than booting up a previously shutdown instance.
By the way, as a matter of practice when creating these scripts, I also do a few things to make things much cleaner for me:
1) create modules in separate bash scripts (e.g. creating user accounts, set environment variables, etc) for repeatability
2) Each deployment script starts by downloading and installing the AWS CLI
3) Every EC2 instance is launched with an IAM role that has S3 read access, IAM SSH describe rights, EC2 Address allocation/association, etc.
4) Load all of the scripts onto S3 and then have the deployment / initialization scripts download the necessary bash module scripts, chmod +x and execute them. It's as close to OOP I can get without overdoing it, but it creates really clean bash scripts. The top level launch / initialization scripts for the most part just download individual scripts from S3 and execute them.
5) I source all of the modules instead of simply executing them. That way bash shares variables.
6) Make the linux account creation part of the initialization script (not the AMI). Using the CLI, you can query for users, grep for their public SSH key requested from AWS, create their accounts and have it ready for them to login automagically.
This way, when you need to change something (i.e. change the version of an application, change a configuration, etc.) you simply modify the module script and if it changes the AMI, re-launch and re-AMI. Otherwise, if it just just changes the instance-specifically, than just launch the AMI with the new initialization script.
Hope that helps...
We are discussing at a client how to boot strap auto scale AWS instances. Essentially, a instance comes up with hardly anything on it. It has a generic startup script that asks somewhere "what am I supposed to do next?"
I'm thinking we can use amazon tags, and have the instance itself ask AWS using awscli tool set to find out it's role. This could give puppet info, environment info (dev/stage/prod for example) and so on. This should be doable with just the DescribeTags privilege. I'm facing resistance however.
I am looking for suggestions on how a fresh AWS instance can find out about it's own purpose, whether from AWS or perhaps from a service broker of some sort.
EC2 instances offer a feature called User Data meant to solve this problem. User Data executes a shell script to perform provisioning functions on new instances. A typical pattern is to use the User Data to download or clone a configuration management source repository, such as Chef, Puppet, or Ansible, and run it locally on the box to perform more complete provisioning.
As #e-j-brennan states, it's also common to prebundle an AMI that has already been provisioned. This approach is faster since no provisioning needs to happen at boot time, but is perhaps less flexible since the instance isn't customized.
You may also be interested in instance metadata, which exposes some data such as network details and tags via a URL path accessible only to the instance itself.
An instance doesn't have to come up with 'hardly anything on it' though. You can/should build your own custom AMI (Amazon machine image), with any and all software you need to have running on it, and when you need to auto-scale an instance, you boot it from the AMI you previously created and saved.
http://docs.aws.amazon.com/gettingstarted/latest/wah-linux/getting-started-create-custom-ami.html
I would recommend to use AWS Beanstalk for creating specific instances, this makes it easier since it will create the AutoScaling groups and Launch Configurations (Bootup code) which you can edit later. Also you only pay for EC2 instances and you can manage most of the things from Beanstalk console.