Optional ways to put large UserData contents in CF template - amazon-web-services

I have a CloudFormation template for deploying Cisco 8000v instances. In order to bootstrap these I have a very long device-specific user-data file. I can put the whole contents in the UserData block but then my CF template is not very reusable. Can I refer to the contents via a filename and import them somehow? Can't find any examples of this. What is a more common way to approach this? The UserData string as several instance-specific configurations. Should I base64 encode the string and refer to it as a parameter?

You would store your long script externally to the instance, e.g. in S3. Then your user_data, would be very short, limited to downloading the script from S3 and executing it.
Alternatively, you can create custom ami which is per-configured for your use-case. This way your use_script can be reduce or even fully eliminated.

You can not use user_data too much, so it is limited with 16KB. check it on here
Best way to do it. Store it on S3 or another place to your EC2 instance can reach. on user_data download and execute it.

Related

Update AWS ECS Task Definition with Powershell

long story short, I need to update my ECS Task definition via powershell in order to increase the "EphemeralStorage_SizeInGiB" which is only available via the AWS cli.
I am able to successfully grab the task via the Get-ECSTaskDefinitionDetail cmdlet but I'm stuck on what to do next.
I was able to convert that output to JSON and update the ephemeral storage field in the json file but cannot figure how to send that back to AWS. All my attempts with the Register-ECSTaskDefinition Cmdlet seem to fail as it wants individual arguments for each parameter instead of a json upload.
Any advice would be appreciated.
Thanks,
I don't have one to test with, but most AWS cmdlets return objects which can be piped to each other. Get-ECSTaskDefinitionDetail does too, returning an DescribeTaskDefinitionResponse object, with what looks like all the right properties to auto-fill the registration. Try out
Get-ECSTaskDefinitionDetail -TaskDefinition $ARN |
Register-ECSTaskDefinition -EphemeralStorage_SizeInGiB $newSize
Or it might require using this .TaskDefinition property:
$Response = Get-ECSTaskDefinitionDetail -TaskDefinition $ARN
$Response.TaskDefinition | Register-ECSTaskDefinition -EphemeralStorage_SizeInGiB $newSize
and maybe it's that easy?
note that you must not use -Select in the Get command, or it will return a different object type.
That said, it's pretty awkward that it won't take json when two of its parameters do. Might be worth reopening this feature request:
https://github.com/aws/aws-tools-for-powershell/issues/184

In an AWS lambda, how do I access the image_id or tag of the launched container from within it?

I have an AWS lambda built using SAM. I want to propagate the id (or, if it's easier, the tag) of a lambda's supporting docker image through to the lambda runtime function.
How do I do this?
Note: I do mean image id and NOT container id - what you'd see if you called docker image ls locally. Getting the container id / hostname is the easy bit :D
I have tried to declare a parameter in the template.yaml and have it picked up as an environment variable that way. I would prefer to define the value at most once within the template.yaml, and preferably have it auto-populated, though I am not aware of best practice there. The aim is to avoid human error. I don't want to pass the value on the command line unless I have to.
If it's too hard to get the image id then as a fallback the DockerTag would be fine. Again, I don't want this in multiple places in the template.yaml. Thanks!
Unanswered similar question: Finding the image ID of a container from within the container
The launched image URI is available in the packaged template file after running sam package, so it's possible to extract the tag from there.
For example, if using YAML:
grep -w ImageUri packaged.yaml | cut -d: -f3
This will find the URI in the packaged template (which looks like ImageUri: 12345.dkr.ecr.us-east-1.amazonaws.com/myrepo:mylambda-123abc-latest) and grabs the tag, which is after the 2nd :.
That said, I don't think it's a great solution. I wish there was a way using the SAM CLI.

AWS Lambda generates large size files to S3

Currently we are having a aws lambda (java based runtime) which takes a SNS as input and then perform business logic and generate 1 XML file , store it to S3.
The implementation now is create the XML at .tmp location which we know there is space limitation of aws lambda (500mb).
Do we have any way to still use lambda but can stream XML file to S3 without using .tmp folder?
I do research but still do not find solution for it.
Thank you.
You can directly load an object to s3 from memory without having to store it locally. You can use the put object API for this. However, keep in mind that you still have time and total memory limits with lambda as well. You may run out of those too if your object size is too big.
If you can split the file into chunks and don't require to update the beginning of the file while working with its end you can use multipart upload providing a ready to go chunk and then free the memory for the next chunk.
Otherwise you still need a temporary storage for form all the parts of the XML. You can use DynamoDB or Redis and when you collect there all the parts of the XML you can start uploading it part by part, then cleanup the db (or set TTL to automate the cleanup).

When using cloud-init, is it possible to both specify a boot script and a data blob that can be queried from inside the instance?

I have some current instances that get some data by passing a json blob through the user data string. I would like to also pass a script to be run at boot time through the user data. Is there a way to do both of these things? I've looked at cloud-config, but setting an arbitrary value doesn't seem to be one of the options.
You're correct that on EC2, there is only one 'user-data' blob that can be specified. Cloud-init addresses this limitation by allowing the blob to be an "archive" format of sorts.
Mime Multipart
Cloud-config Archive
cloud-config archive is unfortunately not documented now, but there is an example in doc/examples/cloud-config-archive.txt. It is expected to be yaml and start with '#cloud-config-archive'. Note that yaml is a strict superset of json, so any thing that can dump json can be used to produce this yaml.
Both of these formats require changes to all consumers to "share" the shared resource of user-data. cloud-init will ignore mime types that it does not understand, and handle those that it does. You'd have to modify the other application producing and consuming user-data to do the same.
Well, cloud-init supports multi-part MIME. With that in mind you could have your boot script as one part, then a custom mime part. Note that you would need to write a python handler that tells cloud-init what to do with that part (most likely moving it to wherever your app expects it). This handler code ends up in the handlers directory as described here.

Can I parameterize AWS lambda functions differently for staging and release resources?

I have a Lambda function invoked by S3 put events, which in turn needs to process the objects and write to a database on RDS. I want to test things out in my staging stack, which means I have a separate bucket, different database endpoint on RDS, and separate IAM roles.
I know how to configure the lambda function's event source and IAM stuff manually (in the Console), and I've read about lambda aliases and versions, but I don't see any support for providing operational parameters (like the name of the destination database) on a per-alias basis. So when I make a change to the function, right now it looks like I need a separate copy of the function for staging and production, and I would have to keep them in sync manually. All of the logic in the code would be the same, and while I get the source bucket and key as a parameter to the function when it's invoked, I don't currently have a way to pass in the destination stuff.
For the destination DB information, I could have a switch statement in the function body that checks the originating S3 bucket and makes a decision, but I hate making every function have to keep that mapping internally. That wouldn't work for the DB credentials or IAM policies, though.
I suppose I could automate all or most of this with the SDK. Has anyone set something like this up for a continuous integration-style deployment with Lambda, or is there a simpler way to do it that I've missed?
I found a workaround using Lambda function aliases. Given the context object, I can get the invoked_function_arn property, which has the alias (if any) at the end.
arn_string = context.invoked_function_arn
alias = arn_string.split(':')[-1]
Then I just use the alias as an index into a dict in my config.py module, and I'm good to go.
config[alias].host
config[alias].database
One thing I'm not crazy about is that I have to invoke my function from an alias every time, and now I can't use aliases for any other purpose without affecting this scheme. It would be nice to have explicit support for user parameters in the context object.