How to achieve consistency of re-baking an AMI - amazon-web-services

I am wondering what would be the best approach for baking an AMI. Although it offers a lot of consistency, it is hard to achieve a level of consistency when you need to re-bake your AMI because of a small security update or new package version because more than likely you will end up updating the other packages you don't need to update and that can cause something to break.
So far I am baking all my package installs including docker and pulling base images (like Ubuntu for example).
I know it is possible to specify exactly what package version you need when you do apt-get install or its cfn-init equivalent, but what if it is no longer supported? Should I put my packages in an S3 bucket? But then what about all the dependencies? Are there any simple ways of doing apt-get install from s3 instead of going out to the 3rd party repo?

I just answered a similar question about baking resources into an AMI vs. using a configuration management tool like Chef, Puppet, etc.
Short answer is to try and not bake software into the AMI but rather build on top of base images with repeatable "recipes" (Chef term).
As for the specific versions of packages to install, you certainly can pin software dependencies to specific versions. If you aren't doing anything special with them I would strongly advise to use the native package managers where you can. As for packages not being available anymore, with Ubuntu LTS that hopefully shouldn't be much of an issue.
See the full answer here.

Related

Is it possible to use lambda layers with zappa?

I want to deploy my wagtail (which is a CMS based on django) project onto an AWS lambda function. The best option seems to be using zappa.
Wagtail needs opencv installed to support all the features.
As you might know, just running pip install opencv-python is not enough because opencv needs some os level packages to be installed. So before running pip install opencv-python one has to install some packages on the Amazon Linux in which the lambda environment is running. (yum install ...)
The only solution that came to my mind is using lambda layers to properly install opencv.
But I'm not sure whether it's possible to use lambda layers with projects deployed by zappa.
Any kind of help and sharing experiences would be really appreciated!
There is an open pull request that is ready to merge, but needs additional user testing.
The older project has a pull request that claims layer support has been merged
Feel free to try it out and let the maintainers know so documentation can be updated.

Node.JS native addons on LINUX [duplicate]

I'm using AWS Lambda, which involves creating an archive of my node.js script, including the node_modules folder and uploading that to their infrastructure to run.
This works fine, except when it comes to node modules with native bindings (using node-gyp). Because the binding was complied and project archived on my local computer (OS X), it is not compatible with AWS's (Amazon Linux) servers.
How can I cross-compile/install a node module (specifically, node-sqlite3) so when I upload it to another server arch it runs?
While not really a solution to your problem, a very easy workaround could be to simply compile the native addons on a Linux machine.
For your particular situation, I would use Vagrant. Vagrant can create virtual machines and configure them within seconds.
Find an OS image that resembles Amazon's Linux distro (Fedora, CentOS, others that use yum as package manager - see Wiki)
Use a simple configuration script that, when run by Vagrant on machine startup, will run npm install (optionally it might also remove the node_modules folder before to ensure a clean installation)
For extra comfort, the script can also create the zip file for deployment
Once the installation finishes, the script will shutdown the VM to avoid unnecessary consumption of system resources
Deploy!
It might require some tuning if the linked libraries are not at the same place on the target machine but generally this seems to me like the best and quickest solution.
While installing the app using Vagrant might be sufficient in some cases, I have found it necessary to build the app on Linux which is as close to Lambda's Amazon Linux AMI as possible.
You can read the original answer here: https://stackoverflow.com/a/34019739/303184
Steps to make it work:
Spawn new EC2 instance. Make sure it is based on exactly the same image as your AWS Lambda runtime. You can review Lambda env details here: http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html. In our case, it was Amazon Linux AMI called amzn-ami-hvm-2015.03.0.x86_64-gp2.
Install nvm and use it to install the same version of Node.js as on the AWS Lambda. At the time of writing this, it was v0.10.36. You can refer to http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html again to find out.
You will probably need to install git & g++ compiler on the EC2. You can do this running
sudo yum install git gcc-c++
Finally, clone your app to your new EC2 and install your app's dependecies:
nvm use 0.10.36
npm install --production
You can then easily download the node_modules using scp or such.
Same lines as Robert's answer, when I had to work on my MAC in a different OS I use vm ware like Oracle's free virtualizer VirtualBox to get a linux on my mac, no cost to me. Or sign up for a new AWS account, you get a micro for a year free. Use that to get your linux box, do whatever you need there.
AWS has a page describing how to deal with native NPM modules: https://aws.amazon.com/blogs/compute/nodejs-packages-in-lambda/

Can I load additional libraries in Gitpod without creating my own Docker image?

I have recently tried out Gitpod, which seems to be a quite cool tool.
For testing purposes, I have opened some C++ GitHub repository of mine that uses Boost (among other libraries). Unfortunately, Boost does not seem to be installed in the Docker image, so my code does not compile.
I know about the possibility of creating own Docker images, but I was wondering if there are alternative, easier options as well. Does Gitpod provide any Environment Modules-like option to dynamically load/unload certain "commonly used" libraries or do I always have to provide my own Docker instance in this case?
I work on Gitpod. Thank you for trying it and the compliment :)
We didn't want to invent yet another module system for Gitpod.
Instead, we decided to support Dockerfiles and build them on-demand, because Dockerfiles allow using all those amazing module systems that are already out there: Debian's packages, Alpine's packages, Node Version Manager (NVM), Ruby Version Manager (RVM), SDKman, etc. Basically any Linux-compatible package manager down to simple wget.
You can also use own Docker images, but I find Dockerfiles more convenient because you can check them into git and thereby version them together with your source code. It's dev-environment-as-code and should be shared across the team. Also, you don't need to bother with building and pushing them to a registry (e.g. hub.docker.com).
What Gitpod does offer, hoever, is a selection of Docker images (src). The most prominent one is gitpod/workspace-full, which it Gitpod's default image.
To get back to your question about the easiest way to get the right modules into your Gitpod development environment:
inheriting from gitpod/workspace-full is very convenient.
If you don't want (2), copy'n'pasting sections from gitpod/workspace-full is convenient.
Often, putting RUN apt-get update && apt-get install -y libboost-all-dev into your Dockerfile is enough. This is APT to install the package libboost-all-dev.
Most software projects have documentation on how to build them under Linux. These instructions usually work in Dockerfiles, too.
Search on hub.docker.com for useful Docker images. You can inherit from those images or find their Dockerfiles and copy'n'paste sections from there.

Can you add CLI tools to a Cloud Foundry app?

Terribly sorry if this is an obvious question. But I have a webapp that relies on a CLI tool to get it to work. I was wondering if there was a way I could specify this without using a custom buildpack. And how to go about doing this if possible
Any help on this would be great, thanks
Can you add CLI tools to a Cloud Foundry app?
It's not possible to directly install things with apt or apt-get. Your app runs as an unprivileged user and is unable to run those tools to install things.
This leaves you with a couple options:
Get the binary and bundle it with your app. Some people (no judgement from me though) would say that your app is responsible for bringing everything it needs to run anyway, so you should be doing this already.
Twelve-factor apps also do not rely on the implicit existence of any system tools. Examples include shelling out to ImageMagick or curl. [1]
This path works well for dependencies that are small or self contained, like statically built Go apps. If your app need shared libraries or other resources to function, you need to bundle those with your app too. It's also not great if the size of what you bundle is large. Everything you bundle will be pushed up with the app, so it can slow down your pushes. You are also responsible for tracking updates and making sure that you have the latest, bug free & security patched binaries & libraries.
The general steps for doing this are:
Create a folder like binaries/ under the root of your app, with subfolders of bin/ and lib/.
Place all your binaries under binaries/bin and any shared libraries they require under binaries/lib.
Add a .profile file at the root of your app. This will be sourced prior to your app starting so it will put any binaries you bundle on the path and add libraries to the search path.
In .profile put the following:
export PATH=$HOME/binaries/bin:$PATH
export LD_LIBRARY_PATH=$HOME/binaries/lib:$LD_LIBRARY_PATH
That should be it. Just push your app with all the new files.
Another easier option, is to use the Apt buildpack [2]. This buildpack can install required dependencies using apt. You just need to add an apt.yml file to the root of your app & run your app with multi-buildpacks (apt buildpack first, then your normal buildpack).
The main benefit of doing this is that you don't have to manage the dependencies. The apt buildpack will automatically install them from the repo you tell it to use, so it'll pick up new versions from there as well. This is good if what you need to install has a lot of dependencies, particularly sensitive dependencies (like openssl) or dependencies that get updated/patched often like other language runtimes (Python, Perl, Ruby, etc...).
Other benefits. It's easier because the buildpack takes care of adjusting PATH & LD_LIBRARY_PATH. It also makes the app size smaller so pushes are faster.
The downsides of this option are that the apt-buildpack is not an official buildpack (it's community maintained). It also works best when you have Internet access, so it can download binaries from the Internet, although you can work around this by using an internal repo.
There's a couple other options as well, but I wouldn't recommend them unless both options above are definitely not going to work for you.
Use Docker. You can set up your own Docker container with all the dependencies you need, plus your app code and cf push the Docker image to CF. The downside of this is that your lose the advantages of using buildpacks, so you're back to building and managing Docker images and all the required dependencies of your app all on your own.
You could create your own custom buildpack and supply the dependencies that way. I don't see any reason you'd want to do this though. It's a decent bit of work and in the end, you'd have something that's just more brittle and less flexible than Apt Buildpack.
It's technically possible to ship your own rootfs, but you really really shouldn't (I'm just including this to be thorough). This is the base file system that's used by all apps on CF. Doing this has a lot of drawbacks though, chiefly being that it's difficult. It also applies to all apps on the foundation, can bloat the size of the rootfs, and makes a larger attack surface for anything using the rootfs (i.e. all apps).
Hope that helps!
[1] https://12factor.net/dependencies
[2] https://github.com/cloudfoundry/apt-buildpack

Trying Dask on AWS

I am a scientist who is exploring the use of Dask on Amazon Web Services. I have some experience with Dask, but none with AWS. I have a few large custom task graphs to execute, and a few colleagues who may want to do the same if I can show them how. I believe that I should be using Kubernetes with Helm because I fall into the "Try out Dask for the first time on a cloud-based system like Amazon, Google, or Microsoft Azure" category.
I also fall into the "Dynamically create a personal and ephemeral deployment for interactive use" category. Should I be trying native Dask-Kubernetes instead of Helm? It seems simpler, but it's hard to judge the trade-offs.
In either case, how do you provide Dask workers a uniform environment that includes your own Python packages (not on any package index)? The solution I've found suggests that packages need to be on a pip or conda index.
Thanks for any help!
Use Helm or Dask-Kubernetes ?
You can use either. Generally starting with Helm is simpler.
How to include custom packages
You can install custom software using pip or conda. They don't need to be on PyPI or the anaconda default channel. You can point pip or conda to other channels. Here is an example installing software using pip from github
pip install git+https://github.com/username/repository#branch
For small custom files you can also use the Client.upload_file method.