I have a Java standalone application which I have dockerized. I want to run this docker everytime an object is put into S3 storage. On way is to do it via AWS batch which I am trying to avoid.
Is there a direct and easy way to call docker run from a lambda?
Yes and no.
What you can't do is execute docker run to run a container within the context of the Lambda call. But you can trigger a task on ECS to be executed. For this to work, you need to have a cluster set up on ECS, which means you need to pay for at least one EC2 instance. Because of that, it might be better to not use Docker, but I know too little about your application to judge that.
There are a lot of articles out there how to connect S3, Lambda and ECS. Here is a pretty in-depth article by Amazon that you might be interested in:
https://aws.amazon.com/blogs/compute/better-together-amazon-ecs-and-aws-lambda/
If you are looking for code, this repository implements what is discussed in the above article:
https://github.com/awslabs/lambda-ecs-worker-pattern
Here is a snippet we use in our Lambda function (Python) to run a Docker container from Lambda:
result = boto3.client('ecs').run_task(
cluster=cluster,
taskDefinition=task_definition,
overrides=overrides,
count=1,
startedBy='lambda'
)
We pass in the name of the cluster on which we want to run the container, as well as the task definition that defines which container to run, the resources it needs and so on. overrides is a dictionary/map with settings that you want to override in the task definition, which we use to specify the command we want to run (i.e. the argument to docker run). This enables us to use the same Lambda function to run a lot of different jobs on ECS.
Hope that points you in the right direction.
Yes. It is possible to run containers out Docker images stored in Docker Hub within AWS Lambda using SCAR.
For example, you can create a Lambda function to execute a container out of the ubuntu:16.04 image in Docker Hub as follows:
scar init ubuntu:16.04
And then you can run a command or a shell-script within that container upon each invocation of the function:
scar run scar-ubuntu-16-04 whoami
SCAR: Request Id: ed5e9f09-ce0c-11e7-8375-6fc6859242f0
Log group name: /aws/lambda/scar-ubuntu-16-04
Log stream name: 2017/11/20/[$LATEST]7e53ed01e54a451494832e21ea933fca
---------------------------------------------------------------------------
sbx_user1059
You can use your own Docker images stored in Docker Hub. Some limitations apply but it can be effectively used to run generic applications on AWS Lambda. It also features a programming model for file-processing event-driven applications. It uses uDocker under the hood.
Yes try Udocker.
Udocker is a simple tool written in Python, it has a minimal set of dependencies so that can be executed in a wide range of Linux systems.
udocker does not make use of docker nor requires its installation.
udocker "executes" the containers by simply providing a chroot like environment over the extracted container. The current implementation uses PRoot to mimic chroot without requiring privileges.
Examples
Pull from docker hub and list the pulled images.
udocker pull fedora
Create the container from a pulled image and run it.
udocker create --name=myfed fedora
udocker run myfed cat /etc/redhat-release
And also its good to check Hackernoon.
Because:
In Lambda, the only place you are allowed to write is /tmp. But udocker will attempt to write to the homedir by default. And other stuff.
Related
I have a Docker image in Elastic Container Registry (ECR). It was created via a simple Dockerfile which I have control over.
The image itself is fine, but I have a problem where the shared memory is insufficient when working inside a container in SageMaker Studio. Therefore I need to raise the shared memory of these containers.
To raise the shared memory of a container, I believe the usual method is to pass the --shm-size argument to the docker run command when starting the container. However, I do not have control over this command, as SageMaker is doing that bit for me. The docs say that SageMaker is running docker run <image> train when starting a container.
Is it possible to work around this problem? Either via somehow providing additional arguments to the command, or specifying something when creating the Docker image (such as in the Dockerfile, deployment script to ECR).
According to this issue there is no option you can use in sagemaker at the moment. If ECS is an option for you, it does support --shm-size option in the task definition.
As pointed out by #rok (thank you!) it is not possible in this situation to pass arguments to docker run, although it would be if switching to ECS.
It is however possible to pass the --shm-size argument to docker build when building the image to push to ECR. This seems to have fixed the problem, albeit it does require a new Docker image to be built and pushed whenever wanting to change this parameter.
AWS Step Functions may be run in a local Docker environment using Step Functions Local Docker. However, the step functions need to be defined using the JSON-based Amazon States Language. This is not at all convenient if your AWS infrastructure (Step Functions plus lambdas) is defined using AWS CDK/CloudFormation.
Is there a way to create the Amazon States Language definition of a state machine from the CDK or CloudFormation output, such that it’s possible to run the step functions locally?
My development cycle is currently taking me 30 minutes to build/deploy/run my Lambda-based step functions in AWS in order to test them and there must surely be a better/faster way of testing them than this.
We have been able to achieve this by the following:
Download:
https://docs.aws.amazon.com/step-functions/latest/dg/sfn-local.html
To run step functions local, in the directory where you extracted the local Step Function files run:
java -jar StepFunctionsLocal.jar --lambda-endpoint http://localhost:3003
To create a state machine, you need a json definition (It can be pulled from the generated template or can get the toolkit plug in for Vs code, type step functions, select from a template and that can be your starter. Can also get it from the AWS console in the definition tab on the step function.
Run this command in the same directory as the definition json:
aws stepfunctions --endpoint http://localhost:8083 create-state-machine --definition "cat step-function.json" --name "local-state-machine" --role-arn "arn:aws:iam::012345678901:role/DummyRole"
You should be able to hit the SF now (hopefully) :)
You can use cdk watch or the --hotswap option to deploy your updated state machine or Lambda functions without a CloudFormation deployment.
https://aws.amazon.com/blogs/developer/increasing-development-speed-with-cdk-watch/
If you want to test with Step Functions local, cdk synth generates the CloudFormation code containing the state machine's ASL JSON definition. If you get that and replace the CloudFormation references and intrinsic functions, you can use it to create and execute the state machine in Step Functions Local.
How some people have automated this:
https://nathanagez.com/blog/mocking-service-integration-step-functions-local-cdk/
https://github.com/kenfdev/step-functions-testing
Another solution that might help is to use localstack what is supports many tools such CDK or CloudFormation and let developers to run stack locally.
There are a variety ways to run it, one of them is to run it manually in docker container, according to the instruction get started.
Next following the instruction what's next configure aws-cli or use awslocal.
All next steps and templates should be the same as for AWS API in the cloud.
Terraform has a dedicated "docker" provider which works with images and containers and which can use a private registry and supply it with credentials, cf. the registry documentation. However, I didn't find any means to supply a Dockerfile directly without use of a separate registry. The problem of handling changes to docker files itself is already solved e.g. in this question, albeit without the use of terraform.
I could do a couple of workarounds: not using the dedicated docker provider, but use some other provider (although I don't know which one). Or I could start my own private registry (possibly in a docker container with terraform), run the docker commands locally which generate the images files (from terraform this could be done using the null_resource of the null provider) and then continue with those.
None of these workarounds make much sense to me. Is there a way to deploy docker containers described in a docker file directly using terraform?
Terraform is a provisioning tool rather than a build tool, so building artifacts like Docker images from source is not really within its scope.
Much as how the common and recommended way to deal with EC2 images (AMIs) is to have some other tool build them and Terraform simply to use them, the same principle applies to Docker images: the common and recommended path is to have some other system build your Docker images -- a CI system, for example -- and to publish the results somewhere that Terraform's Docker provider will be able to find them at provisioning time.
The primary reason for this separation is that it separates the concerns of building a new artifact and provisioning infrastructure using artifacts. This is useful in a number of ways, for example:
If you're changing something about your infrastructure that doesn't require a new image then you can just re-use the image you already built.
If there's a problem with your Dockerfile that produces a broken new image, you can easily roll back to the previous image (as long as it's still in the registry) without having to rebuild it.
It can be tempting to try to orchestrate an entire build/provision/deploy pipeline with Terraform alone, but Terraform is not designed for that and so it will often be frustrating to do so. Instead, I'd recommend treating Terraform as just one component in your pipeline, and use it in conjunction with other tools that are better suited to the problem of build automation.
If avoiding running a separate registry is your goal, I believe that can be accomplished by skipping using docker_image altogether and just using docker_container with an image argument referring to an image that is already available to the Docker daemon indicated in the provider configuration.
docker_image retrieves a remote image into the daemon's local image cache, but docker build writes its result directly into the local image cache of the daemon used for the build process, so as long as both Terraform and docker build are interacting with the same daemon, Terraform's Docker provider should be able to find and use the cached image without interacting with a registry at all.
For example, you could build an automation pipeline that runs docker build first, obtains the raw id (hash) of the image that was built, and then runs terraform apply -var="docker_image=$DOCKER_IMAGE" against a suitable Terraform configuration that can then immediately use that image.
Having such a tight coupling between the artifact build process and the provisioning process does defeat slightly the advantages of the separation, but the capability is there if you need it.
I have aws generic container enviroment. And for some reason I want to run multiple dotnet webapi applications on the same container. I know it is possible. Because it's already seen someone do it passing a bash script into the ENTRYPOINT. This script creates one kestrel-$API.service file for each application and put them on the /etc/systemd/system/ directory and it runs systemctl to enable those services.
But I want to know if there is a better or more elegant way to do it.
The reason I don't want to use multi container is because in the Dockerrun file I have to specify a memory limit for each container and because I would have to create a container network.
I am currently using AWS ElasticBeanStalk and I was curious as to how (as in internally) it knows that when you fire up an instance (or it automatically does with scaling), to unpack the zip I uploaded as a version? Is there some enviroment setting that looks up my zip in my S3 bucket and then unpacks automatically for every instance running in that environment?
If so, could this be used to automate a task such as run an SQL query on boot-up (instance deployment) too? Are these automated tasks changeable or viewable at all?
Thanks
I don't know how beanstalk knows which version to download and unpack, but running a task on start-up is trivial. Check out cloud-init, a tool written by Ubuntu that's now packaged in Amazon Linux. It allows you to pass arbitrary shell scripts into the UserData section of the instance configuration, and those shell scripts will run on startup.
It's a great way to bootstrap instances on startup, which avoids the soul-sucking misery of managing AMIs.
A quick (possibly non-applicable) warning: If you're running a SQL query on a database that lives on the beanstalk AMI, you're pretty much guaranteed to lose your database at some point. Those machines are designed to be entirely transient. Do not put databases on them. See this answer for more details.
Since your goal seems to be to run custom configuration tasks, the answer is yes, there is a way to do that. You can define custom actions in an .ebextensions file packaged with your app. For example, you can configure a command to run every time a new machine is deployed:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html#linux-commands