Docker Volume vs AWS s3 - amazon-web-services

Probably I am completely off in my assumptions but I am pretty new to both Docker and Aws and we have two applications which are Dockerized containers working under the same docker-compose network bridge.
Now, We have been looking for a way that these two containers can share some files. Since we are on the cloud, one suggestion was Amazon s3 Bucket. Which is great. But My questions is that since we are on Docker envionment does it not make more sense to share those files in a Docker Volume? I thought that's exactly what Docker Volume is. A mounted virtual place where files can be shared. At least that is my shallow and simplistic understanding after reading about Docker Volumes
So I do have some questions
Is my assumptions that AWS s3 bucket and Docker volumes provide similar functionality like comparing apples to apples?
If my assumption is correct then would a Docker Volume qualify to be called an object store?
If it does qualify to be called an object store then would it be wise to use Docker Volume as replacement of AWS s3?
If not, why?

Yes. They are different and even complementary. There's a plugin for Docker volumes on AWS here:
https://github.com/joeduffy/blocker
I wouldn't use the term object store. It's implemented as a filesystem mounted on the container.
No...
... for the reason stated in (1).

Related

Best practices for copying container images between ECRs in different account

I wonder what are best practices for copying Docker container images from ECR to ECR in AWS.
I have to copy container images periodically between multiple ECR repositories, each placed in separate AWS account - like mirroring but with specific filters for what to copy and what to skip. I wrote a script that does this work by pulling missed images from the 'source' ECR to an EC2 VM, and pushes them to the 'target' ECR.
This works, but I am not satisfied by the performance of doing that in a single thread, and it's not network throughput limiting it but 'expences' to wrap commands, run some necessary calls to AWS, etc.
So I am thinking of rewriting the script to a multi-thread application, but I wonder if I'm inventing a bicycle, and there is some known and better solution for this task.
As it was confirmed by AWS support, there was no 'out-of-the-box' way to do this job, other than direct mirroring of entire repository.
Thus I rewrote the tool for doing it in a smarter and faster way, and published it.
https://github.com/dfad1ripe/aws-ecr-cross-account-clone
To use it, both source and destination AWS accounts should be defined as profiles in ~/.aws/credentials, and the host should be running either Docker Engine or Docker Desktop.

How is `tmp` folder managed when using ECS Fargate?

I'm currently running some containers on production using AWS Fargate. I'm running an application that from time to time populates some files to /tmp folder.
That said, I want to know what happens to this /tmp folder. Is this something managed by Fargate (by ECS Container Agent, for example) or is it something that I need to manage by myself (using a cronjob to clear the files there, for example)?
NOTE 1: One way to handle that is to use s3 to handle that kind of behavior, however, the question is to know how Fargate behaves regarding /tmp folder.
NOTE 2: I don't need the files in /tmp folder, they just happen to appear there, and I want to know if I need to remove them or if ECS will do that for me.
I couldn't find anything about that in documentation. If someone points that subjects on the docs, I would be happy to accept the answer.
if I understand your question correctly, it looks like you want more precise control over temporary storage within your container.
I don't think there is anything special that ECS or Fagate does with /tmp folders on the FS within the container.
However, docker does have a notion of a tempfs mount. This allows you to designate a path that allows you to avoid storing data on the containers host machine.
https://docs.docker.com/storage/tmpfs/
ECS and Fargate recently added support for the tmpfs flag:
https://aws.amazon.com/about-aws/whats-new/2018/03/amazon-ecs-adds-support-for-shm-size-and-tmpfs-parameters/
If I understand correctly, after your Fargate task ends its running, all the storage goes away.
According to aws documentation, a Fargate task receives some storage when provisioned, but it is an ephemeral storage.
So unless you are using a forever running task, you don't need to deal with temporary files. They will be gone with the storage.
I hope this helps.

Persistence in AWS Fargate Containers

I have 2 containers in a Fargate task definition. One of the containers is a database server. I'm wanting to persist the data directory. However, Fargate doesn't support the Source Path field when setting up a volume in the task definition. Does anyone know who to set up persistence in Fargate?
AWS Fargate at this moment is targeted to stateless container solutions only, but we never know, maybe AWS is already working in a solution for it.
Remember you are sharing the same host with other AWS Customers. Your instance could be terminated and restarted in another host anytime. You also can scale out your service anytime.
You can use any of the options below:
use RDS for general purpose databases.
If your DB is not available you can start a new EC2 and install the database
continue to use fargate for the other services.
AWS Fargate supports EFS volumes, at last!
I can think about 3 ways to do this:
use a storage solutions compatible with containers workload (longhorn or portwork are good calls)
use RDS
use a distributed database that can have multiple copies of it's data (but you will have to take care of the case all the copies where shutdown)
[Fargate] [Volumes]: Allow at least EFS mounts to Fargate Containers.
This is some thing you can trust:
https://github.com/aws/containers-roadmap/issues/53
Until then you can:
Generate dump of Database periodically within the container.
With the help of AWS CLI/SDK, Upload same to S3.
Use dump to recover whenever required.

AWS ECS mysql DB instance

Hey I wanted a design answer regarding using MySQL database in AWS ECS container . I'm not using RDS as currently doing some MVP. Is it possible to use Mysql DB as a docker container, and if it is so, then how do i make sure prod data is persisted when deployment happens of this DB container.
Please guide me for this scenario.
Yes, entirely possible.
Explaining it from start to finish is way too much for an SO answer. AWS has thorough documentation on ECS, and I would recommend starting there: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html
The section concerning data persistence is here: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html
The thing to remember with volumes and ECS - named volumes are for sharing data between containers; hosted volumes are for persisting data beyond the lifecycle of any number of containers. So you'll want to mount a volume from the underlying EC2 instance into the container where the MySQL data is stored.
Depending on which MySQL image you choose, the container data directory might differ. Any image worth it's salt is going to tell you where this directory is located in the README, because that is a very common question with databases + Docker.
Yes it is possible. All you have to do is to find a MYSQL image such as the official one and just as instructed in the documentation of the image you will have to run:
docker run --name my-container-name -v /my/own/datadir:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql/mysql-server:tag
The -v /my/own/datadir:/var/lib/mysql part of the command mounts the /my/own/datadir directory from the underlying host system as /var/lib/mysql inside the container, where MySQL by default will write its data files.

Motivation for putting Docker containers inside an AWS EC2 instance

There seems to be a growing trend of developers setting up multiple Docker containers inside of their AWS EC2 instances. This seems counterintuitive to me. Why put a virtual machine inside another virtual machine? Why not just use smaller EC2 instances (ie make the instance the same size as your container and then not use Docker at all)? One argument I've heard is that Docker containers make it easy to bring your development environment exactly as-is to prod. Can't the same be done with Amazon Machine Images (AMIs)? Is there some sort of marginal cost savings? Is it so that developers can be cloud-agnostic and avoid vendor lock in? I understand the benefits of Docker on physical hardware, just not on EC2.
The main advantage of Docker related to your question is the concept of images. These are light-weight and easier to configure than AMIs. Also, note that Docker is simple because it runs on a VM (in this case EC2).
More info here - How is Docker different from a normal virtual machine?
Docker containers inside of their AWS EC2 instances. This seems counterintuitive to me. Why put a virtual machine inside another virtual machine?
Err... a docker container is not a VM. It is a way to isolate part of the resources (filesystem, CPU, memory) of your host (here an EC2 VM).
AMI (Amazon Machine Images) is just one of EC2 resources