Docker base image version for spark 3 in docker hub - dockerfile

I want to know the public apache Docker base image detail for Spark 3, for an example of dockerfile:
FROM datamechanics/spark:3.1.2-dm17
is there any public image available from Apache, for doclerfile.

Related

How to run a docker image from within a docker image?

I run a dockerized Django-celery app which takes some user input/data from a webpage and (is supposed to) run a unix binary on the host system for subsequent data analysis. The data analysis takes a bit of time, so I use celery to run it asynchronously. The data analysis software is dockerized as well, so my django-celery worker should do os.system('docker run ...'). However, celery says docker: command not found, obviously because docker is not installed within my Django docker image. What is the best solution to this problem? I don't want to run docker within docker, because my analysis software should be allowed to use all system resources and not just the resources assigned to the Django image.
I don't want to run docker within docker, because my analysis software should be allowed to use all system resources and not just the resources assigned to the Django image.
I didn't catch the causal relationship here. In fact, we just need to add 2 steps to your Django image:
Follow Install client binaries on Linux to download the docker client binary from prebuilt, then your Django image will have the command docker.
When starting the Django container, add /var/run/docker.sock bind mount, this allows the Django container to directly talk to the docker daemon on the host machine and start the data-analysis tool container on the host. As the analysis container does not start in Django container, they can have separate system resources. In other words, the analysis container's resources do not depend on the resource of the Django image container.
Samples with one docker image which already has the docker client in it:
root#pie:~# ls /dev/fuse
/dev/fuse
root#pie:~# docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock docker /bin/sh
/ # ls /dev/fuse
ls: /dev/fuse: No such file or directory
/ # docker run --rm -it -v /dev:/dev alpine ls /dev/fuse
/dev/fuse
You can see, although the initial container does not have access to the host's /dev folder, the docker container whose command initialized from the initial container could really have separate resources.
If the above is what you need, then it's the right solution for you. Otherwise, you will have to install the analysis tool in your Django image

How do you containerize a Django app for the purpose of running mulptiple instances?

The idea is to create a Django app what would serve as the backend for an Android application and would serve an web admin interface for managing the mobile application's data.
Different sites of the company sometimes need different backends for the same android app (data has to be manageable completely separately). Application will be hosted on Windows server/s.
How can I containerize the app so I can run multiple instances of it (listening on different ports of the same IP) and I can move it to different servers if needed and set up a new instance of it there?
The Django development part I'm familiar with but I have never used Docker(nor other) containers before.
What I need:
Either a tutorial or documentation that deals with this specific topic
OR
Ordered points with some articles or tips how to get this done.
Is this the kind of thing you wanted?
https://atrisaxena.github.io/projects/docker-basic-django-application-deployment/
The secret to having multiple instances is to map the ports when you run the container.
When you run
docker run -d -p 8000:8000 djangosite
you can change the port mapping by changing the 8000:8000 setting to any <host_port>:<container_port> you want.
e.g. if you follow the example above, you end up exposing port 8000 on the container (EXPOSE 8000 in the Dockerfile). The above command maps port 8000 on the host to 8000 on the container. If you want to then run a second instance of the container on port 8001, you simply run
docker run -d -p 8001:8000 djangosite
Then final step is to use a proxy such as nginx to map the ports on the docker host machine to URLs that are accessible via a browser (i.e. via ports 80 for http and 443 for https).
Regarding moving the container, you simply need to import the docker image that you built onto whichever docker host machine you want, no need to move the source code.
Does this answer your question?
P.S. It is worth noting that the tutorial above recommends running the Django server using manage.py runserver which is NOT the standard way of deploying a Django site. The proper way to do it is to use WSGI or similar (via apache, nginx, gunicorn, etc.) within the container to properly interface with the container boundaries. See https://docs.djangoproject.com/en/3.2/howto/deployment/ for more info on how to properly deploy the site. All of the methods detailed in the documentation can be done within the container (but take care not to make your container too bulky or it will weigh down your host machines).
P.P.S It is also not strictly necessary to tag your docker container to a remote repository as suggested in the linked article. You can build the container locally with docker build (see https://docs.docker.com/engine/reference/commandline/build/) and save the image as a file using docker save (see https://docs.docker.com/engine/reference/commandline/save/). You can then import the image to new hosts using docker load (https://docs.docker.com/engine/reference/commandline/load/).
N.B. Don't confuse docker save and docker load with docker export and docker import because they serve different functions. Read the docs for more info there. docker save and docker load work with images whereas docker export and docker import work directly with containers (i.e. specific instances of an image).
I would recommend having a docker-compose file, having two services named differently and running on different ports, that's it
version: '2'
services:
backend:
ports:
- host_port:container_port example
- 8080:8000
build:
context: ./directory_containing_docker_file
dockerfile: .
restart: unless-stopped
networks:
- your-network
:
ports:
- host_port:container_port
- 8090:8000
build:
context: ./directory_containing_docker_file
dockerfile: .
restart: unless-stopped
networks:
- your-network
networks:
your-network:
driver: bridge

What is difference between alpine, alpine-jre and alpine-slim docker base images?

I want to add a docker base image to my dockerFile. I am a little confused between these three types of images. I am trying to get a base image for Java8 openjdk

Docker image different size when pushed to ECR than locally

I have a docker image that is 1.46GB on my local machine, but when this is pushed to AWS ECR (either via my local machine or via CicleCI deployment) it is only 537.05MB. I'm pretty new to Docker and to AWS, so any help in figuring out as to why this may be would be appreciated!
I have a feeling that it has not fully uploaded to ECR for whatever reason, as I am trying to use this container for a Batch job, but for some reason the same command which works when used locally does not work when used in the job definition. The command is simply python app.py, but I have also tried with absolute path python /usr/local/src/app/app.py, both of which result in [Errno 2] No such file or directory.
Commands used in my Makefile deployment are as below:
docker build --force-rm=true -t $(EXTRACTOR_IMAGE_NAME) ./extractor
docker tag $(EXTRACTOR_IMAGE_NAME) $(EXTRACTOR_ECR_IMAGE_NAME)
$(shell aws ecr get-login --no-include-email)
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/$(EXTRACTOR_ECR_REPO)
Edit 1:
I think this might be to do with the size of the base image, which is python:2.7 in this case. The base image is 914MB, plus the size of my ECR image 537.05MB = 1451.05MB, i.e. approx 1.46GB. Still not sure what the issue is with the Batch command though...
Edit 2:
I've been mounting code into my container using a volume, which is why this has been working locally. At build time I've forgotten to copy the code into the container, which I assume is the only reason why this is not working in Batch!
That could be due to how docker client acts before it pushes the image to ECR as documented:
Beginning with Docker version 1.9, the Docker client compresses image layers before pushing them to a V2 Docker registry. The output of the docker images command shows the uncompressed image size, so it may return a larger image size than the image sizes shown in the AWS Management Console.
So when you pull an image you will notice that the image layers go through three stages:
Downloading
Extraction
Completion
Regarding this command: python /usr/local/src/app/app.py are executing it while you inside /usr/local/src/app/ ? You might need to ensure this first also have you checked the command inside the container using the image before you push it as the error seems to be code related more than a docker issue
We can read the following in the the AWS ECR documentation:
Note
Beginning with Docker version 1.9, the Docker client compresses image layers before pushing them to a V2 Docker registry. The output of the docker images command shows the uncompressed image size, so it may return a larger image size than the image sizes shown in the AWS Management Console.
I suspect you'd get the sizes you expect, would you use the CLI (docker images) instead of the ECR web console.

Need to take image of docker image or container from application installed machine in AWS

As i am working on docker, i need help to take a container or image from existing AWS box. In my AWS box our application is installed and initiated.
For our application initialization, it takes more time. So i want to deploy this container(application installed) while the box launching time itself. When i am taking docker container it will have my application initiated, as per my understanding. So i can save the application initialization time.
I am launching the machine through ansible in AWS VPC. So i can call the docker container there.
Can anyone help on this how to do this activity.
With Thanks,
Ezhilmurugan M I
If you docker commit your changes into an image with a tag, you can then push to a registry, and then pull down the images on another server.
$ docker commit <hash or name> yourusername/red_panda
$ docker push yourusername/red_panda
On other host
$ docker pull yourusername/red_panda
You could also export the image, transfer however you want, and then import the image on the new server.
$ docker export red_panda > latest.tar
$ cat latest.tgz | docker import - exampleimagelocal:new