Datalab - how to install and keep packages - google-cloud-platform

I decided to try and use Google Cloud Datalab for a small project that I'm working on rather than a Jupyter Notebook in an Anaconda environment on an AWS instance.
How can I install a package (for example OpenCV) onto the Datalab VM so that I don't have to reinstall it every time I restart my VM? Why do the packages disappear after every restart but the updated notebooks remain persistent? Any help answering these questions and clarifying how the Datalab VM works would be very helpful.

The notebooks are stored in a docker volume mount that represents a location on the persistent disk that is maintained across restarts of the VM.
The packages you install however are stored in the running container and hence lost on each restart.
You could create a custom docker image and use that instead. On the datalab create command, see the --image-name argument.
Here is an example of a Dockerfile you'll want to use:
FROM gcr.io/cloud-datalab/datalab:latest
RUN pip install opencv
Note that you'll need build the docker image using this docker file, and push the image to Google Container Registry. My memory is a bit fuzzy on this, but it is possible this image needs to be marked as public.
Hope that helps!

Related

Clone a google cloud VM

I have google cloud VM with Ubuntu installed along with various services and libraries. I need to make a similar bootable VM with the same OS and all the data, libraries etc in the already configured VM. How do I clone the VM with these requirements?
I tried to create an image from the already existing VM and could not SSH into it.
So I retraced my installations step by step trying to figure out which step is breaking the image.
I created an Ubuntu(18.04) VM and used that to create an image. The instance I created using the image did allow me to SSH into.
Next installed Ubuntu desktop and xorg server and created an image after that. Using that image, I created a new VM and tried to SSH into it.
But unfortunately, the SSH connection could not be established. So I think it is these installations that are causing the error if it is not some sort of system error.
Below are the exact commands I ran to install these after creating an Ubuntu(18.04) VM:
sudo passwd username
sudo su -
passwd
apt update && apt upgrade -y
adduser username root
adduser username admin
adduser username sudo
apt-get install ubuntu-desktop -y
apt-get install xserver-xorg-video-dummy
nano /etc/X11/xorg.conf
and pasted the following into the .conf file
Section "Device"
Identifier "Configured Video Device"
Driver "dummy"
EndSection
Section "Monitor"
Identifier "Configured Monitor"
HorizSync 31.5-48.5
VertRefresh 50-70
EndSection
Section "Screen"
Identifier "Default Screen"
Monitor "Configured Monitor"
Device "Configured Video Device"
DefaultDepth 24
SubSection "Display"
Depth 24
Modes "1600x900"
EndSubSection
EndSection
After this state, I created the image using which I could not instantiate a VM that I could SSH into.
Since you have your VM ready and running; backup your image as per this GCP document. Follow the guidelines before you begin the process which were mentioned in the document like updating Google cloud CLI setting default region and zone and for general image guidelines.
Few networking features may require guest operating system mode. You can also check how to export a custom image to cloud storage.
You can also consider the Snapshot Approach.
Follow this process in order to create the image exactly as the one you have already set up and you know is working correctly. As you may already know, this is a custom image so they are available only to your Cloud project. You can create a custom image from boot disks and other images if you would like also. Then, use the custom image to create an instance.
I will also suggest you to give a look at this document which would give you a deeper knowledge on the task.
Regards,
Just spin up a new container from a disk snapshot, if you need an exact copy. And if you cannot SSH, you may either not have a SSH public key provisioned, no external IP assigned, or :22 closed.
gcloud ssh always works. One can as well provision project-wide SSH keys, which all VM in the project will inherit then. The documentation below: About VM metadata explains this all in detail.
My personal favorite are rather startup scripts, which describe the configuration, instead of copying it.
And it's not so difficult to get started with these: cat ~/.bash_history > rocky8_startup.sh. In a software-defined data-center, it might make sense to use software-defined configurations (one simply cannot alternate the installation per VM instance, when starting with a disk snapshot).
xserver-xorg-video-dummy is questionable, because one can enable display device -but unless recording the screen, this driver might still suffice; eg. for VNC sessions.

Make custom kernels available to SageMaker Studio notebooks

On starting the SageMaker Studio server, I can only see a set of predefined kernels when
I select kernel for any notebook.
I create conda environments and persist them between sessions by pointing .condarc to a custom miniconda directory stored on EFS.
I want all notebooks to have access to environments stored in the custom miniconda directory. I can do that on the system terminal but can't seem to find a way to make the kernels available to notebooks.
I am aware of Life Cycle Configuration but that seems to be working only with notebooks instances rather than SageMaker Studio.
Desired outcomes
Ideally making custom kernels persistently available to notebooks but if that isn't feasible or requires custom docker image, I am happy with running a script manually every time I run the server.
What I have tried so far:
I ran the following which is a tweaked version of start.sh meant to be for Life Cycle Configuration.
#!/bin/bash
set -e
sudo -u sagemaker-user -i <<'EOF'
unset SUDO_UID
WORKING_DIR=/home/sagemaker-user/.SageMaker/custom-miniconda/
source "$WORKING_DIR/miniconda/bin/activate"
for env in $WORKING_DIR/miniconda/envs/*; do
BASENAME=$(basename "$env")
source activate "$BASENAME"
python -m ipykernel install --user --name "$BASENAME" --display-name "$BASENAME"
done
EOF
That didn't work and I couldn't access the kernels from the notebooks.
If you need a persistent custom kernel in SageMaker studio, you can create an ECR repository and build a docker image with custom environment configurations. This image can then be attached to the SageMaker studio notebooks. Reference link!
SageMaker studio now also supports the use of lifecycle configurations. Reference link!

docker context create ecs myecs - requires exactly one argument

I'm trying to create a Docker context that will automatically integrate with AWS's ECS.
I'm following this tutorial
The author just does:
docker context create ecs myecs and gets a "pick an integration" prompt, whereas I get an error saying it needs exactly 1 argument.
docker context create" requires exactly 1 argument.
See 'docker context create --help'.
Usage: docker context create [OPTIONS] CONTEXT
Create a context
You need to install the Docker Compose CLI preview
The below curl is from here: Docker docs
curl -L https://raw.githubusercontent.com/docker/compose-cli/main/scripts/install/install_linux.sh | sh
sudo docker context create ecs myecs
It didn't work without sudo for me for some reason.
After the script finished I had some weird errors:
cp: cannot stat '/tmp/tmp.d4QjhW8T6k/docker-compose': No such file or directory and docker context create ecs myecs didn't work at first, but once I tried with sudo it worked fine.
EDIT: . ~/.zshrc (or just close your terminal and open a new one) made it possible for me to run docker context create ecs myecs without sudo.
Author of the blog/tutorial here. It looks like you don't have the pre-requsite installed. In the blog I call out the pre-req in pieces like this.
....In July, Docker released a beta for Docker Desktop that embedded these functionalities and, on September 15th, Docker released an updated experience in their Docker Desktop stable channel....
and then
...For now the only thing you need is Docker Desktop and an AWS account. For this test , I am using Docker Desktop (stable) version 2.5.0.1....
and finally
The core of this integration is built around a new tool dubbed Compose CLI (this is not to be confused with the original docker-compose CLI). This new CLI surfaces to the user as new functionalities in the docker command. While in Docker Desktop all this plumbing is completely hidden and available out of the box, if you are using a Linux machine you can set it up using either a script or a manual install. This new CLI is, essentially, a new version of the docker binary.
Eager to understand more how we could make it more clear / front and center that there were stuff to install and/or minimum software versions you had to use.
Thanks for trying it out!
If you're on Linux and you're running the docker context create ecs myecscontext command from the docs then try enabling experimental features in docker:
Edit /etc/docker/daemon.json
Set contents to
{
"experimental": true
}
Restart docker service sudo systemctl restart docker
Exit your terminal and open a new one so that the changes take effect.
Source1
Source2
I had same issue but after installing Docker Desktop version problem resolved.
Server side version doesn't have such kind of functionality.

Customizing the cloud environment to include a package permanently

I have been using some packages by installing them using the sudo apt-get command in the cloud shell. But now I want to make it permanent. I got this message in the shell
You are running apt-get inside of Cloud Shell. Note that your Cloud Shell
machine is ephemeral and no system-wide change will persist beyond session end.
You can customize your environment to permanently include this package by
updating your environment at https://cloud.google.com/console/cloudshell/environment/view.
So how to customize the cloud environment to include a package permanently?
You have several options.
1) Reinstall everything each time you launch Cloud Shell. This sounds bad but if you keep your files on GCS, the copy happens very fast.
2) Cloud Shell is a Docker container. You can modify that container so that you launch Cloud Shell using your customized container. Launch Cloud Shell. In the title bar on the right hand side is a icon that looks like a laptop. Click it. This will open a window with details on configuring the Docker container.
3) Keep everything local to your home directory. You home directory tree is persistent and will be restored each time your Cloud Shell VM is recreated.

Getting Started with EC2 Container Registry

This is giving me a headache.
Here's what I've done so far
Created an EC2 Virtual Server Instance, and its running
Installed the AWS CLI
Installed Docker on my EC2 Virtual Server after I SSH'd into it
So looking at the docs it tells you how to build an image. Now comes my confusion.
Question 1: So am I right by assuming that one basically have an option to a) build an image off your host or b) pull an image created by others from Docker Hub?
Question 2: If I'm right about Question #1 then what am I building an image ** off of** if I am not pulling one from docker hub? with the AWS docs here?
Question 3: then I see a whole different route I can take, using Docker Compose, so I'd use that instead of all this above? This is so confusing.
EC2 Container Registry – Now Generally Available
So again, here, it tells you to install docker on the Host. Then immediately jumps into "create an image". Create an image off what, that host's OS? I don't get it, I guess that's what it means OR I can pull an image from Docker Hub and not go this route?
Same here, it's talking about creating a docker image, what off the Host?
Or..maybe I'm not understanding what "image" means but I assume going this route, instead of pulling a Docker image from Docker Hub that I'm creating an image off my EC2 virtual Instance?
A1: No. You can't build an image off your host.
You can create an new image according to your requirement like which Operating Sytem (Ubuntu, Fedora), Stack(LAMP, LEMP) and many other things.
Or you can pull an image which will be pre-configured with all the packages like Wordpress Stack image, Magento stack image, Bitnami image which you can pull from docker-hub.
A2: As I have mentioned earlier you can build an image of any operating system you want(Ubuntu, Fedora, Debian) but not off the host.
You just need to pull image from docker-hub. e.g docker pull ubuntu will pull mininmal image of Ubuntu-14.04. And if you need specific version of Ubuntu
like Ubuntu-12.04 version e.g docker pull ubuntu:12.04 will pull minimal image of Ubuntu-12.04
A3: Docker-compose is a tool for defining and running multi-container docker applications. docker-compose conatins a compose file
in which you can configure your application services.
And finally Amazon EC2 Container Registry is little bit different thing. The Idea is the same as docker but Amazon is providing
this as a EC2 Container Service with many other functionality which docker doesn't have right now.
Hope it hepls:-)