Customizing the cloud environment to include a package permanently - google-cloud-platform

I have been using some packages by installing them using the sudo apt-get command in the cloud shell. But now I want to make it permanent. I got this message in the shell
You are running apt-get inside of Cloud Shell. Note that your Cloud Shell
machine is ephemeral and no system-wide change will persist beyond session end.
You can customize your environment to permanently include this package by
updating your environment at https://cloud.google.com/console/cloudshell/environment/view.
So how to customize the cloud environment to include a package permanently?

You have several options.
1) Reinstall everything each time you launch Cloud Shell. This sounds bad but if you keep your files on GCS, the copy happens very fast.
2) Cloud Shell is a Docker container. You can modify that container so that you launch Cloud Shell using your customized container. Launch Cloud Shell. In the title bar on the right hand side is a icon that looks like a laptop. Click it. This will open a window with details on configuring the Docker container.
3) Keep everything local to your home directory. You home directory tree is persistent and will be restored each time your Cloud Shell VM is recreated.

Related

Make custom kernels available to SageMaker Studio notebooks

On starting the SageMaker Studio server, I can only see a set of predefined kernels when
I select kernel for any notebook.
I create conda environments and persist them between sessions by pointing .condarc to a custom miniconda directory stored on EFS.
I want all notebooks to have access to environments stored in the custom miniconda directory. I can do that on the system terminal but can't seem to find a way to make the kernels available to notebooks.
I am aware of Life Cycle Configuration but that seems to be working only with notebooks instances rather than SageMaker Studio.
Desired outcomes
Ideally making custom kernels persistently available to notebooks but if that isn't feasible or requires custom docker image, I am happy with running a script manually every time I run the server.
What I have tried so far:
I ran the following which is a tweaked version of start.sh meant to be for Life Cycle Configuration.
#!/bin/bash
set -e
sudo -u sagemaker-user -i <<'EOF'
unset SUDO_UID
WORKING_DIR=/home/sagemaker-user/.SageMaker/custom-miniconda/
source "$WORKING_DIR/miniconda/bin/activate"
for env in $WORKING_DIR/miniconda/envs/*; do
BASENAME=$(basename "$env")
source activate "$BASENAME"
python -m ipykernel install --user --name "$BASENAME" --display-name "$BASENAME"
done
EOF
That didn't work and I couldn't access the kernels from the notebooks.
If you need a persistent custom kernel in SageMaker studio, you can create an ECR repository and build a docker image with custom environment configurations. This image can then be attached to the SageMaker studio notebooks. Reference link!
SageMaker studio now also supports the use of lifecycle configurations. Reference link!

Run docker from within toolbox

Within Google Container OS, I would like to use it as my cloud development environment. How would I run the docker command from the toolbox? Do I need to add the docker.sock as a bind mount? I need to be able to run docker (and docker-compose) to run my development environment.
Google Container OS images come with docker already installed and configured, so you will be able to use the docker command from the command line without any prior configuration if you create a virtual machine from one of these images, and SSH into the machine.
As for docker-compose, this doesn't come pre-installed. However, you can install this, and other relevant tools/programs you require by making use of the toolbox you mentioned which provides a shell (including a package manager)in a Debian chroot-like environment (here you automatically gain root privileges).
You can install docker-compose by following these steps:
1) If you havn't already, enter the toolbox environment by running /usr/bin/toolbox
2) Check the latest version of docker-compose here.
3) You can run the following to retrieve and install docker-compose on the machine (substitute the docker-compose version number for the latest version you retrieved in step 2):
curl -L https://github.com/docker/compose/releases/download/1.18.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
You've probably found at this point that although you can now run the freshly installed docker-compose command within the toolbox, you can't run the docker command. This is because by default the toolbox environment doesn't have access to all paths with the rootfs and that the filesystem available doesn't correspond between both environments.
It may be possible to remedy this by exiting out of the toolbox shell, and then edit the /etc/default/toolbox file which allows you to configure the toolbox settings. This would allow you to provide access to the docker binary file in the standard environment by following these steps:
1) Ensure you are no longer in the toolbox shell, then run command which docker. You will see something similar to /usr/bin/docker.
2) Open file /etc/default/toolbox
3) The TOOLBOX_BIND line specifies the paths from rootfs to be made available inside the toolbox environment. To ensure docker is available inside the toolbox environment, you could try adding an entry to the TOOLBOX_BIND section, for example --bind=/usr/bin/docker:/usr/bin/docker.
However, I've found that even though it's possible to edit the /etc/default/toolbox to make the docker binary file available in the toolbox environment, when certain docker commands are run in the toolbox environment, additional errors are generated as the docker version that comes pre-installed on the machine is configured to use certain configuration files and directories and although it may be possible edit the /etc/default/toolbox file and make all of the required locations accessible from within the toolbox environment, it may be simpler to install docker within the toolbox by following the instructions for installing docker on debian found here.
You would then be able, to issue both the docker and docker-compose commands from within toolbox.
Alternatively, it's possible to simply install docker and docker-compose on a standard VM (i.e. without necessarily using a Google Container OS machine type) although the suitability of this depends on your use case.

Datalab - how to install and keep packages

I decided to try and use Google Cloud Datalab for a small project that I'm working on rather than a Jupyter Notebook in an Anaconda environment on an AWS instance.
How can I install a package (for example OpenCV) onto the Datalab VM so that I don't have to reinstall it every time I restart my VM? Why do the packages disappear after every restart but the updated notebooks remain persistent? Any help answering these questions and clarifying how the Datalab VM works would be very helpful.
The notebooks are stored in a docker volume mount that represents a location on the persistent disk that is maintained across restarts of the VM.
The packages you install however are stored in the running container and hence lost on each restart.
You could create a custom docker image and use that instead. On the datalab create command, see the --image-name argument.
Here is an example of a Dockerfile you'll want to use:
FROM gcr.io/cloud-datalab/datalab:latest
RUN pip install opencv
Note that you'll need build the docker image using this docker file, and push the image to Google Container Registry. My memory is a bit fuzzy on this, but it is possible this image needs to be marked as public.
Hope that helps!

Can I run gcloud components update?

Will updating gcloud components from within my Google Cloud Shell instance persist?
Will updating anything, like Go or NPM, that is pre-installed with Google Cloud Shell persist?
Yes, depending upon where you install those tools.
When you init a new cloud shell, you get a disk for yourself, and the system image is constructed using a template. So any changes that you do to your disk will persist, while anything you do to core image, will not.
All the pre-installed tools are part of the system image that is updated for all the users and is maintained by GCP team. If you are updating or switching versions there, they will not persist.
But if you want to install custom tools, or switch to a specific version, you can install those tools at your $HOME. All those tools will be installed in your disk and hence will persist across termination/relaunches.

How to manually trigger rsync using vagrant and vagrant-aws?

I'm using Vagrant to deploy chef scripts to an AWS server (and it mostly works awesome). I have set up a local rsync in my Vagrantfile to mirror my dev directory onto the server.
config.vm.synced_folder "../geoevents", "/vagrant/geoevents-repo"
And this syncs find on 'vagrant provision'. I'm wondering if there is an easier way that I can have vagrant only trigger that rsync, or to control how often rsync occurs?
Or, should I not be using rsync, but instead mount a shared file system?
Vagrant CLI now has two new commands, vagrant rsync and vagrant rsync-auto which can do the job.
Command: vagrant rsync
This command forces a re-sync of any rsync synced folders.
Note that if you change any settings within the rsync synced folders such as exclude paths, you will need to vagrant reload before this command will pick up those changes.
https://www.vagrantup.com/docs/cli/rsync.html
https://www.vagrantup.com/docs/cli/rsync-auto.html
https://www.vagrantup.com/docs/cli/non-primary.html
Currently, you can fit your needs with the following plugin:
https://github.com/cromulus/vagrant-rsync
By the way, most of the plugin features will be available in Vagrant 1.5 (currently in development).
The vagrant-rsync is deprecated as of Vagrant 1.5. One solution out there is vagrant-unison. You may also check out this discussion. What should also work is a vagrant reload.