Memory error in Google Cloud Platform AI Jupyter notebook - google-cloud-platform

I am trying to run a sentiment analysis on Google Cloud Platform (AI Platform). When I try to split the data into training, Its showing memory error like below error like below
MemoryError: Unable to allocate 194. GiB for an array with shape (414298,) and data type <U125872
How do I increase the memory size accordingly? Should I change the machine type in the instance? If so Which setting would be appropriate?

From the error it's seems the VM is out of memory.
1 - Create a new Notebook with another Machine type.
For this, go to AI Platform > Notebooks and click on NEW INSTANCE. Select the option most fit you (R 3.6, Python 2 and 3, etc.) and click on ADVANCED OPTIONS in the popped pane. In the Machine Configuration area you can pick a Machine type with more memory.
Please start with n1-standard-16 or n1-highmem-8, and if any of those works, jump to n1-standard-32 or n1-highmem-16.
Using the command you also can change the machine size:
gcloud compute instances set-machine-type INSTANCE_NAME \
--machine-type NEW_MACHINE_TYPE
2 - Change the dtype.
If you are working with np.float64 type, you can change it to np.float32 in order to reduce size. So you can modify the line:
result = np.empty(self.shape, dtype=dtype)
By:
result = np.empty(self.shape, dtype=np.float32)
If you don't want to modify your code I suggest you to follow first option.

Changing machine type to one with enough resources is necessary but might not be sufficient. As indicated here Jupyter as a service settings need to be set to allow for greater memory usage. Make sure of this trying the following steps:
Open a terminal on your Jupyter instance and run the following command:
sudo nano /lib/systemd/system/jupyter.service
Check if the MemoryHigh and MemoryMax parameters on the text editor that prompts (like the one showed bellow) are set to your desired capacity. If not, then change them.
[Unit]
Description=Jupyter Notebook
[Service]
Type=simple
PIDFile=/run/jupyter.pid
CPUQuota=97%
MemoryHigh=3533868160
MemoryMax=3583868160
ExecStart=/bin/bash --login -c '/opt/conda/bin/jupyter lab --config=/home/jupyter/.jupyter/jupyter_notebook_config.py'
User=jupyter
Group=jupyter
WorkingDirectory=/home/jupyter
Restart=always
[Install]
WantedBy=multi-user.target`
Save and exit.
Finally, run the following command on the terminal:
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
This will allow for full usage of the vm resources on the Jupyter instance.

Related

Expose kernel for existing environment on Vertex AI User Managed Notebook startup

This is a continuation of this thread, posted here because it was too complicated for a comment.
TL;DR
In a Vertex AI User Managed Notebook, how does one retain the exposed kernel icons for existing venv (and conda, if possible) environments stored on the data disk, through repeated stop and start cycles?
Details
I am using User Managed Notebook Instances built off a Docker image. Once the Notebook is launched, I manually go in create a custom environment. For the moment, let's say this is a venv python environment. The environment works fine and I can expose the kernel so it shows as an icon in the Jupyter Lab's Launcher. If I shut the instance down and restart it, the icon is gone. I have been trying to create a start-up script that re-exposes the kernel, but it is not working properly. I have been trying to use method #2 proposed by #gogasca in the link above. Among other operations (which do execute correctly), my start-up script contains the following:
cat << 'EOF' > /home/jupyter/logs/exposeKernel.sh
#!/bin/bash
set -x
if [ -d /home/jupyter/envs ]; then
# For each env creation file...
for i in /home/jupyter/envs/*.sh; do
tempName="${i##*/}"
envName=${tempName%.*}
# If there is a corresponding env directory, then expose the kernel
if [ -d /home/jupyter/envs/${envName} ]; then
/home/jupyter/envs/${envName}/bin/python3 -m ipykernel install --prefix=/root/.local --name $envName &>> /home/jupyter/logs/log.txt
echo -en "Kernel created for: $envName \n" &>> /home/jupyter/logs/log.txt
else
echo -en "No kernels can be exposed\n" &>> /home/jupyter/logs/log.txt
fi
done
fi
EOF
chown root /home/jupyter/logs/exposeKernel.sh
chmod a+r+w+x /home/jupyter/logs/exposeKernel.sh
su -c '/home/jupyter/logs/exposeKernel.sh' root
echo -en "Existing environment kernels have been exposed\n\n" &>> /home/jupyter/logs/log.txt
I am attempting to log the operations, and I see in the log that the kernel is created successfully in the same location that it would be created if I were to manually activate the environment and expose the kernel from within. Despite the apparent success in the log (no errors, at least), the kernel icon does not appear. If I manually run the exposeKernel.sh script from the terminal using su -c '/home/jupyter/logs/exposeKernel.sh' root, it also works fine and the kernel is exposed correctly. #gogasca's comments on the aforementioned thread suggest that I should be using the jupyter user instead of root, but repeated testing and logging indicates that the jupyter user fails to execute the code while root succeeds (though neither create the kernel icon when called from the start-up script).
Questions:
(1) My goal is to automatically re-expose the existing environment kernels on startup. Presumably they disappear each time the VM is stopped and started because there is some kind of linking to the boot disk that is rebuilt each time. What is the appropriate strategy here? Is there a way to build the environments (interested in both conda and venv) so that their kernel icons don't vaporize on shut-down?
(2) If the answer to (1) is no, then why does the EOF-created file fail to accomplish the job when called from a start-up script?
(3) Tangentially related, am I correct in thinking that the post-startup-script executes only once during the initial Notebook instance creation process, while the the startup-script or startup-script-url executes each time the Notebook is started?

Copy files to Container-Optimised OS from a GCP Storage bucket

How can one download files from a GCP Storage bucket to a Container-Optimised OS (COS) on instance startup?
I know of the following solutions:
gcloud compute copy-files
SSH through console
SCP
Yet all of these have to be done manually and externally after an instance is started.
There is also cloud init, yet I can't find any info on how to copy files from a Storage bucket. Examples seem to be suggesting that it's better to include content of files in the cloud init file directly, which is not something I want to do because security. Is it possible to download files from Storge bucket using cloud init?
I considered using a startup script, yet COS lacks CLI tools such as gcloud or gsutil to be able to run any such commands in a startup script.
I know I could copy the files manually and then save the image as a boot disk, but I'm hoping there are solutions that avoid having to do so.
Most of all, I'm assuming I'm not asking for something impossible, given that COS instance setup allows me to specify Docker volumes that I could mount onto the starting container. This seems to suggest I should be able to have some private files on the instance the moment COS will attempt to run my image on startup. But how?
Trying to execute a startup-script with a cloud-sdk image and copying files there as suggested by Guillaume didn't work for me for a while, showing this log. Eventually I realised that the cloud-sdk image is 2.41GB when uncompressed and takes over 2 minutes to complete pulling. I tried again with an empty COS instance and the startup script completed successfully, downloading the data from a Storage bucket.
However, a 2.41GB image and over 2 minutes of boot time sound like a bit of an overkill to download a 2KB file. Don't they?
I'm glad to see a working solution to my question (thanks Guillaume!) although I'm still wondering: isn't there a nicer way to do this? I feel that this method is even less tidy than manually putting the files on the COS instance and then creating a machine image to use in the future.
Based on Guillaume's answer I created and published a gsutil wrapper image, available as voyz/gsutil_wrap. This way I am able to run a startup-script with the following command:
docker run -v /host/path:/container/path \
--entrypoint gsutil voyz/gsutil_wrap \
cp gs://bucket/path /container/path
It's essentially a copy of what Guillaume suggested, except it is using an image containing only a minimum setup required to run gsutil. As a result it weighs 0.22GB and pulls within 10-20 seconds on average - as opposed to 2.41GB and over 2 minutes respectively for the google/cloud-sdk image suggested by Guillaume.
Also, credit to this incredibly useful StackOverflow answer that allows gsutil to use the default service account for authentication.
The startup-script is the correct location to do this. And YES, COS lacks some useful library.
BUT you can run container! And, for example, the Google Cloud SDK container!
So, add this startup-script in the VM metadata:
key -> startup-script
value ->
docker run -v /local/path/to/copy/files:/dummy/container/path \
--entrypoint gsutil google/cloud-sdk \
cp gs://your_bucket/path/to/file /dummy/container/path
Note: the startup script is ran in root mode. Perform a chmod/chown in your startup script if you need to change the file access mode.
Let me know if you need more explanation on this command line
Of course, with a fresh COS image, the startup time is quite long (pull the container image and extract it).
To reduce the startup time, you can "bake" your image. I mean, start with a COS, download/install what you want on it (or only perform a docker pull of the googkle/cloud-sdk container) and create a custom image from this.
Like this, all the required dependencies will be present on the image and the boot start will be quicker.

Is there a way to set overcommit memory in a Google Cloud notebook instance?

I've set up a jupyter notebook instance on the Google Cloud AI Platform and am able to see that the overcommit memory is set to 0 by using the command:
cat /proc/sys/vm/overcommit_memory
However, when I try to set the memory to 1 by using the command: echo 1 > /proc/sys/vm/overcommit_memory, it gives me the following error:
bash: /proc/sys/vm/overcommit_memory: Permission denied
I know that this is likely due to something in the IAM and changes would need to be made there, but exactly what policy do I need to add or alter to my current project so that I have permission to change the overcommit memory type?
Use this:
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
I tried it in my instance and it works.
Source

Google Cloud Platform: cloudshell - is there any way to "keep" gcloud init configs?

Does anyone know of a way to persist configurations done using "gcloud init" commands inside cloudshell, so they don't vanish each time you disconnect?
I figured out how to persist python pip installs using the --user
example: pip install --user pandas
But, when I create a new configuration using gcloud init, use it for a bit, close cloudshell (or cloudshell times out on me), then reconnect later, the configurations are gone.
Not a big deal, I bounce between projects/etc so it's nice to have the configs saved so I can simply run
gcloud config configurations activate config-name
Thanks...Rich Murnane
Google Cloud Shell only persists data in your $HOME directory. Commands like gcloud init modify the environment variables and store configuration files in /tmp which is deleted when the VM is restarted. The VM is terminated after being idle for 20 minutes or 60 minutes depending on which document you read.
Google Cloud Shell is a Docker container. You can modify the docker image to customize to fit your needs. This method will allow you to install packages, tools, etc that are not located in your $HOME directory.
You can also store your files and configuration scripts on Google Cloud Storage. Modify .bashrc to download your cloud files and run your configuration script.
Either method will allow you to create a persistent environment.
This StackOverflow answer covers in detail what gcloud init does and how to basically emulate the same thing via script or command line.
gcloud init details
this isn't exactly what I wanted, but since my
account (userid) isn't changing, I'm simply going to
do the command
gcloud config set project second-project-name
good enough, thanks...Rich

Getting Data From A Specific Website Using Google Cloud

I have a machine learning project and I have to get data from a website every 15 minutes. And I cannot use my own computer so I will use Google cloud. I am trying to use Google Compute Engine and I have a script for getting data (here is the link: https://github.com/BurkayKirnik/Automatic-Crypto-Currency-Data-Getter/blob/master/code.py). This script gets data every 15 mins and writes it down to csv files. I can run this code by opening an SSH terminal and executing it from there but it stops working when I close the terminal. I tried to run it by executing it in startup script but it doesn't work this way too. How can I run this and save the csv files? BTW I have to install an API to run the code and I am doing it in startup script. There is no problem in this part.
Instances running in Google Cloud Platform can be configured with the same tools available in the operating system that they are running. If your instance is a Linux instance, the best method would be to use a cronjob to execute your script repeatedly at your chosen interval.
Once you have accessed the instance via SSH, you can open the crontab configuration file by running the following command:
$ crontab -e
The above command will provide access to your personal crontab configuration (for the user you are logged in as). If you want to run the script as root you can use this instead:
$ sudo crontab -e
You can now edit the crontab configuration and add an entry that tells cron to execute your script at your required interval (in your case every 15 minutes).
Therefore, your crontab entry should look something like this:
*/15 * * * * /path/to/you/script.sh
Notice the first entry is for minutes, so by using the */15, you are telling the cron daemon to execute the script once every 15 minutes.
Once you have edited the crontab configuration file, it is a good idea to restart the cron daemon to ensure the change you made will take place. To do this you can run:
$ sudo service cron restart
If you would like to check the status to ensure the cron service is running you can run:
$ sudo service cron status
You script will now execute every 15 minutes.
In terms of storing the CSV files, you could either program your script to store them on the instance, or an alternative would be to use Google Cloud Storage bucket. File can be copied to buckets easily by making use of the gsutil (part of Cloud SDK) command as described here. It's also possible to mount buckets as a file system as described here.