GCP - run all cells of Jupyter Notebook without open browser and show logs to terminal - google-cloud-platform

I started using VM instances in Google Cloud Platform to train deep learning. In a Linux machine what is the best way to start running cells of Jupyter Notebook without opening browser, just by command in terminal. Also I want to see all the output in terminal.

Yes this is possible and there are different ways of doing it;
One way is to use runipy. This will run all cells in a notebook.
The source code is here runipy
You can also save the output as html report or a notebook
You can install runipy using pip
$ pip3 install runipy
Another method is to use the python3 module nbconvert.
This would allow you to use a python interactive shell.
See the official Python documentation here Executing notebooks from the command line

Related

Make custom kernels available to SageMaker Studio notebooks

On starting the SageMaker Studio server, I can only see a set of predefined kernels when
I select kernel for any notebook.
I create conda environments and persist them between sessions by pointing .condarc to a custom miniconda directory stored on EFS.
I want all notebooks to have access to environments stored in the custom miniconda directory. I can do that on the system terminal but can't seem to find a way to make the kernels available to notebooks.
I am aware of Life Cycle Configuration but that seems to be working only with notebooks instances rather than SageMaker Studio.
Desired outcomes
Ideally making custom kernels persistently available to notebooks but if that isn't feasible or requires custom docker image, I am happy with running a script manually every time I run the server.
What I have tried so far:
I ran the following which is a tweaked version of start.sh meant to be for Life Cycle Configuration.
#!/bin/bash
set -e
sudo -u sagemaker-user -i <<'EOF'
unset SUDO_UID
WORKING_DIR=/home/sagemaker-user/.SageMaker/custom-miniconda/
source "$WORKING_DIR/miniconda/bin/activate"
for env in $WORKING_DIR/miniconda/envs/*; do
BASENAME=$(basename "$env")
source activate "$BASENAME"
python -m ipykernel install --user --name "$BASENAME" --display-name "$BASENAME"
done
EOF
That didn't work and I couldn't access the kernels from the notebooks.
If you need a persistent custom kernel in SageMaker studio, you can create an ECR repository and build a docker image with custom environment configurations. This image can then be attached to the SageMaker studio notebooks. Reference link!
SageMaker studio now also supports the use of lifecycle configurations. Reference link!

bq command not found

I am currently working on windows machine. Installed WSL to be able to work in Linux env.
Installed the Google Cloud SDK and am able to run gsutil and gcloud commands.
However, while trying to run bq, I get the bq command not found error.
Can someone help me here?
"bq" is one of the default Cloud SDK components, and gets installed by default.
Please check with the command "gcloud components list" to confirm if "bq" is available.
If not, maybe somehow your installation got corrupted. Please try re-installing to fix this issue.
Otherwise, try running these commands, see how the path for all are set and same like "/usr/bin" in the given example. This may reveal some path setting related issues which need to be fixed.
I've run into a similar issue when working on a Windows environment. I have found that calling bq.cmd helps to get the BigQuery commands to execute.
So running:
bq.cmd ls
instead of running:
bq ls
To list datasets in your current project.
In WSL2, install the Google Cloud CLI with this command as shown in the documentation.
curl https://sdk.cloud.google.com | bash
Then restart your WSL installation. The bq command works at both a Windows command prompt and a WSL terminal.

Restart Jupyter Lab server running in the background

I'm trying to restart a Jupyter Lab server (not just the kernels) running in the background of an AWS SageMaker notebook instance. I have already tried the following:
Killing the server by it's process ID
pgrep doesn't show me the process
pkill can't find the process
ps aux shows the process ID as constantly changing
Stopping the server through jupyter notebook stop
I get an SSL error and nothing happens
The only thing I've been able to do is reboot the entire instance, which isn't a great option as it can take awhile to become available again.
Edit 1:
The main reason I am trying to do this is that after installing the tqdm package and trying to use tqdm.notebook in Jupyter Lab, in order for it to display correctly I need to enable/install notebook and lab extensions. In order for these to take effect the server then needs to be restarted.
Try this:
Left hand navbar, Commands
Navigate to the Help section on the popout menu
Reset Application State
Both classic Jupyter and Jupyter lab live within the same process.
sudo initctl restart jupyter-server --no-wait is what AWS suggest in https://forums.aws.amazon.com/thread.jspa?messageID=917594&#917594
Assuming it runs on port 8888:
jupyter lab stop 8888 && jupyter lab

Programming the Pepper robot without "Choreography" software?

Usually the developer can use Softbanks own software Choreography to give programs to Pepper robot.
Isn't there a way to setup a different development environment? e.g. Access via SSH and creating Python scripts with a simple text editor and starting the script manually? It means writing and starting Python scripts for Pepper without using Choreography.
You can also use qibuild (pip install qibuild) : https://github.com/aldebaran/qibuild
It contains a qipkg command, just run
qipkg deploy-package path/to/your/file.pml --url USER#IP:/home/nao
A pml file is a project, it is created by Choregraph, or you can use this tool :
https://github.com/pepperhacking/robot-jumpstarter
in order to get a sample app.
Of course, using Choregraphe is not an obligation, you can use the different SDKs directly.
You can for instance create a python script on your computer, copy it on the robot
scp path/to/script/myscript.py nao#robotIp
And then ssh onto the robot and launch the script
ssh nao#robotIp
python myscript.py
You can also ssh onto the robot, create a script (using nano for instance) and launch it from there.
I've been using Pycharm Pro for 6 months and I am happy with it. You get automatic deployment and remote debugging. The most basic setup must still be done with Choregraphe, but it takes less than one minut.

Setting up an Apache Spark Cluster on Amazon EC2 Using CMD

I am working on my graduation project and It's my first time dealing with spark and EC2
so I am following the steps in this blog
http://www.supergloo.com/fieldnotes/apache-spark-cluster-amazon-ec2-tutorial/#comment-3843
The problem is he is working on MAC and I don't know how to make these commands work on Windows (CMD).
for example this command
ec2/spark-ec2 –key-pair=courseexample –identity-file=courseexample.pem launch spark-cluster-example
Any Help ?!
try running it in the following way (from the same folder):
python -Wdefault "ec2\spark_ec2.py" –key-pair=courseexample –identity-file=courseexample.pem launch spark-cluster-example
if you don't know how to open console in windows - just press "Start->Run", type cmd and hit enter, then you would need to havigate to your Spark home folder, and execute the above command.
NOTE: I don't currently own a Windows machine, so I haven't tried this command myslef.