Copy a SageMaker Notebook to SageMaker Studio - amazon-web-services

My colleague made a notebook in SageMaker but I want to copy that notebook into SageMaker Studio so that future collaborations and changes are smoother.
From Studio, I can't see anything that relates it to SageMaker classic.
Any advice?

If it's a single notebook download the notebook from Jupyter in the notebook instance (in Jupyter File menu), then upload it to Studio notebooks.
If it's many notebooks, from the notebook instance open a terminal to copy them to S3, the in Studio, copy from S3 to the Studio storage.
aws s3 cp /tmp/my_notebook s3://my_bucket/my_notebook
aws s3 cp s3://my_bucket/my_notebook /tmp/my_notebook

Related

Is it possible to access sagemaker jupyter notebook from intellij IDE?

I have deployed a model via jupyter notebook on sagemaker instance.
Now, I am wondering is there any chance to access sagemaker jupyter notebook from intellij IDE?
I am looking for a way to make an environment to work with peers so that I can get code reviews.
I can see I can control aws lambda functions via terminal, but not sure about Jupyter notebook on Sagemaker instance.

How does one move python and other types of files from one GCP notebook instance to another?

I have a Vertex AI notebook that contains a lot of python and jupyter notebook as well as pickled data files in it. I need to move these files to another notebook. There isn't a lot of documentation on google's help center.
Has someone had to do this yet? I'm new to GCP.
Can you try these steps in this article. It says you can copy your files to a Google Cloud Storage Bucket then move it to a new notebook by using gsutil tool.
In your notebook's terminal run this code to copy an object to your Google Cloud storage bucket:
gsutil cp -R /home/jupyter/* gs://BUCKET_NAMEPATH
Then open a new terminal to the target notebook and run this command to copy the directory to the notebook:
gsutil cp gs://BUCKET_NAMEPATH* /home/jupyter/
Just change the BUCKET_NAMEPATH to the name of your cloud storage bucket.
I'm assuming that both notebooks are on the same GC project and that you have the same permissions on both, ok?
There are many ways to do that... Listing some here:
The hardest to execute, but the simplest by concept: You can download everything for your computer/workstation from the original notebook instance, then go to the second notebook instance and upload everything
You can use Google Cloud Storage, the object storage service to be used as the medium for the files movement. To do that you need to (1) create a storage bucket, (2) then using your notebook instance terminal, copy the data from the instance to the bucket, (3) and finally use the console on the target notebook instance and copy the data from the bucket to the instance

How to create a notebook in EMR Studio using boto3?

I am going through the boto3 documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.create_studio
but I cannot see any sort of create/delete notebook for EMR studio. Only create/delete studio.
How can I create a emr-studio-notebook that preloads a notebook deployed somewhere on S3 via boto3?
Create/delete notebook(Workspace) operations can only be performed using EMR Studio UI and there are no CLI/SDK available for them as of today. You can create Workspace from EMR Studio and upload your existing notebook file via JupyterLab UI.

How can solve a scheduling problem a .ipnyb notebook in Sagemaker using AWS lambda and Lifecycle Configuration?

I want to schedule my .ipynb file with Amazon Lambda. I am following the steps of this publications https://towardsdatascience.com/automating-aws-sagemaker-notebooks-2dec62bc2c84. For notebook instance is working very well starting and stoping, but my .ipynb file is not executing, i wrote as the same above mentioned publication in lifecycle configuration.
Just i change these lines with my notebook instance source
"NOTEBOOK_FILE="/home/ec2-user/SageMaker/Test Notebook.ipynb"
/home/ec2-user/anaconda3/bin/activate "$ENVIRONMENT"
"source /home/ec2-user/anaconda3/bin/deactivate".
Cloudwatch is working very well for notebook instance, but .ipynb file is not executed.
Can someone help me about my problem!
Check out the this aws-sample of how to run a notebook in aws-sagemaker.
This document shows how to install and run the sagemaker-run-notebooks library that lets you run and schedule Jupyter notebook executions as SageMaker Processing Jobs.
This library provides three interfaces to the notebook execution functionality:
A command line interface (CLI)
A Python library
A JupyterLab extension that can be enabled for JupyterLab running locally, in SageMaker Studio, or on a SageMaker notebook instance
https://github.com/aws-samples/sagemaker-run-notebook
Also, check out this example of Scheduling Jupyter notebooks on SageMaker. you can write code in a Jupyter notebook and run it on an Amazon SageMaker ephemeral instance with the click of a button, either immediately or on a schedule. With the tools provided here, you can do this from anywhere: at a shell prompt, in JupyterLab on Amazon SageMaker, in another JupyterLab environment you have, or automated in a program you’ve written.
https://aws.amazon.com/blogs/machine-learning/scheduling-jupyter-notebooks-on-sagemaker-ephemeral-instances/

How to serve previously created Jupyter Notebook on Google Cloud Platform

I have a previously created jupyter notebook that I'd like to run on the Google Cloud Platform.
I currently have a notebook instance running on a GCP VM and it works fine. I was also able to create a storage bucket and upload all dataset and notebook files to the bucket. However, these files don't show up in the Jupyter Notebook directory tree. I know I can access the dataset files using something like...
client = storage.Client()
bucket = client.getbucket('name-of-bucket')
blob = storage.Blob( 'diretory/to/files', bucket )
fid = BytesIO(blob.downloadas_string())
But I'm not sure how to actually serve up a notebook file to use, and I really don't feel like copying and pasting all my previous work.
All help appreciated!
Very simple. You can upload directly from within the Jupyter Notebook and bypass the bucket if desired (the icon with the up arrow).
Jupyter Notebook upload icon image
The only issue with this is you can't upload folders, so zip the folder first then upload it.
You can use Jupyter Lab's git extension to host your Notebooks in GitHub and pull them from there.
Fyi, if you use GCP's AI Platform Notebooks you'll get a pre-configured Jupyter environment with many ML/DL libraries pre-installed. That git extension will be pre-installed as well.