I am going through the boto3 documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.create_studio
but I cannot see any sort of create/delete notebook for EMR studio. Only create/delete studio.
How can I create a emr-studio-notebook that preloads a notebook deployed somewhere on S3 via boto3?
Create/delete notebook(Workspace) operations can only be performed using EMR Studio UI and there are no CLI/SDK available for them as of today. You can create Workspace from EMR Studio and upload your existing notebook file via JupyterLab UI.
Related
I am trying to work with glue interective session in sagemaker notebook by configuring the glue-conda-pyspark kernel via aws lifecycle configurations. It worked earlier while creating a notebook instance. Now the instance is running with configuration but i am no longer able to see the conda glue pyspark kernel in the kernel list. Could anybody help with the create script and start script to run the notebook with glue-pyspark.
I am configuring using this aws doc: https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions-sagemaker.html#is-sagemaker-existing
and also aws took help from aws github scripts: https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/blob/master/scripts/install-conda-package-single-environment/on-start.sh
My colleague made a notebook in SageMaker but I want to copy that notebook into SageMaker Studio so that future collaborations and changes are smoother.
From Studio, I can't see anything that relates it to SageMaker classic.
Any advice?
If it's a single notebook download the notebook from Jupyter in the notebook instance (in Jupyter File menu), then upload it to Studio notebooks.
If it's many notebooks, from the notebook instance open a terminal to copy them to S3, the in Studio, copy from S3 to the Studio storage.
aws s3 cp /tmp/my_notebook s3://my_bucket/my_notebook
aws s3 cp s3://my_bucket/my_notebook /tmp/my_notebook
I can create SageMaker Notebook instance from aws sagemaker create-notebook-instance --notebook-instance-name test-123
but I can't find a similiar CLI command to create a "SageMaker Studio" instance?
Thanks
SageMaker Studio is a web-based IDE for machine learning, with multiple components. At its core, Studio consists of a Domain and a list of user profiles. Each user profile can contain multiple "apps" which can host notebook instances, among other features.
See Onboard to Amazon SageMaker Studio for details. Each of these can be created through the CLI, such as -
create-domain
create-user-profile
create-app
I have an transient Emr cluster up and ready, I want to run a simple pyspark script on the emr notebook.
Is there any way to create and modify the emr notebook through terraform?
Thanks in advance.
As far as i know, AWS says "You create an EMR notebook using the Amazon EMR console. Creating notebooks using the AWS CLI or the Amazon EMR API is not supported." [AWS Documentation on creating EMR Notebook][1]
You can create a notebook via console, the notebook will be stored in S3 as .ipynb, by giving the relative path, you can execute notebook on the cluster. Refer boto3 for more info [Boto3 Documentation][2]
[1]: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-managed-notebooks-create.html
[2]: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.start_notebook_execution
Yes, you can create and modify an EMR cluster from Terraform and choose which tools will be installed, but this seems like the "hard way". Easier would be a Sagemaker Notebook or using the new Glue Databrew tool.
I am planning to start a Jupyter Notebook instance and execute each notebook file in AWS Sagemaker using AWS Step Functions. Can this be achieved?
The AWS Step Functions Data Science SDK is an open source library that allows data scientists to easily create workflows that process and publish machine learning models using SageMaker and Step Functions.
The following Example notebooks, which are available in Jupyter notebook instances in the SageMaker console and the related GitHub project:
hello_world_workflow.ipynb
machine_learning_workflow_abalone.ipynb
training_pipeline_pytorch_mnist.ipynb