How to run AWS Sagemaker Studio job based on pre defined schedule - amazon-web-services

Currently I am developing a model in AWS Sagemaker Studio. In Sagemaker there are multiple options for running model, like notebook instance, sagemaker studio etc, to schedule a task in notebook instance, it is known that we need to use AWS lambda for that. But I can't see an documentation on how to run scheduled job on AWS Sagemaker Studio.
Need suggestion on this. I know this is not a good question based on StackOverflow guidance like showing some code, but the problem itself is a bit new one, with a newer solution like AWS Sagemaker Studio.

A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs

Related

How can we orchestrate and automate data movement and data transformation in AWS sagemaker pipeline

I'm migrating our ML notebooks from Azure Databricks to AWS environment using Sagemaker and Step functions. I have separate notebooks for data processing, feature engineering and ML algorithms which I want to run in a sequence after completion of previous notebook. Can you help me any resource which shows to execute sagemaker notebooks in a sequence using AWS step?
For this type of architecture you need to involve some other elements of the aws as well.
The other services which might be helpful to achieve this is using the combination of eventbridge (scheduled rules) which will execute lambda and then reaches to sagemaker where you can execute you notebooks.
A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs. Unfortunately there no way yet to tie them together into a pipeline.
The other alternative would be to convert your notebooks to processing and training jobs, and use something like AWS Step Functions, or SageMaker Pipelines to run them as a pipeline.

Is it possible to trigger Sagemaker notebook from AWS Lambda function?

I am trying to trigger Sagemaker notebook or Sagemaker Studio notebook from AWS Lambda when data is available in S3 bucket. I want to know if this is possible and if yes, how?
All I want is once data is uploaded in S3, the lambda function should be able to spin up the Sagemaker notebook with a standard CPU cluster.
Here is a Jupyter plug in that you can use to do this, please note this is not managed by AWS. It is experimental software and should be used that way.
https://github.com/aws-samples/sagemaker-run-notebook
Using this extension, you can run your notebook based on an event.
I work at AWS and my opinions are my own.

How to create SageMaker Studio environment from CLI?

I can create SageMaker Notebook instance from aws sagemaker create-notebook-instance --notebook-instance-name test-123
but I can't find a similiar CLI command to create a "SageMaker Studio" instance?
Thanks
SageMaker Studio is a web-based IDE for machine learning, with multiple components. At its core, Studio consists of a Domain and a list of user profiles. Each user profile can contain multiple "apps" which can host notebook instances, among other features.
See Onboard to Amazon SageMaker Studio for details. Each of these can be created through the CLI, such as -
create-domain
create-user-profile
create-app

Can you start and execute a Jupyter Notebook in Sagemaker using Step Functions?

I am planning to start a Jupyter Notebook instance and execute each notebook file in AWS Sagemaker using AWS Step Functions. Can this be achieved?
The AWS Step Functions Data Science SDK is an open source library that allows data scientists to easily create workflows that process and publish machine learning models using SageMaker and Step Functions.
The following Example notebooks, which are available in Jupyter notebook instances in the SageMaker console and the related GitHub project:
hello_world_workflow.ipynb
machine_learning_workflow_abalone.ipynb
training_pipeline_pytorch_mnist.ipynb

Use AWS Lambda to execute a jupyter notebook on AWS Sagemaker

I made a classifier in Python that uses a lot of libraries. I have uploaded the model to Amazon S3 as a pickle (my_model.pkl). Ideally, every time someone uploads a file to a specific S3 bucket, it should trigger an AWS Lambda that would load the classifier, return predictions and save a few files on an Amazon S3 bucket.
I want to know if it is possible to use a Lambda to execute a Jupyter Notebook in AWS SageMaker. This way I would not have to worry about the dependencies and would generally make the classification more straight forward.
So, is there a way to use an AWS Lambda to execute a Jupyter Notebook?
Scheduling notebook execution is a bit of a SageMaker anti-pattern, because (1) you would need to manage data I/O (training set, trained model) yourself, (2) you would need to manage metadata tracking yourself, (3) you cannot run on distributed hardware and (4) you cannot use Spot. Instead, it is recommended for scheduled task to leverage the various SageMaker long-running, background job APIs: SageMaker Training, SageMaker Processing or SageMaker Batch Transform (in the case of a batch inference).
That being said, if you still want to schedule a notebook to run, you can do it in a variety of ways:
in the SageMaker CICD Reinvent 2018 Video, Notebooks are launched as Cloudformation templates, and their execution is automated via a SageMaker lifecycle configuration.
AWS released this blog post to document how to launch Notebooks from within Processing jobs
But again, my recommendation for scheduled tasks would be to remove them from Jupyter, turn them into scripts and run them in SageMaker Training
No matter your choices, all those tasks can be launched as API calls from within a Lambda function, as long as the function role has appropriate permissions
I agree with Olivier. Using Sagemaker for Notebook execution might not be the right tool for the job.
Papermill is the framework to run Jupyter Notebooks in this fashion.
You can consider trying this. This allows you to deploy your Jupyter Notebook directly as serverless cloud function and uses Papermill behind the scene.
Disclaimer: I work for Clouderizer.
It totally possible, not an anti-pattern at all. It really depends on your use-case. AWs actually made a great article describing it, which includes a lambda