How to run SageMaker Distributed training from SageMaker Studio? - amazon-web-services

The sample notebooks for SageMaker Distributed training, like here: https://github.com/aws/amazon-sagemaker-examples/blob/main/advanced_functionality/distributed_tensorflow_mask_rcnn/mask-rcnn-scriptmode-s3.ipynb rely on the docker build . and docker push . commands, which are not available or installable in Amazon SageMaker Studio.
Are there alternatives of these notebooks that are compatible with the SageMaker Studio?

SageMaker Studio does not support Docker, since the Studio apps are containers themselves. You can use the SageMaker Docker Build tool to build docker images from Studio (uses CodeBuild in the backend). See the blog Using the Amazon SageMaker Studio Image Build CLI to build container images from your Studio notebooks and the Github repo for details.

Related

How to run AWS Sagemaker Studio job based on pre defined schedule

Currently I am developing a model in AWS Sagemaker Studio. In Sagemaker there are multiple options for running model, like notebook instance, sagemaker studio etc, to schedule a task in notebook instance, it is known that we need to use AWS lambda for that. But I can't see an documentation on how to run scheduled job on AWS Sagemaker Studio.
Need suggestion on this. I know this is not a good question based on StackOverflow guidance like showing some code, but the problem itself is a bit new one, with a newer solution like AWS Sagemaker Studio.
A new feature allows you to Operationalize your Amazon SageMaker Studio notebooks as scheduled notebook jobs

How to create SageMaker Studio environment from CLI?

I can create SageMaker Notebook instance from aws sagemaker create-notebook-instance --notebook-instance-name test-123
but I can't find a similiar CLI command to create a "SageMaker Studio" instance?
Thanks
SageMaker Studio is a web-based IDE for machine learning, with multiple components. At its core, Studio consists of a Domain and a list of user profiles. Each user profile can contain multiple "apps" which can host notebook instances, among other features.
See Onboard to Amazon SageMaker Studio for details. Each of these can be created through the CLI, such as -
create-domain
create-user-profile
create-app

How to create a notebook in EMR Studio using boto3?

I am going through the boto3 documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.create_studio
but I cannot see any sort of create/delete notebook for EMR studio. Only create/delete studio.
How can I create a emr-studio-notebook that preloads a notebook deployed somewhere on S3 via boto3?
Create/delete notebook(Workspace) operations can only be performed using EMR Studio UI and there are no CLI/SDK available for them as of today. You can create Workspace from EMR Studio and upload your existing notebook file via JupyterLab UI.

Building Windows Docker images: Lambda-EC2 vs Docker Hub vs AWS ECR

Problem
The problem is CodeBuild can not build windows Docker image. That happens due to the fact that CodeBuild is runing inside Docker container, and Microsoft does not support Docker inside Docker.
I know, not the first question about this topic, i.e. this. But I gonna suppose some alternatives to the standard workflow that looks like this.
Important: As I understand, Windows Docker image of Microsoft Server 2016 can be built only from Microsoft Server 2016 system/container.
Standard approach
CodeBuild triggers Lambda
Lambda launches image of EC2 with Docker
EC2 instance pulls source code, build image from Dockerfile, push image to repo & trigger CodePipeline.
CodePipeline deploys image
Quetions
Instead of implementing custom image-build step, we can use one of 3rd party solutions: Docker Hub or AWS ECR.
Is AWS ECR able to build Docker images from Docker file? Is it possible to make builds on Microsoft Server 2016 system/container?
Is Docker Hub able to build Docker images on Microsoft Server 2016 system/container?

Which build server and code scan tool to use on AWS EC2 Windows instance?

I have to implement Code Scan tool in CI/CD pipeline in AWS. I have an EC2 Windows Instance.
I checked few tutorials and found some plugins with Jenkins but these all samples are in Linux.
I want to know how to install Jenkins or any other alternative in EC2 Windowsand which code scan tool to use in this environment?
You can follow How to Install Jenkins on Windows tutorial to see how you can use Jenkins on windows.
Alternative to Jenkins is Atlassian Bamboo which is widely used as well for CI/CD
Some of the widely used code scan tools are
Checkmarx
Sonarqube
IBM Appscan - Commercial