Can SageMaker model compiler be used to optimize model for AWS App Runner? Based on documentation https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OutputConfig.html#sagemaker-Type-OutputConfig-CompilerOptions it looks like it can support CPU devices - both X86 and ARM. Would the output model work with CPU-based App Runner?
Thanks very much for any insight if anyone tried it before.
You are probably referring to SageMaker Neo which is used for inference, rather to SageMaker Training Compiler (used for training).
SageMaker Neo can compile the model to a target CPU architecture, AppRunner is a high level service for light containers designed with simplicity as a primary goal. In App Runner you can set the num of vCPUs, I couldn't find any guarantees on which CPU model will be used to run the container (no surprise really). In practice, you could try optimizing for Intel instructions set, and see if the resulting runtime is supported on app runner.
Or if you really need more control and performance, switch to AWS Fargate and explicitly choose x86 or ARM.
Related
We're testing a cloud version of our core product which is fundamentally a windows machine running a small vm instance of a heavily modified old OS (QNX6.5) as the core of a suite of our applications. Workspaces is perfect for our use case (allowing a lot of operators using a variety of clients in easily). However we're having real trouble creating this nested VM which is fundamental. There doesn't seem to be a way to activate hypervisor but it's still present which means that VMware won't run. Is there any solution to getting this up and running or is workspaces a non-starter? the required resources are very light. Any help will be greatly appreciated, thanks
I am currently in the process of building a custom container for a sagemaker training job, using amazon linux 2 as my base image. However, I could not find resources/examples online about this. I am currently struggling on the requirements of only installing the cuda toolkit but nvidia driver
I understand that there is a prebuild nvidia image which I can use as my base. but unfortunately, my application doesn't allow that. also conda may not be the best option as well. Please advice :)
What requires you to start from scratch? Just make sure you have a really good reason, because it's extra work for you.
You could build on top of AWS's Deep Learning Containers.
Or at the very least build on top of NVIDIA's NGC container, so you don't have to deal with CUDA installation - Here's an example of enabling an NGC container with high performance EFA networking driver and SageMaker.
I am trying to run a CI/CD on my codebase, but in order to run my tests, I need a GPU-enabled VM (to produce deep learning results).
However, the only configurable machine option I see is the machine type (number of cores and memory). I don't see an option for adding an accelerator type (GPU).
Is there a way to attach a GPU to the build VM, and if not, is there another method for triggering a test on another GPU enabled VM?
Thanks!
Google Cloud Build doesn't provide machine types equipped with GPUs at the moment. One option though is to use the remote-builder cloud builder. It allows you to run your builds on a Compute Engine instances running in your project. You can use the INSTANCE_ARGS option to customize the instance to fit your specific needs, adding one or more GPUs in this case. You can have a look here for some example configs. You can use any flag available with the gcloud compute instances create command, including the --accelerator flag for GPUs.
I have few questions for Sagemaker Neo:
1) Can I take advantage of Sagemaker Neo if I have an externally trained tensorflow/mxnet model?
2) Sagemaker provides container image for 'image-classification' and it has released a new image with name 'image-classification-neo' for the neo compilation job. What is the difference between both of them? Do I require a new Neo compatible image for each pre built sagemaker template(container) similarly?
Any help would be appreciated
Thanks!!
1) Yes. Upload your model to an S3 bucket as a model.tar.gz file (similar to what SageMaker would save after training) and you can compile it.
2) The Neo versions use the Neo runtime to load and predict, so yes, the containers are different. Right now, Neo supports the XGBoost and Image Classification built-in algos. Of course, you could build your own custom container and use Neo inside that. For more info: https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html
Julien
It is long time that this question has been asked. But in case someone turns up here after searching for the same question:
Mainly, Amazon NEO is an optimizer for making the program compatible for multiple underlying hardware and platform. Based on documentation:
"Neo is a new capability of Amazon SageMaker that enables machine learning models to train once and run anywhere in the cloud and at the edge. "
And yes, those 2 docker images are different. As one of them has the optimiser code, the other doesn't.
The difference is not in the input, so 'image-classification-neo' can work with images that 'image-classification' can work.
But the output is different.
The output of 'image-classification-neo' can be used on multiple platforms.
you can check out the supported hardware platforms in the link below:
https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html
I have finally arrived in the cloud to put my NLP work to the next level, but I am a bit overwhelmed with all the possibilities I have. So I am coming to you for advice.
Currently I see three possibilities:
SageMaker
Jupyter Notebooks are great
It's quick and simple
saves a lot of time spent on managing everything, you can very easily get the model into production
costs more
no version control
Cloud9
EC2(-AMI)
Well, that's where I am for now. I really like SageMaker, although I don't like the lack of version control (at least I haven't found anything for now).
Cloud9 seems just to be an IDE to an EC2 instance.. I haven't found any comparisons of Cloud9 vs SageMaker for Machine Learning. Maybe because Cloud9 is not advertised as an ML solution. But it seems to be an option.
What is your take on that question? What have I missed? What would you advise me to go for? What is your workflow and why?
I am looking for an easy work environment where I can quickly test my models, exactly. And it won't be only me working on it, it's a team effort.
Since you are working as a team I would recommend to use sagemaker with custom docker images. That way you have complete freedom over your algorithm. The docker images are stored in ecr. Here you can upload many versions of the same image and tag them to keep control of the different versions(which you build from a git repo).
Sagemaker also gives the execution role to inside the docker image. So you still have full access to other aws resources (if the execution role has the right permissions)
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
In my opinion this is a good example to start because it shows how sagemaker is interacting with your image.
Some notes on other solutions:
The problem of every other solution you posted is you want to build and execute on the same machine. Sure you can do this but keep in mind, that gpu instances are expensive and therefore you might only switch to the cloud when the code is ready to run.
Some other notes
Jupyter Notebooks in general are not made for collaborative programming. I think they want to change this with jupyter lab but this is still in development and sagemaker only use the notebook at the moment.
EC2 is cheaper as sagemaker but you have to do more work. Especially if you want to run your model as docker images. Also with sagemaker you can easily build an endpoint for model inference which would be even more complex to realize with ec2.
Cloud 9 I never used this service and but on first glance it seems good to develop on, but the question remains if you want to do this on a gpu machine. Because you're using ec2 as instance you have the same advantage/disadvantage.
One thing I'd like to call out first is SageMaker notebook is not the only IDE environment in which you can interact with other components of SageMaker such as training and hosting. In fact you can make API calls to SageMaker training/hosting through Cloud9 or any IDEs you've installed on EC2 or even your laptop, as long as you have AWS SDK or SageMaker Python SDK installed.
Regarding the choice of the IDE, it's really up to your particular needs. SageMaker notebook is Jupyter based (now also supports JupyterLab beta), ML focused, and fully managed. Hundreds of Python packages that are commonly used in ML, as well as Tensorflow, Keras, MxNet, SageMaker Python SDK, etc., are preinstalled and automatically maintained for you. It also integrates more closely with other components of SageMaker as one can imagine.
Cloud9 is a managed IDE too but it is for general purpose rather than ML specific. If you want to use Jupyter on cloud9 it requires extra work from your side. It does not preinstall and maintain the version of common ML/DL related packages like SageMaker notebook does.