I have few questions for Sagemaker Neo:
1) Can I take advantage of Sagemaker Neo if I have an externally trained tensorflow/mxnet model?
2) Sagemaker provides container image for 'image-classification' and it has released a new image with name 'image-classification-neo' for the neo compilation job. What is the difference between both of them? Do I require a new Neo compatible image for each pre built sagemaker template(container) similarly?
Any help would be appreciated
Thanks!!
1) Yes. Upload your model to an S3 bucket as a model.tar.gz file (similar to what SageMaker would save after training) and you can compile it.
2) The Neo versions use the Neo runtime to load and predict, so yes, the containers are different. Right now, Neo supports the XGBoost and Image Classification built-in algos. Of course, you could build your own custom container and use Neo inside that. For more info: https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html
Julien
It is long time that this question has been asked. But in case someone turns up here after searching for the same question:
Mainly, Amazon NEO is an optimizer for making the program compatible for multiple underlying hardware and platform. Based on documentation:
"Neo is a new capability of Amazon SageMaker that enables machine learning models to train once and run anywhere in the cloud and at the edge. "
And yes, those 2 docker images are different. As one of them has the optimiser code, the other doesn't.
The difference is not in the input, so 'image-classification-neo' can work with images that 'image-classification' can work.
But the output is different.
The output of 'image-classification-neo' can be used on multiple platforms.
you can check out the supported hardware platforms in the link below:
https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html
Related
I've been trying to perform distributed training on SageMaker yet I'm still having problems executing it.
Currently using TensorFlow and modified original Word2vec, so I cannot use SageMaker built-in algorithms. Also not using GPU, so Horovod may not fit. So, I've been trying to figure out how to execute the code with parameter server method on SageMaker. However, I couldn't find general guidance as how to rewrite TensorFlow code for that purpose. Does anyone know how to solve this or provide some reference?
Can SageMaker model compiler be used to optimize model for AWS App Runner? Based on documentation https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OutputConfig.html#sagemaker-Type-OutputConfig-CompilerOptions it looks like it can support CPU devices - both X86 and ARM. Would the output model work with CPU-based App Runner?
Thanks very much for any insight if anyone tried it before.
You are probably referring to SageMaker Neo which is used for inference, rather to SageMaker Training Compiler (used for training).
SageMaker Neo can compile the model to a target CPU architecture, AppRunner is a high level service for light containers designed with simplicity as a primary goal. In App Runner you can set the num of vCPUs, I couldn't find any guarantees on which CPU model will be used to run the container (no surprise really). In practice, you could try optimizing for Intel instructions set, and see if the resulting runtime is supported on app runner.
Or if you really need more control and performance, switch to AWS Fargate and explicitly choose x86 or ARM.
I've been using SageMaker for a while and have performed several experiments already with distributed training. I am wondering if it is possible to test and run SageMaker distributed training in local mode (using SageMaker Notebook Instances)?
Yes, SageMaker supports distributed training in local mode. But as #Philipp Schmid said some other features (like pipemode are not supported).
No, not possible yet. local mode does not support the distributed training with local_gpufor Gzip compression, Pipe Mode, or manifest files for inputs
I am a newbie in AWS. Right now I have defined an image segmentation function in SageMaker notebook instance and this will return masks.
I didn't train my models there, what I have done is pip install models packages there, upload pre-trained weights manually. The rest is very similar to working in local machine: I imported package, load the weights, defined a function to take an image as input then outputs masks.
My question is: is there a way to host my function so that I can call it with URL endpoint + one image info, then it returns me masks in response?
Again I am so new to AWS and I begin to doubt SageMaker is not designed for this job... The reason I chose SageMaker is the need of computing capacity, I don't think I can do this job with pure lambda.
SageMaker inference endpoints currently rely on an interface based on Docker images. At the base level, you can set up a Docker image that runs a web server and responds to the endpoints on the ports that AWS require. This guide will show you how to do it: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html.
This is an annoying amount of work. If you're using a well-known framework they have a container library that contains some boilerplate code you might be able to reuse: https://github.com/aws/sagemaker-containers. You might have to do some customization there.
Or don't use SageMaker inference endpoints at all :) If your model can fit within the size / memory restrictions of AWS Lambda, that is an easier option!
Full disclaimer, I'm working on a platform that competes with SageMaker: Model Zoo
I have finally arrived in the cloud to put my NLP work to the next level, but I am a bit overwhelmed with all the possibilities I have. So I am coming to you for advice.
Currently I see three possibilities:
SageMaker
Jupyter Notebooks are great
It's quick and simple
saves a lot of time spent on managing everything, you can very easily get the model into production
costs more
no version control
Cloud9
EC2(-AMI)
Well, that's where I am for now. I really like SageMaker, although I don't like the lack of version control (at least I haven't found anything for now).
Cloud9 seems just to be an IDE to an EC2 instance.. I haven't found any comparisons of Cloud9 vs SageMaker for Machine Learning. Maybe because Cloud9 is not advertised as an ML solution. But it seems to be an option.
What is your take on that question? What have I missed? What would you advise me to go for? What is your workflow and why?
I am looking for an easy work environment where I can quickly test my models, exactly. And it won't be only me working on it, it's a team effort.
Since you are working as a team I would recommend to use sagemaker with custom docker images. That way you have complete freedom over your algorithm. The docker images are stored in ecr. Here you can upload many versions of the same image and tag them to keep control of the different versions(which you build from a git repo).
Sagemaker also gives the execution role to inside the docker image. So you still have full access to other aws resources (if the execution role has the right permissions)
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
In my opinion this is a good example to start because it shows how sagemaker is interacting with your image.
Some notes on other solutions:
The problem of every other solution you posted is you want to build and execute on the same machine. Sure you can do this but keep in mind, that gpu instances are expensive and therefore you might only switch to the cloud when the code is ready to run.
Some other notes
Jupyter Notebooks in general are not made for collaborative programming. I think they want to change this with jupyter lab but this is still in development and sagemaker only use the notebook at the moment.
EC2 is cheaper as sagemaker but you have to do more work. Especially if you want to run your model as docker images. Also with sagemaker you can easily build an endpoint for model inference which would be even more complex to realize with ec2.
Cloud 9 I never used this service and but on first glance it seems good to develop on, but the question remains if you want to do this on a gpu machine. Because you're using ec2 as instance you have the same advantage/disadvantage.
One thing I'd like to call out first is SageMaker notebook is not the only IDE environment in which you can interact with other components of SageMaker such as training and hosting. In fact you can make API calls to SageMaker training/hosting through Cloud9 or any IDEs you've installed on EC2 or even your laptop, as long as you have AWS SDK or SageMaker Python SDK installed.
Regarding the choice of the IDE, it's really up to your particular needs. SageMaker notebook is Jupyter based (now also supports JupyterLab beta), ML focused, and fully managed. Hundreds of Python packages that are commonly used in ML, as well as Tensorflow, Keras, MxNet, SageMaker Python SDK, etc., are preinstalled and automatically maintained for you. It also integrates more closely with other components of SageMaker as one can imagine.
Cloud9 is a managed IDE too but it is for general purpose rather than ML specific. If you want to use Jupyter on cloud9 it requires extra work from your side. It does not preinstall and maintain the version of common ML/DL related packages like SageMaker notebook does.