How to deploy Tensorflow service in a server

How to deploy Tensorflow service in a server - web-services

I successfully trained a deep net using tensorflow. The net is able to modify the content of a square 512x512 px image.
Now I would like to deploy the net in a server and create a basic service. Something like this: https://letsenhance.io/. Where a user (client) can send a picture and the server respond with a modified picture.
I read a lot of tutorials about AWS Lambda, but my problem is that the model data (weights) is very big: 700 MB, and seems to exceed the Lambda limitation (https://docs.aws.amazon.com/lambda/latest/dg/limits.html).
So I can use Lambda? Or exist an alternative and better way to do this?

It will exceed the limitations of AWS Lambda.
But you can try running on EC2 instance with GPU support but it will take some time to set it up. Also, it costs much more than running lambdas.
I've heard about TensorFlow on AWS but I haven't tried it.

Related

Sending Data From Elasticsearch to AWS Databases in Real Time

I know this is a very different use case for Elasticsearch and I need your help.
Main structure (can't be changed):
There are some physical machines and we have sensors there. Data from
these sensors are going to AWS Greengrass.
Then, with Lambda function data are going to Elasticsearch by using
MQTT. Elasticsearch is running on the docker.
This is the structure and until here everything is ready and running ✅
Now, on the top of the ES I need some software that can send this data by using MQTT to Cloud database, for example DynamoDB.
But this is not one time migration. It should send the data continuously. Basically, I need a channel between ES and AWS DynamoDB.
Also, sensors are producing so much data and we don't want to store all of them in the Cloud but we want to store them in ES. Some filtering is needed in the Elasticsearch side before we send data to Cloud. Like "save every 10th data to cloud" so we can only save 1 data out of 10.
Do you have any idea about how can it be done? I have no experience in this field and it looks like a challenging task. I would love to get some suggestions from experienced people in these areas.
Thanks a lot! 🙌😊

I haven’t worked on a similar use case but you can try looking into Logstash for this.
It's an open source service, part of ELK stack and provides the option of filtering the output. The pipeline will look something link below:
data ----> ES ----> Logstash -----> DynamoDB or any other destination.
It supports various plugins required for your use case, like:
DynamoDB output plugin -
https://github.com/tellapart/logstash-output-dynamodb
Logstash MQTT Output Plugin -
https://github.com/kompa3/logstash-output-mqtt

Migrating Django application from heroku (celery/redis) to aws fargate/lambda

Apologies in advance for my little knowledge of AWS
I'm trying to draw parallels between my current setup on Heroku to a move to AWS. I've run into some memory issues on Heroku because of some machine learning models I'm running and Heroku seems too expensive for my needs.
I was recommenced to move to aws using fargate which would be a better fit for my app. Below is my whole architecture, I'm hoping for some guidance on my direction of what I have and where I plan to go.
A django application running on heroku.
The base of functionality is the user uploads a video from their mobile device and uploads it to s3. A message from SNS is sent to my Heroku server that the upload is completed. The server kicks off a celery task that downloads the video from s3 and uses a machine learning model to do some natural language processing, then saves the results to my postresql database. Obviously this is very compute intensive, so I've run into some memory issues and can for-see scaling issues to come.
After lots of tweaking and attempts to no avail, I've decided to move over to AWS and leverage some of the cost benefits that I've seen in comparison to heroku of running more memory intensive tasks.
I should also mention there is a web interface involved with this django project and it isn't just a REST Api.
As far as AWS goes, I'm looking for a bit of direction. Possibly just a rough outline of the architecture I should look deeper into.
My first plan is to dockerize my application and go from there...but I'm a bit stuck on how my application fits (website, rest api, worker threads) into the AWS ecosystem.

AWS is a great fit for the application you describe. AWS Fargate / RDS will host your Django application. You have the option of using AWS Batch to handle your processing. One huge advantage is the ability to scale according to the needs of your application.
This image is one possible way to structure your application. It's a lot of work to get to this point, but AWS offers a lot of power and flexibility for reasonable costs IMO.

Should I run forecast predictive model with AWS lambda or sagemaker?

I've been reading some articles regarding this topic and have preliminary thoughts as what I should do with it, but still want to see if anyone can share comments if you have more experience with running machine learning on AWS. I was doing a project for a professor at school, and we decided to use AWS. I need to find a cost-effective and efficient way to deploy a forecasting model on it.
What we want to achieve is:
read the data from S3 bucket monthly (there will be new data coming in every month),
run a few python files (.py) for custom-built packages and install dependencies (including the files, no more than 30kb),
produce predicted results into a file back in S3 (JSON or CSV works), or push to other endpoints (most likely to be some BI tools - tableau etc.) - but really this step can be flexible (not web for sure)
First thought I have is AWS sagemaker. However, we'll be using "fb prophet" model to predict the results, and we built a customized package to use in the model, therefore, I don't think the notebook instance is gonna help us. (Please correct me if I'm wrong) My understanding is that sagemaker is a environment to build and train the model, but we already built and trained the model. Plus, we won't be using AWS pre-built models anyways.
Another thing is if we want to use custom-built package, we will need to create container image, and I've never done that before, not sure about the efforts to do that.
2nd option is to create multiple lambda functions
one that triggers to run the python scripts from S3 bucket (2-3 .py files) every time a new file is imported into S3 bucket, which will happen monthly.
one that trigger after the python scripts are done running and produce results and save into S3 bucket.
3rd option will combine both options:
- Use lambda function to trigger the implementation on the python scripts in S3 bucket when the new file comes in.
- Push the result using sagemaker endpoint, which means we host the model on sagemaker and deploy from there.
I am still not entirely sure how to put pre-built model and python scripts onto sagemaker instance and host from there.
I'm hoping whoever has more experience with AWS service can help give me some guidance, in terms of more cost-effective and efficient way to run model.
Thank you!!

I would say it all depends on how heavy your model is / how much data you're running through it. You're right to identify that Lambda will likely be less work. It's quite easy to get a lambda up and running to do the things that you need, and Lambda has a very generous free tier. The problem is:
Lambda functions are fundamentally limited in their processing capacity (they timeout after max 15 minutes).
Your model might be expensive to load.
If you have a lot of data to run through your model, you will need multiple lambdas. Multiple lambdas means you have to load your model multiple times, and that's wasted work. If you're working with "big data" this will get expensive once you get through the free tier.
If you don't have much data, Lambda will work just fine. I would eyeball it as follows: assuming your data processing step is dominated by your model step, and if all your model interactions (loading the model + evaluating all your data) take less than 15min, you're definitely fine. If they take more, you'll need to do a back-of-the-envelope calculation to figure out whether you'd leave the Lambda free tier.
Regarding Lambda: You can literally copy-paste code in to setup a prototype. If your execution takes more than 15min for all your data, you'll need a method of splitting your data up between multiple Lambdas. Consider Step Functions for this.

SageMaker is a set of services that each is responsible for a different part of the Machine Learning process. What you might want to use is the hosted version of Jupyter notebooks in SageMaker. You get a lot of freedom in the size of the instance that you are using (CPU/GPU, memory, and disk), and you can install various packages on that instance (such as FB Prophet). If you need it once a month, you can stop and start the notebook instances between these times and "Run all" the cells in your notebooks on this instance. It will only cost you the minutes of execution.
regarding the other alternatives, it is not trivial to run FB Prophet in Lambda due to the size limit of the libraries that you can install on Lambda (to avoid too long cold start). You can also use ECS (container Service) where you can have much larger images, but you need to know how to build a Docker image of your code and endpoint to be able to call it.

Object Detection Django Rest API Deployment on Google Cloud Platform or Google ML Engine

I have developed Django API which accepts images from livefeed camera using in the form of base64 as request. Then, In API this image is converted into numpy arrays to pass to machine learning model i.e object detection using tensorflow object API. Response is simple text of detected objects.
I need GPU based cloud instance where i can deploy this application for fast processing to achieve real time results. I have searched a lot but no such resource found. I believe google cloud console (instances) can be connected to live API but I am not sure how exactly.
Thanks

I assume that you're using GPU locally or wherever your Django application is hosted.
First thing is to make sure that you are using tensorflow-gpu and all the necessary setup for Cuda is done.
You can start your GPU instance easily on Google Cloud Platform (GCP). There are multiple ways to do this.
Quick option
Search for notebooks and start a new instance with the required GPU and
RAM.
Instead of the notebook instance, you can set up the instance separately if you need some specific OS and more flexibility on choosing the machine.
To access the instance with ssh simply add your ssh public key
to Metadata which can be seen when you open the instance details.
Setup Django as you would do on the server. To test it simply just debug run it on host 0 or 0.0.0.0 and preferred port.
You can access the APIs with the external IP of the machine which can be found out in the instance details page.
Some suggestions
While the first option is quick and dirty, it's not recommended to use that in production.
It is better to use some deployment services such as tensorflow-serving along with Kubeflow.
If you think that you're handling the inference properly itself, then make sure that you load balance the server properly. Use NGINX or any other good server along with gunicorn/uwsgi.
You can use redis for queue management. When someone calls the API, it is not necessary that GPU is available for the inference. It is fine not to use this when you have very less number of hits on the API per second. But when we think of scaling up, think of 50 requests per second which a single GPU can't handle at a time, we can use a queue system.
All the requests should directly go to redis first and the GPU takes the jobs required to be done from the queue. If required, you can always scale the GPU.

Google Cloud actually offers Cloud GPUs. If you are looking to perform higher level computations with your applications that require real-time capabilities I would suggest your look into the following link for more information.
https://cloud.google.com/gpu/
Compute Engine also provides GPUs that can be added to your virtual machine instances. Use GPUs to accelerate specific workloads on your instances such as Machine Learning and data processing.
https://cloud.google.com/compute/docs/gpus/
However, if your application requires a lot of resources you’ll need to increase your quota to ensure you have enough GPUs available in your project. Make sure to pick a zone where GPUs are available. If this requires much more computing power you would need to submit a request for an increase of your quota. https://cloud.google.com/compute/docs/gpus/add-gpus#create-new-gpu-instance
Since you would be using the Tensorflow API for your application on ML Engine I would advise you to take a look at this link below. It provides instructions for creating a Deep Learning VM instance with TensorFlow and other tools pre-installed.
https://cloud.google.com/ai-platform/deep-learning-vm/docs/tensorflow_start_instance

Amazon Web Services and non-amazon website

I'm confused about some facets of the Amazon Web Services stuff. Here is what I want to do.
My site lets users enter equations and solve them. Some of the equations will deal with large data sets and math that is too computationally expensive for the browser.
My site will look at each equation and determine if it should be solved in the browser or on a server.
If it needs to be solved on the server, I want to do one of two things. First, either send the data and a function and have AWS run the code on that data. The other option is to have preset code with is given data.
AWS then runs the code and returns a JSON of the solution.
For example, lets say that a user has a numeric matrix of 1,000 by 1,000 and they want to take the inverse or do Gaussian elimination. My code would look at the size of the matrix and decide that it needs to be run on the server. The code would then call my function on AWS to solve this, send it the data, and AWS returns the answer.
As I read, I don't understand exactly how to set up EC2 to call a function from a server or from an ajax call. Does AWS not do what I think it does? Do I need to host my site on AWS to do this?
If it matters, I am running a LAMP stack on Hostmonster.

You can use Amazon EC2 to create a server (eg a web server) that is accessible on the Internet. What you load on the server, and how you use the server, is up to you.
There is no functionality provided by Amazon EC2 that would help you for your specific stated use case. Anything you would run on a "normal" server can be run on Amazon EC2, since it is just a virtual machine running an operating system and whatever software you configure.
From your description, you will need to develop a web app that runs mostly in the browser (eg with JavaScript), but also makes calls to a back-end server. How you do that is totally in your control.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js