under the assumptions that the model training itself is very fast, I'm wondering what is the best practice to spin up ~ > 1K models endpoints
fast as possible.
Thanks for any hint
Christian
Assuming these are different models (not production variants for testing), you'll need one endpoint per model and thus one SageMaker instance. Probably not the greatest option (cost, time to spin up instances, synchronous calls, API throttling, etc). For now, I'd use another service to deploy, e.g. an ECS cluster.
Could you please tell me a little more about your use case (business problem, framework, model size, etc)? You're not the first one to ask about this capability and your feedback would be very valuable in building the best solution.
Julien (AWS)
Related
I have an API (python+FastAPI) with different endpoints. One of these endpoints manages images, and apply different models (up to 4) one after another one. Furthermore, there can be a preprocessing and postprocessing between these models are applied in series.
What is the best approach to scale this kind of endpoints using AWS? I have in mind pages such as huggingface where you can try models and they run fast even they can have many calls at the same time. Or for example, pages like https://interiorai.com.
Right now I have an instance inside an autoscaling group, which adds images to a queue using celery+rabbitmq, and then process images one by one since celery can only work with maximum 1 worker if you need to use GPU.
Should I look for real-time instance inference? Is there any better aproximations? Do you have a good example on how to start with this?
Your best bet is SageMaker real-time inference, with this you can also enable AutoScaling. You can also utilize SageMaker Serial Inference Pipelines to stitch together multiple models (in containers) behind just one endpoint.
i just arrived on this architecture, am doing a lot of research and i understood how it work in general but it's all theorical.
I decided to separate each step for the development of this architecture to start implementing so i can understand better these steps.
The first that i wanted to learn was the tenant provisioning, i wanted to apply it on AWS to mirror a production software example.
So, starting on that the common AWS service that i see most people using is AWS Cognito, but it's not clear in my mind the steps of the implementation, like how should i get the tenant data to onboard him in my app? Assuming it's tier based.
Should i have one database to store all tenants data separate from the application database?
I want to use microservices on this one because i think is better to onboard the tenant with different tiers and much more benefits.
Which AWS services should i use to make this process work? I'm not really asking about the implementation itself but a path to understand which services to use and how it connects with each other.
I hope i was clear about my doubts, english is not my mother tongue, sorry about that!
You are thinking in the right direction. However, there are decisions you need to make before diving into any saas service stack. I would start with
Planning my infrastructure - how many tenants/group.
the kind of tenant onboarding system you want
How will tenants onboard their users and manage authorization/authentication
Multitenant architecture, which needs to account for several things at the least like - DB model, shared vs isolated, data privacy, design keeping in mind industry data security standards
what will be your tenant deployment model. Remember one of the disadvantages of multitenancy is also slow time to market.
Your API stack needs to account for which apis needs to be multitenant and which are generic product offerings.
operational tool to monitor app health, client analytics.
how will you meter and bill the client and other non-functional decisions.
AWS offers good documentation to get started here : https://aws.amazon.com/blogs/apn/building-a-multi-tenant-saas-solution-using-aws-serverless-services/
This may be a very trivial question but its something I could not find a convincing answer.
I am building a web application with Django as the backend.
My application has a user profile which can take a profile picture. I have to decide on
cloud storage solution.
I wanted to ask what is best strategy for such a task. Storing on cloud, or storing on the web server itself. I want to know the pros and cons (ignoring cost).
Also if I decide to go with cloud storage, what are the best for such a task. Here its profile picture (limiting single picture to say less than 1MB), so taking into account latency etc.
I have worked on AWS S3 and have a solution in place to store these images on S3. But I wanted to have opinions.
We currently run a Java backend which we're hoping to move away from and switch to Node running on AWS Lambda & Serverless.
Ideally during this process we want to build out a fully service orientated architecture.
My question is if our frontend angular app requests the current user's ordered items to get that information it would need to hit three services, the user service, the order service and the item service.
Does this mean we would need make three get requests to these services? At the moment we would have a single endpoint built for that specific request, which can then take advantage of DB joins for optimal performance.
I understand the benefits SOA, but how to do we scale when performing more compex requests such as this? Are there any good resources I can take a look at?
Looking at your question I would advise to align your priorities first: why do you want to move away from the Java backend that you're running on now? Which problems do you want to overcome?
You're combining the microservices architecture and the concept of serverless infrastructure in your question. Both can be used in conjunction, but they don't have to. A lot of companies are using microservices, even bigger enterprises like Uber (on NodeJS), but serverless infrastructures like Lambda are really just getting started. I would advise you to read up on microservices especially, e.g. here are some nice articles. You'll also find answers to your question about performance and joins.
When considering an architecture based on Lambda, do consider that there's no state whatsoever possible in a Lambda function. This is a step further then stateless services that we usually talk about; they generally target 'client state' that does not exist anymore. But a Lambda function cannot have any state, so e.g. a persistent DB-connection pool is not possible. For all the downsides, there's also a lot of stuff you don't have to deal with which can be very beneficial, especially in terms of scalability.
I have a web app running on php, mysql, apache on a virtual windows server. I want to redesign it so it is scalable (for fun so I can learn new things) on AWS.
I can see how to setup an EC2 and dump it all in there but I want to make it scalable and take advantage of all the cool features on AWS.
I've tried googling but just can't find a simple guide (note - I have no command line experience of Linux)
Can anyone direct me to detailed resources that can lead me through the steps and teach me? Or alternatively, summarise the steps in an answer so I can research based on what you say.
Thanks
AWS is growing and changing all the time, so there aren't a lot of books to help. Amazon offers training that's excellent. I took their three day class on Architecting with AWS that seems to be just what you're looking for.
Of course, not everyone can afford to spend the travel time and money to attend a class. The AWS re:Invent conference in November 2012 had a lot of sessions related to what you want, and most (maybe all) of the sessions have videos available online for free. Building Web Scale Applications With AWS is probably relevant (slides and video available), as is Dissecting an Internet-Scale Application (slides and video available).
A great way to understand these options better is by fiddling with your existing application on AWS. It will be easy to just move it to an EC2 instance in AWS, then start taking more advantage of what's available. The first thing I'd do is get rid of the MySql server on your own machine and use one offered with RDS. Once that's stable, create one or more read replicas in RDS, and change your application to read from them for most operations, reading from the main (writable) database only when you need completely current results.
Does your application keep any data on the web server, other than in the database? If so, get rid of all local storage by moving that data off the EC2 instance. Some of it might go to the database, some (like big files) might be suitable for S3. DynamoDB is a good place for things like session data.
All of the above reduces the load on the web server to just your application code, which helps with scalability. And now that you keep no state on the web server, you can use ELB and Auto-scaling to automatically run multiple web servers (and even automatically launch more as needed) to handle greater load.
Does the application have any long running, intensive operations that you now perform on demand from a web request? Consider not performing the operation when asked, but instead queueing the request using SQS, and just telling the user you'll get to it. Now have long running processes (or cron jobs or scheduled tasks) check the queue regularly, run the requested operation, and email the result (using SES) back to the user. To really scale up, you can move those jobs off your web server to dedicated machines, and again use auto-scaling if needed.
Do you need bigger machines, or perhaps can live with smaller ones? CloudWatch metrics can show you how much IO, memory, and CPU are used over time. You can use provisioned IOPS with EC2 or RDS instances to improve performance (at a cost) as needed, and use difference size instances for more memory or CPU.
All this AWS setup and configuration can be done with the AWS web console, or command-line tools, or SDKs available in many languages (Python's boto library is great). After learning the basics, look into CloudFormation to automate it better (I've written a couple of posts about that so far).
That's a bit of the 10,000 foot high view of one approach. You'll need to discover the details of each AWS service when you try to use them. AWS has good documentation about all of them.
Depending on how you look at it, this is more of a comment than it is an answer, but it was too long to write as a comment.
What you're asking for really can't be answered on SO--it's a huge, complex question. You're basically asking is "How to I design a highly-scalable, durable application that can be deployed on a cloud-based platform?" The answer depends largely on:
The specifics of your application--what does it do and how does it work?
Your tolerance for downtime balanced against your budget
Your present development and deployment workflow
The resources/skill sets you have on-staff to support the application
What your launch time frame looks like.
I run a software consulting company that specializes in consulting on Amazon Web Services architecture. About 80% of our business is investigating and answering these questions for our clients. It's a multi-week long project each time.
However, to get you pointed in the right direction, I'd recommend that you look at Elastic Beanstalk. It's a PaaS-like service that abstracts away the underlying AWS resources, making AWS easier to use for developers who don't have a lot of sysadmin experience. Think of it as "training wheels" for designing an autoscaling application on AWS.