Recommended way to run a web server on demand, with auto-shutdown (on AWS)

Recommended way to run a web server on demand, with auto-shutdown (on AWS) - amazon-web-services

I am trying to find the best way to architect a low cost solution to provide an on-demand web server for a certain amount of time.
The context is as follows: I have some large amount of data sitting on S3. From time to time, users will want to consult that data. I've written a Flask app that can display the data in a nice way for them. Beign poorly written, it really only accepts a single user session at the time. Currently therefore they have to download the Flask app and run it on their own machine.
I would like to find a way for users to request a cloud-based web server that would run the Flask app (through a docker container for example) on-demand, and give them access to it quickly, without having to do much if anything on their own machine.
Every user wanting to view the data would have their own web server created on demand (to avoid multiple users sharing the same web server, which wouldn't work with my Flask app)
Critically, and in order to avoid cost, the web server would terminate itself automatically after some (configurable) idle time (possibly with the Flask app informing the user that it's about to shut down, so that they can "renew" the lease).
Initially I thought that maybe AWS Fargate would be good: it can run docker instances, is quite configurable in terms of CPU/disk it can get (my Flask app is resource-hungry), and at least on paper could be used in a way that there is zero cost when users are not consulting the data (bar S3 costs naturally). But it's when it comes to the detail that I'm not sure...
How to ensure that every new user gets their own Fargate instance?
How to shut-down the instance automatically after idle time?
Is Fargate quick enough in terms of boot time?

The closest I can think is AWS App Runner. It's built on top of Fargate and it provides an intelligent scale out mechanism (probably you are not interested in this) as well as a scale to (almost) 0 capability. The way it works is that when the endpoint is solicited and it's doing work you pay for the entire fargate task (cpu/memory) you have selected in the configuration. If the endpoint is doing nothing you only pay for the memory (note the memory cost is roughly 20% of the entire cost so it's not scale to 0 but "quasi"). Checkout the pricing examples at the bottom of this page.
Please note you can further optimize costs by pausing/starting the endpoint (when it's paused you pay nothing) but in that case you need to create the logic that pauses/restarts it.
Another option you may want to explore is using Lambda this way (which would allow you to use the same container image and benefit from the intrinsic scale to 0 of Lambda). But given your comment "Lambda doesn’t have enough power, and the timeout is not flexible" may be a show stopper.

Related

AWS EC2 multi-tenant no scale up: which is the simplest way to implement it?

the need: I have a team of people who are not particularly tech savvy but we still want them to be able to access a jupyter notebook in a hosted ec2 instance (which has got access to a variety of resources in AWS). Not all of the people use this instance all the time but it's reasonable to believe that that the instance will be used continuosly by one person or the other throughout the day ( and we do not have budget to scale up things on demand).
the solution: to use a single medium sized instance in a multi-tenant fashion so that basically every user could link up to it and, as more users connect, resources are re-distributed (meaning the already present users will witness reduced capacity in RAM and CPU - up to a limit as CPUs are finite in number) and, as they disconnect, the opposite happens.
the problem & question: I saw there are some pretty beefy/complicated infrastructure designs for this. my question is: what's the simplest (=least complicated/expensive to maintain solution) architecture to cater such need?
p.s. on security: all users are equal, there are not users who have got more rights than others, instance will be accessible just via a pwd, no ID management is required.
p.s. on caching sessions: after a certain period of inactivity (or you close down the browser in this case), session will be terminated on its own. no session caching needs to be employed

Display real time data on website that scales?

I am starting a project where I want to create a website which will display LIVE flight information and status. We all have seen this at airport. An example is given here - http://www.computronics.biz/productimages/prodairport4.jpg. As you can see this information changes continuously. The website will talk to a backend api and the this backend api will talk to database. Now the important part is that the flight information in the database will be updated by the airline itself. There could be several airlines and they will update their data respectively. I have drawn a diagram and uploaded here - https://imgur.com/a/ssw1S.
Now those airlines will obviously have an interface (website talking to some backend API) through which they will update the database.
Now here is my attempt to solve it. We need to have some sort of trigger such that if any airline updates a flight detail in the database between current time - 1 hour to current + 4 hours (website will only display few hours of flights), we need to call the web api and then send the update to the website in the real time. The user must not refresh the page at all. At the same time the website needs to scale well i.e. if 1 million users are on the website, and there is an update in the database in the correct time range, all 1 million user's website should get updated within a decent amount of time.
I did some research and it looks like we need to have an event based approach. For example - we need to create a function (AWS lambda or Azure function) that should be called whenever there is an update in the database (Dynamo DB for example) within the correct time range. This function then should call an API which should then update the website through web socket technology for example.
I am not looking for any code but just some alternative suggestions on how this can be solved in a scalable way. Also how do we test scalability?

Dont use serverless functions(Lambda/Azure functions)
Although I am a huge fan of serverless functions, and currently running a full web app in Lambda, I don't think its needed for your use case and doesn't make sense economically. As you've answered in the comments, each airline will not write directly to the database, they'll push to an API, meaning you are explicitly told when flights have changed. When an airline has sent you new data you can simply propagate this to all the browser endpoints via websockets. This keeps the design very simple. There is no need to artificially create a database event that then triggers a function that will then tell you a flight has been updated. Thats like removing your doorbell and replacing it with a motion detector that triggers a doorbell :)
Cost
Money always deserves its own section. Lambda is more of an economic break through than a technological one. You have to know when its cost effective. You pay per request so if your dealing with a process that handles 10,000 operations a month, or something that only fires 1,000 times a day, than lambda is dirt cheap and practically free. You also pay for the length of time the function is executing and the memory consumed while executing. Generally, it makes sense to use lambda functions where a dedicated server would be sitting idle for most of the time. So instead of a whole EC2 instance, AWS provides you with a container on demand. There are points at which high requests rates and constantly running processes makes lambda more expensive than EC2. This article discusses how generally its cheaper to use lambda up to a point -> https://www.trek10.com/blog/lambda-cost/ The same applies to Azure functions and googles equivalent. They are all just containers offered on demand.
If you're dealing with flight information I would imagine you will have thousands of flights being updated every minute so your lambda functions will be firing constantly as if you were running an EC2 instance. You will end up paying a lot more than EC2. When you have a service that needs to stay up 24/7 and run 24/7 with high activity that is most certainly a valid use case for a dedicated server or servers.
Proposed Solution
These are the components I would use below:
Message Queue of some sort (RabbitMQ or AWS SQS with SNS perhaps)
Web Socket Backend (The choice will depend on programming language)
Airline input API (REST,GraphQL, or maybe AWS Kinesis Data Firehose)
The airlines publish their data to a back-end api. The updates are stored on a message queue and the web applicaton that actually displays the results to users, via websockets, reads from the queue.
Scalability
For scalability you can run the websocket application on multiple EC2 instances (all reading from the same queuing service) in an autoscaling group, so with extra load more instances will be created automatically hence the name "autoscaling". And those instances can sit behind an elastic load balancer. Lots of AWS documentation on how to do this and its their flagship design pattern. If you use AWS SQS you don't have to manage the scalability details yourself, aws handles that. The only real components to scale are your websocket application and the flight data input endpoint. You can run the flight api in an autoscaling group as well but AWS does offer an additional tool for high traffic data processing. I detail that below.
Testing Scalability
It would be fairly easy to have a mock airline blast your service with thousands and thousands of fake updates and on the other end you can easily run multiple threads of selenium tests simulating browser clicks and validating that the UI is still operational.
Additional tools
If it ends up being large amounts of data, rather than using a conventional REST api for your flight update service you could consider a service AWS offers specifically for dealing with large amounts of real time updates (Kinessis Data Firehose) https://aws.amazon.com/kinesis/data-firehose/ But I've never used it.

First, please don't over think this. This is a trivial problem to solve and doesn't require any special techniques, technologies or trendy patterns & frameworks.
You actually have three functional areas you can address almost separately.
Ingestion - Collection and normalization of the data from the various sources. For this, you'll need a process and transformation engine, LogicApps or such.
Your databases. You'll quickly learn that not all flights are the same ;). While it might seem so, the amount of data isn't that much. Instances of MySQL/SQL Server tuned for a particular function will work just fine. Hint, you don't need to have data for every movement ready to present all the time.
Presentation. The data API and UIs. This, really, is the easy part. I would suggest you use basic polling at first. For reasons you will never have any control over, the SLA for flight data is ~5 minutes so a real-time client notification system is time you should spend elsewhere at first.

I need feedback on this partly serverless architecture design

I want to host a scalable blog or application of this sort in nodeJS on AWS making use of AWS technologies. The idea here is to have a small EC2 server that is not responsible for serving the website, but only for running the CMS/admin panel. While these operations could be serverless as well, I think having a dedicated small VM EC2 instance could be more efficient, and works better with existing frameworks, etc.
In my diagram above, you can see there's two type of users audiences and admin/writers. Admin CRUD operations also cause lambda to run. Lambda generates the static site after Admin changes, which is delivered to S3. Users are directed to the static site hosted in S3. Only admins/writers have access to the server-connecting part of the site.
I think this is a good design for an extremely scalable and relatively cheap site, as long as the user-facing side is all static. An alternative to this is a CDN, but then I have to deal with cache invalidation issues, a site that updates slower, and a larger server.
This seems like a win-win to me. Feedback?

This ought to be a comment rather than an answer, but as I don't have enough points...
There are a couple of other considerations for this architecture. Lambda functions are great for scaling out microservices horizontally with each small function being executed in parallel tens or hundreds of times. Generation of a static site is typically a single threaded operation so you may not see the gains you expect, you'll also need to watch the timeout period (maximum 300 seconds currently) and make sure that you can generate the site in that time. Of course if you are not running Lambda code you are not getting charged.
For your admin frontend I would suggest ElasticBeanstalk, even if you peg it at a single instance, it gives you lots of great features like rolling updates.
Good luck with the project.

How to convert a WAMP stacked app running on a VPS to a scalable AWS app?

I have a web app running on php, mysql, apache on a virtual windows server. I want to redesign it so it is scalable (for fun so I can learn new things) on AWS.
I can see how to setup an EC2 and dump it all in there but I want to make it scalable and take advantage of all the cool features on AWS.
I've tried googling but just can't find a simple guide (note - I have no command line experience of Linux)
Can anyone direct me to detailed resources that can lead me through the steps and teach me? Or alternatively, summarise the steps in an answer so I can research based on what you say.
Thanks

AWS is growing and changing all the time, so there aren't a lot of books to help. Amazon offers training that's excellent. I took their three day class on Architecting with AWS that seems to be just what you're looking for.
Of course, not everyone can afford to spend the travel time and money to attend a class. The AWS re:Invent conference in November 2012 had a lot of sessions related to what you want, and most (maybe all) of the sessions have videos available online for free. Building Web Scale Applications With AWS is probably relevant (slides and video available), as is Dissecting an Internet-Scale Application (slides and video available).
A great way to understand these options better is by fiddling with your existing application on AWS. It will be easy to just move it to an EC2 instance in AWS, then start taking more advantage of what's available. The first thing I'd do is get rid of the MySql server on your own machine and use one offered with RDS. Once that's stable, create one or more read replicas in RDS, and change your application to read from them for most operations, reading from the main (writable) database only when you need completely current results.
Does your application keep any data on the web server, other than in the database? If so, get rid of all local storage by moving that data off the EC2 instance. Some of it might go to the database, some (like big files) might be suitable for S3. DynamoDB is a good place for things like session data.
All of the above reduces the load on the web server to just your application code, which helps with scalability. And now that you keep no state on the web server, you can use ELB and Auto-scaling to automatically run multiple web servers (and even automatically launch more as needed) to handle greater load.
Does the application have any long running, intensive operations that you now perform on demand from a web request? Consider not performing the operation when asked, but instead queueing the request using SQS, and just telling the user you'll get to it. Now have long running processes (or cron jobs or scheduled tasks) check the queue regularly, run the requested operation, and email the result (using SES) back to the user. To really scale up, you can move those jobs off your web server to dedicated machines, and again use auto-scaling if needed.
Do you need bigger machines, or perhaps can live with smaller ones? CloudWatch metrics can show you how much IO, memory, and CPU are used over time. You can use provisioned IOPS with EC2 or RDS instances to improve performance (at a cost) as needed, and use difference size instances for more memory or CPU.
All this AWS setup and configuration can be done with the AWS web console, or command-line tools, or SDKs available in many languages (Python's boto library is great). After learning the basics, look into CloudFormation to automate it better (I've written a couple of posts about that so far).
That's a bit of the 10,000 foot high view of one approach. You'll need to discover the details of each AWS service when you try to use them. AWS has good documentation about all of them.

Depending on how you look at it, this is more of a comment than it is an answer, but it was too long to write as a comment.
What you're asking for really can't be answered on SO--it's a huge, complex question. You're basically asking is "How to I design a highly-scalable, durable application that can be deployed on a cloud-based platform?" The answer depends largely on:
The specifics of your application--what does it do and how does it work?
Your tolerance for downtime balanced against your budget
Your present development and deployment workflow
The resources/skill sets you have on-staff to support the application
What your launch time frame looks like.
I run a software consulting company that specializes in consulting on Amazon Web Services architecture. About 80% of our business is investigating and answering these questions for our clients. It's a multi-week long project each time.
However, to get you pointed in the right direction, I'd recommend that you look at Elastic Beanstalk. It's a PaaS-like service that abstracts away the underlying AWS resources, making AWS easier to use for developers who don't have a lot of sysadmin experience. Think of it as "training wheels" for designing an autoscaling application on AWS.

What configurations need to be set for a LAMP server for heavy traffic?

I was contracted to make a groupon-clone website for my client. It was done in PHP with MYSQL and I plan to host it on an Amazon EC2 server. My client warned me that he will be email blasting to about 10k customers so my site needs to be able to handle that surge of clicks from those emails. I have two questions:
1) Which Amazon server instance should I choose? Right now I am on a Small instance, I wonder if I should upgrade it to a Large instance for the week of the email blast?
2) What are the configurations that need to be set for a LAMP server. For example, does Amazon server, Apache, PHP, or MySQL have a maximum-connections limit that I should adjust?
Thanks

Technically, putting the static pages, the PHP and the DB on the same instance isn't the best route to take if you want a highly scalable system. That said, if the budget is low and high availablity isn't a problem then you may get away with it in practise.
One option, as you say, is to re-launch the server on a larger instance size for the period you expect heavy traffic. Often this works well enough. You problem is that you don't know the exact model of the traffic that will come. You will get a certain percentage who are at their computers when it arrives and they go straight to the site. The rest will trickle in over time. Having your client send the email whilst the majority of the users are in bed, would help you somewhat, if that's possible, by avoiding the surge.
If we take the case of, say, 2,000 users hitting your site in 10 minutes, I doubt a site that hasn't been optimised would cope, there's very likely to be a silly bottleneck in there. The DB is often the problem, a good sized in-memory cache often helps.
This all said, there are a number of architectural design and features provided by the likes of Amazon and GAE, that enable you, with a correctly designed back-end, to have to worry very little about scalability, it is handled for you on the most part.
If you split the database away from the web server, you would be able to put the web server instances behind an elastic load balancer and have that scale instances by demand. There also exist standard patterns for scaling databases, though there isn't any particular feature to help you with that, apart from database instances.
You might want to try Amazon mechanical turk, which basically lots of people who'll perform often trivial tasks (like navigate to a web page click on this, etc) for a usually very small fee. It's not a bad way to simulate real traffic.
That said, you'd probably have to repeat this several times, so you're better off with a load testing tool. And remember, you can't load testing a time-slicing instance with another time-slicing instance...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js