Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am writing a system that has extremely high volume of transactions, CRUD and I am working with AWS. What are the considerations that I must keep in mind given that none of the data should be lost?
I have done some research and they say to use SQS queues to make sure that data is not lost. What other backup, redundancy, quick processing considerations should I keep in mind?
So if you want to create a system that is highly resilient, whilst also being redundant I would advise you to take a read of the AWS Well Architected Framework. This will go into more detail that a person can provide on stack overflow.
Regarding individual technologies:
If you're transactional like you said, then you should look at using a relational data store for storing the data. I'd recommend taking a look at Amazon Aurora, it has built in features like auto scaling of read onlys and multi master support. Whilst you might be expecting large number, by using autoscaling you will only pay for what you use.
Try to decouple your APIs, have a dumb validation layer before handing off to your backend if you can help it. Technologies like SQS (as you mentioned before) help with decoupling when you combine with Lambda.
SQS guarantees at least once, so if your system should not write duplicates you'll want to account for idempotency in your application.
Also use a dead letter queue (DLQ) to handle any failed actions.
Ensure any resources residing in your VPC are spread across availability zones.
Use S3, EC2 Backup Manager and RDS snapshots to ensure data is backed up. Most other services has some sort of backup functionality you can enable.
Use autoscaling wherever possible to ensure you're reducing costs.
Build any infrastructure using an IaC tool (CloudFormation or Terraform), and any provisioning of resources via a tool like (Ansible, Puppet, Chef). Try to follow a pre baked AMI workflow to ensure that it is quick to return to the base server state.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Hey guys I need advice in order to make the right architectural decision.
I need to be able to run the console application (or docker container in the future) on different locations (Countries/Citys) without paying for hundreds always running virtual machines.
In other words, I need to press the button and run the application for a couple of hours on a server in New York, next press, and the same application will be run in Stambul.
The straight forward approach is to buy hundreds of virtual machines, but there are two problems with it:
It's too expensive.
Probably only a couple of them will be used but I'll have to pay for all of them.
What can you recommend?
Does Azure support it? Or maybe AWS?
First thing, cloud service provider work base on the region instead of a city like you mentioned new york etc but you can choose always nearest region to the country/city in which you want to run your application. you can also try cloudping or aws cloudping for nearest region.
In other words, I need to press the button and run the application for
a couple of hours on a server in New York, next press, and the same
application will be run in Stambul.
So I will recommend docker container as you want to run the same application in a different region so instead of mainain AMI better to go with the container.
AWS fargate is designing for pay as you go purpose along with zero server maintenance mean you just need to specify the docker image and run your application, rest AWS will take care of the resources.
AWS Fargate is a serverless compute engine for containers that works
with both Amazon Elastic Container Service (ECS) and Amazon Elastic
Kubernetes Service (EKS). Fargate makes it easy for you to focus on
building your applications. Fargate removes the need to provision and
manage servers, lets you specify and pay for resources per
application, and improves security through application isolation by
design.
like you mentioned
without paying for hundreds always running virtual machines.
So you do not need pay, you will only pay for the compute hours that used by your application when you start/run the container.
With AWS Fargate, there are no upfront payments and you only pay for
the resources that you use. You pay for the amount of vCPU and memory
resources consumed by your containerized applications.
AWS Fargate pricing
For deployment purpose, I will recommend terraform so you will only need to create resources for region and for the rest of the region you can make it parameterized.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am new to Amazon AWS and as a freelancer I am not clear on how I would facilitate dozens of clients using AWS. I average 5 clients per month. How would I do billing and set up instances for multiple clients? I have been using godaddy for a long time and they have a pro user dashboard that manages all of that.
You should create a separate AWS account for each client. If you are handling the AWS payments, then you could use AWS Organizations to combine the accounts into a single bill. You will be able to split the billing report into accounts to see exactly what each client owes you for AWS services.
This will also allow you to hand over an AWS account to a client, or provide their developers with access if they need it, without compromising your other clients in any way.
If you are the only person who can access the AWS services (eg management console, create resources, etc), then #MarkB's suggestion is sound: Create separate AWS Accounts under an Organization, the the customers for their usage.
Another benefit of this method is that you might want to charge your clients a fixed amount per month, or an uplift (eg extra 20% on top of AWS costs) for your service of managing their account and taking care of payments.
If, however, your clients have the ability to create resources under AWS, you might want to have them setup the AWS accounts so that it bills them directly. This is because your clients might create resources that cost additional money and might then claim that they didn't realise the impact of what they were doing, thus leaving you with a bill that they don't want to pay.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Let's assume the standard data engineering problem:
every day at 3.00 AM connect to an API
download data
store them in a data lake
Let's say there is a python script that does the API hit and storage, but that is not that important.
Ideally I would like to have some service that comes alive, runs this script and kills itself... So far, I thought about those possibilities (using AWS services):
(AWS) Lambda - FaaS, ideal match for the usecase. But there is a problem: bandwith of the function (limited RAM/CPU) and timeout of 5 mins.
(AWS) Lambda + Step Functions + range requests: fire multiple Lambdas in parallel, each downloading a part of the file. Coordination via Step Functions. It solves the issue of 1) but it feels very complicated.
(AWS EC2) Static VM: classic approach: I have a VM, I have a python interpreter, I have a cron -> every night I run the script. Or every night, I can trigger a build of new EC2 machine using CloudFormation, run the script and then kill it. Problems: feels very old-school - like there has to be a better way to do it.
(AWS ECS) Docker: I have very little experience with docker. Probably similar to the VM case, but feels more versatile/controllable. I don't know if there is a good orchestrator for this kind of job and how easy it is (firing docker and killing it)
How I see it:
Exactly what I would like to have, but it is not good for downloading big data because of the resource constrains.
Complicated workaround for 1)
Feels very oldschool, additional devops expenses
Don't know a lot about this topic, feels like the current state-of-art
My question is: what is the current state-of-art for this kind of job? What services are useful and what are the experiences with them?
A variation on #3... Launch a Linux Amazon EC2 instance with a User Data script, with Shutdown Behavior set to Terminate.
The User Data script performs the download and copies the data to Amazon S3. It then executes sudo shutdown -h to turn off the instance. (Or, if the script is complex, the User Data script can download a program from an S3 bucket, then execute it.)
Linux EC2 instances are now charged per-second, so think of it like a larger version of Lambda that has more disk space and does not have a 5-minute limit.
There is no need to use CloudFormation to launch the instance because then you'd just need to delete the CloudFormation stack. Instead, just launch the instance directly with the necessary parameters. You could even create a Launch Template with the parameters and then simply launch an instance using the Launch Template.
You could even add a few smarts to the process and launch the instance using Spot Pricing (set the bid price to normal On-Demand pricing, since worst case you'll just pay the normal price). If the Spot Instance won't launch due to insufficient spare capacity, then launch an On-Demand instance instead.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a web application that has very fluctuating traffic. I'm talking about 30 to 40 users daily to thousands of people simultaneously. It's a ticketing app so this kind of behavior is here to stay so I want to make a strategic choice I don't want to by a host with a high configuration because it's just going to be sitting around for most of the time. We're running a Node.js server so we usually run low on RAM. My question is this: what are my options and how difficult is it to go from a normal VPS to something like Microsoft Azure, Google Cloud, or AWS.
It's difficult to be specific without knowing more about your application architecture but both AWS Lambda and Google App Engine offer 'serverless architecture' and support Node.js. Serverless architectures allow you to host code directly rather than running servers and associated infrastructure. Scaling is given to you by the services, costs are based on consumption and you can configure constraints and alerts to prevent racking up huge unexpected bills. In both instances you would need to front the services with additional Google or AWS services to make them accessible to customers, but these offer a great way to scale and pay only for what you need.
A first step is to offload static content to Amazon S3 (or similar service). Those services will handle any load and will lessen the load on your web server.
If the load goes up/down gradually (eg over the course of 30 minutes), you can use Auto Scaling to add/remove Amazon EC2 servers based upon load metrics. For example, you probably don't need many servers at night.
However, for handling of spiky traffic, rewriting an application as Serverless will make it highly resilient, highly scalable and most likely a lot cheaper too!
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I have been learning aws for quite sometime. I would like to confirm the overall picture of what I have learned so far : I take a normal PC as an analogy to this :
**EC2 similar to arithmetic and logical unit of PC
EMR similar to the OS of PC
S3 similar to the hard-disk of PC**
Please correct me if am wrong and explain me the AWS EC2,EMR,S3 with comparison to another system/service etc.
(Please dont direct to amazon doc links/tutorials as I have crossed all those and I want to confirm my understanding)
Thanks in advance
I think your analogies are reasonable from a 10,000 foot view. However, I wouldn't say they are correct since there are a lot of subtleties involved. Let me list a few.
EC2 does handle compute side of your application hence it does have a similar role to an ALU has in a microprocessor. However, two major differences.
a) EC2 is not like the ALU because EC2 consists of the ability to launch/terminate new compute resources. An ALU by definition is a fixed compute entity while EC2 by definition is a system for provisioning compute resources. Very different.
b) EC2 is not stateless but an ALU is. EC2-provided instances have disk, memory, etc. Thus they can carry the entire state of application. S3 is not a required component. In a computer, ALU by itself isn't useful you additional memory is required.
EMR to OS. EMR is really just Hadoop. Hadoop is a task distribution platform. EMR is like an OS in that it does task scheduling. However, a major part of an OS is doing arbitration between different app threads. Whereas, Hadoop is about taking a big data problem and running it in a distributed fashion across many computers. It does no resource arbitration and works on one problem at a time. Thus, its not really like an OS. Apache Yarn to me is closer to an OS btw.
Your S3 analogy is also partially correct. AWS has many types of storage. There is Ephermal storage which is like memory and goes away when an instance dies. There is EBS volumes that are permanent disks attached to instances (or sitting idle) with data on them. S3 is the third type of storage which is like having a web storage. You can upload files to S3 and access them. S3 is very much like a remote disk. To complete, AWS also has Glacier which is archival storage which is even more distant than S3.
Hope this helps.