Is there a best practice to test your stack locally before deploying to AWS and avoid deploying your stack over and over during debuggin? - amazon-web-services

I have been working with AWS and the Serverless Framework/Cloud Formation over the last few months.
A solid amount of time went into debugging my applications and most of this time share went into staring at my console while my stack is being deployed.
I did read in „The Software Craftsman“ (Sandro Mancuso) that the Author worked for a company where the developers where working in a similar fashion: Changing a tiny bit of code, deploying all of the code to the server, executing it, checking print statements before again changing a tiny bit of code and deploying all the code again.
Mancaso heavily criticized this approach and strongly recommended to write tests before deployment to avoid this kind of behavior. Since I currently am developing in a pretty much exactly the same fashion, I gave this approach some thought, but I came across some issues.
Of cause testing is very important and it catches some issues I would have missed before deploying my code. However, when working on cloud infrastructure, microservices and other distributed systems, there are a lot of aspects I simply can not capture in my tests. Errors stemming from the AWS Infrastructure itself, errors stemming from Interaction with other micro services or connected systems etc.
Therefore I am looking for a way (if any exists) to test my AWS stack locally in any way, to avoid changing tiny bits of code and then waiting for my code to deploy to AWS for a few minutes during debugging.

I have not found yet a perfect solution to it. Even if you are testing code locally, with some mocked services, it still can fail after deployment, because you forgot to combine the IAM rights, permissions, security groups, policies etc.
Currently I am working with AWS CLI, which creates Cloud Formation stacks. We can test an Lambda locally, this is not a problem, but even if it communicates with your local DB, it can fail after deployment, as the DB in your account is in VPC and you forgot to change the policies...
Our approach is currently to work with nested stacks, so that we don't have to redeploy entire infrastructure, but only that one part, that was really changed.
Nested stacks works good with AWS CLI.

Related

Usefulness of IaaS Provisoning tools like Terraform?

I have a quick point of confusion regarding the whole idea of "Infrastructure as a Code" or IaaS provisioning with tools like Terraform.
I've been working on a team recently that uses Terraform to provision all of its AWS resources, and I've been learning it here and there and admit that it's a pretty nifty tool.
Besides Infrastructure as Code being a "cool" alternative to manually provisioning resources in the AWS console, I don't understand why it's actually useful though.
Take, for example, a typical deployment of a website with a database. After my initial provisioning of this infrastructure, why would I ever need to even run the Terraform plan again? With everything I need being provisioned on my AWS account, what are the use cases in which I'll need to "reprovision" this infrastructure?
Under this assumption, the process of provisioning everything I need is front-loaded to begin with, so why do I bother learning tools when I can just click some buttons in the AWS console when I'm first deploying my website?
Honestly I thought this would be a pretty common point of confusion, but I couldn't seem to find clarity elsewhere so I thought I'd ask here. Probably a naive question, but keep in mind I'm new to this whole philosophy.
Thanks in advance!
Manually provisioning, in the long term, is slow, non-reproducible, troublesome, not self-documenting and difficult to do in teams.
With tools such as terraform or CloudFormation you can have the following benefits:
Apply all the same development principles which you have when you write a traditional code. You can use comments to document your infrastructure. You can track all changes and who made these changes using software version control system (e.g. git).
you can easily share your infrastructure architecture. Your VPC and ALB don't work? Just post your terraform code to SO or share with a colleague for a review. Its much easier then sharing screenshots of your VPC and ALB when done manually.
easy to plan for disaster recovery and global applications. You just deploy the same infrastructure in different regions automatically. Doing the same manually in many regions would be difficult.
separation of dev, prod and staging infrastructure. You just re-use the same infrastructure code across different environments. A change to dev infrastructure can be easily ported to prod.
inspect changes before actually performing them. Manual upgrades to your infrastructure can have disastrous effects due to domino effect. Changing one, can change/break many other components of your architecture. With infrastructure as a code, you can preview the changes and have good understanding what implications can be before you actually do the change.
work team. You can have many people working on the same infrastructure code, proposing changes, testing and reviewing.
I really like the #Marcin's answer.
Here some additional points from my experience:
As for software version control case you not only can see history/authors, perform code review, but also treat infrastructural changes as product features. Let's say for example you're adding CDN support to your application so you have to make some changes in your infrastructure (to provision a cloud CDN service), application (to actually support and work with CDN) and your pipelines (to deliver static to CDN, if you're using this approach). If all changes related to this new feature will be in a one single branch - all feature related changes will be transparent for everyone in the team and can be easily tracked down later.
Another thing related to version control - is have ability to easily provision and destroy infrastructures for review apps semi-automatically using triggers and capabilities of your CI/CD tools for automated and manual testing. It's even possible to run automated tests for your changes in infrastructure declaration.
If you working on multiple similar project or if your project requires multiple similar but isolated from each other environment, IaC can help save countless hours of provisioning and tracking down everything. Although it's not always silver bullet, but in almost all cases it helps with saving time and avoiding most of accidental mistakes.
Last but not least - it helps with seeing bigger picture if you working with hybrid or multicloud environments. Not as good as infrastructural diagrams, but diagrams might not be always up date unlike your code.

How to use codestar in a serverless project having a microservice architecture?

I'm totally new with AWS Serverless architecture.
I was trying to generate the project architecture, and I read about AWS codestar and how it can Easily create new projects using templates for AWS Lambda using Python (which is my case)
But I didn't know if I should :
generate one project (the main project ) with AWS codestar and then
I create separate folders for every microservice I have
(UsersService, ContactService ...etc)
OR
every microservice can be generated via AWS Codestar so each
service is a separate codestar project for my lambdas ?
Maybe it's a very stupid question for some of you, please any help or usefull links are welcome.
Thanks
This is generally your decision over how you deploy, although I feel like the general consensus will be option 2. I'll try to explain why.
Option 1 is what you would call a Monolith, this means everything for your app is all in one place. This might initially seem great but has a few limitations which I've detailed below:
All or nothing deployments, if you update a tiny part of the app you need to deploy everywhere.
Leads to coupling between unrelated components, generally the design pattern can lead to overlapping changes that can cause breaking changes for other parts of your stack.
Harder to scale, you generally scale larger chunks (i.e. not search and book independently but everything all together).
You can mitigate against these but it can be a bit of a headache.
The second option leads more towards a Microservice/Decoupled Architecture.
Some of the benefits of this method are:
Only deploy the changes you've made, if the search service changes only deploy that.
Easier to scale infrastructure to meet specific demand.
Able to implement functional testing of the component easier.
Restrict access to users who develop specific components.
Option 2 is your microservice based repository setup, so I would suggest using this.
I don't have enough reputation to comment so I will post this as an answer.
What you are asking about is software architecture question, and whether or not to use a monorepo vs a polyrepo. You've already made the decision about microservices, so this is not a monolith.
The answer is.... it depends. There is no general consensus. Just do search on monorepo vs polyrepo (or multirepo) and be prepared to go down the rabbit hole.
Being a serverless application should have no bearing on the type of structure you decide on. However, CodeStar may have some limitations that make it more difficult to use a monorepo. I'm using the CDK for that.
Here are a couple of articles to get you started:
https://medium.com/#mattklein123/monorepos-please-dont-e9a279be011b
https://medium.com/#adamhjk/monorepo-please-do-3657e08a4b70
Here is another that pertains directly to serverless applications:
https://lumigo.io/blog/mono-repo-vs-one-per-service/

How do I handle development and production environments in AWS? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
Building an app to be launched in production - and unsure how to handle the dev/production environments on AWS.
If I use multiple buckets, multiple DynamoDB tables, multiple Lambda functions, multiple Elastic Search instances, EC2, API gateway - it seems SUPER cumbersome to have a production and a dev environment?
Currently there is only ONE environment, and once the app goes live - any changes will be changing the production environment.
So how do you handle two environments on AWS? The only way I can think of - is to make copies of every lambda function, every database, every EC2 instance, every API and bucket.... But that would cost literally double the price and be super tedious to update once going live.
Any suggestions?
There are couple of approaches. However, regardless of option I have found it best to keep as much infrastructure as code as possible. This gives the maximum flexibility in terms of environment set up and recoverability.
There's the separate account approach
Create a new account
Move all your objects(EC2, S3 etc) into this account
This is easily done if you have the majority of your infrastructure as code and are working out of version control such as git - as you can use AWS Cloudformaton.
Make sure you rename the s3 buckets to something which is globally unique and compliant
You are then running 2 separate instances of everything. However, you can put some cost controls in place such as smaller EC2 instances. Or, you can just delete the entire Cloudformation stack when you are not using it and then spin it up when needed. There is more up front cost in terms of time with this approach, but it can save $$$ in the long run. Plus, the separation of accounts is great from a security perspective.
One account approach
This can get a bit messy but there are several features which can help you split out one account into dev and production.
Lambda versioning. If you are using lambdas you can get versioning and alias. Which in effect means you can have one lambda set up with a production and dev version under the same function name.
API Gate way has 'Stages'. This is effectively environments and you can label one production and one development to split the separations of concern for a single API.
S3 buckets. You can always make a key at the top level of the directory S3://mybucket/prod/ and s3://mybuckey/dev/. It's a bit messy, but better than having everything in the one directory.
However, what you really need to ask is how much does it actually cost to run a second account verses one account for this use case? and the answer is probably close to the same.
That's the advantage of AWS, and cloud computing in general. You only pay for what you use. Running a lambda across two accounts costs the same as running one lambda in a single account but invoking it the exact same amount of times.
The two account approach also gives a lot more clarity into what is going on, and helps prevent issues in production where a development piece of code finds its way in because it is all in one account.
I suggest two AWS accounts. Then make a CloudFormation template that provisions all the resources you need. Once you make the template, it is no longer cumbersome, and having side-by-side environments makes it easy to test code updates before they go live. It is not a good idea to test changes in a production environment.
Yes, this will mean double the costs, but you can always delete the CloudFormation stack in your pre-prod account when you are done testing so there are no idle resources. You just spin them up when you need to test, then spin them down when you are done. So you are only doubling costs for that small window of time when you are testing. And pushing the changes live is just a matter of updating the CloudFormation stack.
These cloud capabilities are one of the big selling points for moving to the cloud in the first place--they can solve the problem you describe without it being cumbersome, but it does take investment in building the CloudFormation template (infrastructure-as-code).

AWS CloudFormation vs. Web Console?

I'm trying to understand the real-world usefulness of AWS CloudFormation. It seems to be a way of describing AWS infrastructure as a JSON file, but even then I'm struggling to understand what benefits that serves (besides potentially "recording" your infrastructure changes in VCS).
Of what use does CloudFormation's JSON files serve? What benefits does it have over using the AWS web console and making changes manually?
CloudFormation gives you the following benefits:
You get to version control your infrastructure. You have a full record of all changes made, and you can easily go back if something goes wrong. This alone makes it worth using.
You have a full and complete documentation of your infrastructure. There is no need to remember who did what on the console when, and exactly how things fit together - it is all described right there in the stack templates.
In case of disaster you can recreate your entire infrastructure with a single command, again without having to remember just exactly how things were set up.
You can easily test changes to your infrastructure by deploying separate stacks, without touching production. Instead of having permanent test and staging environments you can create them automatically whenever you need to.
Developers can work on their own, custom stacks while implementing changes, completely isolated from changes made by others, and from production.
It really is very good, and it gives you both more control, and more freedom to experiment.
First, you seem to underestimate the power of tracking changes in your infrastructure provisioning and configuration in VCS.
Provisioning and editing your infrastructure configuration via web interface is usually very lengthy process. Having the configuration in a file versus having it in multiple web dashboards gives you the much needed perspective and overall glance at what you use and what is it's configuration. Also, when you repeatedly configure similar stacks, you can re-use the code and avoid errors or mistakes.
It's also important to note that AWS CloudFormation resources frequently lag behind development of services available in the AWS Console. CloudFormation also requires gathering some know-how and time getting used to it, but in the end the benefits prevail.

How to convert a WAMP stacked app running on a VPS to a scalable AWS app?

I have a web app running on php, mysql, apache on a virtual windows server. I want to redesign it so it is scalable (for fun so I can learn new things) on AWS.
I can see how to setup an EC2 and dump it all in there but I want to make it scalable and take advantage of all the cool features on AWS.
I've tried googling but just can't find a simple guide (note - I have no command line experience of Linux)
Can anyone direct me to detailed resources that can lead me through the steps and teach me? Or alternatively, summarise the steps in an answer so I can research based on what you say.
Thanks
AWS is growing and changing all the time, so there aren't a lot of books to help. Amazon offers training that's excellent. I took their three day class on Architecting with AWS that seems to be just what you're looking for.
Of course, not everyone can afford to spend the travel time and money to attend a class. The AWS re:Invent conference in November 2012 had a lot of sessions related to what you want, and most (maybe all) of the sessions have videos available online for free. Building Web Scale Applications With AWS is probably relevant (slides and video available), as is Dissecting an Internet-Scale Application (slides and video available).
A great way to understand these options better is by fiddling with your existing application on AWS. It will be easy to just move it to an EC2 instance in AWS, then start taking more advantage of what's available. The first thing I'd do is get rid of the MySql server on your own machine and use one offered with RDS. Once that's stable, create one or more read replicas in RDS, and change your application to read from them for most operations, reading from the main (writable) database only when you need completely current results.
Does your application keep any data on the web server, other than in the database? If so, get rid of all local storage by moving that data off the EC2 instance. Some of it might go to the database, some (like big files) might be suitable for S3. DynamoDB is a good place for things like session data.
All of the above reduces the load on the web server to just your application code, which helps with scalability. And now that you keep no state on the web server, you can use ELB and Auto-scaling to automatically run multiple web servers (and even automatically launch more as needed) to handle greater load.
Does the application have any long running, intensive operations that you now perform on demand from a web request? Consider not performing the operation when asked, but instead queueing the request using SQS, and just telling the user you'll get to it. Now have long running processes (or cron jobs or scheduled tasks) check the queue regularly, run the requested operation, and email the result (using SES) back to the user. To really scale up, you can move those jobs off your web server to dedicated machines, and again use auto-scaling if needed.
Do you need bigger machines, or perhaps can live with smaller ones? CloudWatch metrics can show you how much IO, memory, and CPU are used over time. You can use provisioned IOPS with EC2 or RDS instances to improve performance (at a cost) as needed, and use difference size instances for more memory or CPU.
All this AWS setup and configuration can be done with the AWS web console, or command-line tools, or SDKs available in many languages (Python's boto library is great). After learning the basics, look into CloudFormation to automate it better (I've written a couple of posts about that so far).
That's a bit of the 10,000 foot high view of one approach. You'll need to discover the details of each AWS service when you try to use them. AWS has good documentation about all of them.
Depending on how you look at it, this is more of a comment than it is an answer, but it was too long to write as a comment.
What you're asking for really can't be answered on SO--it's a huge, complex question. You're basically asking is "How to I design a highly-scalable, durable application that can be deployed on a cloud-based platform?" The answer depends largely on:
The specifics of your application--what does it do and how does it work?
Your tolerance for downtime balanced against your budget
Your present development and deployment workflow
The resources/skill sets you have on-staff to support the application
What your launch time frame looks like.
I run a software consulting company that specializes in consulting on Amazon Web Services architecture. About 80% of our business is investigating and answering these questions for our clients. It's a multi-week long project each time.
However, to get you pointed in the right direction, I'd recommend that you look at Elastic Beanstalk. It's a PaaS-like service that abstracts away the underlying AWS resources, making AWS easier to use for developers who don't have a lot of sysadmin experience. Think of it as "training wheels" for designing an autoscaling application on AWS.