An infrastructure (include both dev and prod environments) for an application has been made on an AWS account that is quite big, includes 15 instances,... Now, we're gonna make a new infrastructure for another application. I would like to know if it's better to create another AWS account for the new project. What would be the advantages?
Although I prefered to have separated account for each environment than projects but as the first project is made on one account compeletly, so I think the only better way is to atleast create another AWS account for the new project.
Plus,in any case, is there any easy way to transfer production env to another account inorder to separate the environments?
Any suggestion would be appreciated.
I'm not sure as to the circumstances in your case but I imagine having a separate account for each environment does give you more control and less room for error.
If you're working alone, try to determine this for yourself whether the effort is worth it. Should you be part of a team or even leading a team, if someone has access to the 'global' aws account with both the development and production instances, errors can easily be made. If you're consuming the AWS API for example and terminate the wrong instance... Food for thought.
Another reason would be that you will need to become very very granular with your IAM roles should you wish to worth with a global account with each environment in it to keep some level of control.
Lastly, cloudwatch will give you nice detailed reports on how your instances are doing and when you have all environments in their respective AWS accounts, it becomes a quick way to see which servers are operating in which fashion.
If all your environments are in the same account, this can become quite confusing as to which instances are production / development.
TLDR, it is good practise to split up the different environments to keep a higher level of control and overview.
Today (I know I'm answering a very old question), AWS makes it easy and very useful to group accounts into Organizations.
For a big setup, this means you can consolidate billing, reservations and other reductions, as well as many security and compliance aspects, while keeping each account operationally separate. While it may be some overhead for a small setup it will be less overhead than trying to keep separate two development teams that are using one account, and extra costs are small to none.
In short, there are a number of very significant advantages and as far as I can see no significant downsides to separating different spheres of responsibility into different accounts.
Related
We have been running multi-tier application on aws and using various aws services like ECS, Lambda and RDS. Looking for a solution to map billing items to actual system components, finding the most money spending component etc.
AWS improved its Detailed Cost Usage Reports and have Cost Explorer API however it only break down the billing to services or instances. However per instance breakdown does not bring so much value if you looking for what is the cost of each component. Any solutions/recommendations for this?
Cost Allocation Tags
You can create a tag such as "system" or "app" and apply it to all of your resources and set the value to the different applications/systems/Components that you wish to track. Then you can go to the billing page, click on "Cost Allocation Tags" and activate that tag that you created.
Then you can see costs broken down by the different values of that tag. They will show up in Cost Explorer, tag will be one of the filters available. However, I think it takes 24 hours after activation before they will show up.
If you do need to enforce tag usage, and you have developers that work on multiple components, it's possible to have IAM roles for managing each components, each role is limited to interacting with resources with a specific tag (i.e. they can only modify existing resources with that tag, and they can only create new resources with that tag). A developer can have an IAM user (or you could federate identities, but that's a whole different conversation) and allow them to assume different roles depending on which component they are working on. This has the added benefit of making cross-account management easier. However, it may require a non-trivial IAM overhaul.
More info on cost allocation tags here: https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html
Divide Cost boundaries by AWS account
To attack the components that are not taggable such as data transfers, you could build your account strategy around cost boundaries and have a separate account for each cost silo (if that's tenable). That may increase cost, because you'd have to break systems into specific accounts (and therefore specific EC2 Instances).
When you centralize reporting, monitoring, config management, log analysis, etc. Each application will add a little bit to that cost, but usually you just have to consider that centralization a system in itself and cost it out separately on its own. Obviously, you can have separate monitoring, alerting, reporting, log collection, config management, etc. for each system, but this will cost more overall (both in infrastructure costs and engineering hours). So you would have to prioritize cost visibility versus cost optimization.
There are still a great deal of capabilities within AWS to connect resources from disparate accounts, and it's not difficult to have a data-layer in one account, and an app-tier in another (though it's not a paradigm I often see).
Custom Tooling
Maybe the above are imperfect solutions for your environment, you could use the above as far as they are feasible and write scripts to estimate usage of things that are more difficult to track. For bandwidth, if you had your own EC2 Instances that ran as Forward Proxies or NAT gateways, you could write some outbound data transfer account software. If everything in your VPCs had a route to point to ENIs on these instances, then you could better track outbound transfer by any parameters you choose. This does sound a little fragile to me, and there may be several cases where this isn't tenable from a network perspective, but it's a possibility.
Similarly, with Cloudwatch metrics, you can use Namespaces, I wasn't able to find any reference to the ability to filter by Cloudwatch Namespaces in Cost Explorer, but it probably would be pretty easy to suss out raw metrics per namespace and estimate costs per namespace. Then you could divide your components in Cloudwatch by namespace. This may lead to some duplication, which may lead to more management effort or increased cost, but that would be the tradeoff for more granular cost visibility.
Kubernetes
This may be very pie-in-the-sky for your environment, but it's worth mentioning. If you ran a cluster using EKS or a self-managed cluster on EC2, you can harness the power of that platform, which would allow you provision a base level of compute resources, divide components into namespaces and use built-in or third party tools to grab usage statistics per namespace (or even per workload). This is much more easy to enforce, because you can give developers access to specific namespaces and outliers are generally more obvious. When you know the amount of CPU and Memory each workload uses over time, you can get a pretty good estimate of individual cost patterns by component.
Of course, you will still have a cost for the k8s management plane, which will be in a cost bucket apart from all of your other applications/systems.
Istio, while not a simple technology by any means, allows you to collect granular metrics about data egress which you can use to get an idea of how much data transfer costs are being ran up.
It might be easier to duplicate monitoring in each namespace, since you already have to abstract your monitoring workload to a certain extent to run on k8s at all. However, that still increases management and overall cost, but perhaps less than siloing at the Infrastructure (AWS) layer.
Summary
There's not a lot of options I know for getting to the level of granularity and control that you need in AWS. And efforts to this end will probably increase overall cost and management overhead. AWS is rather notorious for it's difficult to estimate cost model. Perhaps look into platforms other than AWS for your workloads that might provide better visibility into component costs.
It's also difficult to avoid systems that operate centrally and whose cost-per-system is difficult to trace. These include log management, config management, authentication systems, alerting systems, monitoring systems, etc. Generally it's more cost effective and more manageable to centralize these functions for all of your workloads, but then TCO of individual apps becomes difficult. In my experience, most teams write this off as infrastructure cost, and track the cost of an app more with the compute, storage, and AWS service usage data points.
My employer is asking me what hours I want to use AWS VMs.
They don't want to grant me full corporate access, because in the past people have shut down mission critical instances by mistake.
I'd like the flexibility to start/stop my own instance and not be reliant on asking someone else to extend the hours on an adhoc basis, as I often work odd hours into the night if I am on a roll with something.
Other than the expense of a 24/7 use case, is there a more cost effective capability that I can point the gatekeeper too, that would allow this sort of flexibility?
At the moment, I'm pretty naive on the AWS front.. I just use the VMs I've been given to use.
BTW: I think there are issues about having them in certain domains - so I can't just have my own individual account.
Thanks in advance for your advice.
I think there are issues about having them in certain domains - so I can't just have my own individual account.
This is what AWS Organizations is for: you have your own account, but it's tied to the corporate account and can be granted access to perform certain functions.
You don't describe what you're creating these instances for, but I'm going to assume that it's development testing. In that case, you would work entirely within your own sandbox, and be unable to affect the mission-critical resources. If there's a need for explicit domain naming, they can delegate authority for a sub-domain, and if necessary use CNAMEs link hosts in that sub-domain to the parent domain.
If you need to do production support work, such as bringing up a new production machine, they can create a role that grants you permission to do so -- probably one that allows you to start machines but not stop them.
At the moment, I'm pretty naive on the AWS front
Unfortunately, it sounds like they are as well. I think the best thing you can do is point them at the Organizations doc.
Is there a best practice around separating environments in AWS?
I've got a solution that employs the following services:
Lambda
SNS
SQS
DyanmoDB
API Gateway
S3
IAM
We're not live yet, but we're getting close. By the time we go-live, I'd like a proper production, test, and development environment with a "reasonable" amount of isolation between them.
Separate account per environment
Single Account and separate VPC per environment
I read the article AWS NETWORKING, ENVIRONMENTS AND YOU by Charity Majors. I'm down with segmentation via VPC, but I don't know that all the services in my stack are VPC scoped? Here are some of my requirements:
Limit Service Name Collision (for non global services)
Establish a very clear boundary between environments
Eventually, grant permissions at the environment level
I am using an AWS Organization.
P.S. Apologies if this isn't the right forum for the question. If there is something better, just let me know and I'll move it.
I recommend one AWS account per environment. The reasons, in no particular order:
security: managing complex IAM policies to create boundaries within a single account is really hard; conversely, one account per environment forces boundaries by default: you can enable cross account access but you have to be very deliberate
auditing access to your different environments is more difficult when all activity happens in the same account
performance: some services don't have the same performance characteristics when operating in VPC vs non-VPC (ie. Lambda cold starts increased latency when operating in VPC)
naming: instead of using the AWS account id to identify the environment you're operating in, you have to add prefixes or suffixes to all the resources in the account - this is a matter of preference but nonetheless..
compliance: if you ever need to adhere to some compliance standard such as HIPAA which imposes strict restrictions on how long you can hold on to data and who can access data, it becomes really difficult to prove which data is production and which data is test etc. (this goes back to #1 and #2 above)
cost control: in dev, test, staging environments you may want to give people pretty wide permissions to spin up new resources but put low spending caps to prevent accidental usage spikes; conversely in a production account you'll want restricted ability to spin up resources but higher spending caps; easy to enforce via separate account - not so much in the same account
Did I miss anything? Possibly! But these are the reasons why I would use separate accounts.
By the way - I am NOT advocating against using VPCs. They exist for a reason and you should definitely use VPCs for network isolation. What I am trying to argue is that anybody who also uses other services such as DynamoDb, Lambda, SQS, S3 etc - VPCs are not really the way to isolate resources, IMO.
The downsides to one account per stage that I can think of are mostly around continuous deployment if you use tools that are not flexible enough to be able to deploy to different accounts.
Finally, some people like to call on billing as a possible issue but really, wouldn’t you want to know how much money you spend on Production vs Staging vs Development ?!
Avoid separate accounts for each environment to avoid additional complexity and obstacles in accessing shared resources.
Try rather using:
resource groups
tagging
as recommended by AWS:
https://aws.amazon.com/blogs/startups/managing-resources-across-multiple-environments-in-aws/
The account separation is recommended by the AWS Well Architected Framework security pillar.
I would like to know a system by which I can keep track of multiple aws accounts, somewhere around 130+ accounts with each account containing around 200+ servers.
I wanna know methods to keep track of machine failure, service failure etc.
I also wanna know methods by which I can automatically turn up a machine if the underlying hardware failed or the machine terminated while on spot.
I'm open to all solutions including chef/terraform automation, healing scripts etc.
You guys will be saving me a lot of sleepless nights :)
Thanks in advance!!
This is purely my take on implementing your problem statement.
1) Well.. for managing and keeping track of multiple aws accounts you can use AWS Organization. This will help you manage centrally with one root account all the other 130+ accounts. You can enable consolidated billing as well.
2) As far as keeping track of failures... you may need to customize this according to your requirements. For example: You can build a micro service on top of docker containers or ecs whose sole purpose is to keep track of failures, generate a report and push to s3 on a daily basis.You can further create a dashboard using AWS quicksight out of this reports in S3.
There can be another micro service which will rectify the failures. It just depends on how exhaustive and fine grained you want your implementation to be.
3) For spawning instances when spot instances are terminated, it can be achieved through you simple autoscaling configurations. Here are some of the articles you may want to go through which will give you some ideas:
Using Spot Instances with On-Demand instances
Optimizing Spot Fleet+Docker with High Availability
AWS Organisations are useful for management. You can also look at multiple account billing strategy and security strategy. A shared services account with your IAM users will make things easier.
Regarding tracking failures you can set up automatic instance recovery using CloudWatch. CloudWatch can also have alerts defined that will email you when something happens you don't expect, though setting them up individually could be time consuming. At your scale I think you should look into third party tools.
I would want to have different environments in AWS. At first I thought of differentiating environments by Tags, tags on AWS Resources. But then I cannot restrict users to change Tags of the machine. What that means is, if I allow them ec2:CreateTags, they can not only create tag, but also change tag of any of the resources, since cannot apply a condition on it - say for example if it belongs to a particular VPC or subnet. If I don't allow them the previlege to create tag, then they can launch an instance but their tags are not applied and hence any further operation on the instance is not permitted.
If I want to distinguish between environments by VPC-ID, then for operations such as ec2:StartInstance cannot apply a condition to allow the operation only in a specific VPC-ID, but can conditionally allow based on Resource Tag which for reasons in previous paragraph is not convincing.
On AWS documentation it mentions
One approach to maintaining this separation was discussed in the Working with AWS Access Credentials, that is, to use different accounts for development and production resources.
So it is possible to have one Paying Account for several other accounts which themselves are Paying Accounts? I still don't think multiple accounts for just different environments is a good idea.
How do you generally differentiate among environments for enforcing policies?
Thanks.
Different accounts is the way to go. There are so many places you'll want to create isolation that you'll make yourself crazy trying to do it within one account. Think about it - there's network controls, all the IAM permissions for all the services, access control lists, tags that have the limitations you describe, and on and on. Real isolation comes from putting things in different accounts for now.
The last thing you want is some weakness in your dev environment to pivot into your production environment - end of story. Consider also the productivity benefit of separating prod and dev accounts... you'll never break a prod system from a mistake or experiment in development.
Consolidated billing is the answer to paying for it all. Easy to setup and track. If you need more reporting, look into CloudAbility.
Where this gets really interesting is in the space of multiple production and multiple dev environments. There are a lot of opinions on isolation there. Some people combine all prod and dev into two accounts, and some put every prod and dev into their own. It all depends on your risk profile. Just don't end up like CloudSpaces.
It is possible to do consolidated billing, where one account is billed for its own usage + the AWS usage for any other linked account. However, you can not split that bill (e.g. have the master account only pay for EC2 services on a linked account, while having the linked account pay for it's other usage like S3, etc.).
As for differentiating between environments, I've used different security groups for each one (dev, staging, production) as an alternative to tags, but there are limitations when it comes to enforcing policies. The best option to have full policy control is to use different accounts.
I would suggest go with with one VPC and use Security Groups for isolation. As your AWS infra grows, you will need Directory Services (Name Servers, User Directory, VM Directory, Lookup services etc.). If you have two VPCs, sharing the Directory Services will not be easy. Also if you need Code Repository (e.g. GitHub) or Build tools (e.g. Jenkins) having three separate VPCs for DEV, Staging and Production will make things really complicated.