AWS: VPC, Subnets, NACLs, Security Groups, IGWs, Route Tables, Etc - amazon-web-services

I've watched hours upon hours of tutorials and have read until my eyes were about to bleed, but I just cannot seem to grasp how Amazon VPCs are working. I've created and deleted entire VPC environments with EC2 instances various times following tutorials, but as soon as I go to create one w/out the tutorial, I'm lost.
I'm trying to come up with an analogy to help me to better understand. What I have so far is something like this:
A VPC is like a Club. At the front of the club, you have an
Entrance, the IGW. Inside the Club, you have different areas; the General Area which would be the public subnet and the
Management Area which is the private subnet.
Within the General Area you would have a Dance Floor/Bar which
would equate to an EC2 Instance and a Receiving Bay where management
can receive deliveries and whatnot from the outside world, the NAT.
Then in the Management Area you'd have an Office, another EC2
Instance, and your Inventory which is like your RDS.
I think that's a somewhat accurate analogy so far, but once I start to try and work in the SGs, NACLs, RTs, etc, I realize that I'm just not grasping it all.
Can anyone help me with finishing this analogy or supply a better analogy? I'm at my wits' end.

Rather than using analogies, let's use the network you already have at home.
Within your home, you probably have a Router and various devices connected to the router. They might be directly connected via ethernet cables (eg a PC), or they might be connected via wifi (eg tablets, phones, Alexa). Your home network is like a VPC. Your various devices connect to the network and all of the devices can talk to each other.
You also have some sort of box that connects your router to the Internet. This might be a cable modem, or a fibre router or (in the old days) a telephone connection. These boxes connect your network (VPC) to the Internet and are similar in function to an Internet Gateway. Without these boxes, your network would not be able to communicate with the Internet. Similarly, without an Internet Gateway, a VPC cannot communicate with the Internet.
Some home routers allow you to broadcast a Guest network in addition to your normal network. This is a network where you can give guests a password, but they can't access your whole network -- this is good for security, since they can't snoop around your network to try and steal your data. This is similar in concept to having a separate subnet -- there are two networks, but routing rules (NACLs) block the traffic between them to improve security.
A home router typically blocks incoming access to your devices. This means that people on the Internet cannot access your computer, printer, devices, etc. This is good, since there are many bots on the Internet always trying to hack into devices on your network. However, the home router allows outbound requests from your devices to the Internet (eg a website) and it is smart enough to allow the responses to come back into the network. This is equivalent to a Security Group, which has rules that determine what Inbound and Outbound requests are permitted. Security Groups are stateful, which means they automatically allow return traffic even if it is not specifically listed. The difference is that the router is acting as the Security Group, whereas in an Amazon VPC it is possible to assign a Security Group to each individual resource (like having a router on each resource).
That doesn't cover all the capabilities of an Amazon VPC, but it should give you an idea of how the network actually behaves.

Related

How AWS Direct Connect works?

How does AWS Direct Connect work?
From AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
Please explain what it actually means in practical sense? And why would it be any better?
NOTE: I know there are AWS docs (worst docs I've ever seen) and some
other articles and answers on SO, but none of those explain what it
actually means in practice or miss some important notes for understanding for somebody who's only used to public internet and can't imagine how it's possible to "bybass public internet". So I decided
to create this post from my collective knowledge and also provided a real example from our company case. Hope it will be
useful for somebody.
So, from AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
But to understand what it actually means, lets refresh some knowledge.
Public Internet
Internet cable goes from your PC to your ISP (Internet Service Provider) switch located somewhere in your apartments building, usually.
And then... it gets connected to another switch, and it goes on and on like that, travelling through other cables and switches until it reaches the target PC you wanted to reach. Each switch knows where to send the signal next (how: it's a different topic).
So the signal first travels to some main switch of your ISP (where it does all the pricing/monitoring etc). ISP itself is then connected to another bigger ISP (they have some sort of partnership where your ISP pays that another ISP for using its cables, just like you do with your own ISP). In total, lots of physical cables literally lay down around the world going undersea/underground/over-the-air allowing the whole world to be connected to each other.
Problem
There are millions of internet users. You can imagine how many switches and cables is needed to support all that big network, and how much traffic (hello TikTok) is traveling. Because of that:
Your signal doesn't travel via the most optimal route through switches, when it needs to go from your PC to another target machine (some AWS machine in our case). Especially when you're on different continents.
Big traffic also makes your ping fluctuate, depending on the load on each individual switch.
There are all sorts of switches, we can't really trust them. Moreover, ISPs aren't required to be compliant with PCI security standard, or any other. If you want a secure connection, you have to use VPN (IPSec, OSI layer 3), but it costs in terms of performance and ping speed.
AWS Direct Connect
AWS own internet
AWS came here and said - "let me create my own internet around the world. I literally lay down my own physical cables and I'll connect them to only selected (partner) data centers. So you can use those instead of public internet" AWS may still probably lease someone's cables (especially underseas one's), but most of the cables/switches are their own ones. Benefits:
It's a much smaller net. Less switches and they're more reliable. Cables go almost straight to AWS. Therefore it's faster.
AWS implements MACsec OSI layer 2 security at hardware level. So no VPN required (though you could still use it). It's faster.
How to connect
Obviously, AWS can't connect to each PC in the world just like public ISPs, otherwise it would be the same public internet network, sort of speaking, and wouldn't make sense.
AWS has a partnership with selected data centers around the world to which AWS did the physical connection and put their own switch there ("AWS Direct Connect Cage"). So if you put your server in such data center (or at least your server connects to such data center from other nearest location via public internet) you can quickly enter AWS network where your signal will travel much faster. So you don't even need a public ISP in such case.
You do this to:
Reduce latency and improve stability into your AWS environment.
Even reduce latency between non-AWS endpoints. When both endpoints use public internet to only connect to the nearest AWS cage, while then cage-to-cage traffics goes through AWS internal network. And voila!
Results
In our company case, we managed to decrease the latency around 5 times (i.e. 0.5 seconds vs 2.5 seconds) for non-AWS connectivity.

Static IP to access GCP Machine Learning APIs via gRPC stream over HTTP/2

We're living behind a corporate proxy/firewall, that can only consume static IP rules and not FQDNs.
For our project, we need to access Google Speech To Text API: https://speech.googleapis.com. If outside of corporate network, we use gRPC stream over HTTP/2 to do that.
The ideal scenario looks like:
Corporate network -> static IP in GCP -> forwarded gRPC stream to speech.googleapis.com
What we have tried is creating a global static external IP, but failed when configuring the Load Balancer, as it can only connect to VMs and not APIs.
Alternatively, we were thinking to use output of nslookup speech.googleapis.com IP address ranges and update it daily, though it seems pretty 'dirty'.
I'm aware we can configure a compute engine resource / VM and forward the traffic, but this really doesn't seem like an elegant solution either. Preferably, we can achieve that with existing GCP networking components.
Many thanks for any pointers!
Google does not publish a CIDR block for you to use. You will have daily grief trying to whitelist IP addresses. Most of Google's API services are fronted by the Global Frontend (GFE). This uses HTTP Host headers to route traffic and not IP addresses, which will cause routing to fail.
Trying to lookup the IP addresses can be an issue. DNS does not have to return all IP addresses for name resolution in every call. This means that a DNS lookup might return one set of addresses now and a different set an hour from how. This is an edge example of grief you will cause yourself with whitelisting IP addresses.
Solution: Talk to your firewall vendor.
Found a solution thanks to clever networking engineers from Google, posting here for future reference:
You can use a CNAME in your internal DNS to point *.googleapis.com to private.googleapis.com. This record in public DNS points to two public IP addresses (199.36.153.8/30) that are not reachable from the public internet but through a VPN tunnel or Cloud interconnect only.
So if setting up a VPN tunnel to a project in GCP is possible (and it should be quite easy, see https://cloud.google.com/vpn/docs/how-to/creating-static-vpns), then this should solve the problem.

Can not communicate between subnets in the same virtual network

Not sure what is exactly happening since it was always working before but VMs on different subnets within the same virtual network with no NSGs or firewalls between them can not talk to each other. Ping is failing as well as any other sort of communication. Firewalls are disabled on both sides. All machines have access to internet. Communication was tried using IP addresses and not names. Both ping as well as TCP based tests were used.
Effective route for app01 for example is below
By default, Azure allows communicate between subnets in a same VNet.
Your issue seems a issue on Azure side, I suggest you could open a ticket on Azure Portal.

How To Secure Erlang Cluster Behind Private Subnet

I am testing Erlang and have a few questions related to Security of the Distribution. (There is a lot of mixed information out there) These type of questions come with lots of opinions related to situations, and depends on personal comfort level on the type of data you are dealing with. For the sake of this question, lets assume it is a simple chat server where users can connect to and chat together.
Example Diagram:
The cluster will be behind a private subnet VPC with elastic-load-balancing directing all connections to these nodes (to and from). The elastic-load-balancing will be the only direct path to these nodes (there would be no way to connect to a node via name#privatesubnet).
My question is the following:
Based on this question and answer: Distributed erlang security how to?
There are two different types of inner-communication that can take place. Either, directly connecting nodes using built in functionality, or doing everything over a TCP connection with a custom protocol. The first is the most easiest, but I believe it comes with a few security issues, and I was wondering based on the above diagram if It would be good enough (Er, okay, Good Enough is not always good when dealing with sensitive information, but there can always be better ways to do everything ...)
How do you secure and Erlang cluster behind a private subnet? I would like to hide the nodes, and manually connect them, and of course use cookies on them. Is there any flaws with this approach? And since a custom protocol using TCP would be the best option, what type of impact does that have on performance? I want to know the potential security flaws(As I said, there is a lot of mixed information out there on how to do this).
I would be interested in hearing from people who have used Erlang in this manner!
On AWS, with your EC2 nodes in a private subnet, you are pretty safe from unwanted connections to your nodes. You can verify this by trying to connect (in any way) to the machines running your code: if you're using a private subnet you will be unable to do so because the instances are not even addressable outside the subnet.
Your load-balancer should not be forwarding Erlang node traffic.
You can do a little better than the above using some security-group rules. Configure your nodes to use some range of ports. Then make a group "erlang" that allows connections to that port range from the "erlang" group and denies the connection otherwise. Finally, assign that security-group to all your Erlang-running instances. This prevents instances that don't need to talk to Erlang from being able to do so.
I think you have a very "classic" setup over there.
You aren't going to connect to the cluster from the Internet ― "outside" the ELB. Assuming the "private" sub-net is shared for something else, you can allow only certain IPs (or ranges) to connect via EPMD.
In any case, some machines must be "trusted" to connect to via EPMD and some other(s) can only establish a connection to some other port(s)...otherwise anything that's running your Erlang cluster is useless.
Something to think about is: you might want to (and indeed you will have to) connect to the cluster for doing some "administrative task(s)", either from the Internet or from somewhere else. I've seen this done via SSH; Erlang support that out-of-the-box.
A final word on doing everything over a TCP connection with a custom protocol, please don't, you will end-up implementing something on your own that hardly have what Erlang offers, and it's really awesome at. In the end, you'll have the same constraints.

How to setup EC2 Security Group to allow working with Firebase?

I am preparing a system of EC2 workers on AWS that use Firebase as a queue of tasks they should work on.
My app in node.js that reads the queue and works on tasks is done and working and I would like to properly setup a firewall (EC2 Security Group) that allows my machines to connect only to my Firebase.
Each rule of that Security Group contains:
protocol
port range
and destination (IP address with mask, so it supports whole subnets)
My question is - how can I setup this rule for Firebase? I suppose that IP address of my Firebase is dynamic (it resolves to different IPs from different instances). Is there a list of possible addresses or how would you address this issue? Can some kind of proxy be a solution that would not slow down my Firebase drastically?
Since using node to interact with Firebase is outbound traffic, the default security group should work fine (you don't need to allow any inbound traffic).
If you want to lock it down further for whatever reason, it's a bit tricky. As you noticed, there are a bunch of IP addresses serving Firebase. You could get a list of them all with "dig -t A firebaseio.com" and add all of them to your firebase rules. That would work for today, but there could be new servers added next week and you'd be broken. To try to be a bit more general, you could perhaps allow all of 75.126.., but that is probably overly permissive and could still break if new Firebase servers were added in a different data center or something.
FWIW, I wouldn't worry about it. Blocking inbound traffic is generally much more important than outbound (since to generate outbound traffic you have to have already managed to somehow run software on the box)