Originally asked on the AWS forums but I get the sense I won't hear back for quite some time, so I'm also posing my questions here:
I recently set up a Client VPN based on this guide. When connected I'm successfully able to access the internet as well as resources in a private subnet, so at this point I have a basic understanding of how all the parts fit together, except for one: the Client CIDR range. This concept gave me so much trouble that I think it stretched out the time-to-build by 2 days because of all the thrashing I did trying to connect it to the other concepts Client VPN involves. But it bugs me when I don't fully understand a thing so I have some questions about it:
Does the Range benefit at all from being in the same CIDR range as the VPC it's a part of, assuming it doesn't overlap with target network(s)? Why or why not?
Why does the Range need to be of size /22, while target networks can be as small as /27? Doesn't that imply 2^5 more clients could be attempting to access a resource in a VPC as there are available addresses in a given subnet?
In setting up security groups for the private subnet I noticed that I had to use rules based on the CIDR range of the target subnet client connections landed in, rather than the Client CIDR range - why is that?
As you can probably tell from my questions, I'm not a network administrator. I'm trying to understand that world at the same time I'm trying to spin up useful infrastructure. My guess is the answers to these questions are blindingly obvious to someone with experience in that area, but I just don't get it.
Here are my attempts at clarification:
So the range shouldn't overlap the VPC CIDR supernet (and individual subnets within the VPC) or you may get routing conflicts. So I'm not sure what you are referring to? Can you provide your configuration.
From what I can tell the /16 to /22 range is just something that is not technical restriction, probably because AWS hadn't had a chance to add a feature that would allow this to have more options. I'm assuming you want a smaller range? In Azure P2S VPN, there is not such restriction - their minimum pool is a /29.
SGs are applied to resources such as EC2s and not VPCs directly but in the inbound rules you can specific CIDRs directly - so I'm not sure what you are referring to... do you have the specific example you could share?
Related
We're using Lambda to submit API requests to various endpoints. Lately we have been getting 403-Forbidden replies from the API endpoint(s) we're using, but it's only happening randomly.
When it pops up it seems to happen for a couple of days and then stops for awhile, but happens again later.
In order to troubleshoot this, the API provider(s) are asking me what IP address / domain we are sending requests from so that they can check their firewall.
I cannot find any report or anything showing me this, which seems unbelievable to me. I do see other threads about setting up VPC with private subnet, which would then use a static IP for all Lambda requests.
We can do that, but is there really no report or log that would show me a list of all the requests we've made and the Ip/domain it came from in the current setup?
Any information on this would be greatly appreciated. Thanks!
I cannot find any report or anything showing me this, which seems unbelievable to me
Lambda exists to let you write functions without thinking about the infrastructure that it's deployed on. It seems completely reasonable to me that it doesn't give you visibility into its public IP. It may not have one.
AWS has the concept of an elastic network interface. This is an entity in the AWS software-defined network that is independent of both the physical hardware running your workload, as well as any potential public IP addresses. For example, in EC2 an ENI is associated with an instance even when it's stopped, and even though it may run on different physical hardware and get a different public IP when it's next started (I've linked to the EC2 docs because that's the best description that I know of, but the same idea applies to Lambda, ECS, and anything else on the AWS network).
If you absolutely need to know what address a particular non-VPC Lambda invocation is using, then I think your only option is to call one of the "what's my IP" APIs. However, there is no guarantee that you'll ever see the same IP address associated with one of your Lambdas in the future.
As people have noted in the comments, the best solution is to run your Lambdas in a private subnet in your VPC, with a NAT and Elastic IP to guarantee that they always appear to be using the same public IP.
I've watched hours upon hours of tutorials and have read until my eyes were about to bleed, but I just cannot seem to grasp how Amazon VPCs are working. I've created and deleted entire VPC environments with EC2 instances various times following tutorials, but as soon as I go to create one w/out the tutorial, I'm lost.
I'm trying to come up with an analogy to help me to better understand. What I have so far is something like this:
A VPC is like a Club. At the front of the club, you have an
Entrance, the IGW. Inside the Club, you have different areas; the General Area which would be the public subnet and the
Management Area which is the private subnet.
Within the General Area you would have a Dance Floor/Bar which
would equate to an EC2 Instance and a Receiving Bay where management
can receive deliveries and whatnot from the outside world, the NAT.
Then in the Management Area you'd have an Office, another EC2
Instance, and your Inventory which is like your RDS.
I think that's a somewhat accurate analogy so far, but once I start to try and work in the SGs, NACLs, RTs, etc, I realize that I'm just not grasping it all.
Can anyone help me with finishing this analogy or supply a better analogy? I'm at my wits' end.
Rather than using analogies, let's use the network you already have at home.
Within your home, you probably have a Router and various devices connected to the router. They might be directly connected via ethernet cables (eg a PC), or they might be connected via wifi (eg tablets, phones, Alexa). Your home network is like a VPC. Your various devices connect to the network and all of the devices can talk to each other.
You also have some sort of box that connects your router to the Internet. This might be a cable modem, or a fibre router or (in the old days) a telephone connection. These boxes connect your network (VPC) to the Internet and are similar in function to an Internet Gateway. Without these boxes, your network would not be able to communicate with the Internet. Similarly, without an Internet Gateway, a VPC cannot communicate with the Internet.
Some home routers allow you to broadcast a Guest network in addition to your normal network. This is a network where you can give guests a password, but they can't access your whole network -- this is good for security, since they can't snoop around your network to try and steal your data. This is similar in concept to having a separate subnet -- there are two networks, but routing rules (NACLs) block the traffic between them to improve security.
A home router typically blocks incoming access to your devices. This means that people on the Internet cannot access your computer, printer, devices, etc. This is good, since there are many bots on the Internet always trying to hack into devices on your network. However, the home router allows outbound requests from your devices to the Internet (eg a website) and it is smart enough to allow the responses to come back into the network. This is equivalent to a Security Group, which has rules that determine what Inbound and Outbound requests are permitted. Security Groups are stateful, which means they automatically allow return traffic even if it is not specifically listed. The difference is that the router is acting as the Security Group, whereas in an Amazon VPC it is possible to assign a Security Group to each individual resource (like having a router on each resource).
That doesn't cover all the capabilities of an Amazon VPC, but it should give you an idea of how the network actually behaves.
The official sample of AWS Advanced Networking Speciality questions contains a question about the most cost-effective connection between your
on-premises data centre and AWS ensuring confidentiality and integrity of the data in transit to your VPC (the question #7).
The correct answer implies establishing of the managed VPN connection between the customer gateway appliance and the virtual private gateway over the Direct Connect connection.
However one of the possible options in the list of answers offers a software VPN solution ("Set up an IPsec tunnel between your customer gateway and a software VPN on Amazon EC2 in the
VPC"). The explanation why this answer is incorrect says that:
it would not take
advantage of the already existing Direct Connect connection
My question is: why would not this software VPN connection take advantage of the already existing DC connection? What's the principal difference here?
Option 1: The question is flawed.
If you built a tunnel between a customer gateway device and an EC2 instance with traffic routing through the Direct Connect interconnection, then you are quite correct -- that traffic would use the existing Direct Connect connection.
If, on the other hand, you built a tunnel from the customer gateway to an EC2 instance over the Internet, then of course that traffic would not use the Direct Connect route.
There appears to be an implicit assumption that a tunnel between a device on the customer side and an EC2 instance would necessarily traverse the Internet, and that is a flawed assumption.
There are, of course, other reasons why the native solution might be preferable to a hand-rolled one with EC2 (e.g. survival of a the complete loss of an AZ or avoidance of downtime due to eventual instance hardware failures), but that isn't part of the scenario.
Option 2. The answer is wrong for a different reason than the explanation offered.
Having written and reflected on the above, I realized there might be a much simpler explanation: "it would not take advantage of the already existing Direct Connect connection" is simply the wrong justification for rejecting this answer.
It must be rejected on procedural grounds, because of the instruction to Choose 3. Here are the other two correct answers.
A) Set up a VPC with a virtual private gateway.
C) Configure a public virtual interface on your Direct Connect connection.
You don't need to have either of these things in order to implement a roll-your-own IPSec tunnel between on-premise and EC2 over Direct Connect. A Virtual Private Gateway is the AWS side of an AWS-managed VPN, and a Public Virtual Interface is necessary to make one of those accessible from inside Direct Connect (among other things, but it is not necessary in order to access VMs inside a VPC using private IPs over Direct Connect).
I would suggest that the answer you selected may simply be incorrect, because it doesn't belong with the other two, and the explanation that is offered misses the point entirely, and the explanation is itself incorrect.
Can I assume that while my cloud function is running, no other cloud function (that is also currently running) also has the same IP address? In other words, do I "own" the IP address of the cloud function during the time in which it is running?
My guess is no, since it would just cost Google more money to do that without much benefit for 95% of users, but I couldn't find any info on this anywhere, hence this question.
If my intuition is correct, then perhaps the only way to be sure that my function has a unique IP is to assign it a static IP? As of writing, static IPs for Cloud Functions are apparently in beta.
Currently, as the product stands, you can not assume that if you make an outgoing request from Cloud Functions that it will appear to come from an IP address, with no other outgoing traffic from any other functions appearing to come from it. As you've seen in the other question, there are blocks of addresses that Google owns, and the traffic could appear to come from anywhere within those blocks, depending on the region of deployment and other factors. You can expect that there are going to be far more Cloud Functions deployed for all projects for all customers running concurrently than there are specific IPs within those blocks. So you should not make any assumptions about the IP of origination. It could change at any time, and any function's or project's traffic may appear to come from it.
If this situation changes due to additional features offered by Cloud Functions, you might get a different set of guarantees, but it's not clear what those are without being in this beta program.
Doug is right. There is any guaranty of the IP address. And I don't hear about any alpha/beta program with static public IP.
However, there is an beta program called vpc connector, in networks section in the console, which allows you to define a small range of IP (cidr /28) to be used by function to enter in the VPC of your project. You can then set up all the route and the firewall rules that you want with this range in your VPC.
Finally, about the early Access mentioned in the link, and which shouldn't be public, it's not exactly that. Stay tuned.
I am testing Erlang and have a few questions related to Security of the Distribution. (There is a lot of mixed information out there) These type of questions come with lots of opinions related to situations, and depends on personal comfort level on the type of data you are dealing with. For the sake of this question, lets assume it is a simple chat server where users can connect to and chat together.
Example Diagram:
The cluster will be behind a private subnet VPC with elastic-load-balancing directing all connections to these nodes (to and from). The elastic-load-balancing will be the only direct path to these nodes (there would be no way to connect to a node via name#privatesubnet).
My question is the following:
Based on this question and answer: Distributed erlang security how to?
There are two different types of inner-communication that can take place. Either, directly connecting nodes using built in functionality, or doing everything over a TCP connection with a custom protocol. The first is the most easiest, but I believe it comes with a few security issues, and I was wondering based on the above diagram if It would be good enough (Er, okay, Good Enough is not always good when dealing with sensitive information, but there can always be better ways to do everything ...)
How do you secure and Erlang cluster behind a private subnet? I would like to hide the nodes, and manually connect them, and of course use cookies on them. Is there any flaws with this approach? And since a custom protocol using TCP would be the best option, what type of impact does that have on performance? I want to know the potential security flaws(As I said, there is a lot of mixed information out there on how to do this).
I would be interested in hearing from people who have used Erlang in this manner!
On AWS, with your EC2 nodes in a private subnet, you are pretty safe from unwanted connections to your nodes. You can verify this by trying to connect (in any way) to the machines running your code: if you're using a private subnet you will be unable to do so because the instances are not even addressable outside the subnet.
Your load-balancer should not be forwarding Erlang node traffic.
You can do a little better than the above using some security-group rules. Configure your nodes to use some range of ports. Then make a group "erlang" that allows connections to that port range from the "erlang" group and denies the connection otherwise. Finally, assign that security-group to all your Erlang-running instances. This prevents instances that don't need to talk to Erlang from being able to do so.
I think you have a very "classic" setup over there.
You aren't going to connect to the cluster from the Internet ― "outside" the ELB. Assuming the "private" sub-net is shared for something else, you can allow only certain IPs (or ranges) to connect via EPMD.
In any case, some machines must be "trusted" to connect to via EPMD and some other(s) can only establish a connection to some other port(s)...otherwise anything that's running your Erlang cluster is useless.
Something to think about is: you might want to (and indeed you will have to) connect to the cluster for doing some "administrative task(s)", either from the Internet or from somewhere else. I've seen this done via SSH; Erlang support that out-of-the-box.
A final word on doing everything over a TCP connection with a custom protocol, please don't, you will end-up implementing something on your own that hardly have what Erlang offers, and it's really awesome at. In the end, you'll have the same constraints.