How To Secure Erlang Cluster Behind Private Subnet - amazon-web-services

I am testing Erlang and have a few questions related to Security of the Distribution. (There is a lot of mixed information out there) These type of questions come with lots of opinions related to situations, and depends on personal comfort level on the type of data you are dealing with. For the sake of this question, lets assume it is a simple chat server where users can connect to and chat together.
Example Diagram:
The cluster will be behind a private subnet VPC with elastic-load-balancing directing all connections to these nodes (to and from). The elastic-load-balancing will be the only direct path to these nodes (there would be no way to connect to a node via name#privatesubnet).
My question is the following:
Based on this question and answer: Distributed erlang security how to?
There are two different types of inner-communication that can take place. Either, directly connecting nodes using built in functionality, or doing everything over a TCP connection with a custom protocol. The first is the most easiest, but I believe it comes with a few security issues, and I was wondering based on the above diagram if It would be good enough (Er, okay, Good Enough is not always good when dealing with sensitive information, but there can always be better ways to do everything ...)
How do you secure and Erlang cluster behind a private subnet? I would like to hide the nodes, and manually connect them, and of course use cookies on them. Is there any flaws with this approach? And since a custom protocol using TCP would be the best option, what type of impact does that have on performance? I want to know the potential security flaws(As I said, there is a lot of mixed information out there on how to do this).
I would be interested in hearing from people who have used Erlang in this manner!

On AWS, with your EC2 nodes in a private subnet, you are pretty safe from unwanted connections to your nodes. You can verify this by trying to connect (in any way) to the machines running your code: if you're using a private subnet you will be unable to do so because the instances are not even addressable outside the subnet.
Your load-balancer should not be forwarding Erlang node traffic.
You can do a little better than the above using some security-group rules. Configure your nodes to use some range of ports. Then make a group "erlang" that allows connections to that port range from the "erlang" group and denies the connection otherwise. Finally, assign that security-group to all your Erlang-running instances. This prevents instances that don't need to talk to Erlang from being able to do so.

I think you have a very "classic" setup over there.
You aren't going to connect to the cluster from the Internet ― "outside" the ELB. Assuming the "private" sub-net is shared for something else, you can allow only certain IPs (or ranges) to connect via EPMD.
In any case, some machines must be "trusted" to connect to via EPMD and some other(s) can only establish a connection to some other port(s)...otherwise anything that's running your Erlang cluster is useless.
Something to think about is: you might want to (and indeed you will have to) connect to the cluster for doing some "administrative task(s)", either from the Internet or from somewhere else. I've seen this done via SSH; Erlang support that out-of-the-box.
A final word on doing everything over a TCP connection with a custom protocol, please don't, you will end-up implementing something on your own that hardly have what Erlang offers, and it's really awesome at. In the end, you'll have the same constraints.

Related

Restrict access to some endpoints on Google Cloud

I have a k8s cluster that runs my app (gce as an ingress) and I want to restrict access to some endpoints "/test/*" but all other endpoints should be publically available. I don't want to restrict for specific IP's to have some flexibility and ability to access restricted endpoints from any device like phones.
I considered IAP but it restricts access to the full service when I need it only for some endpoints. Hence extra.
I have thought about VPN. But I don't understand how to set this up, or would it even resolve my issues.
I have heard about proxy but seems to me it can't fulfill my requirements (?)
I can't tell that solution should be super extensible or generic because only a few people will use this feature.
I want the solution to be light, flexible, simple, and fulfill my needs at the same time. So if you say that there are solutions but it's complex I would consider restricting access by the IP, but I worry about how the restricted IP's approach is viable in the real life. In a sense would it be too cumbersome to add the IP of my phone every time I change my location and so on?
You can use API Gateway for that. It approximatively meets your needs, it's not so flexible and simple.
But it's fully managed and can scale with your traffic.
For a more convenient solution, you have to use software proxy (or API Gateway), or go to the Bank and use Apigee
I set up OpenVPN.
It was not a tedious process because of the various small obstacles but I encourage you to do the same.
Get a host (machine, cluster, or whatever) with the static IP
Setup an OpenVPN instance. I do docker https://hub.docker.com/r/kylemanna/openvpn/ (follow instructions but update a host -u YOUR_IP)
Ensure that VPN setup works from your local machine
To the routes you need limit IP access to the VPN one. Nginx example
allow x.x.x.x;
deny all;
Make sure that nginx treats IP right. I had an issue that the nginx was having Load Balancer IP as client IP's, so I have to put some as trusted. http://nginx.org/en/docs/http/ngx_http_realip_module.html
Test the setup

How AWS Direct Connect works?

How does AWS Direct Connect work?
From AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
Please explain what it actually means in practical sense? And why would it be any better?
NOTE: I know there are AWS docs (worst docs I've ever seen) and some
other articles and answers on SO, but none of those explain what it
actually means in practice or miss some important notes for understanding for somebody who's only used to public internet and can't imagine how it's possible to "bybass public internet". So I decided
to create this post from my collective knowledge and also provided a real example from our company case. Hope it will be
useful for somebody.
So, from AWS docs:
AWS Direct Connect establishes a dedicated network connection between your on-premises network and AWS ... bypassing your internet service provider.
But to understand what it actually means, lets refresh some knowledge.
Public Internet
Internet cable goes from your PC to your ISP (Internet Service Provider) switch located somewhere in your apartments building, usually.
And then... it gets connected to another switch, and it goes on and on like that, travelling through other cables and switches until it reaches the target PC you wanted to reach. Each switch knows where to send the signal next (how: it's a different topic).
So the signal first travels to some main switch of your ISP (where it does all the pricing/monitoring etc). ISP itself is then connected to another bigger ISP (they have some sort of partnership where your ISP pays that another ISP for using its cables, just like you do with your own ISP). In total, lots of physical cables literally lay down around the world going undersea/underground/over-the-air allowing the whole world to be connected to each other.
Problem
There are millions of internet users. You can imagine how many switches and cables is needed to support all that big network, and how much traffic (hello TikTok) is traveling. Because of that:
Your signal doesn't travel via the most optimal route through switches, when it needs to go from your PC to another target machine (some AWS machine in our case). Especially when you're on different continents.
Big traffic also makes your ping fluctuate, depending on the load on each individual switch.
There are all sorts of switches, we can't really trust them. Moreover, ISPs aren't required to be compliant with PCI security standard, or any other. If you want a secure connection, you have to use VPN (IPSec, OSI layer 3), but it costs in terms of performance and ping speed.
AWS Direct Connect
AWS own internet
AWS came here and said - "let me create my own internet around the world. I literally lay down my own physical cables and I'll connect them to only selected (partner) data centers. So you can use those instead of public internet" AWS may still probably lease someone's cables (especially underseas one's), but most of the cables/switches are their own ones. Benefits:
It's a much smaller net. Less switches and they're more reliable. Cables go almost straight to AWS. Therefore it's faster.
AWS implements MACsec OSI layer 2 security at hardware level. So no VPN required (though you could still use it). It's faster.
How to connect
Obviously, AWS can't connect to each PC in the world just like public ISPs, otherwise it would be the same public internet network, sort of speaking, and wouldn't make sense.
AWS has a partnership with selected data centers around the world to which AWS did the physical connection and put their own switch there ("AWS Direct Connect Cage"). So if you put your server in such data center (or at least your server connects to such data center from other nearest location via public internet) you can quickly enter AWS network where your signal will travel much faster. So you don't even need a public ISP in such case.
You do this to:
Reduce latency and improve stability into your AWS environment.
Even reduce latency between non-AWS endpoints. When both endpoints use public internet to only connect to the nearest AWS cage, while then cage-to-cage traffics goes through AWS internal network. And voila!
Results
In our company case, we managed to decrease the latency around 5 times (i.e. 0.5 seconds vs 2.5 seconds) for non-AWS connectivity.

Why doesn't GCP's "Memorystore for Redis" doesnot allow option to add Public IP?

Currently, when trying to create "MemoryStore for Redis" in GCP, there is no option to add Public IP.
This poses a problem as I am unable to connect to it from a Compute Engine from external network with this REDIS instance in another network.
Why is this missing?
Redis is designed to be accessed by trusted clients inside trusted
environments. This means that usually it is not a good idea to expose
the Redis instance directly to the internet or, in general, to an
environment where untrusted clients can directly access the Redis TCP
port or UNIX socket.
Redis Security
I think because a design decision but in general this is not something we will know since we are not part of the Product team so I don't think this question can be easily answered in SO.
According to this Issue Tracker there are no plans to support this a near future.
Said that you may want to take a look to at this doc where it shows some workarounds to connect from a network outside the VPC.

Should I disable EC2 to access external network to improve safety?

I want to use Kubernetes on some clouds (maybe Amazon, Google, etc). Should I disallow my EC2 machines from accessing the external network? My guess is as follows, and I wonder whether it is correct or wrong?
I should disallow EC2 from accessing the external network. Otherwise, hackers can attack my machines more easily. (true?)
How to do it: I should use a dedicated load balancer (maybe Ingress) with the external IP that my domain name is bound to. The load balancer will then talk with my actual application (which has no external IP and can only access internal network). (true?)
Sorry I am new to Ops, and thanks for any help!
Allowing or disallowing your EC2 instances from accessing external networks, ie keeping the rule that allows all outgoing traffic in your security group won't be of much use keeping hackers out, that's what the incoming traffic rules are for. It will, however, prevent unwanted traffic from going out after the hacker has reached your instance and has been able to install whatever malicious software on it, and then it would try to initiate outgoing communication.
That outgoing traffic rule is usually kept to allow things like getting software installs and updates, but it won't affect how your instances respond to incoming requests (legitimate or not).
It is a good idea to have a load balancer in front of your instances and have it be the only allowed point of entry to your services. It's a good pattern to follow, and your instances will not need to have an external IP address.
Having a bastion host is a good idea as well, and use it to manage the instances themselves. And I would also recommend Systems Manager's Session Manager for this task.

Websocket Load Balancing on AWS EC2

We are building a scaled application that uses WebSockets on AWS EC2. We were considering using the default ELB (Elastic Load Balancing) for this, but that, unnecessarily, makes the load balancer itself a bottleneck for traffic-heavy operations (see this related thread), so we are currently looking into a way to send the client the connection details of a "good instance" to connect to instead. However, the Elastic Load Balancer API does not seem to support a query of the sort "give me (public) connection details of a good instance", which is odd because that is the core functionality of any load balancer. Maybe I have just not looked at the right place?
UPDATE:
Currently, we are investigating two simple solutions using default implementations:
Use ELB in TCP mode which tunnels all traffic through the ELB.
Simply connect to the public IP of the instance that the ELB connected you to for your GET request. The second solution requires public IPs to be enabled, but does not route all traffic through the ELB.
I was concerned about that very last part because I assumed that the ELB is not in the same building as the instance it gave you. But I assume, it usually is in the same building or has some other high-speed connection to the instances? In that case, the tunneling overhead is negligible.
Both solutions seem to be equally viable, or am I overseeing something?
If your application manages to make the ELB a bottleneck, then you are a pretty big fish. Why don't you try first using their load balancer trusting that they do their job right? It is difficult to make it "better", and the most difficult part about this is to define what is "better" in the first place. You definitely did not very well define that in your question, so I am pretty sure that you are well off using just their load balancer.
In some cases it might make sense to develop your own load balancing logic, especially if your machine usage depends on very special metrics not per se accessible to the ELB system.
Yes, I'd say both solutions are viable.
The upside of the second is that it allows greater customization of the load balancing logic you may want to implement (providing an improvement over ELBs round robin), dispatching requests to a server of your convenience after an initial HTTP GET request.
The downside may be on the security front. It's not clear whether security, and SSL is part of your requirements, but in case it is, the second solution forces you to handle it at the ec2 instances level, which can be inconvenient and affect each node's performance. Otherwise websocket communications may be left unsecured.