AWS network firewall with Suricata rules - amazon-web-services

I'm looking into implementing AWS Network Firewall with Suricata IPS rules, and find it really hard to find real examples and ideas of what is relevant regarding rules etc. Our customer put emphasis on IPS, IDS and anti-malware.
My setup today is Internet Gateway -> Application Load Balancer -> Auto-scaling ECS containers. Correct me if I'm wrong, but the firewall fits in between IG and ALB?
I have spent some time staring at the following screen;
and my initial questions are;
How do I determine what rules are applicable to me?
What is "Capacity" really?
Starting with number one, I believe the rules I can choose from are listed here, and initially I thought that I surely wanna use all the 30k (?) rules they supply. Thinking about it a bit more I assume that that might affect the responsiveness for our end users. So, if I'm thinking IPS, what rule-sets are necessary for a web solution with port 80 and 443 open to the public? If I look at the file containing all "emerging" rules they list about 30k rules but I hardly think all of them are relevant to me.
Regarding point two, Capacity, Amazon state the following as an explanation;
Maximum processing capacity allowed for the rule group. Estimate the stateful rule group’s capacity requirement as the number of rules you expect to add. You can’t change or exceed this setting when you update the rule group.
Initially I thought that "one capacity" refers to one line (one rule in any rule set), but I later understood that one line itself might require up to 450 "capacity" (I've lost the link where I read/interpreted this).
I understand that this subject is huge, and I'm somewhat of a rookie when it comes to firewalls, but can anyone enlighten me how to approach this? I feel as if I'm not certain what I'm asking about, so please let me know if I need to clarify anything.

I have recently developed an integration between IDSTower (suricata & rules management solution) and AWS Network firewall, so I can relate to the confusion :)
How do I determine what rules are applicable to me?
The starting point should be the services you are protecting, once you know that things will be easier, ET Open/Suricata rules can be grouped in different ways, they are published in different files (eg: emerging-smtp.rules, emerging-sql.rules ...etc) and contains classtype that classify the rules (eg: bad-unknown, misc-attack ...etc) as well as metadata like tags, signature_severity ...etc
Another important thing to point here is that aws network firewall has a limit of the uploaded rules size (in a single stateful rule group) of 2 MBs, which will force you to pick and choose you rules.
there are several approaches to decide what rules to enable:
Using the grouping of rules explained above, start by enabling a small subset, monitor the output, adjust/tune and enable another subset, till you cover the services, so start small and grow the enabled rules.
Enable all of the rules (in IDS mode) and asses the alerts, disable/tune noisy/useless ones till you reach a state of confidence.
Enable Rules that monitor the protocol you system speaks, if you are protecting HTTP based web services, start by enabling rules that are monitoring http protocol ('alert http.....')
If you are applying the above to a production environment, make sure you start by alerting only and once you remove false positives you can move them to drop.
What is "Capacity" really?
AWS use the capacity settings to make sure your Cloud-Suricata instance can deliver the promised performance which is largely influenced by the number of enabled rules.
a single stateful rule consumes 1 capacity
Initially I thought that "one capacity" refers to one line (one rule in any rule set), but I later understood that one line itself might require up to 450 "capacity" (I've lost the link where I read/interpreted this).
Yes, Suricata Rules (which are stateful in AWS Network Firewall world) consumes 1 capacity point per single rule line, however for stateless rules, a single rule can consume more depending on protocols, sources, destinations as mentioned in AWS Docs
A rule with a protocol that specifies 30 different protocols, a source with 3 settings, a destination with 5 settings, and single or no specifications for the other match settings has a capacity requirement of (3035) = 450.
Here is the AWS Network Firewall Docs link

Related

GKE: Protect cluster from accidental deletion

Is there a way on Google Kubernetes Engine to prevent a cluster from accidental deletion.
I know that this can be set at the Compute Engine level as described in the relevant docs.
I cannot seem to be able to find sth at the cluster level.
Exactly as you need it, to avoid deletion of a cluster an all the resources involved with it, there is still work to do ahead, some in favor some against as you can read in here [1] it's a discussion that it has been for quite a long time (almost 4 years) and some of those flags are set into the managed resources in GKE so only upgrades (or full cluster bye-bye) can be done but some of the flags may not work in other resources (like "protected") so, the handling for this is still charged to the user whom would need to be careful when applying YAMLs that may affect the configuration, deployment cycles and resources on his/her clusters. In GKE it actually prompts twice (even though it seems like once) when dumping a cluster see [2], but once again, is relying in the client.
I trust this information can be helpful for you.
[1] https://github.com/kubernetes/kubernetes/issues/10179
[2] https://cloud.google.com/kubernetes-engine/docs/how-to/deleting-a-cluster

How to setup elastic inference combined with an EC2 on AWS

I just found out about the new AWS elastic inference to make access and use of Teraflops cheaper. I found it unbelieavably complex to understand what the intro page is talking about, especially the part where they explain what needs to be done in order to have it set up and running. Until now I just have been using p2xlarge instances to run deep learning training and inference.
I am mostly interested in to combining an EC2 c4.xlarge or a c5.xlarge with the eia1.large
Did anybody go through the steps already? is there a full tutorial on how to do that step by step on that context. Unfortunately the current "tutorial" just points to other tutorials that are too general.
I was struggling at the beginning too getting errors when trying to run their example but eventually I managed. Here's where you should start and I can not emphasize the importance of following the instructions on this page in detail more :
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/setting-up-ei.html
very very important and complete set of instructions. Pay attention to all inbound and outbound rules and different security group settings.
then simply follow this example and take it from there :
https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-tf-elastic-inference.html
Good luck!
P.s. my instance setup was c5.xlarge + eia1.medium

What is the difference between ColdFusion Clustering and Network Load Balancing?

I have gone through several links regarding the difference between Clustering and Network Load Balancing. What I got :
Clustering: A cluster is a group of resources that are trying to achieve a common objective, and are aware of one another. Clustering usually involves setting up the resources (servers usually) to exchange details on a particular channel (port) and keep exchanging their states, so a resource’s state is replicated at other places as well.
Load Balancing: serving a large number of requests (web, VPN connections, ...) by having multiple "copies" of a server.
Now I am not sure What is the difference between ColdFusion clustering and Network load balancing?
What are the benefits of creating multiple instances in a single CF server,Cluster them and host my webapp ?
In terms of CF the main difference between "clustering" and "load balancing" is with clustering the session scope variables, caches, etc... will be shared around the cluster so that all servers know about the sessions, etc... and can therefore answer any request but with load balancing you are simply splitting the traffic to different servers and they are not aware of each other, so things like session variables are not shared and if used for say a login process will cause the user to logged out again if they move from say Server A to Server B when making subsequent requests. In that situation you will need to implement something like "sticky sessions" on the load balancer to stop people from moving between servers.
Clustering is a broad, imprecise term. Anything involving more than one computer is called "clustering"; even the most trivial approaches. Which makes this a great marketing term. And there also is a data mining technique, clustering (alias of cluster-analysis), that has nothing to do with server clustering.
Try to use more precise terms:
load-balancing for example indicates that every host is essentially doing the same thing.
failover goes a step beyond load-balancing: whereas in load balancing one could split the users (say, by the hash code of their username) and every node is only responsible for part of the users, the term failover indicates that any host may fail, and there is a node that can replace its functionality.
sharding is the opposite. It is also a type of load-balancing, but one where you split the load by some kind of key. You can of course have sharded scenarios that still support failover (at reduced performance) when you do the sharding mostly to improve caching.
cluster-computing commonly refers to distributed computation, such as computing weather models by parallel-processing. This will require interaction between hosts, whereas load-balancing of web sites usually involves only one frontend node.
Most likely, ColdFusion only supports failover style load-balancing. Every node will be able to serve every request (so any node may fail), and will only benefit mildly from sharding.

Can you get a cluster of Google Compute Engine instances that are *physically* local?

Google Compute Engine lets you get a group of instances that are semantically local in the sense that only they can talk to each other and all external access has to go through a firewall etc. If I want to run Map-Reduce or other kinds of cluster jobs that are going to induce high network traffic, then I also want machines that are physically local (say, on the same rack). Looking at the APIs and initial documentation, I don't see any way to request that; does anyone know otherwise?
There is no support in GCE right now for specifying rack locality. However, we built the system to work well in the face of large numbers of instances talking to each other in a fully connected way, as long as they are in the same zone.
This is one of the things that allowed MapR to approach the record for a hadoop terasort. You can see that in action in the video for the Criag Mcluckie's talk from IO:
https://developers.google.com/events/io/sessions/gooio2012/302/
The best way to see is to test out your application and see how it works.

is single-purpose instances recommended?

When building an application/system that is to be run in the cloud (e.g., AWS),
is it recommended to always make single-purpose instances?
For example, should I have two instances running MySQL (master+slave), and then two web-server instances, instead of combining web+MySQL in one (possible larger) instance?
Whats the pros and cons, except separation of concerns?
The primary reasons why it's better to have single-purposes instances are:
1) It's easier scale. (eg: just scale up the bottlenecks rather than having to scale the entire stack)
2) It's more secure (eg: your MySQL database isn't on a server that has port 80 open because it also needs to accept your http traffic)
The only good reason not to have single-purpose instances is price. It costs money and for some people it's too much.
If you're doing any kind of e-commerce then definitely use single-purpose instances since most security standards (like PCI-DSS for example) require it. If you're running a content site that doesn't have any e-commerce components and doesn't accept sensitive data from your users, then you can probably be a little looser to save a few bucks, but I don't recommend it.
Splitting database apart from the front end web server(s) is a standard recommendation.
Here's a good writeup on some of the issues to consider:
http://www.mysqlperformanceblog.com/2006/10/16/should-mysql-and-web-server-share-the-same-box/