AWS WAF Log Utilisation + Penetration Testing with Web Applications - amazon-web-services

How can AWS Web Application Firewall help me in identifying which penetration testing I should use against my web application. Once i have access to the WAF Logs, how can I best utilise it to identify penetration testing.

I would say that the AWS WAF (or any WAF) is not a good indicator of what type of pentesting you should be doing. Determining the scope and type of pentesting is one of the most important first steps a qualified pentester or consultancy should be doing.
On the topic of WAFs, I would also say that they are not a good indicator of true manual pentesting. While the AWS WAF is great at catching SQL Injection and XSS test cases, it is not capable of detecting parameter tampering attacks.
So while it can create alerts that are likely to detect scans, it may fall short of detecting subtle human-driven test cases (which are often more dangerous).
To detect true pentest test-cases, it is always most valuable to add instrumentation at the application layer. This way you can create alerts for when a user tries to access pages or objects that belong to other users.
Also consider that if you create these alerts at the application layer, you can include more valuable data points such as the user and IP address. This will provide a valuable distinction between alerts that are create by random scanner on the internet, and ones created by authenticated users.

Related

API Gateway Authorizer and Logout (Performance/Security Considerations)

I am using Cognito, API Gateway and Authorizers. Authorizers are configured to be cached for 5mins for performance. I felt that this is a nice feature.
I understand that authorizers are a good way to keep auth logic in one place and apps can assume users are already authorized. However, I am having my doubts.
The pentest report recommends that once logged out tokens should not be able to be used. So it means for security, I should not enable authorizer caching? It also means all authenticated APIs will have a go through one overhead of a lambda authorizer ...
Also from a coding perspective, isit really a good idea to use authorizers which are hard to test end to end? I could test the lambda function as a unit. But whats more important to me is they are attached to the correct APIs. Theres currently no way I see that allows me to test this easily.
Another problem is looking at the code, I can no longer tell what authorization is required easily ... I have to look through which authorizer is supposed to be attached (eg. CloudFormation) then the lambda code itself.
Is there a good thing from using Authorizers? Or whats the best practice with this actually?
For security, I should not enable authorizer caching
If you have strict security requirements (e.g., as soon as a token is invalidated all requests should fail) you will need to turn off authorizer caching. See this answer from https://forums.aws.amazon.com/thread.jspa?messageID=703917:
Currently there is only one TTL for the authorizer cache, so in the scenario you presented API Gateway will continue to return the same cached policy (generating a 200) for the cache TTL regardless of whether the token may be expired or not. You can tune your cache TTL down to level you feel comfortable with, but this is set at level of the authorizer, not on a per token basis.
We are already considering updates to the custom authorizer features and will definitely consider your feedback and use case as we iterate on the feature.
It also means all authenticated APIs will have a go through one overhead of a Lambda authorizer...
That it does. However, in practice, my team fought far harder with Lambda cold starts and ENI attachment than anything else performance-wise, so the overhead that our authorizer added ended up being negligible. This does not mean that the performance hit was not measurable, but it ended up increasing latency on the order of milliseconds over placing authorizer code directly in the Lambda, a tradeoff which made sense in our application. In stark contrast, Lambda cold starts could often take up to 30s.
Also from a coding perspective, is it really a good idea to use authorizers which are hard to test end to end?
In many applications built on a service-oriented architecture you are going to have "end-to-end" scenarios which cross multiple codebases and can only be tested in a deployed environment. Tests at this level are obviously expensive, so you should avoid testing functionality which could be covered by a lower-level (unit, integration, etc.) test. However, it is still extremely important to test the cohesiveness of your application, and the fact that you will need such tests is not necessarily a huge detractor for SOA.
I could test the Lambda function as a unit. But what's more important to me is they are attached to the correct APIs. There's currently no way I see that allows me to test this easily.
If you are considering multiple authorizers, one way to test that the right authorizers are wired up is to have each authorizer pass a fingerprint down to the endpoint. You can then ping your endpoints and have them return a health check status.
For example,
[ HTTP Request ] -> [ API Gateway ] -> [ Authorizer 1 ] -> [ Lambda 1 ] -> [ HTTP Response ]
Payload: Payload:
user identity status: ok
authorizer: 1 authorizer: 1
In practice, my team had one authorizer per service, which made testing this configuration non-critical (we only had to ensure that endpoints which should have been secured were).
Another problem is. looking at the code, I can no longer tell what authorization is required easily... I have to look through which authorizer is supposed to be attached (eg. CloudFormation) then the Lambda code itself.
Yes, this is true, and the nature of a hugely decoupled environment which was hard to test locally was one of the biggest gripes my team had when working with AWS infrastructure. However, I believe that it's mainly a learning curve when dealing with the AWS space. The development community as a whole is still relatively new to a lot of concepts that AWS exposes, such as infrastructure as code or microservices, so our tooling and education is lacking in this space when compared with traditional monolithic development.
Is this the right solution for your application? I couldn't tell you that without an in-depth analysis. There are plenty of opinions in the broader community which go both ways, but I would like to point you to this article, especially for the fallacies listed: Microservices – Please, don’t. Use microservices because you have developed a solid use case for them, not necessarily just because they're the newest buzzword in computer science.
Is there a good thing from using Authorizers? Or what's the best practice with this actually?
My team used authorizers for AuthN (via a custom auth service), and handled AuthZ at the individual Lambda layer (via a different custom auth service). This was hugely beneficial to our architecture as it allowed us to isolate what were often extremely complex object-specific security rules from the simple question of identity. Your use case may be different, and I certainly won't claim to know best practices. However, I will point you to the API Gateway Authorizer examples for more ideas on how you can integrate this service into your application.
Best of luck.

Microservices service registry registration and discovery

Little domain presentation
I m actually having two microservices :
User - managing CRUD on users
Billings - managing CRUD on billings, with a "reference" on a user concerned by the billing
Explanation
I need, when a billing is called in a HTTP request, to send the fully billing object with the user loaded. In that case, and in this specifical case, I really need this.
In a first time, I looked around, and it seems that it was a good idea to use message queuing, for asynchronicity, and so the billing service can send on a queue :
"who's the user with the id 123456 ? I need to load it"
So my two services could exchange, without really knowing each other, or without knowing the "location" of each other.
Problems
My first question is, what is the aim of using a service registry in that case ? The message queuing is able to give us the information without knowing anything at all concerning the user service location no ?
When do we need to use a service registration :
In the case of Aggregator Pattern, with RESTFul API, we can navigate through hateoas links. In the case of Proxy pattern maybe ? When the microservices are interfaced by another service ?
Admitting now, that we use proxy pattern, with a "frontal service". In this case, it's okay for me to use a service registration. But it means that the front send service know the name of the userService and the billing service in the service registration ? Example :
Service User registers as "UserServiceOfHell:http://80.80.80.80/v1/"
on ZooKeeper
Service Billing registers as "BillingService:http://90.90.90.90/v4.3/"
The front end service needs to send some requests to the user and billing service, it implies that it needs to know that the user service is "UserServiceOfHell". Is this defined at the beginning of the project ?
Last question, can we use multiple microservices patterns in one microservices architecture or is this a bad practice ?
NB : Everything I ask is based on http://blog.arungupta.me/microservice-design-patterns/
A lot of good questions!
First of all, I want to answer your last question - multiple patterns are ok when you know what you're doing. It's fine to mix asynchronous queues, HTTP calls and even binary RPC - it depends on consistency, availability and performance requirements. Sometimes you can see a good fit for simple PubSub and sometimes you need to have distributed lock - microservices are different.
Your example is simple: two microservices need to exchange some information. You chose asynchronous queue - fine, in this case they don't really need to know about each other. Queues don't expect any discovery between consumers.
But we need service discovery in other cases! For example, backing services: databases, caches and actually queues as well. Without service discovery you probably hardcoded the URL to your queue, but if it goes down you have nothing. You need to have high availability - cluster of nodes replicating your queue, for example. When you add a new node or existing node crashed - you should not change anything, service discovery tool should understand that and update the registry.
Consul is a perfect modern service discovery tool, you can just use custom DNS name for accessing your backing services and Consul will perform constant health checks and keep your cluster healthy.
The same rule can be applied to microservices - when you have a cluster running service A and you need to access it from service B without any queues (for example, for HTTP call) you have to use service discovery to be sure that endpoint you use will bring you to the healthy node. So it's a perfect fit for Aggregator or Proxy patterns from the article you mentioned.
Probably the most confusion is caused by the fact that you see "hardcoded" URLs in Zookeeper. And you think that you need to manage that manually. Modern tools like Consul or etcd allows you to avoid that headache and just rely on them. It's actually also achievable with Zookeeper, but it'll require more time and resources to have similar setup.
PS: please remember about the most important rule in microservices - http://martinfowler.com/bliki/MonolithFirst.html

Building Erlang applications for the cloud

I'm working on a socket server that'll be deployed to AWS and so far we have the basic OTP application set up following a structure similarly to the sample project in Erlang in Practice, but we wanted to avoid having a global message router because that's not going to scale well.
Having looked through the OTP design guide on Distributed Applications and the corresponding chapters (Distribunomicon and Distributed OTP) in Learn You Some Erlang it seems the built-in distributed application mechanism is geared towards on-premise solutions where you have known hostnames and IPs and the cluster configuration is determined ahead of time, whereas in our intended setup the application will need to scale dynamically up and down and the IP addresses of the nodes will be random.
Sorry that's a bit of a long-winded build up, my question is whether there are design guidelines for distributed Erlang applications that are deployed to the cloud and need to deal with all the dynamic scaling?
Thanks,
There are a few possible approaches:
In Erlang and OTP in Action, one method presented is to use one or two central nodes with known domains or IPs, and have all the other nodes connect to this one to discover each other
Applications like https://github.com/heroku/redgrid/tree/logplex require having a central redis node where all Erlang nodes register themselves instead, and do membership management
Third party services like Zookeeper and whatnot to do something similar
Whatever else people may recommend
Note that unless you're going to need to protect your communication, either by switching the distribution protocol to use SSL, or by using AWS security groups and whatnot to restrict who can access your network.
I'm just learning Erlang so can't offer any practical advice of my own but it sounds like your situation might require a "Resource Discovery" type of approach as i've read about in Erlang & OTP in Action.
Erlware also have an application to help with this: https://github.com/erlware/resource_discovery
Other stupid answers in addition to Fred's smart answers include:
Using Route53 and targetting a name instead of an IP
Keeping an IP address in AWS KMS or AWS Secrets Manager, and connecting to that (nice thing about this is it's updatable without a rebuild)
Environment variables: scourge or necessary evil?
Stuffing it in a text file in an obscured, password protected s3 bucket
VPNs
Hardcoding and updating the build in CI/CD
I mostly do #2

How can I uniquely identify a desktop application making a request to my API?

I'm fleshing out an idea for a web service that will only allow requests from desktop applications (and desktop applications only) that have been registered with it. I can't really use a "secret key" for authentication because it would be really easy to discover and the applications that use the API would be deployed to many different machines that aren't controlled by the account holder.
How can I uniquely identify an application in a cross-platform way that doesn't make it incredibly easy for anyone to impersonate it?
You can't. As long as you put information in an uncontrolled place, you have to assume that information will be disseminated. Encryption doesn't really apply, because the only encryption-based approaches involve keeping a key on the client side.
The only real solution is to put the value of the service in the service itself, and make the desktop client be a low-value way to access that service. MMORPGs do this: you can download the games for free, but you need to sign up to play. The value is in the service, and the ability to connect to the service is controlled by the service (it authenticates players when they first connect).
Or, you just make it too much of a pain to break the security. For example, by putting a credential check at the start and end of every single method. And, because eventually someone will create a binary that patches out all of those checks, loading pieces of the application from the server. With credentials and timestamp checks in place, and using a different memory layout for each download.
You comment proposes a much simpler scenario. Companies have a much stronger incentive to protect access to the service, and there will be legal agreements in effect regarding your liability if they fail to protect access.
The simplest approach is what Amazon does: provide a secret key, and require all clients to encrypt with that secret key. Yes, rogue employees within those companies can walk away with the secret. So you give the company the option (or maybe require them) to change the key on a regular basis. Perhaps daily.
You can enhance that with an IP check on all accesses: each customer will provide you with a set of valid IP addresses. If someone walks out with the desktop software, they still can't use it.
Or, you can require that your service be proxied by the company. This is particularly useful if the service is only accessed from inside the corporate firewall.
Encrypt it (the secret key), hard-code it, and then obfuscate the program. Use HTTPS for the web-service, so that it is not caught by network sniffers.
Generate the key using hardware speciffic IDs - processor ID, MAC Address, etc. Think of a deterministic GUID.
You can then encrypt it and send it over the wire.

Identifying ASP.NET web service references

At my day job we have load balanced web servers which talk to load balanced app servers via web services (and lately WCF). At any given time, we have 4-6 different teams that have the ability to add new web sites or services or consume existing services. We probably have about 20-30 different web applications and corresponding services.
Unfortunately, given that we have no centralized control over this due to competing priorities, org structures, project timelines, financial buckets, etc., it is quite a mess. We have a variety of services that are reused, but a bunch that are specific to a front-end.
Ideally we would have better control over this situation, and we are trying to get control over it, but that is taking a while. One thing we would like to do is find out more about what all of the inter-relationships between web sites and the app servers.
I have used Reflector to find dependencies among assemblies, but would like to be able to see the traffic patterns between services.
What are the options for trying to map out web service relationships? For the most part, we are mainly talking about internal services (web to app, app to app, batch to app, etc.). Off the top of my head, I can think of two ways to approach it:
Analyze assemblies for any web references. The drawback here is that not everything is a web reference and I'm not sure how WCF connections are listed. However, this would at least be a start for finding 80% of the connections. Does anyone know of any tools that can do that analysis? Like I said, I've used Reflector for assembly references but can't find anything for web references.
Possibly tap into IIS and passively monitor the traffic coming in and out and somehow figure out what is being called and where from. We are looking at enterprise tools that could help but it would be a while before they are implemented (and cost a lot). But is there anything out there that could help out quickly and cheaply? One tool in particular (AmberPoint) can tap into IIS on the servers and monitor inbound and outbound traffic, adds a little special sauce and begin to build a map of the traffic. Very nice, but costs a bundle.
I know, I know, how the heck did you get into this mess in the first place? Beats me, just trying to help us get control of it and get out of it.
Thanks,
Matt
The easiest way is to look through the logs, but if that doesn't include the referrer than you may also want to monitor what is going out from your web to the app server. You can use tools like Wireshark or Microsoft Network Monitor to see this traffic.
The other "solution" and I use this loosely is to bind a specific web server to app server and then run through a bundle and see what it is hitting on the app server. You could probably do this in a test environment to lesson the effects on the users of the site.
You need a service registry (UDDI??)... If you had a means to catalog these services and their consumers, it would make this job of dependency discovery a lot easier. That is not an easy solution, though. It takes time and documentation to get a catalog in place.
I think the quickest solution would be to query your IIS logs and find source URLs which originate from your own servers. You would at least be able to track down which servers your consumers are coming from.
Also, if you already have some kind of authentication mechanism in place, you could trace who is using a particular service based on login.
You are right about AmberPoint. There are other tools that catalog the service traffic and provide reports showing what is happening to your services. Systinet, SOA Software and Actional also has a products similar to Amberpoint but Amberpoint has a free-ware version, I believe.