How to decouple uniqueness validation from the persistence layer?

How to decouple uniqueness validation from the persistence layer? - clojure

Let's say I have a super simple user registration check that a user's email must be unique across all users.
I've expressed this requirement in such functions.
(defn validate-user [user]
(and (:email user) (is-unique? (:email user))))
(defn is-unique? [email]
(not (db-api/user-exists {:email email})))
But I want to decouple my validation from the database, I want to make it purely functional. I could probably also inject the database API as a parameter to validate-user, like
(defn validate-user [db-api user]
(and (:email user) (is-unique? db-api (:email user))))
(defn is-unique? [db-api email]
(not ((:user-exists db-api) {:email email})))
but I don't know if this is idiomatic.
Also, it feels like the consumer of validate-user should not care about the database api. It feels like having this dependency undermines the entire concept of separating the business logic layer from the persistence layer. So I'm looking for a mindset that explains how to do this properly, or why it could not be done.

In order to avoid race conditions, the database should handle this constraint. You are probably looking for the equivalent of INSERT IF NOT EXISTSin the SQL world.
In practise, you could have a function create-user and a function update-user. The create-user function could use the IF NOT EXISTS check.
You cannot decouple it from the database : it is the database responsibility to maintain the constraints of the data (relations, in the relational world). Nobody else than the database can do it due to race conditions.
Let me expand on that with an example :
Suppose that two users (userA and userB) wish at the very same time to create an account with a new email not already chosen : "user#example.com". Your system then queries the database :
checking that "user#example.com" doesn't exists on behalf of userA, it returns true
checking that "user#example.com" doesn't exists on behalf of userB, it returns true
You then proceed to :
create an account for userA with a mail "user#example.com"
create an account for userB with a mail "user#example.com"
Depending of the semantics of the database and your requests, you may end up with :
two accounts with the mail "user#example.com" (ex. no constraint on the database, no unique index)
an account for "userB" is created, no account for userA (ex. the request to create the account for userA failed because there was already a mal present)
an account for "userB" is created, then the data for "userA" overrides the data from "userB" (ex. the semantics of an update statement are user in the database - probably a bug at this point)
Because of that, you should try to create an account, and let the database tell you if it failed. You just cannot check that yourself, without recreating the properties of a database yourself (I personally wouldn't dare trying).

Related

How to prioritize queries from different user roles in superset?

I have low-priority users and high-priority users. First of all, I need to process queries from high-priority users. Low-priority users should be limited initially because they slow down the high-priority users.
At the moment I have not found a solution. Is this generally possible or do I need to fork apache-superset and implement such logic myself in the source code? Is this functionality planned in the roadmap?

Superset generally is a thin layer over your existing data stores and doesn't have much of a compute layer.
With this in mind, the right technical decision is probably to configure this at the database / data store layer. Many people integrate LDAP both into Superset and into their data store so 2-way configuration of roles & datas tore priorities / permissions can be configured.
With that being said, Superset is open source! You're definitely welcome to fork the code and implement it yourself. Better yet, you can feel free to raise a discussion by creating a Github issue.

Like Srini said, it depends on your data layer.
One way you could this is by defining a custom SQL_QUERY_MUTATOR in your superset_config.py:
# superset_config.py
def SQL_QUERY_MUTATOR(sql, username, security_manager):
pool = "vip" if username in VIP_LIST else "normal"
return f"-- pool: {pool}\n{sql}"
This will prepend a comment to the query specifying a pool that's either "vip" or "normal". You could then send this to a proxy that parses the comment and dispatches the query to the proper header.
Another way of doing this is specifying a DB_CONNECTION_MUTATOR that sets connection parameters depending on the user:
# superset_config.py
def DB_CONNECTION_MUTATOR(uri, params, username, security_manager, source):
pool = "vip" if username in VIP_LIST else "normal"
params["configuration"] = {"job.queue.name": pool}}
return uri, params

Should I store failed login attempts in AWS Cognito or Dynamo DB?

I have a requirement to build a basic "3 failed login attempts and your account gets locked" functionality. The project uses AWS Cognito for Authentication, and the Cognito PreAuth and PostAuth triggers to run a Lambda function look like they will help here.
So the basic flow is to increment a counter in the PreAuth lambda, check it and block login there, or reset the counter in the PostAuth lambda (so successful logins dont end up locking the user out). Essentially it boils down to:
PreAuth Lambda
if failed-login-count > LIMIT:
block login
else:
increment failed-login-count
PostAuth Lambda
reset failed-login-count to zero
Now at the moment I am using a dedicated DynamoDB table to store the failed-login-count for a given user. This seems to work fine for now.
Then I figured it'd be neater to use a custom attribute in Cognito (using CognitoIdentityServiceProvider.adminUpdateUserAttributes) so I could throw away the DynamoDB table.
However reading https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-dg.pdf the section titled "Configuring User Pool Attributes" states:
Attributes are pieces of information that help you identify individual users, such as name, email, and phone number. Not all information about your users should be stored in attributes. For example, user data that changes frequently, such as usage statistics or game scores, should be kept in a separate data store, such as Amazon Cognito Sync or Amazon DynamoDB.
Given that the counter will change on every single login attempt, the docs would seem to indicate I shouldn't do this...
But can anyone tell me why? Or if there would be some negative consequence of doing so?
As far as I can see, Cognito billing is purely based on storage (i.e. number of users), and not operations, whereas Dynamo charges for read/write/storage.
Could it simply be AWS not wanting people to abuse Cognito as a storage mechanism? Or am I being daft?

We are dealing with similar problem and main reason why we have decided to store extra attributes in DB is that Cognito has quotas for all the actions and "AdminUpdateUserAttributes" is limited to 25 per second.
More information here:
https://docs.aws.amazon.com/cognito/latest/developerguide/limits.html
So if you have a pool with 100k or more it can create a bottle neck if wanted to update a Cognito user records with every login etc.

Cognito UserAttributes are meant to store information about the users. This information can then be read from the client using the AWS Cognito SDK, or just by decoding the idToken on the client-side. Every custom attribute you add will be visible on the client-side.
Another downside of custom attributes is that:
You only have 25 values to set
They cannot be removed or changed once added to the user pool.
I have personally used custom attributes and the interface to manipulate them is not excellent. But that is just a personal thought.
If you want to store this information, and not depend on DynamoDB, you can use Amazon Cognito Sync. Besides the service, it offers a client with great features that you can incorporate to your app.

AWS DynamoDb appears to be your best option, it is commonly used for such use cases. Some of the benefits of using it:
You can store separate record for each login attempt with as much info as you want such as ip address, location, user-agent etc. You can also add datetime that can be used by pre-auth Lambda to query by time range for example failed attempt within last 30 minutes
You don't need to manage table because you can set TTL for DynamoDb record so that record will be deleted automatically after specified time.
You can also archive items in S3

Django role based permissions

I'm developing a huge application in django and I need a permission system and I assume that the native user/group permission within django is not sufficient. Here my needs:
The application will be available through multiple departments. In each department there will be nearly the same actions. But maybe an user will be allowed to add a new team member in department A and in department B he is only allowed to view the team list and in the other departments he has no access at all.
I though using a RBAC system would be most appropriate. Roles must also be inheritable, stored in a model an managable through an interface. Any good ideas or suggestions? Regards

What you are looking for is called abac aka Attribute-Based Access Control. It's an evolution of RBAC as an access control model. In RBAC, you define access control in terms of roles, groups, and potentially permissions. You then have to write code within your application to make sense of the roles and groups. This is called identity-centric access control.
In ABAC, there are 2 new elements:
attributes which are a generalization of groups and roles. Attributes are a key-value pair that can describe anyone and anything. For instance, department, member, and action are all attributes.
policies tie attributes together to determine whether access should be granted or denied. Policies are a human-friendly way of expressing authorization. Rather than write custom code in your app, you write a policy that can be centrally managed and reused across apps, databases and APIs.
There are a couple of ABAC languages such as xacml and alfa. Using ALFA, I could write the following policy:
A user will be allowed to add a new team member in department A
In department B he is only allowed to view the team list
In the other departments he has no access at all.
Roles must also be inheritable, stored in a model an managable through an interface.
policyset appAccess{
apply firstApplicable
policy members{
target clause object = "member"
apply firstApplicable
/**
* A user can add a member to a department if they are a manager and if they are assigned to that department.
*/
rule addMember{
target clause role == "manager" and action == "add"
permit
condition user.department == target.department
}
}
}
One of the key benefits of ABAC is that you can develop as many policies as you like, audit them, share them, and not have to touch your application code at all because you end up externalizing authorization.
There are several engines / projects that implement ABAC such as:
AuthZForce (a Java library for XACML authorization)
Axiomatics Policy Server (commercial product - disclaimer: I work there)
AT&T XACML

There are two components to this question:
First, role management. Roles can be achieved through group membership, i.e. departmentA_addMember & departmentB_listMembers. These Groups would have corresponding permissions attached, e.g. "Member | Add" and "Member | View". A department in this context may have more resources included, that require separate permissions. Django allows to extend Objects with custom Permissions.
Second, inheritance. Do I understand you want to have individual Groups being member of other groups? Then this is something Django would require you to implement yourself.
However, should you be looking for a really more complex authentication solution, it may be worthwhile to integrate with 3rd party services through, e.g. django-allauth. There are sure more/other solutions, just to throw in one name.

How to realize modifying call in REST API

Assume you have some resource behind a REST API. This resource could well be modified using the usual HTTP verbs PUT or PATCH. But let's assume the server behind the API has to check some prerequisites to decide if the modification on the resource can be made or not (e.g. withdraw an amount from a bank account).
In this case there is no use in using POST (because we do not want to add a new resource), nor PUT or PATCH, because only the server knows about the new value of the resources' modified attribute, if he will allow the requested modification at all. In the above example the account's new balance would have to be computed on the server side like so : balance = balance - amount, and to my knowledge all the client can do with PUT or PATCH is to send the already modified resource (the account) or atttribute of that resource (the accounts' balance).
Am I then right in assuming that in this case the API designer has to provide a parameter (e.g. .../account?withdraw=amount) with the URL pointing to the resource ? What would be the correct HTTP verb for this operation ?

there is no use in using POST (because we do not want to add a new resource)
You do. A monetary exchange can be expressed in a transaction, hence: you're creating a new transaction.
So simply perform a POST with the transaction details to a /transaction endpoint.
You certainly don't want to allow users to PUT their new account balance, as that would require atomicity over HTTP, which is all REST stands against: the client would have to know the pre-transaction balance, and make sure in some way no transaction will be carried out before theirs arrives.

How to deal with deep level granularization with XACML in enterprise application

I am using IS WSO2 for authorization with XACML. I am am able to achieve authorization for static resource. But I am not sure with the design when it comes to granularization.
Example : if I have method like getCarDetails(Object User) where I should get only those cars which are assigned to this particular user, then how to deal this with XACMl?
Wso2 provides support for PIP where we can use custom classes which can fetch data from database. But I am not sure if we should either make copy of original database at PDP side or give the original database to PIP to get updated with live data.
Because Cars would be dynamic for the application eg. currently 10 cars assigned to user Alice. suddenly supervisor add 20 more car in his list which will be in application level database. Then how these other 20 cars will be automatically assigned in policy at PDP level until it also have this latest information.
I may making some mistake in understanding. But I am not sure how to deal with this as in whole application we can have lots of this kind of complex scenario where some times we will get data for one user from more than 4 or 5 tables then how to handle that scenario?

Your question is a great and the answer will highlight the key benefits of XACML and externalized authorization as a whole.
In XACML, you define generic, global rules, about what is allowed and what isn't using what I would call high-level attributes e.g. attributes of the vehicle (in your case) or the user (role, department, ...)
For instance a simple rule could be (using the ALFA syntax):
policy viewCars{
target clause actionId=="view" and resourceType=="car"
apply firstApplicable
rule allowSameRegion{
permit
condition user.region==car.region
}
}
Both the user's region and the car's region are maintained inside the application's database. The values are read using a PIP or Policy Information Point (details here).
In your example, you talk about direct assignment, i.e. a user has been directly assigned to a vehicle. In that case, the rule would become:
policy viewCars{
target clause actionId=="view" and resourceType=="car"
apply firstApplicable
rule allowAssignedVehicle{
permit
condition user.employeeId==car.assignedUser
}
}
This means that the assigned user information must be kept somewhere, in the application database, a CSV file, a web service, or another source of information. It means that from a management perspective, an administrator would add / remove vehicles from a user's assigned list (or perhaps the other way around: add / remove assigned users from a vehicle's assigned user list).
The XACML rule itself will not change. If the supervisor adds 20 more cars to the employee's list (maintained in the application-level database), then the PDP will be able to use that information via the PIP and access will be granted or denied accordingly.
The key benefit of XACML is that you could add a second rule that would state a supervisor can see the cars he/she is assigned to (the normal rule) as well as the cars assigned to his/her subordinates (a new proxy-delegate rule).
This diagram, taken from the Axiomatics blog, summarizes the XACML flow:
HTH, let me know if you have further questions. You can download ALFA here and you can watch tutorials here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js