Django authentication with fine-grained access control - django

I am developing a Django web application with a suite of steel design tools for structural engineers. There will be a database table of inputs for each design tool, and each row of each table will correspond to a particular design condition to be "solved." The users may work solely or in groups. Each user needs to have ongoing access to his own work so that designs can be refined, copied and adapted, and so that reports can be created whenever convenient, usually at the end of a project when hard copy documentation will be needed. The database contents must then be available over any number of sessions occurring over periods measured in months or even years for a given design project.
When there is a group of users, typically all associated with a given design office, it will probably be acceptable for them all to have joint and mutual access to each other's work. The application supports routine engineering production activities, not innovative intellectual property work, and in-house privacy is not the norm in the industry anyway. However, the work absolutely must be shielded from prying eyes outside of the group. Ideally, each group would have one or more superusers authorized to police the membership of the group. Probably the main tool they would need would be the ability to remove a member from the group, discontinuing his access privileges. This would be a user group superuser and would not be the same as a superuser on the site side.
For convenient access, each row of each database table will be associated with a project number/project name pair that will be unique for a given company deploying a user or user group. A different company could easily choose to use a duplicate project number, and even could choose a duplicate project name, so discriminating exactly which database rows belong to a given user (or group) will probably have to be tracked in a separate related "ownership list" table for each user (or group).
It is anticipated (hoped) that, eventually, several hundred users (or user groups) associated with different (and often competing) companies will solve tens of thousands of design conditions for thousands of projects using these tools.
So, here are my questions:
First, is there any point in trying to salvage much of anything from the Django contrib.auth code? As I perceive it, contrib.auth is designed for authentication and access control that is suitable for the blogosphere and web journalism, but that doesn't support fine-grained control of access to "content."
Second, is there any available template, pattern, example, strategy or design advice I could apply to this problem?

django-authority: Documentation, code on GitHub

Related

Django project-apps: What's your approach about implementing a real database scheme?

I've read articles and posts about what a project and an app is for Django, and basically end up using the typical example of Pool and Users, however a real program generally use a complex relational database, therefore its design gravitates around this RDB; and the eternal conflict raises once again about: which ones to consider an application and which one to consider components of that application?
Let's take as an example this RDB (courtesy of Visual Paradigm):
I could consider the whole set as an application or to consider every entity as an application, the outlook looks gray. The only thing I'm sure is about this:
$ django-admin startproject movie_rental
So I wish to learn from the expertise of all of you: What approach (not necessarily those mentioned before) would you use to create applications based on this RDB for a Django project?
Thanks in advance.
PS1: MORE DETAILS RELATED ABOUT MY REQUEST
When programming something I follow this steps:
Understand the context what you are going to program about,
Identify the main actors and objects in this context,
If needed, make an UML diagram,
Design a solid-relational-database diagram, (solid=constraints, triggers, procedures, etc.)
Create the relational database,
Start coding... suffer and enjoy
When I learn something new I hope they follow these same steps to understand where they want to go with their actions.
When reading articles and posts (and viewing videos), almost all of them omit the steps 1 to 5 (because they choose simple demo apps), and when programming they take the easy route, and don't show other situations or the many supposed features that Django offers (reusability, pluggability, etc).
When doing this request, I wish to know what criteria is used for experienced programmers in Django to determine what applications to create based on this sample RDB diagram.
With the (2) answers obtained so far, "application" for...
brandonris1 is about features/services
Jeff Hui is about implementing entities of a DB
James Bennett is about every action on a object, he likes doing a lot of apps
Conclusion so far: Django application is a personal creed.
My initial request was about creating applications, but as models are mentioned, I have this another question: is with a legacy relational database (as showed in the picture) possible to create a Django project with multiple apps? this is because in every Django demo project showed, every app created has a model with their own tables, giving the impression that tables do not interact with those of other applications.
I hope my request is more clear. Thanks again for your help.
It seems you are trying to decide between building a single monolithic application vs microservices. Both approaches have their pros and cons.
For example, a single monolithic application is a good solution if you have a small amount of support resources and do not need to be able to develop new features in fast sprints across the different areas of the application (i.e. Film Management Features vs Staff Management Features)
One major downside to large monolithic applications is that eventually their feature sets grow too large and with each new feature, you have a significant amount of regression testing which will need to be done to ensure there aren't any negative repercussions in other areas of the application.
Your other option is to go with a microservice strategy. In this case, you would divide these entities amongst a series of smaller services and provide them each methods to integrate/communicate with each other (APIs).
Example:
- Film Service
- Customer Service
- Staff Service
The benefits of this approach is it allows you to separate capabilities and features by specific service areas thus reducing risk and regression testing across the application when new features are deployed or there is a catastrophic issue (i.e. DB goes down).
The downside to this approach is that under true microservice architecture, all resources are separated therefore you need to have unique resources (ie Databases, servers) for each service thus increasing your operating cost.
Either of these options is a good option but is totally dependent on your support model and expected volumes. Hope this helps.
ADDITIONAL DETAIL:
After reading through your additional details, since this DB already exists and my assumption is that you cannot migrate it, you still have the same choice as to whether or not you follow a monolithic application or a microservices architecture.
For both approaches, you would need to connect your django webapp the the specific DB you are already using. I can't speak for every connector out there but I know that the MySQL connector allows django to read from the pre-existing db to systematically generate the models.py file for the application. As a part of that connector, there is a model variable which allows you to define whether or not Django is responsible for actually managing the DB tables themselves.
The only thing this changes from an architecture perspective is how many times do you want to code this connection?
If you only want to do it once and completely comply with the DRY method, you can build a monolithic application knowing that as new features become required, application wide regression testing will be an absolute requirement.
If you want ultimate flexibility for future changes with this collection of features and don't mind recoding the migration across multiple apps while reducing the need for application wide regression testing as new features become required, a microservice architecture strategy is more appropriate.

Microservices Architecture: Cross Service data sharing

Consider the following micro services for an online store project:
Users Service keeps account data about the store's users (including first name, last name, email address, etc')
Purchase Service keeps track of details about user's purchases.
Each service provides a UI for viewing and managing it's relevant entities.
The Purchase Service index page lists purchases. Each purchase item should have the following fields:
id, full name of purchasing user, purchased item title and price.
Furthermore, as part of the index page, I'd like to have a search box to let the store manager search purchases by purchasing user name.
It is not clear to me how to get back data which the Purchase Service does not hold - for example: a user's full name.
The problem gets worse when trying to do more complicated things like search purchases by purchasing user name.
I figured that I can obviously solve this by syncing users between the two services by broadcasting some sort of event on user creation (and saving only the relevant user properties on the Purchase Service end). That's far from ideal in my perspective. How do you deal with this when you have millions of users? would you create millions of records in each service which consumes users data?
Another obvious option is exposing an API at the Users Service end which brings back user details based on given ids. That means that every page load in the Purchase Service, I'll have to make a call to the Users Service in order to get the right user names. Not ideal, but I can live with it.
What about implementing a purchase search based on user name? Well I can always expose another API endpoint at the Users Service end which receives the query term, perform a text search over user names in the Users Service, and then return all user details which match the criteria. At the Purchase Service, map the relevant ids back to the right names and show them in the page. This approach is not ideal either.
Am I missing something? Is there another approach for implementing the above? Maybe the fact that I'm facing this issue is sort of a code smell? would love to hear other solutions.
This seems to be a very common and central question when moving into microservices. I wish there was a good answer for that :-)
About the suggested pattern already mentioned here, I would use the term Data Denormalization rather than Polyglot Persistence, as it doesn't necessarily needs to be in different persistence technologies. The point is that each service handles its own data. And yes, you have data duplication and you usually need some kind of event bus to share data across services.
There's another option, which is a sort of a take on the first - making the search itself as a separate service.
So in your example, you have the User service for managing users. The Purchases services manages purchases. Each handles its own data and only the data it needs (so, for instance, the Purchases service doesn't really need the user name, only the ID). And you have a third service - the Search Service - that consumes data produced by other services, and creates a search "view" from the combined data.
It's totally fine to keep appropriate data in different databases, it's called Polyglot Persistence. Yes, you would like to keep user data and data about purchases separately and use message queue for sync. Millions of users seems fine to me, it's scalability, not design issue ;-)
In case of search - you probably want to search more than just username, right? So, if you use message queue to update data between services you can also easily route this data to ElasticSearch, for example. And from ElasticSearch perspective it doesn't really matter what field to index - username or product title.
I usually use both approaches. Sometimes i have another service which is sitting on top on x other services and combines the data. I don't really like this approach because it is causing dependencies and coupling between services. So in general, within my last projects we tried to stick to polyglot persistence.
Also think about, if you need to have x sub http requests for combining data in some kind of middleware service, it will lead you to higher latency. We always try to cut down the amount of requests for one task and handle everything what is possible through asynchronous queues. ( especially data sync )
If you conceptualize modules as the owners and controllers of the data they work on, then your model must also communicate that data out of that module to others. In contrast, the modules in a manufacturing process have the access to change data without possessing and controlling it.
Microservices is an architecture for distributed processing, like most code, where modules pass the data around to work on it. From classic articles by Harvard Business Review and McKinsey on the subject of owning members of a supply chain, I identified complexities arising from this model and wrote an article teaching programmers what you need to know: http://www.powersemantics.com/p.html
Manufacturing is an architecture for integrated processing, where modules work on the data without passing it around from point to point. This can be accomplished by having modules configured to access the same memory, files or database tables. My architecture shows how to accomplish this on memory via reference properties.
When you consider "exposing an API at the Users Service end which brings back user details based on given ids", you need to be aware that creates what HBR calls "irreversible" complexity, which I've dubbed centralization complexity. Don't build A->B (distributed) systems, because you can't decentralize them later after failing to separate requirements. Requirements in production processes represent user instructions, and centralized modules only enable you to change the wrong users' processes. In other words, centralized modules don't document user groups or distinguish them from derived-product-users.

multi user desktop application with privilege separaion

I am writing a C++ application with a postgresql 9.2 database backend. It is an accounting software. It is a muti user application with privilege separation features.
I need help in implementing the user account system. The privileges for users need not be mutually exclusive. Should I implement it at the application level, or at the database level?
The company is not very large at present. Assume about 15-20 offices with an average of 10 program users per office.
Can I make use of the roles in postgres to implement this? Will it become too tedious, unmanageable or are there some flaws in such an approach?
If I go via the application route, how do I store the set of privileges a user has? Will a binary string suffice? What if there are additional privileges later, how can I incorporate them? What do I need to do to ensure that there are no security issues? And in such an approach I am assuming the application connects with the privileges required for the most privileged user.
Some combination of the two methods? Or something entirely different?
All suggestions and arguments are welcome.
Never provide authorization from a client application, which is run on uncontrolled environment. And every device, that a user has physical access to, is an uncontrolled environment. This is security through obscurity — a user can simply use a debugger to get a database access credentials from client program memory and just use psql to do anything.
Use roles.
When I was developing an C++/PostgreSQL desktop application I've chosen to disallow all users access to modify all tables and I've created an API using Pl/PgSQL functions with VOLATILE SECURITY DEFINER options. But I think it wasn't a best approach, as it's not natural and error prone to use for example:
select add_person(?,?,?,?,?,?,?,?,?,?,?,?);
I think a better way would be to allow modifications to tables which a user needs to modify and, when needed, enforce authorization using BEFORE triggers, which would throw an error when current_user does not belong to a proper role.
But remember to use set search_path=... option in all functions that have anything to do with security.
If you want to authorize read-only access to some tables then it gets even more complicated. Either you'd need to disable select privilege for these tables and create API using security definer functions for accessing all data. This would be a monster size API, extremely ugly and extremely fragile. Or you'd need to disable select privilege for these tables and create views for them using create view with (security_barrier). Also not pretty.

Oracle Apex - Administration and Users

I am creating a Oracle Apex application.
My question:
The application has two sides; Administration and Users. The purpose of administration is to assign users to groups.
Users should only be able to access information - groups they have been assigned.
For example; Mr A. is assigned to group A.
Mr B is assigned to group B.
Mr C is assigned to group A, B and C.
Users should only be allowed access to areas they are assigned to and not administrative controls. Only viewing privileges.
Is this possible?
I'm finding Oracle Apex difficult to get my head-around!
You have a number of requirements, each of which can be built in Apex, but using different features:
Administrators may assign users to groups.
Only Administrators may access administrative controls.
Users may view privileges.
Users may only see information for their group.
I would use Apex Authorisation Schemes to take care of #1, #2 and #3. For example, create an authorization scheme called "Administrator" which checks whether the current user is an administrator, then apply this scheme to any pages, regions, buttons, items, etc. that should only be accessible to administrators.
For #4, there are a few solutions I can think of:
Predicates - make sure each query in the application checks whether the data is viewable by the current user according to their group.
Views - encapsulate the security predicates in a view on each table, so that you don't have to repeat this code throughout the application.
Oracle VPD or Row Level Security (requires Enterprise licence) - this hides the security predicates behind the SQL level.
I've used all 3 options above for different projects; the 3rd one was for a fairly large application with quite complex authority-checking rules. RLS made this project much simpler and easier to build and verify; however RLS may be overkill in some cases.

How to enforce authorization policies across multiple applications?

Background
I have a backoffice that manages information from various sources. Part of the information is in a database that the backoffice can access directly, and part of it is managed by accessing web services. Such services usually provides CRUD operations plus paged searches.
There is an access control system that determines what actions a user is allowed to perform. The decision of whether the user can perform some action is defined by authorization rules that depend on the underlying data model. E.g. there is a rule that allows a user to edit a resource if she is the owner of that resource, where the owner is a column in the resources table. There are other rules such as "a user can edit a resource if that resource belongs to an organization and the user is a member of that organization".
This approach works well when the domain model is directly available to the access control system. Its main advantage is that it avoids replicating information that is already present in the domain model.
When the data to be manipulated comes from a Web service, this approach starts causing problems. I can see various approaches that I will discuss below.
Implementing the access control in the service
This approach seems natural, because otherwise someone could bypass access control by calling the service directly. The problem is that the backoffice has no way to know what actions are available to the user on a particular entity. Because of that, it is not possible to disable options that are unavailable to the user, such as an "edit" button.
One could add additional operations to the service to retrieve the authorized actions on a particular entity, but it seems that we would be handling multiple responsibilities to the service.
Implementing the access control in the backoffice
Assuming that the service trusts the backoffice application, one could decide to implement the access control in the backoffice. This seems to solve the issue of knowing which actions are available to the user. The main issue with this approach is that it is no longer possible to perform paged searches because the service will now return every entity that matches, instead of entities that match and that the user is also authorized to see.
Implementing a centralized access control service
If access control was centralized in a single service, everybody would be able to use it to consult access rights on specific entities. However, we would lose the ability to use the domain model to implement the access control rules. There is also a performance issue with this approach, because in order to return lists of search results that contain only the authorized results, there is no way to filter the database query with the access control rules. One has to perform the filtering in memory after retrieving all of the search results.
Conclusion
I am now stuck because none of the above solutions is satisfactory. What other approaches can be used to solve this problem? Are there ways to work around the limitations of the approaches I proposed?
One could add additional operations to the service to retrieve the
authorized actions on a particular entity, but it seems that we would
be handling multiple responsibilities to the service.
Not really. Return a flags field/property from the web service for each record/object that can then be used to pretty up the UI based on what the user can do. The flags are based off the same information that is used for access control that the service is accessing anyway. This also makes the service able to support a browser based AJAX access method and skip the backoffice part in the future for added flexibility.
Distinguish between the components of your access control system and implement each where it makes sense.
Access to specific search results in a list should be implemented by the service that reads the results, and the user interface never needs to know about the results the user doesn't have access to. If the user may or may not edit or interact in other ways with data the user is allowed to see, the service should return that data with flags indicating what the user may do, and the user interface should reflect those flags. Service implementing those interactions should not trust the user interface, it should validate the user has access when the service is called. You may have to implement the access control logic in multiple database queries.
Access to general functionality the user may or may not have access to independant of data should again be controlled by the service implementing that functionality. That service should compute access through a module that is also exposed as a service so that the UI can respect the access rules and not try to call services the user does not have access to.
I understand my response is very late - 3 years late. It's worth shedding some new light on an age-old problem. Back in 2011, access-control was not as mature as it is today. In particular, there is a new model, abac along with a standard implementation, xacml which make centralized authorization possible.
In the OP's question, the OP writes the following re centralized access control:
Implementing a centralized access control service
If access control was centralized in a single service, everybody would be able to use it to consult access rights on specific entities. However, we would lose the ability to use the domain model to implement the access control rules. There is also a performance issue with this approach, because in order to return lists of search results that contain only the authorized results, there is no way to filter the database query with the access control rules. One has to perform the filtering in memory after retrieving all of the search results.
The drawbacks that the OP mentions may have been true in a home-grown access control system, in RBAC, or in ACL. But they are no longer true in abac and xacml. Let's take them one by one.
The ability to use the domain model to implement the access control rules
With attribute-based access control (abac) and the eXtensible Access Control Markup Language (xacml), it is possible to use the domain model and its properties (or attributes) to write access control policies. For instance, if the use case is that of a doctor wishing to view medical records, the domain model would define the Doctor entity with its properties (location, unit, and so on) as well as the Medical Record entity. A rule in XACML could look as follows:
A user with the role==doctor can do the action==view on an object of type==medical record if and only if the doctor.location==medicalRecord.location.
A user with the role==doctor can do the action==edit on an object of type==medical record if and only if the doctor.id==medicalRecord.assignedDoctor.id
One of the key benefits of XACML is precisely to mirror closely the business logic and the domain model of your applications.
Performance issue - the ability to filter items from a db
In the past, it was indeed impossible to create filter expressions. This meant that, as the OP points out, one would have to retrieve all the data first and then filter the data. That would be an expensive task. Now, with XACML, it is possible to achieve reverse querying. The ability to run a reverse query is to create a question of the type "What medical record can Alice view?" instead of the traditional binary question "Can Alice view medical records #123?".
The response of a reverse query is a filter condition which can be converted into a SQL statement, for instance in this scenario SELECT id FROM medicalRecords WHERE location=Chicago assuming of course that the doctor is based in Chicago.
What does the architecture look like?
One of the key benefits of a centralized access control service (also known as externalized authorization) is that you can apply the same consistent authorization logic to your presentation tier, business tier, APIs, web services, and even databases.