Django 2.x: Is using the Primary Key of a Model in the URL pattern a security concern?

Django 2.x: Is using the Primary Key of a Model in the URL pattern a security concern? - django

The id (PK) of a model/ DB can be passed to and used in the URL pattern. Everyone, including hackers, would be able to piece together some information about my DB from this and the actual data in the template.
My questions are kind of general at this point. I would just like to understand how the info above could be used to compromise the data. Or if someone could point me to some further reading about this topic I would appreciate it.
This is a general question as I am trying to gain more understanding into securing Django sites. I have read several articles but nothing's satisfied the question.
Code:
Where the href passes the blogs id to be used in url matching and ultimately pulling data from the DB in the views/ template:
<a href= "{% url 'details' blog.id %}">
and
urlpatterns = [
path('<int:blog_id>/', views.details, name = 'details'),
]
And the URL being:
domain/appname/blog_id/
TL;DR: Can you hack my site with the few pieces of information I am freely giving away concerning the backend?

First it depends on how your ids are generated. The default in Django is to use sequential numbers, which gives away the following (non-exhaustive) information:
Someone can easily try other ids to see what they get. If you haven't properly protected access to ids you don't want to show, someone might be able to see content they shouldn't see. Many information leaks were just due to this: Guess the URL et voilà! Something that was supposed to be published tomorrow is suddenly leaked today. The same applies for dates in the URL. Of course, if you have proper checks for who's allowed to view "draft" posts, there's no harm.
By trying all ids, you can find out numbers: maybe you don't want others to know how many products you have in your database because it's sensitive information. If I can just do /products/4924 to fetch info about product #4924, I can easily create a script to quickly increase the number until I get 404 Not Found, by which time I know there are 10252 products in your database.
If you have a form to make changes to an order and use the id in the URL to determine which order to change (never do just that by the way, make sure you check the order belongs to the user), someone could just pick different ids to mess up with other people's orders. That can happen easily with an UpdateView where you forget to check permissions.
Regarding the last one: I see plenty of posts here on SO where people show their UpdateView for changing user profiles and other really sensitive information. In most cases the pk is the URL parameter used to fetch the UserProfile. But I almost never see a decorator or mixin (PermissionRequiredMixin or UserPassesTestMixin) to check that the user is actually the one authorised to modify this object. I just pray it's left out for clarity sake :-)
On the other hand, in many case there's not much harm using ids. This site, StackOverflow uses a sequential id for the URL of a question/answer. Nothing serious can happen here if I randomly try other ids. And apparently they are happy to share how many questions and answers have been posted so far (57478609 when you posted this question).
TL;DR: Except giving the ability to visitors to "count" objects in your database, all other security issues with using sequential ids aren't real issues if you take care about your security. But by using random ids, e.g. uuids in your URLs (not necessarily replacing the pk in the db) you can reduce the risk if you forgot to secure something where people can guess ids (or your intern forgot and it got passed your code review and unit tests somehow).

You asked a general question, and the general answer would be: "It depends"
TL;DR: Can you hack my site with the few pieces of information I am freely giving away concerning the backend?
This question is broad. You could hack a site with a toothpick if you annoy the site owner by poking them with it until they give you the password.
Instead I'll assume you asked the titular question:
Q: Are PKs in URLs a security concern?
A: They can be.
In your example you mention blog posts- so lets assume your site has plenty of users all writing blog posts. Now you add the ability for a User to set their latest blog entry to "private". Blog posts marked private only show up on the dashboard for the user that wrote them, and don't show up on everyone else's blog feeds e.g:
{% for article in articles if not article.private %}
... <article feed stuff here>
{% endif %}
Great!
However, one of your users posts a private article and looks at the address bar which shows https://myblog.blog/articles/42 and then at a previous article they wrote yesterday which is https://myblog.blog/articles/37 and deduces that the ID's are sequential. On a whim they type into the address bar https://myblog.blog/articles/41 and oh dear, now they're looking at an article that someone else posted that for the sake of argument we'll say was also set to private.
Because we had no check in place to make sure that the user looking at the (private) blog post was permitted to do so we exposed someones private information. Which is bad enough for blog posts but a very expensive disaster for e.g. bank accounts (there are plenty of examples of major banks slipping up on this particular issue)
Django has a robust system for dealing with this sort of thing: https://docs.djangoproject.com/en/2.2/topics/auth/default/#limiting-access-to-logged-in-users-that-pass-a-test
The argument can still be made that as well as permissions checks, good practice would be to use UUIDs (or short UUIDs) for the id "slugs" in the URLs of any objects that you would rather weren't guessable.
Also, not security related but on the subject of URLs for public articles and blog posts you may find this interesting: https://wellfire.co/learn/fast-and-beautiful-urls-with-django/

Related

stopping spam bots in coldfusion

I am blocking a huge number of bots, except the ones from search engines, and then only allowing 2seconds of session management.
However, spam bots are still able to by-pass these measure and create a huge number of requests which is 'killing' the server.
I have read other articles on this site but none seem to directly answer this issue.

A bot probably behaves faster than a human. You could time how long it takes them to fill out the form. Anything less than a second or two is a bot.
A bot probably doesn't have JavaScript turned on. You could use that to your advantage.
You could hide a link via css (or not give it any text) that takes the bot to a bot.cfm page, which could then set a session value.
There are some open source projects but I can't remember the names of them off the top of my head.
CF10 has a new validation function.

Ben Nadel has written some useful posts in his blog regarding spiders/bots.
http://www.bennadel.com/blog/1083-ColdFusion-Session-Management-And-Spiders-Bots.htm
http://www.bennadel.com/blog/154-ColdFusion-Session-Management-Revisited-User-vs-Spider-III.htm
For forms, I use <cfimage> to create a captcha image. I have found that stuffing the captcha phrase in a session variable can cause problems (I can't remember what the problems were though). So, I now use <cfencrypt> to include an encrypted phrase in the form itself. The action page decrypts the phrase and compares it to what the user put in the captcha form field.

I've found CFSPAMProtect to be very useful at blocking automated form fillers.
It bases its SPAM/HAM test on an aggregate score of a number metrics including time on page, mouse movement (via JS) as well as the classic hidden form fields that shouldn't be filled in (but are filled in by dumb robots).
You can assign your own weightings and monitor the SPAM catch via email to allow you to tailor things.
It can work on its own or link to some third party SPAM tools such as Akismet.
So far I've found that it's good enough on it's own.
It's a custom tag and easy to implement in existing forms too which is nice.
Give it a go...

Django - URL design and best practices for identify one object

Im actually working in a django project and I'm not sure about the best format of the URL to access into one particular object page.
I was thinking about these alternatives:
1) Using the autoincremental ID => .com/object/15
This is the simplest and well known way of do that. "id_object" is the autoincremental ID generated by the database engine while saving the object. The problem I find in this way is that the URLs are simple iterable. So we can make an simple script and visit all the pages by incrementing the ID in the URL. Maybe a security problem.
2) Using a <hash_id> => .com/object/c30204225d8311e185c3002219f52617
The "hash_id" should be some alphanumeric string value, generated for example with uuid functions. Its a good idea because it is not iterable. But generate "random" uniques IDs may cause some problems.
3) Using a Slug => .com/object/some-slug-generated-with-the-object
Django comes with a "slug" field for models, and it can be used to identify an object in the URL. The problem I find in this case is that the slug may change in the time, generating broken URLs. If some search engine like Google had indexed this broken URL, users may be guided to "not found" pages and our page rank can decrease. Freezing the Slug can be a solution. I mean, save the slug only on "Add" action, and not in the "Update" one. But the slug can now represent something old or incorrect.
All the options have advantages and disadvantages. May be using some combination of them can some the problems.
What do you think about that?

I think the best option is this:
.com/object/AUTOINCREMENT_ID/SLUG_FIELD
Why?
First reason: the AUTOINCREMENT_ID is simple for the users to identify an object. For example, in an ecommerce site, If the user want to visit several times the page (becouse he's not sure of buying the product) he will recognize the URL.
Second reason: The slug field will prevent the problem of someone iterating over the webpage and will make the URL more clear to people.
This .com/object/10/ford-munstang-2010 is clearer than .com/object/c30204225d8311e185c3002219f52617

IDs are not strictly "iterable". Things get deleted, added back, etc. Over time, there's very rarely a straight linear progression of IDs from 1-1000. From a security perspective, it doesn't really matter. If views need to be protected for some reason, you use logins and only show what each user is allowed to see to each user.
There's upsides and downsides with every approach, but I find slugs to be the best option overall. They're descriptive, they help users know where there at and at a glance enable them to tell where they're going when they click a URL. And, the downsides (404s if slugs change) can be mitigated by 1) don't change slugs, ever 2) set up proper redirects when a slug does need to change for some reason. Django even has a redirects framework baked-in to make that even easier.
The idea of combine an id and a slug is just crazy from where I'm sitting. You still rely on either the id or the slug part of the URL, so it's inherently no different that using one or the other exclusively. Or, you rely on both and compound your problems and introduce additional points of failure. Using both simply provides no meaningful benefit and seems like nothing more than a great way to introduce headaches.

Nobody talked about the UUID field (django model field reference page) which can be a good implementation of the "hash id". I think you can have an url like:
.com/object/UUID/slug
It prevents from showing an order in the URL if this order is not relevant.
Other alternatives could be:
.com/object/yyyy-mm-dd/ID/slug
.com/object/kind/ID/slug
depending of the relevant information you want to have in the url

Does this "invite friends" system for my website sound flawed to you?

I would like to have a system where my users can invite their friends. We prefer not to use a URL shortener when sending the invite link but it is also important that the link be relatively short. I am thinking the best way to accomplish this is just give each user a "profile username" like "tonyamoyal12" and let them request a new unique one if they want.
When my users send out invites, it will send out a URL like http://mydomain/invite/profile_username and essentially if the invitee logs in at that URL, the inviter gets credit. Can anyone think of drawbacks to this approach? Most invite URL's have hashes to verify the integrity of the invite but I think my approach works fine.
UPDATE
The profile username is that of the INVITER not INVITEE. So a user signs up on the INVITER'S profile page and therefore the inviter gets a "point" for having someone sign up on his page.
Thanks!

In these types of systems, you don't usually assign any user data (i.e., user names) before the invitee has actually signed up, and it may be a bit of a pain to get that kind of URL working depending on the framework you're using.
The process is normally:
Invite a user, which sends them an e-mail.
Invitee clicks through a link in the e-mail to go to the site's main registration page.
Invitee registers with a valid user name of their choosing, and based on some unique random key (included in the clickthrough link), you can do your business logic with the two users involved (add to friends list, or whatever).
The drawback to generating your own user names is that they're more likely to be guessed than a random number, because you'll likely use English words in them. If you generate and assign random user names (i.e., "s243k2ldk8sdl"), the invitee is not going to be pleased since they have to do extra work to change the user name, or somehow remember that name.
EDIT, since I didn't understand the question very well.
I think the scheme is fine, except I would just use the invitor's user name in the URL and not allow them to change it (why allow it?). The only issue is if there is some kind of limit put on the number of invites (or maybe there is a reward for each invite), where you'd want to secure each clickthrough with some kind of unique hash value only valid for the invitor's URL.
EDIT 2
Since the users in the system do not have user names assigned, you could go either way. Allowing "user name" assignment on a first-come, first-served basis would be fine, as this would let everyone share their URL more easily with friends since it's memorable and can simply be typed in. However, that goes out the window if a unique key is required to sign up... in which case, it's going to be simpler to just not implement the user name thing and direct everyone to a single registration page of some kind.

Why not just create your own bespoke URL shortening?
If the reason you are avoiding URL-shortening is that you don't want to depend on external companies then that could be a good solution for you.

You can't independently track the invites. At some point you may want to know how many invites went out from a user vs. how many were accepted. With this single URL system you can't track that information.
Bots can easily be written to spam such a system. (Perhaps solved with captcha on resulting pages)

well if the website is large you will get name conflicts, and you will be dependent on the inviter putting in the invitees name which they could do poorly.
If you want to do it that way then you will have to deal with name clashes.
Also it is possible that someone could come along and decide to randomly type in names to see if they get it hit. Say I wanted to be nosey and spy on a friend to see if they are sending out invites to other friends.
EDIT: ahh ok. well if they are just clicking a link to go to the inviter then thats not a problem. That seems perfectly normal and there is no secret about exposed usernames.

You could create a unique hash for each invite and keep an association of hashes with user names. This would require a bit of storage overhead, but you could have expiration of invites to help combat that.
So http://mydomain/invite/RgetSqtu would be an example link, with a DB table that stores RgetSqtu/profile until it is used.
You would probably want to provide a helpful error page if the hash could not be found, like the following:
We are sorry but the invite you entered could not be found. This could be caused by the invite being typed incorrectly, being used already, or being too old (invites expire after 3 days).

I'd suggest passing the inviter's username in the querystring, and have that querystring fill either an editable or non-editable textbox on the new user registration page. That way you still have just one registration page, the URL is short, and users get credit for referring friends.
http://mydomain/invite/register.html?inviter=invitersUsername
leads to
First Name: _________
Last Name: _________
Referred By: invitersUsername

If I understand the setting correctly an existing user creates invites by giving emails of their friends to which your system will send a mail with in that the [inviteUrl]/[inveter'sUserName].
So in the case I send a mail to invite you, the url would be:
www.yourThang.com/invite/borisCallens
Every time somebody visits this I (the user borisCallens) gets a point.
What would stop me from visiting this url a gazillion times and thus win the invite-your-friend game?

How do I block people from intentionally re-submitting a form?

I'm building a website using Ubuntu, Apache, and Django. I'd like to block people from filling out and submitting a particular form on my site more than once. I know it's pretty much impossible to block a determined user from changing his IP address, deleting his cookies, and so on; all I'm looking for is something that will deter the casual user from re-submitting.
It seems to me that blocking multiple form submissions from the same IP address is the best way to achieve what I'm looking for. However, I'm unsure how I should do this, and whether I should be doing this from Apache or from Django. Any tips?
Edit: I'm looking to prevent intentional re-submission, not just unintentional double submission. e.g. I have a survey that I want to discourage people from voting multiple times on.

If your main concern is to prevent someone writes a script and automatically submit the form many times, you may want to use CAPTCHA with your form.

Several whole countries are NAT'ed, and some (most?) large multinational corporations too, many with several hundred thousand users each. Blocking anything by IP is a bad idea.
Go for a cookie instead, which is as good as it's going to get. You could also make the user login in, in which case you'd know if the form was submitted repeatedly for that login.

I would use the session id, and store form submissions in a table with session id, timestamp, and optionally some sort of form identifier. Then, when a form is submitted, you could check the table to make sure that it had not happened within a certain period of time.

Filtering on IP address and/or cookies are both easy to get around, but they will prevent the casual user from accidentally submitting the same stuff multiple times due to browser hick-ups, impatience and so on.
If you want something better than that you could implement login, but of course that prevents a lot of users from responding.

Add to the form a monotonically increasing id number in a hidden field.
As each form is submitted, record the id in a "used" list/map (or mark it used, or whatever, implementation detail).
If you get the same id a second time (if it's already in your used map) inform the user they double-submitted.

While nothing is fool proof, I would suggest something like this: When a user loads the page with your form on, a cookie is set and the value of the cookie is appended with a fixed secret string and the md5 value of this is written to a hidden field on the form. Ensure that a new value is generated each time the user access the form.
When the user submits the form, you check that the cookie value and form value match, that the cookie the user was given has not been used to submit the form before and that the referrer id match the URL of the form. Optionally you make sure that there has been no attempts to post from that IP in the last 2 minutes (fast enough that it wont matter to most people, but slow enough to slow down bots).
To fix this the user has to make a script that loads the page, store the cookies and submit the correct values. This is much more difficult than if the user could just submit the form.
Added Based on edit: I would block the users in the Django framework. This allows you to present a much better error message to the user and you only block them from that form.

This is a question of authentication and authorisation, which are related but not the same. In order to manage authorisation you must first authenticate (reliably identify) the user.
If you want to make this resist intentional misuse then you are going to end up with not only usernames and passwords but demands for information that personally identifies your users, along the lines of the stuff a bank asks for when you want to open an account. The bleeding hearts and lefties will snivel endlessly about invasion of privacy but in fact you are doing exactly the same as a bank and for exactly the same reasons.
It's a lot of work and may be affected by law. Do you really want to do it?

The following methods are all relatively simple, both to implement and to hack around. Anyone with Firebug and a little knowledge won't even blink.
The following JavaScript uses Mootools, and I haven't checked it to be bug free. I understand that JQ syntax is almost identical, and raw JS is similar enough, so the point should be clear.
1) If the form is being submitted via AJAX, you can check before submitting (sorry if I'm just stating the obvious).
var sent = 0;
$('myForm').addEvent('submit', function(){
if(!sent) this.send();
})
This is really simple, and surprisingly effective until they reload the page.
2) Add a JavaScript cookie. Again, with Mootools:
$('myForm').addEvent('submit', function(){
if(Cookie.read('submitted')){ alert('once only'); return false;}
else{ Cookie.write('submitted', 1); return true; }
})
This will work even if the user reloads the page.
3) Add a Python session cookie. I am not familiar with Python, but if it is like PHP, this will have no advantage over method 2. In either case, the user can delete the cookie with FireCookie or WebDeveloper Toolbar (or their equiv's on other browsers) and reload the page.
4) Add a Flash cookie (use Flex). This is ideal - Flash cookies are stored in a different location, are not obvious, and are very difficult to remove. The only downside is that you need to create and embed a tiny swf.
5) Store a value in a hidden field, and check for the value.
A hash can be added to the internal links to insure that the value remains filled in even if the page is navigated away from.
6) Other games can be played incrementing a URL (or a custom URL using htaccess) for each visitor.
An swf cookie is the best idea of the above, though it can be combined with the others.

How can you tell if a site has been made with Django?

A company I'm looking at claims to have made the website for an airline and a furniture store using Django, but when I look at the sites, there is no indication what the underlying web technology is. How can you tell?

This is quite an old question, but I don't see any canonical answers. As the other answers have noted though, there's no sure-fire way to know, and if someone wanted to hide the fact that they're using Django, they can. That said, you can always do a little detective-work and determine with some confidence whether it uses Django or not. If that's your goal, here are some strong indicators you can look out for:
Admin Panel
First and foremost, check if the site has a /admin/ page. If it does, and it gives that familiar Django admin login page, you're 99% sure (unless someone went through a lot of trouble to make it look like Django).
Forms
There are a number of things you can look out for in forms:
Form fields with id attributes starting with id_
Check for a hidden field with the name csrfmiddlewaretoken
If the site has a formset, check for -TOTAL-FORMS and -DELETE hidden inputs.
Cookies
If the site uses the contrib.auth package for authentication, you will probably see a cookie called sessionid being set when you log in.
Forms will also probably set a cookie called csrftoken.
Trailing Slashes
Trailing slashes after URLs, and/or redirecting you to the page with a trailing slash if you try to go to one without it. This is Django's default behavior, and to my knowledge not extremely common in other frameworks. Note, though, that it can be easily deactivated in Django.
Error Pages
Failing all this, or still not being convinced, you can try to force error pages, and try to learn something from that. Go to an unmapped URL with a 404 page, and see if DEBUG still happens to be true (in which case you should probably notify the owner that they're not being very secure about their site).

You can try a few things, such as attempting to find error pages, and checking the default location of the administration panel that Django creates, but overall there's no way to determine what technologies a given site is using.
See also: https://stackoverflow.com/questions/563316/is-there-a-generic-way-to-see-what-is-a-website-running-on/563335#563335

Look for the csrf input box. This is present on any forms. But this can be turned off though not very recommended. Also if it's an old version of django this may not exist. But if it's there it's a strong indicator.
This is present on any page that have a post form. And it looks like this:
<input type='hidden' name='csrfmiddlewaretoken' value='3b3975ab79cec7ac3a2b9adaccff7572' />

Navigate to a page with a formset, and check if there are *-TOTAL_FORMS or *-DELETE hidden inputs.
That doesn't prove that they are using Django, but might be a clue that they are (with the mentioned model formsets).

Try to navigate to some 404 error page, or something of that sort. Chances are slim, but try to find a default django error page.
You can also try to login to www.website.com/admin and see if you get the default django admin page.
Other than that, if that didn't work, then you just can't.

There are no reliable indicators to my knowledge but you could check the /admin/ URL to see if you get the standard admin app or sometimes the feed-URLs use a common prefix compared to a common suffix (although this might not be an indicator at all but just a preference of the developers).
Trying to trigger a debug page (either via a 404 or using some broken input that might case an internal error) might also be a good way (although this acts more as a test of competency of the original developers and admin than anything else :-) )

Could you ask the airline and / or the furniture store? I'm guessing that you want to know if this company has good experience in django, I think it is reasonable to ask for references if you are considering working with them.
The other companies may be quite happy to discuss what technologies were used - some are and some aren't, but it's worth asking.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js