REST API - How to structure endpoint urls - web-services

I am creating a REST API and I'm having some doubts on how to organize the urls for the following endpoints:
list all universities
list all faculties of a given university
retrieve details for a faculty/university
I think it would make sense to have something like this (although the last one has an unnecessary parameter, inst_id, that I decided to have there for readability purposes):
#list all universities
/api​/v1​/universities
#retrieve university detail
/api/v1/universities/{inst_id}
#list faculties of a university
/api/v1/universities/{inst_id}/faculties
#retrieve details of a faculty
/api/v1/universities/{inst_id}/faculties/{inst_unit_id}
The problem with this is that university and faculty details are given by the same service, so it doesn't make sense to have two urls.
How should I organize this then? I think these two options are ok:
Retrieves faculty details with the university url. This is good because there are no unnecessary parameters, but it kind of "goes back" in the route to get the faculty details after listing all faculties:
#list all universities
/api​/v1​/universities
#retrieve university/faculty detail
/api/v1/universities/{inst_id}
#list faculties of a university
/api/v1/universities/{inst_id}/faculties
Retrieves university details with the faculty url. I think the flow is more understandable this way, it doesn't "go back" in the route. However, the last endpoint receives an unnecessary parameter and the university details url is not immediately after the list of universities url in the route:
#list all universities
/api​/v1​/universities
#list faculties of a university
/api/v1/universities/{inst_id}/faculties
#retrieve university/faculty detail
/api/v1/universities/{inst_id}/faculties/{inst_unit_id}
Which one should I use? Are there any other suggestions?
Thank you!

There is no such thing as a REST endpoint. -- Fielding, 2018
REST has resources. A common example of a REST resource is a web page; for example, the one you are reading right now.
The design question you are facing here is analogous to the question of whether the "university" web page should include the list of faculty, or not. Either way is fine.
If you are using a media type with processing rules that have an understanding of fragments (HTML, for example), then you can provide an identifier for the faculty list as a secondary resource:
/api/v1/universities/{inst_id}#faculty
The choice between one web page or two is likely to be a matter of considering different trade offs. For example, if the "university" part of the web page is much bigger than the "faculty" part, and changes more slowly than the "faculty" part, then you might consider separating them so that you can assign different caching meta data to the two parts.

Related

Django 2.x: Is using the Primary Key of a Model in the URL pattern a security concern?

The id (PK) of a model/ DB can be passed to and used in the URL pattern. Everyone, including hackers, would be able to piece together some information about my DB from this and the actual data in the template.
My questions are kind of general at this point. I would just like to understand how the info above could be used to compromise the data. Or if someone could point me to some further reading about this topic I would appreciate it.
This is a general question as I am trying to gain more understanding into securing Django sites. I have read several articles but nothing's satisfied the question.
Code:
Where the href passes the blogs id to be used in url matching and ultimately pulling data from the DB in the views/ template:
<a href= "{% url 'details' blog.id %}">
and
urlpatterns = [
path('<int:blog_id>/', views.details, name = 'details'),
]
And the URL being:
domain/appname/blog_id/
TL;DR: Can you hack my site with the few pieces of information I am freely giving away concerning the backend?
First it depends on how your ids are generated. The default in Django is to use sequential numbers, which gives away the following (non-exhaustive) information:
Someone can easily try other ids to see what they get. If you haven't properly protected access to ids you don't want to show, someone might be able to see content they shouldn't see. Many information leaks were just due to this: Guess the URL et voilà! Something that was supposed to be published tomorrow is suddenly leaked today. The same applies for dates in the URL. Of course, if you have proper checks for who's allowed to view "draft" posts, there's no harm.
By trying all ids, you can find out numbers: maybe you don't want others to know how many products you have in your database because it's sensitive information. If I can just do /products/4924 to fetch info about product #4924, I can easily create a script to quickly increase the number until I get 404 Not Found, by which time I know there are 10252 products in your database.
If you have a form to make changes to an order and use the id in the URL to determine which order to change (never do just that by the way, make sure you check the order belongs to the user), someone could just pick different ids to mess up with other people's orders. That can happen easily with an UpdateView where you forget to check permissions.
Regarding the last one: I see plenty of posts here on SO where people show their UpdateView for changing user profiles and other really sensitive information. In most cases the pk is the URL parameter used to fetch the UserProfile. But I almost never see a decorator or mixin (PermissionRequiredMixin or UserPassesTestMixin) to check that the user is actually the one authorised to modify this object. I just pray it's left out for clarity sake :-)
On the other hand, in many case there's not much harm using ids. This site, StackOverflow uses a sequential id for the URL of a question/answer. Nothing serious can happen here if I randomly try other ids. And apparently they are happy to share how many questions and answers have been posted so far (57478609 when you posted this question).
TL;DR: Except giving the ability to visitors to "count" objects in your database, all other security issues with using sequential ids aren't real issues if you take care about your security. But by using random ids, e.g. uuids in your URLs (not necessarily replacing the pk in the db) you can reduce the risk if you forgot to secure something where people can guess ids (or your intern forgot and it got passed your code review and unit tests somehow).
You asked a general question, and the general answer would be: "It depends"
TL;DR: Can you hack my site with the few pieces of information I am freely giving away concerning the backend?
This question is broad. You could hack a site with a toothpick if you annoy the site owner by poking them with it until they give you the password.
Instead I'll assume you asked the titular question:
Q: Are PKs in URLs a security concern?
A: They can be.
In your example you mention blog posts- so lets assume your site has plenty of users all writing blog posts. Now you add the ability for a User to set their latest blog entry to "private". Blog posts marked private only show up on the dashboard for the user that wrote them, and don't show up on everyone else's blog feeds e.g:
{% for article in articles if not article.private %}
... <article feed stuff here>
{% endif %}
Great!
However, one of your users posts a private article and looks at the address bar which shows https://myblog.blog/articles/42 and then at a previous article they wrote yesterday which is https://myblog.blog/articles/37 and deduces that the ID's are sequential. On a whim they type into the address bar https://myblog.blog/articles/41 and oh dear, now they're looking at an article that someone else posted that for the sake of argument we'll say was also set to private.
Because we had no check in place to make sure that the user looking at the (private) blog post was permitted to do so we exposed someones private information. Which is bad enough for blog posts but a very expensive disaster for e.g. bank accounts (there are plenty of examples of major banks slipping up on this particular issue)
Django has a robust system for dealing with this sort of thing: https://docs.djangoproject.com/en/2.2/topics/auth/default/#limiting-access-to-logged-in-users-that-pass-a-test
The argument can still be made that as well as permissions checks, good practice would be to use UUIDs (or short UUIDs) for the id "slugs" in the URLs of any objects that you would rather weren't guessable.
Also, not security related but on the subject of URLs for public articles and blog posts you may find this interesting: https://wellfire.co/learn/fast-and-beautiful-urls-with-django/

Django url paths what is optimal number to cover all combinations

Example: Car Website.
If you have a URL structure that goes like this /maker/model/year. Where maker, model, year are the generic default placeholder for any. And replace any part will filter the results.
So:
/maker/ will list give you a list of car makers like VW or Ford.
/Ford/model/ will list all models made by Ford
/maker/model_of_car/ will list all models with that name (if two makers have the same name of the model it will list both.
/maker/model/ will list all models
etc... if you need more example please say. But I hope this is enough to get the idea across.
What is the optimal number of URLs need to cover all possibility? Can you show if not all, a critical number of example to get the idea across. So in the form url(r'(?P<maker>[w-]+)/(?P<model>[w-]+/)' (optimal meaning: NOT making just making one super URL that will require a complex view and template to make it work.)
I am sorry about the question name, can someone change it to fit the body (if required). I feel like it does not do the body justice.
Thank you for your time.
I believe this will work for your needs, but I think you should rethink your url convention. Look at how Django Rest Framework builds it's ViewSet's urls. it will work great for you
url(r'^maker/$', <view>),
url(r'^maker/model/$', <view>),
url(r'^maker/(?P<model_of_car>[\w ]+)/$', <view>),
url(r'^(?P<maker>[\w ]+)/model/$', <view>),

CRUD URL Design for Browsers (not REST)

Many questions has been asked on Stack Overflow for RESTful URL design
To name a few...
Hierarchical URL Design:
Hierarchical RESTful URL design
Understanding REST: Verbs, error codes, and authentication: Understanding REST: Verbs, error codes, and authentication
So I am well aware for Restful URL Design.
However, how about URL Design for the Browser for traditional Websites which are not Single Page Applications (SPA).
For the purpose of this example, lets assume we have a Book Database. Lets further assume we have 2 traditional HTML sites created.
HTML Table for showing all books
HTML Form for showing one book (blank or pre-filled with book details)
Now we want that the user of our website can do CRUD operations with it. How about the following URL Design then:
GET /book/show/all // HTML Table
GET /book/show/{id} // HTML Form pre-filled
GET /book/new // HTML Form blank
POST /book/new // Submit HTML Form
POST /book/update/{id} // Submit updated HTML Form
POST /book/delete/{id} // A Link/Button with POST ability (no JS needed)
Questions:
Best practise Browser URL Design
Am I following best practice for URL Design in the Browser (I am not talking about REST here)? Also regarding SEO, Bookmarking and short URL Design? I was thinking of something like: /resource/action/ ...
GET and POST only URL Design
Browsers can only make GET and POST unless someone uses JavaScript. Considering the above URL Design, should it be wiser to introduce JavaScript and make PUT and DELETE requests for updating and deleting a resource? Or should I stay with GET and POST only?
Cheers
Instead of CRUD (create-read-update-delete), I prefer the acronym (D)AREL (display, add, remove, edit, list) -- the (D) is silent ;-)
While not all RESTful API design choices make sense for a browser based crud app, we can borrow much of it, e.g.:
GET /books -- html table listing all books (alternatively /books/list to go with the DAREL acronym)
GET /books/add -- display a form for adding a new book
POST /books/add -- adds a new book and redirects to /book/1 (where 1 is a new book id)
I personally prefer to use plural nouns for collections and singular nouns for items, so..
GET /book/1 -- display book 1 info (e.g. a customer view)
GET /book/1/edit -- display a form to edit /book/1
POST /book/1/edit -- updates /book/1 and redirects to /book/1
GET /book/1/remove -- maybe/probably optional
POST /book/1/remove -- normally /book/1/edit will have a delete button that handles "are you sure..?" and posts here, redirects to /books
The uri scheme is /resource/unique-identifier/action. The (D) / display action is silent/default for a given resource uri.
This also works if you want to model that a book can have multiple authors:
GET /book/1/authors -- list all authors for /book/1
GET /book/1/authors/add -- add author form
GET /book/1/author/1
GET /book/1/author/1/edit
// etc.
although you will likely need a separate/additional url-hierarchy for authors:
GET /authors
GET /authors/add
GET /author/1
// etc.
and similarly, books that an author has written:
GET /author/1/books
// etc.
Most modern web-apps use ajax calls for sub-resources though, so here you can use a pure RESTful api too:
GET /api/book/1/authors -- returns list of all authors for /book/1
POST /api/book/1/authors -- create a new author, returns the new author uri, e.g. /api/author/1
GET /api/author/1 -- get /author/1 info according to MIME type etc.
PUT /api/author/1 -- update /author/1
DELETE /api/author/1 -- delete the /author/1 resource
DELETE /api/book/1/author/1 -- delete author/1 from /book/1? (or maybe this is covered by PUT /api/author/1 ?)
The translation from the original url-scheme is pretty mechanical
/resource/unique-id/action -> http-verb /resource/unique-id
where action = http-verb
display = GET (on a singular resource)
add = POST
remove = DELETE
edit = PUT
list = GET (on a plural/collection resource)

Creating one-to-one RESTful API relationship

I need to create a one-to-one relationship between a Game and a Site - each game happens in one site. In my database, site is an attribute of the Game object which points to a Site object.
I couldn't find much on the internet about this, these are my ideas:
GET /game/<game_id>/site
Gets a game's site, hiding the site id.
POST /game/<game_id>/site
Creates a game's site, this is only used once when creating the game.
PUT /game/<game_id>/site
Updates a game's site
DELETE /game/<game_id>/site
Deletes a game's site.
But what if someone wants to get a list of all the sites? Should I add a /sites URI and have the get method for the Site object detect whether a game_id has been passed in? Should I also let people access a site by /sites/<site_id> Or should I let the client populate their own list of sites by iterating over all games? Finally, I usually have an 'href' attribute for each object which is a link back to itself. If I went with the above design (incl. the /sites/ URI), do I link to /game/<game_id>/site or /sites/<site_id>? Should there be two places to access the same info?
Am I on the right track? Or is there a better way to model one-to-one relationships in REST?
If it matters, I'm using Flask-RESTful to make my APIs.
Your ideas make a lot of sense.
The big distinction is whether or not a site can exist independently of a game. It sounds like it can. For example, two games may point to the same site.
As far as I understand with RESTful API design, there isn't a problem with exposing the same site resource through both /game/<game_id>/site and through /sites/<side_id>. But REST encourages you to link data through hypermedia.
Exposing the site in two different places could complicate things, since you'd then expect to be able to interact with site objects through both of those URLs.
My recommendation to keep your structure explicit and simple would be:
Have a collection of site resources at /sites
Expose site resources at /site/<site_id>
Use a link object from a game to a site. See Thoughts on RESTful API design by Geert Jansen.
Following the link object design, your game resource representation would include something like this:
{
"game_id": 10,
...,
"link": {
rel: resource/site
href: /api/sites/14
}
}
Without more design work, this would mean you'll make a second call to get the site's information. Every design has its compromises :)

How could I allow copy editors for my Django site to create internal links in a DRY manner? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
My Django site is an ecommerce store. Relatively nontechnical copy editors will be logging into the Django admin interface and writing the copy for each of the product pages. They have told me that they want to be able to create links in this copy to other pages on the site. For example, if a product references another product in its description, they want to link between the pages.
I see a couple of possible options:
They simply hardcode the urls in <a> tags in the copy. I've set up ckeditor for the admin textareas so this would be the simplest solution, but if the url structure of the site ever changed, (say we changed them for SEO purposes) all the links would break.
Introduce some sort of wiki syntax where they surround the text that they want the links to be in square brackets. Something like:
Widget A works really well with [[Widget B]]. It is good.
would produce:
Widget A works really well with Widget B. It is good.
Then you have the problem of what happens if the product's name changes?
Has anyone dealt with this problem before and come up with a solution that is flexible enough to allow changing links/names/etc?
I deal with this issue frequently. Ultimately, you have to be very persuasive to convince me to allow embedding links directly into the copy--especially with an e-commerce website.
What if the product name changes or is re-branded?
What if the product is discontinued... you don't want 404 errors from your internal links.
Do you really want to lead people away from your "add to cart" call to action that high up on the page?
Do they know your SEO strategy? Are they going to dilute your links? What verbiage will they use? Will they ensure the link is valid?
When I am asked to give copy/product development team the ability to add links I always start with a No. Ask them what they need them for, explain the problems that can arise (eg. extra cost in maintaining valid links, conversion rate considerations, SEO considerations), and offer alternative solutions.
For example, I usually offer the ability to allow them to associate products with products as "Associated Products", "Related Products", "Accessories", "More Information" etc. You can have these in tabs or lists at the bottom of the product page. These would be in models and thus you have control over not displaying discontinued products, the link names are the product names (which you have SEO control over), etc. Determine if they are going for cross-selling, up-selling, or providing the end user with more information.
As a last resort I have also used a custom code parser which is again based on the target object and not a hard-coded link. For example, let's say you give them the ability to do:
Widget A works really well with [product=123].
A custom template tag, parser in your model/view can replace that with a link to the the Product with id=123 (or use slug) based on get_absolute_url(). If the product is discontinued, the name can still show but no link. This only works if you have a policy of never deleting records. Even then, you may have to have some error handling for when they enter an invalid product ID or somebody does delete that product. That will happen.