How to match multiple segments in CodeIgniter routing - regex

I'm trying to re-route my CodeIgniter URIs by changing the controller names only and leaving ANY OTHER SEGMENTS intact, regardless of how many there are (if any).
My controllers are using a single-level folder structure, such as controllers/user/profile_controller.php.
Each controller is called [name]_controller to distinguish it from other files/classes and avoid conflicts (ie. Users controller clashing with Tank Auth users model), but I want the URI to be:
/users/profile
Therefore, a simple route (that works) would be:
$route['(:any)/(:any)'] = '$1/$2_controller';
But the above doesn't allow for subsequent segments, and I don't know how many there could be. (:any) doesn't work because it only applies to single segments and obviously I don't want to write several routes for the potential number of segments, even if it is low.
I have tried a regular expression to match the rest of the URI (eg: users/profile/edit/123/abc), but the following doesn't work:
$route['(:any)/(:any)/(.+)'] = "$1/$2_controller/$3";
Does anyone know if it is possible to match the remaining segments and put them back onto the re-routed URI?
Thanks in advance.
Mat

Related

Coldbox routing dynamic number of path variables

I am working on a coldbox application where I would like to create a route that accepts 'n' number of path variables as one variable. Here is what I mean.
http://localhost/api/variable1/variable2/variable3/...
I would like to either be able to grab everything after /api as one path variable where I can split on / and get the values or be able to iterate over all variables after /api.
Is there a way to setup a Route to do this?
with(pattern="/api", handler="api")
.addRoute(pattern="/:variables", action="index")
.endWith();
Any ideas would be most appreciated. Thanks in advance.
As you probably know, the default routing paradigm is to do name value pairs like so:
http://localhost/api/name1/value1/name2/value2/name3/value3
There is no need to create a custom route for that as everything after the matched part of the route is broken up into name/value pairs and placed in the rc automatically.
Now, it sounds like you're wanting to only have values in your route. If you know the maximum number of variables you'll ever have, you could create a route of optional, incrementally-named variables.
addRoute(pattern="/:var1?/:var2?/:var3?/:var4?/:var5?", action="index")
Now, if you truly might have an unlimited number of variables, there is no way to do a route that will match that. What you CAN do is have your route match the /api bit and write an onRequestCapture interceptor that grabs the URL and does your own custom parsing on it. Note, you may need to remove the name/value pairs that ColdBox will try to put in the rc.
I will add a note of caution-- the only way for this to really work is for you to KNOW the order of the incoming variables ahead of time, and if you know that, there is no reason why you can't create a known route for it. Otherwise you're basically rebuilding the SES interceptor over again which is an anti-pattern called "inner platform effect"
http://wiki.coldbox.org/wiki/URLMappings.cfm#URL_Mappings
http://wiki.coldbox.org/wiki/Interceptors.cfm#Core_Interception_Points
http://en.wikipedia.org/wiki/Inner-platform_effect

Custom CodeIgniter route regex

Trying to improve a custom codeigniter route regex that I have created. Essentially the purpose of the custom route is to create a cleaner/shorter URL for client profile pages which have the format of clients/client-slug, for example: clients/acme-inc. I only want this route to match if their are no additional segments after the client-slug segment, and if they client-slug value does not match any of the 'reserved' values which correspond to actual methods/routes in the Clients controller. Currently, this is what I'm using:
$route['clients/(?!some_method|another_method|foo|bar)(.+)'] = 'clients/index/$1';
This mostly works ok, except for when there is a client-slug that begins with one of the reserved methods text, i.e. clients/food-co, which since it has clients/foo in it, the custom route is not matched. So I need to basically conditionally allow the route to contain any of the reserved methods in that set ONLY IF it is followed by additional characters (that is not a /).
Do you try this?
$route['clients/(?!(?:some_method|another_method|foo|bar)(?:/|$))(.+)'] = 'clients/index/$1';
You should consider the _remap() method in the future. It will allow you to update your controller and add new methods without needing to update your route (you actually wouldn't need a route at all, so long as your URI matches the controller name).

Django reverse routing - solution to case where reverse routing is less specific than forward routing

I have a route defined as follows:
(r'^edit/(\d+)/$', 'app.path.edit')
I want to use the reverse function as follows:
url = reverse('app.path.edit', args=('-id-',))
The generated url gets passed to a js function, and client side code will eventually replace '-id-' with the correct numeric id. This of course won't work, because the 'reverse' function won't match the route, because the url is defined as containing a numeric argument.
I can change the route to accept any type of argument as follows, but then I loose some specificity:
(r'^edit/(.+)/$', 'app.path.edit'
I could create a separate url for each item being displayed, but I'll be displaying many items in a list, so it seems like a waste of bandwidth to include the full url for each item.
Is there a better strategy to accomplish what I want to do?
You can rewrite regexp like this:
(r'^edit/(\d+|-id-)/$', 'app.path.edit')
but I generally prefer this:
(r'^edit/([^/]+)/$', 'app.path.edit') # you can still differ "edit/11/" and "edit/11/param/"
Usually you will anyway need to check entity for existent with get_object_or_404 shortcut or similar, so the only bad is that you have to be more accurate with incoming data as id can contain almost any characters.
In my opinion, and easier solution would be to keep the original url and then pass the value '0' instead of '-id-'. In the client side then you replace '/0/' with the correct id. I think this is better because it doesn't obscure the url routing, and you don't lose specificity.

Match all characters in group except for first and last occurrence

Say I request
parent/child/child/page-name
in my browser. I want to extract the parent, children as well as page name. Here are the regular expressions I am currently using. There should be no limit as to how many children there are in the url request. For the time being, the page name will always be at the end and never be omitted.
^([\w-]{1,}){1} -> Match parent (returns 'parent')
(/(?:(?!/).)*[a-z]){1,}/ -> Match children (returns /child/child/)
[\w-]{1,}(?!.*[\w-]{1,}) -> Match page name (returns 'page-name')
The more I play with this, the more I feel how clunky this solution is. This is for a small CMS I am developing in ASP Classic (:(). It is sort of like the MVC routing paths. But instead of calling controllers and functions based on the URL request. I would be travelling down the hierarchy and finding the appropriate page in the database. The database is using the nested set model and is linked by a unique page name for each child.
I have tried using the split function to split with a / delimiter however I found I was nested so many split statements together it became very unreadable.
All said, I need an efficient way to parse out the parent, children as well as page name from a string. Could someone please provide an alternative solution?
To be honest, I'm not even sure if a regular expression is the best solution to my problem.
Thank you.
You could try using:
^([\w-]+)(/.*/)([\w-]+)$
And then access the three matching groups created using Match.SubMatches. See here for more details.
EDIT
Actually, assuming that you know that [\w-] is all that is used in the names of the parts, you can use ^([\w-]+)(.*)([\w-]+)$ instead and it will handle the no-child case fine by itself as well.

How to solve two REST problems: the interface document; loss of privacy in descriptive URLs

Coming from a lot of frustrating times with WSDL/Soap, I very much like the REST paradigm, but am trying to solve two basic problems in our application, before moving over to REST. The first problem relates to the lack of an interface document. I think I finally see how to handle this situation: One can query his way down from a top-level "/resources" resource using various requests of GET, HEAD, and OPTIONS to find the one needed resource in the correct hypermedia format. Is this the idea? If so, the client need only be provided with a top-level resource URI: http://www.mywebservicesite.com/mywebservice/resources. He will then have to do some searching and possible keep track of what he is discovering, so that he can use the URIs again efficiently in future to do GETs, POSTs, PUTs, and DELETEs. Are there any thoughts on what should happen here?
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber. We do have an implementation of opaque URLs we use in the context of a session, and I'm wondering how opaque URLs might be applied to REST. The general problem is how to keep domain-specific details out of URLs, and still benefit from what REST has to offer.
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber.
I think you've misunderstood the point of opaque URIs. The notion of opaque URIs is with respect to clients: A client shall not decipher a URI to guess anything of semantic meaning from it. So a service may well have URIs like /resources/.../customer/Madonna/phonenumber, and that's quite a good idea. The URIs should be treated as opaque by clients: not infer from the URI that it represents Madonna's phone number, and that Madonna is a customer of some sort. That knowledge can only be obtained by looking inside the URI itself, or perhaps by remembering where the URI was discovered.
Edit:
A consequence of this is that navigation should happen by links, not by deconstructing the URI. So if you see /resouces/customer/Madonna/phonenumber (and it actually represents Customer Madonna's phone number) you should have links in that resource to point to the Madonna resource: e.g.
{
"phone_number" : "01-234-56",
"customer_URI": "/resources/customer/Madonna"
}
That's the only way to navigate from a phone number resource to a customer resource. An important aspect is that the server implementation might or might not have domain specific information in the URI, The Madonna record might just as well live somewhere else: /resources/customers/byid/81496237. This is why clients should treat URIs as opaque.
Edit 2:
Another question you have (in the comments) is then how a client, with the required no knowledge of the server's URIs is supposed to be able to find anything. Clients have the following possibilities to find resources:
Provide a search interface. This could be done by providing an OpenSearch description document, which tells clients how to search for items. An OpenSearch template can include several variables, and several endpoints, depending on what you're looking for. So if you have a "customer ID" that's unique, you could have the following template: /customers/byid/{proprietary:customerid}", the customerid element needs to be documented somewhere, inside the proprietary namespace. A client can then know how to use such a template.
Provide a custom form. This implies making a custom media type in which you explicitly define how (based on an instance of the document) a URI to a customer can be forged. <customers template="/customers/byid/{id}"/>. The documentation (for the media type) would have to state that the template attribute must be interpreted as a relative URI after the string substitution "{id}" to an actual customer ID.
Provide links to all resources. Some resources aren't innumerable, so you can simply make a link to each and every one of them, optionally including identifying information along with the links. This could also be done in a custom media type: <customer id="12345" href="/customer/byid/12345"/>.
It should be noted that #1 and #2 are two ways of saying the same thing: Clients are allowed to create URIs if they
haven't got the URI structure a priori
a media type exists for which the documentation states that URIs should be created
This is much the same way as a web browser has no idea of any URI structure on the web, except for the rules laid out in the definition of HTML forms, to add a ? and then all the query parameters separated by &.
In theory, if you have a customer with id 12345, then you could actually dispense with the href, since you could plug the customer id 12345 into #1 or #2. It's more common to actually provide real links between resources, rather than always relying on lookup or search techniques.
I haven't really used web RPC systems (WSDL/Soap), but i think the 'interface document' is there mostly to allow client libraries to create the service API, right? if so, REST shouldn't need it, because the verbs are already defined and don't really need to be documented again.
AFAIUI, the REST way is to document the structure of each resource (usually encoded in XML or JSON). In that document, you'll also have to document the relationship between those resources. In my case, a resource is often a container of other resources (sometimes more than one type), therefore the structure doc specifies what field holds a list of URLs pointing to the contained resources. Ideally, only one unique resource will need a single, fixed (documented) URL. everithing else follows from there.
The URL 'style' is meaningless to the client, since it shouldn't 'construct' an URL. Every URL it needs should be already constructed on a resource field. That let's you change the URL structure without changing the client (that has saved tons of time to me). Your URLs can be as opaque or as descriptive as you like. (personally, i don't like text keys or slugs; my keys are all BIGINTs or UUIDs)
I am currently building a REST "agent" that addresses the first part of your question. The agent offers a temporary bookmarking service. The client code that is interacting with the agent can request that an URL be bookmarked using some identifier. If the client code needs to retrieve that representation again, it simply asks the agent for the url that corresponds to the saved bookmark and then navigates to that bookmark. Currently those bookmarks are not persisted so they only last for the lifetime of the client application, but I have found it a useful mechanism for accessing commonly used resources. E.g. The root representation provides a login link. I bookmark that link and if the client ever receives a 401 then I can redirect to the "login" bookmark.
To address an issue you mentioned in a comment, the agent also has the ability to store retrieved representations in a dictionary. If it becomes necessary to aggregate and manipulate multiple representations at the same time then I can simply request that the agent store the current representation in a dictionary associated to a key and then continue navigating to the next resource. Once the client has accumulated all the necessary representation it can do what it needs to do.