I have a question regarding generating example urls according the Django url and path.
For example if I have 2 urls like:
path('example/<int:example_id>/', views.some_view, name='example')
url(r'^example/(?P<example_id>[0-9]+)/$', views.some_view, name='example')
Is it possible to generate somehow by Django built-in means example full-urls for tests like:
example/234/
example/93/
example/228/
etc…
randomly.
For different urls based on concrete name + regex or converter parameters?
In other words where in source code Django understands that <int:example_id> is an integer and so on. If I have something like : 1) I expect example/ 2) I expect int - I would be able to use it to generate random example url.
Hopefully this is clear...
These urls are just an example. Real urls will differ.
Thank you.
In other words where in source code Django understands that <int:example_id> is an integer and so on.
This is in essence looking for <…:…> patterns in the path. Django has furthermore a set of path converters. The standard ones are int, str, slug, uuid, path, but you can define your own path converters. Indeed, the Django documentation has a section registering custom path converters that explains how you can design your own path converters.
What a path basically does is constructing a regex. It thus searches for patterns like <int:example_id>. Since you here used int, the IntConverter [GitHUb] is used:
class IntConverter:
regex = '[0-9]+'
def to_python(self, value):
return int(value)
def to_url(self, value):
return str(value)
Here it will thus replace the <int:example_id> by (?<example_id>[0-9]+), where the regex part thus will be used.
A path(…) does not only constructs a regex. It also automatically maps the captured items to a Python object by calling .to_python(…). This thus means that if you work with re_path(r'^example/(?P<example_id>[0-9]+)/$', …), then example_id will be a string, but for path('example/<int:example_id>/', …), it will automatically call int(…) on the captured string, and thus pass an int object. Often that does not make much difference.
This also holds when you serialize an object: you can pass a value, and it will call .to_url(…) on the object to convert it to a string representation for the generated URL. This thus means that you can write custom path converters that thus perform more sophisticated conversions.
I expect example/ 2) I expect int - I would be able to use it to generate random example url.
Well eventually it boils down to a regular expression. A regular expression can be converted into a finite state machine, so we can generate random valid strings. For example exrex [GitHub] is capable to generate all valid strings (this is often an generator that will keep proposing new strings), or random strings that match the regular expression. It is however not said that because it matches the path(…), that it will be sensical for the view. If you for example use this int as the primary key of a Post object it is fetching, then for random URLs, it is likely that you will generate URLs for non-existing Posts.
Related
I want to pass into a variable, the language of the user.
But, my client can't/didn't pass this information trough datalayer. So, the unique solution I've is to use the URL Path.
Indeed - The structure is:
http://www.website.be/en/subcategory/subsubcategory
I want to extract "en" information
No idea to get this - I check on Stack, on google, some people talk about regex, other ones about CustomJS, but, no result on my specific setup.
Do you have an idea how to proceed on this point ?
Many thanks !!
Ludo
Make sure the built in {{Page Path}} variable is enabled. Create a custom Javascript variable.
function() {
var parts = {{Page Path}}.split("/");
return parts[1];
}
This splits the path by the path delimiter "/" and gives you an array with the parts. Since the page path has a leading slash (I think), the first part is empty, so you return the second one (since array indexing starts with 0 the second array element has the index 1).
This might need a bit of refinement (for pages that do not start with a language signifier, if any), but that's the basic idea.
Regex is an alternative (via the regex table variable), but the above solution is a little easier to implement.
Say your program can imports URL paths for seeding a crawl.
A user wants to define a pattern that should function a see - e.g.
http://example\.com/mypage-[0-9][0-9][0-9]?/jump(suit|er)/
Could just be a simplified version of regex syntax if need to be - but something like above would be required for the user to enter.
From the above my software would then need to generate a long list like:
http://example.com/mypage-0/jumpsuit/
http://example.com/mypage-0/jumper/
http://example.com/mypage-1/jumpsuit/
http://example.com/mypage-1/jumper/
...
http://example.com/mypage-998/jumper/
http://example.com/mypage-999/jumper/
Is there anything around for Delphi that can do what I want? Or am I missing an obvious way at achieving what I want that does not require to write a regex parser from scratch :)
I have a user access log like this:
pagename url
broker_pv /broker/934832
broker_pv /broker/983432
broker_pv /broker/n/342349
listing_pv /listing/a1-b2/
listing_pv /listing/c3/
I want to find out whether a future url "/broker/245729" belongs to "broker_pv" or "listing_pv", or doesn't match at all.
It's like a machine learning process: I feed the computer some raw data, it learns, and then help me filtering things.
One way to do it, I can think of, is a "pattern finder" process. i.e., from the raw input, we human can deduce that "broker_pv" urls will match a pattern "/broker/(n/)?[0-9]+". So when a url like "/broker/245729" comes, I can test all the patterns against it, and judge which "pagename" it belongs to.
Then the question is, how to find out these patterns and thus build up a "pagename-pattern pair collection" for further use.
Or there's a better way, hopefully?
I'm using the Moovweb SDK and using Tritium. I want my mobile site to behave like my desktop site. I have different URLs pointing to my homepage. Should I use regex? A common element? And what's the best syntax for matching the path?
The mappings.ts file in the scripts directory is where particular pages are matched. The file is imported in html.ts and allows us to say "when a certain page is matched, make the following transformations."
Most projects already have a mappings file generated. A simple layout will be as so:
match($path) {
with(/home/) {
log("--> Importing pages/homes.ts in mappings.ts")
#import pages/home.ts
}
}
Every time you start working on a new page, you need to set up a new "map".
First: Match with a unique path
The Tritium above matches the path for the homepage. The path is the bit of a URL after the domain. For example, in www.example.com/search/item, "www.example.com" is the domain and "search/item" is the path.
The <>/home/<> is specifying the "home" part with regular expressions. You could also use a plain string if necessary:
with("home")
If Tritium matches the path with the matcher, it will import the home page.
It's probably true that the homepage of a site doesn't actually contain the word home. Most homepages are the URL without any matcher. A better string matcher could be:
match($path) {
with ("/")
}
Or, using regex:
with(/index|^\/$/) {
As you can see, the <>with()<> function of the mappings file is where knowledge of Regex can really come in handy. Check out our short guide on regex. Sometimes it will be simpler, such as <>(/search/)<>.
Remember to come up with the most unique aspect of the URL possible. If two <>with()<> functions match the same URL, then the one that appears first in the mappings file will be used. If you cannot find a unique URL matcher for different page types, you may have to match via other means.
Why Use Regex?
It might seem easier to use a string rather than a regex matcher. However, regex provides a lot more flexibility over which URLs are matched.
For example, a site could use a string of numbers in its product page URLs. Using a normal string matcher would not be practical - you'd have to list out all the numbers possible for all the items on the site. An easier way would be to use regex to say, "If there's a string of 5 digits, continue!" (The code for matching 5 digits: <>/\d{5}/<>.)
Second: Log the match
When matching a particular path, you should also use <>log()<> statements so you know exactly what's getting imported. The log statement will be printed in the command line window, so you can see if your regular expression accurately matches your path.
match($path) {
with(/index|^\/$/) {
log("--> importing pages/home.ts in mappings.ts")
}
}
Third: Import the file
Finally, use the <>#import<> function to include the page-specific tritium file.
match($path) {
with(/index|^\/$/) {
log("--> importing pages/home.ts in mappings.ts")
#import pages/home.ts
}
}
So in django we write
Entry.objects.filter(blog__id=3)
Which looks ugly because sometimes there are too many underscores
Entry.objects.filter(blog_something_underscore_too_many_id=3)
why django can't use syntax like
[entry.objects if blog.id=3 ]
?
I am not expert on this, but why must double underscore? Could there be a more elegant style in python's grammar to write this?
Django runs on Python, which sets some basic constraints when it comes to syntax, making the following suggested syntax impossible (Python does not allow much re-definition of basic syntax):
[entry.objects if blog.id=3 ]
Also, "blog" and "id" are not objects, they refer to names in the database, so addressing these as blog.id is also problematic. Unless of course it is entered as a string, which is actually what is being done seeing as keyword arguments are passed as a dictionary objects in Python. It could of course be done in other ways, here is an example of how to use dots as separators:
def dotstyle(dict):
retdict = {}
for key, value in dict.items():
retdict[key.replace(".", "__")] = value
return retdict
Entry.objects.filter(**dotstyle({"blog.id": 3})
By incorporating this to the filter function in Django, we could do away with the dotstyle function and the awkward **, but we are still left with the dictionary braces, which is probably why they went with the double underscores instead.