Django ugettext_lazy, interpolation and ChoiceField - django

I want a ChoiceField with these choices:
choices = [(1, '1 thing'),
(2, '2 things'),
(3, '3 things'),
...]
and I want to have it translated.
This does not work:
choices = [(i, ungettext_lazy('%s thing', '%s things', i) % i) for i in range(1,4)]
because as soon as the lazy object is interpolated, it becomes a unicode object - since ChoiceField.choices is evaluated at startup, its choices will be in the language active during Django's startup.
I could use ugettext_lazy('%s things' % i), but that would require a translation for each numeral, which is silly. What is the right way to do this?

In the Django documentation, Translation… Working with lazy translation objects, I see a remark which seems to address your concern here.
Using ugettext_lazy() and ungettext_lazy() to mark strings in models and utility functions is a common operation. When you're working with these objects elsewhere in your code, you should ensure that you don't accidentally convert them to strings, because they should be converted as late as possible (so that the correct locale is in effect). This necessitates the use of the helper function described next.
Then they present django.utils.functional.lazy(func, *resultclasses), which is not presently covered by the django.utils.functional module documentation. However, according to the django.utils.functional.py source code, it "Turns any callable into a lazy evaluated callable.… the
function is evaluated on every access."
Modifying their example from Other uses of lazy in delayed translations to incorporate your code, the following code might work for you.
from django.utils import six # Python 3 compatibility
from django.utils.functional import lazy
from django.utils.safestring import mark_safe
choices = [
(i, lazy(
mark_safe(ungettext_lazy('%s thing', '%s things', i) % i),
six.text_type
)# lazy()
for i in range(1,4)
]
Also, the django.utils.functional module documentation does mention a function decorator allow_lazy(func, *resultclasses). This lets you write your own function which takes a lazy string as arguments. "It modifies the function so that if it's called with a lazy translation as the first argument, the function evaluation is delayed until it needs to be converted to a string." lazy(func, *resultclasses) is not a decorator, it modifies a callable.
N.B. I haven't tried this code in Django. I'm just passing along what I found in the documentation. Hopefully it will point you to something you can use.

For those who encounter this question. Unfortunately, #Jim DeLaHunt's answer doesn't completely work - it's almost there, but not exactly what needs to be done.
The important distinctions are:
What you need to warp with lazy is a function that'd return you a text value, not another lazy translation object, or you'll likely see weird <django.utils.functional.__proxy__ at ...> instead of the actual text (IIRC Django won't go deep down the chain of lazy objects). So, use ungettext, not ungettext_lazy.
You want to do string interpolation only when the wrapped function runs. If you write lazy(f("%d" % 42)) the interpolation would happen too early - in this case Python evaluates eagerly. And don't forget about variable scopes - you can't just refer to the iterator from the wrapped function.
Here, I've used a lambda that receives a number argument and does the interpolation. The code inside lambda is only executed when lazy object is evaluated, that is, when the choice is rendered.
So, the working code is:
choices = [
(
(i, lazy(
lambda cnt: ungettext(u"%(count)d thing",
u"%(count)d things", cnt)
% {"count": cnt},
six.text_type
)(i))
)
for i in [1, 2, 3]
]
This will correctly have the same intended effect as
choices = [
(1, _("1 thing")),
(2, _("2 things")),
(3, _("3 things")),
]
But there will be just a single entry for this in translation database, not multiple ones.

This looks like a situation where you could benefit from the trick taught by Ilian Iliev's blog, Django forms ChoiceField with dynamic values….
Iliev shows a very similar initialiser:
my_choice_field = forms.ChoiceField(choices=get_my_choices())
He says, "the trick is that in this case my_choice_field choices are initialized on server (re)start. Or in other words once you run the server the choices are loaded(calculated) and they will not change until next (re)start." Sounds like the same difficulty you are encountering.
His trick is: "fortunately the form`s class has an init method that is called on every form load. Most of the times you skipped it in the form definition but now you will have to use it."
Here is his sample code, blended with your generator expression:
class MyForm(forms.Form):
def __init__(self, *args, **kwargs):
super(MyForm, self).__init__(*args, **kwargs)
self.fields['my_choice_field'] = forms.ChoiceField(
choices=(
(i, ungettext_lazy('%s thing', '%s things', i) % i)
for i in range(1,4)
)# choices=
)# __init__
The generator expression is enclosed in parentheses so that it is treated as a generator object, which is assigned to choices.
N.B. I haven't tried this code in Django. I'm just passing along Iliev's idea.

Related

Pass capitalised variables to django settings.configure() [duplicate]

Does python have the ability to create dynamic keywords?
For example:
qset.filter(min_price__usd__range=(min_price, max_price))
I want to be able to change the usd part based on a selected currency.
Yes, It does. Use **kwargs in a function definition.
Example:
def f(**kwargs):
print kwargs.keys()
f(a=2, b="b") # -> ['a', 'b']
f(**{'d'+'e': 1}) # -> ['de']
But why do you need that?
If I understand what you're asking correctly,
qset.filter(**{
'min_price_' + selected_currency + '_range' :
(min_price, max_price)})
does what you need.
You can easily do this by declaring your function like this:
def filter(**kwargs):
your function will now be passed a dictionary called kwargs that contains the keywords and values passed to your function. Note that, syntactically, the word kwargs is meaningless; the ** is what causes the dynamic keyword behavior.
You can also do the reverse. If you are calling a function, and you have a dictionary that corresponds to the arguments, you can do
someFunction(**theDictionary)
There is also the lesser used *foo variant, which causes you to receive an array of arguments. This is similar to normal C vararg arrays.
Yes, sort of.
In your filter method you can declare a wildcard variable that collects all the unknown keyword arguments. Your method might look like this:
def filter(self, **kwargs):
for key,value in kwargs:
if key.startswith('min_price__') and key.endswith('__range'):
currency = key.replace('min_price__', '').replace('__range','')
rate = self.current_conversion_rates[currency]
self.setCurrencyRange(value[0]*rate, value[1]*rate)

Python - Classes and objects

This may appear like a very trivial question but I have just started learning python classes and objects. I have a code like below.
class Point(object):
def __init__(self,x,y):
self.x = float(x)
self.y = float(y)
def __str__(self):
return '('+str(self.x)+','+str(self.y)+')'
def main():
p1 = Point(pt1,pt2)
p2 = Point(pt3,pt4)
p3 = Point(pt5,pt6)
p4 = Point(pt7,pt8)
parray = [p1,p2,p3,p4]
print " Points are", p1,p2,p3,p4
print "parray",parray
I m getting the below Output :
Points are (4.0,2.0) (4.0,8.0) (4.0,-1.0) (100.0,1.0)
parray - intersection.Point object at 0x7ff09f00a550, intersection.Point object at 0x7ff09f00a410, intersection.Point object at 0x7ff09f00a590
My question is why are the addresses of objects assigned to array while I get the values while printing the objects?
Can someone suggest a way to get the values returned by class in array in main function?
When you print an object as an individual argument to a print statement in Python 2 or the print() function in Python 3, Python calls str on the object before printing it out.
When you put the object inside a container like a list and print the list, the list gets str called on it, but it in turn calls repr on each of the items it contains, rather than str. To understand why, look at the list [1, '2, 3', 4] and imagine what it would look like if the quotation marks were not included in the output when it was printed. The quotation marks are part of the '2, 3' string's repr.
So to make your class work the way you want, either rename your __str__ method to __repr__ (which will also work for str calls, since the default implementation of __str__ is to call __repr__), or add an additional __repr__ method. Sometimes it's useful to have a __repr__ that returns a less ambiguous string than __str__ does (for instance, it might name the class as well as the arguments). One common convention is to make __repr__ return a string that could be evaled to get an equivalent object again. For your class, that could look like:
def __repr__(self):
return "{}({!r}, {!r})".format(type(self).__name__, self.x, self.y)
I'd also recommend using string formatting like this (or the older %s style if you prefer), rather than concatenating lots of strings together to build your result.
Python containers, e.g. lists use an objects __repr__ method when printing their contents, not their __str__, Define __repr__ instead:
def __repr__(self):
return '('+str(self.x)+','+str(self.y)+')'
If you want a more detailed explanation of __repr__ vs __str__ see here

Custom Django count filtering

A lot of websites will display:
"1.8K pages" instead of "1,830 pages"
or
"43.2M pages" instead of "43,200,123 pages"
Is there a way to do this in Django?
For example, the following code will generate the quantified amount of objects in the queryset (i.e. 3,123):
Books.objects.all().count()
Is there a way to add a custom count filter to return "3.1K pages" instead of "3,123 pages?
Thank you in advance!
First off, I wouldn't do anything that alters the way the ORM portion of Django works. There are two places this could be done, if you are only planning on using it in one place - do it on the frontend. With that said, there are many ways to achieve this result. Just to spout off a few ideas, you could write a property on your model that calls count then converts that to something a little more human readable for the back end. If you want to do it on the frontend you might want to find a JavaScript lib that could do the conversion.
I will edit this later from my computer and add an example of the property.
Edit: To answer your comment, the easier one to implement depends on your skills in python vs in JavaScript. I prefer python so I would probably do it in there somewhere on the model.
Edit2: I have wrote an example to show you how I would do a classmethod on a base model or on the model that you need these numbers on. I found a python package called humanize and I took its function that converts these to readable and modified it a bit to allow for thousands and took out some of the super large number conversion.
def readable_number(value, short=False):
# Modified from the package `humanize` on pypy.
powers = [10 ** x for x in (3, 6, 9, 12, 15, 18)]
human_powers = ('thousand', 'million', 'billion', 'trillion', 'quadrillion')
human_powers_short = ('K', 'M', 'B', 'T', 'QD')
try:
value = int(value)
except (TypeError, ValueError):
return value
if value < powers[0]:
return str(value)
for ordinal, power in enumerate(powers[1:], 1):
if value < power:
chopped = value / float(powers[ordinal - 1])
chopped = format(chopped, '.1f')
if not short:
return '{} {}'.format(chopped, human_powers[ordinal - 1])
return '{}{}'.format(chopped, human_powers_short[ordinal - 1])
class MyModel(models.Model):
#classmethod
def readable_count(cls, short=True):
count = cls.objects.all().count()
return readable_number(count, short=short)
print(readable_number(62220, True)) # Returns '62.2K'
print(readable_number(6555500)) # Returns '6.6 million'
I would stick that readable_number in some sort of utils and just import it in your models file. Once you have that, you can just stick that string wherever you would like on your frontend.
You would use MyModel.readable_count() to get that value. If you want it under MyModel.objects.readable_count() you will need to make a custom object manager for your model, but that is a bit more advanced.

Python - null object pattern with generators

It is apparently Pythonic to return values that can be treated as 'False' versions of the successful return type, such that if MyIterableObject: do_things() is a simple way to deal with the output whether or not it is actually there.
With generators, bool(MyGenerator) is always True even if it would have a len of 0 or something equally empty. So while I could write something like the following:
result = list(get_generator(*my_variables))
if result:
do_stuff(result)
It seems like it defeats the benefit of having a generator in the first place.
Perhaps I'm just missing a language feature or something, but what is the pythonic language construct for explicitly indicating that work is not to be done with empty generators?
To be clear, I'd like to be able to give the user some insight as to how much work the script actually did (if any) - contextual snippet as follows:
# Python 2.7
templates = files_from_folder(path_to_folder)
result = list(get_same_sections(templates)) # returns generator
if not result:
msg("No data to sync.")
sys.exit()
for data in result:
for i, tpl in zip(data, templates):
tpl['sections'][i]['uuid'] = data[-1]
msg("{} sections found to sync up.".format(len(result)))
It works, but I think that ultimately it's a waste to change the generator into a list just to see if there's any work to do, so I assume there's a better way, yes?
EDIT: I get the sense that generators just aren't supposed to be used in this way, but I will add an example to show my reasoning.
There's a semi-popular 'helper function' in Python that you see now and again when you need to traverse a structure like a nested dict or what-have-you. Usually called getnode or getn, whenever I see it, it reads something like this:
def get_node(seq, path):
for p in path:
if p in seq:
seq = seq[p]
else:
return ()
return seq
So in this way, you can make it easier to deal with the results of a complicated path to data in a nested structure without always checking for None or try/except when you're not actually dealing with 'something exceptional'.
mydata = get_node(my_container, ('path', 2, 'some', 'data'))
if mydata: # could also be "for x in mydata", etc
do_work(mydata)
else:
something_else()
It's looking less like this kind of syntax would (or could) exist with generators, without writing a class that handles generators in this way as has been suggested.
A generator does not have a length until you've exhausted its iterations.
the only way to get whether it's got anything or not, is to exhaust it
items = list(myGenerator)
if items:
# do something
Unless you wrote a class with attribute nonzero that internally looks at your iterations list
class MyGenerator(object):
def __init__(self, items):
self.items = items
def __iter__(self):
for i in self.items:
yield i
def __nonzero__(self):
return bool(self.items)
>>> bool(MyGenerator([]))
False
>>> bool(MyGenerator([1]))
True
>>>

Is there a way to filter a django queryset based on string similarity (a la python difflib)?

I have a need to match cold leads against a database of our clients.
The leads come from a third party provider in bulk (thousands of records) and sales is asking us to (in their words) "filter out our clients" so they don't try to sell our service to a established client.
Obviously, there are misspellings in the leads. Charles becomes Charlie, Joseph becomes Joe, etc. So I can't really just do a filter comparing lead_first_name to client_first_name, etc.
I need to use some sort of string similarity mechanism.
Right now I'm using the lovely difflib to compare the leads' first and last names to a list generated with Client.objects.all(). It works, but because of the number of clients it tends to be slow.
I know that most sql databases have soundex and difference functions. See my test of it in the update below - it doesn't work as well as difflib.
Is there another solution? Is there a better solution?
Edit:
Soundex, at least in my db, doesn't behave as well as difflib.
Here is a simple test - look for "Joe Lopes" in a table containing "Joseph Lopes":
with temp (first_name, last_name) as (
select 'Joseph', 'Lopes'
union
select 'Joe', 'Satriani'
union
select 'CZ', 'Lopes'
union
select 'Blah', 'Lopes'
union
select 'Antonio', 'Lopes'
union
select 'Carlos', 'Lopes'
)
select first_name, last_name
from temp
where difference(first_name+' '+last_name, 'Joe Lopes') >= 3
order by difference(first_name+' '+last_name, 'Joe Lopes')
The above returns "Joe Satriani" as the only match. Even reducing the similarity threshold to 2 doesn't return "Joseph Lopes" as a potential match.
But difflib does a much better job:
difflib.get_close_matches('Joe Lopes', ['Joseph Lopes', 'Joe Satriani', 'CZ Lopes', 'Blah Lopes', 'Antonio Lopes', 'Carlos Lopes'])
['Joseph Lopes', 'CZ Lopes', 'Carlos Lopes']
Edit after gruszczy's response:
Before writing my own, I looked for and found a T-SQL implementation of Levenshtein Distance in the repository of all knowledge.
In testing it, it still won't do a better matching job than difflib.
Which led me to research what algorithm is behind difflib. It seems to be a modified version of the Ratcliff-Obershelp algorithm.
Unhappily I can't seem to find some other kind soul who has already created a T-SQL implementation based on difflib's... I'll try my hand at it when I can.
If nobody else comes up with a better answer in the next few days, I'll grant it to gruszczy. Thanks, kind sir.
soundex won't help you, because it's a phonetic algorithm. Joe and Joseph aren't similar phonetically, so soundex won't mark them as similar.
You can try Levenshtein distance, which is implemented in PostgreSQL. Maybe in your database too and if not, you should be able to write a stored procedure, which will calculate the distance between two strings and use it in your computation.
It's possible with trigram_similar lookups since Django 1.10, see docs for PostgreSQL specific lookups and Full text search
As per the answer of andilabs you can use the Levenshtein function to create your custom function. Postgres doc indicates that the Levenshtein function is as follows:
levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int
levenshtein(text source, text target) returns int
andilabs answer can use the only second function. If you want a more advanced search with insertion/deletion/substitution costs, you can rewrite function like this:
from django.db.models import Func
class Levenshtein(Func):
template = "%(function)s(%(expressions)s, '%(search_term)s', %(ins_cost)d, %(del_cost)d, %(sub_cost)d)"
function = 'levenshtein'
def __init__(self, expression, search_term, ins_cost=1, del_cost=1, sub_cost=1, **extras):
super(Levenshtein, self).__init__(
expression,
search_term=search_term,
ins_cost=ins_cost,
del_cost=del_cost,
sub_cost=sub_cost,
**extras
)
And call the function:
from django.db.models import F
Spot.objects.annotate(
lev_dist=Levenshtein(F('name'), 'Kfaka', 3, 3, 1) # ins = 3, del = 3, sub = 1
).filter(
lev_dist__lte=2
)
If you need getting there with django and postgres and don't want to use introduced in 1.10 trigram-similarity https://docs.djangoproject.com/en/2.0/ref/contrib/postgres/lookups/#trigram-similarity you can implement using Levensthein like these:
Extension needed fuzzystrmatch
you need adding postgres extension to your db in psql:
CREATE EXTENSION fuzzystrmatch;
Lets define custom function with wich we can annotate queryset. It just take one argument the search_term and uses postgres levenshtein function (see docs):
from django.db.models import Func
class Levenshtein(Func):
template = "%(function)s(%(expressions)s, '%(search_term)s')"
function = "levenshtein"
def __init__(self, expression, search_term, **extras):
super(Levenshtein, self).__init__(
expression,
search_term=search_term,
**extras
)
then in any other place in project we just import defined Levenshtein and F to pass the django field.
from django.db.models import F
Spot.objects.annotate(
lev_dist=Levenshtein(F('name'), 'Kfaka')
).filter(
lev_dist__lte=2
)