django-haystack order_by not working - django

I have a query like
SearchQueryset().all().models(Show).order_by('title')
This will return list of objects. But in the title names there might be the special characters like ./hack:twilight and also numbers like 009:absolute.
According to order_by documentation, the priority goes to special characters. But when I see the output, it starts from the numbers.
Basically I need this output using that query
>> list = ['apple', 'zebra', '.hack', 'orange', 'car', 'funk', 'python']
>>> list.sort()
>>> list
['.hack', 'apple', 'car', 'funk', 'orange', 'python', 'zebra']
Any idea?

Related

Combining filters for a Django queryset

Lets say I have models that look like this:
class Sauce(models.Model):
...
class Topping(models.Model):
...
class Pizza(models.Model):
sauces = models.ManyToManyField(Sauce, related_name='pizzas')
toppings = models.ManyToManyField(Topping, related_name='pizzas')
geo_type = models.CharField(max_length=50, choices=(('NY', 'New York'), ('IT', 'Italy')))
Now, I have an endpoint which accepts URL parameters for filtering the pizza table. For example, one time I might get the following:
{
"sauces": [1, 4],
"toppings": [4, 7],
"geo_type": "NY"
}
Using this, I would simply filter using the following code:
Pizza.objects.filter(sauces__in=url_params["sauces"], toppings__in=url_params["toppings"], geo_type=url_params["geo_type"])
And this would work perfectly fine. However, sometimes I might get URL parameters which look like this:
{
"sauces": [],
"toppings": [4, 7],
"geo_type": "NY"
}
Notice the empty array for the sauces parameter. This means that for this request, I don't care about sauces and it can be anything. Now the query would be something like this:
Pizza.objects.filter(toppings__in=url_params["toppings"], geo_type=url_params["geo_type"])
Once again, this works as expected. However, the issues is that I have a lot of these fields to filter, and the number of combinations is huge. Is there some to just tell my queryset to ignore a filter if it is an empty array? And if the geo_type is an empty string or null, it should ignore those too. Hopefully I have gotten my point across.
Thanks for any help.
You can omit the empty lists, for example by making a helper function:
def filter_ignore_if_empty(qs, **kwargs):
return qs.filter(**{k: v for k, v in kwargs.items() if v != []})
and then filter with:
filter_ignore_if_empty(
Pizza.objects.all(),
sauces__in=url_params['sauces'],
toppings__in=url_params['toppings'],
geo_type=url_params['geo_type']
)

How do I make row-based tuples from column range or list?

This is probably really simple but I'm not too clear.
Let's say I have a data frame and a list of column references. My goal is to make a list of tuples that give that row number's values for only the columns contained in my list.
raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'],
'sex': ['male', 'female', 'male', 'female', 'female'],
'dog': ['Burt','Kane','Billy','Bob','Thorton'],
'cat': ['Evil','PurrEvil','Rorry','Meowth','Killer'],
'fish': ['Johhny','Nemo','Dorry','Jacob','Pinky']}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'sex'])
colref = ['dog','cat','fish']
I want to make a list of tuples like [['Burt','Evil','Johhny'],['Kane','PurrEvil','Nemo'],...]
but I want to do so without hardcoding column names or numbers. The real data set I am performing this on is much larger and variable in size but my list colref will always include all of the columns I want in my tuple list. Does anyone have any tips for me?
I think I may have figured it out.. lol
tuples = [tuple(x) for x in df[colref].values]
If there is a better solution though please let me know. I'm having fun looking at other people's solutions to the problems I encounter as a noob.

Nested Dictionary from Lists and Other Dictionaries

I am trying to make a master dictionary from preexisting lists and dictionaries. I am having a hard time getting it to work like I think it should.
I have a list of student names.
names=[Alice, Bob, Charles, Dan]
I then have 2 dictionaries that have info based on student ID number and another piece of info.
dict1={'100':9, '101:9, '102':11, '103':10} #the keys are student ID and the values are the grade level of the student. Alice is 100, Bob=101...
dict2={'100':9721234567, '101':6071234567, '103': 9727654321, '104':6077654321} #this dictionary gives the home phone number as a value using the student ID as a key.
How can I make a master dictionary to give me all of a student's information? Here's what I've tried based on reading answers to other questions.
Dicta=dict(zip(names, dict1))
Dictb=dict(zip(names, dict2))
Dict=dict(zip(Dicta, Dictb))
Here's the sort of answer I'd like to get.
>Dict[Alice]
>>>'100, 9, 9721234567'
#this is Alice's ID, grade level, and home phone
names is ordered, but the keys of a dict are unordered, so you can't rely on zip(names,dict1) to correctly match names with keys (student ID). For example:
>>> d = {'100':1,'101':2,'102':3,'103':4}
>>> d
{'102': 3, '103': 4, '100': 1, '101': 2} # keys not in order declared.
You need one more dict matching names to student ID. Below I've added an ordered list of IDs then create that dictionary. Then I use a dict comprehension to compute the combined dictionary.
names = ['Alice','Bob','Charles','Dan']
ids = ['100','101','102','103']
dictn = dict(zip(names,ids))
dict1={'100':9, '101':9, '102':11, '103':10}
dict2={'100':9721234567, '101':6071234567, '102': 9727654321, '103':6077654321}
Dict = {k:(v,dict1[v],dict2[v]) for k,v in dictn.items()}
print Dict
print Dict['Alice']
Output:
{'Bob': ('101', 9, 6071234567L), 'Charles': ('102', 11, 9727654321L), 'Alice': ('100', 9, 9721234567L), 'Dan': ('103', 10, 6077654321L)}
('100', 9, 9721234567L)

Return all strings between a list of strings with regular expressions in Python

I have a list of strings like the following:
list = ['a1', 'a2', 'a3', 'a4', 'a5', 'a6']
I would like to retrieve all the items and the indices between a pair of strings. For example, all the items between 'a2' and 'a6'.
Is there a way to do it with regular expressions?
the desire output is the following:
(in reality I only need the indices as I can retreive all the values with the indices).
THe reason to want regex is; I am trying to mine the output from a PDF and I am trying to mine the text and from the text extracted from the PDF I am creating a big list with all the output. From this list created from the PDF a im trying to automate these text extraction of the PDFs. As they can have variable texts and different formats I want to be able to take various formats of representing the same data.I figured regex allows to take text with slight variable format and then transform that with the desired format.
example of reference list:
list = ['name', 'Mark', 'Smith', 'location', 'Florida', 'Coast', 'FL', 'date']
location_indices = [3, 6]
desired namelst = ['name', 'Mark', 'Smith']
location= ['location', 'Florida', 'Coast', 'FL']
I figured that the best way to go about this is to get the indices between Location and Date and from there I can generate the location list. Now, As my original list can vary slightly in the reference list I think regex provides me the flexibility to have slight different original list than I can reformat.
Let's define your list:
>>> lst = ['a1', 'a2', 'a3', 'a4', 'a5', 'a6']
(So as not to overwrite a builtin, I renamed the list to lst.)
Now, let's retrieve the indices and values of all items from a2 to a6 inclusive:
>>> [(i,x) for (i,x) in enumerate(lst) if lst.index('a2')<=i<=lst.index('a6')]
[(1, 'a2'), (2, 'a3'), (3, 'a4'), (4, 'a5'), (5, 'a6')]

Django: ordering numerical value with order_by

I'm in a situation where I must output a quite large list of objects by a CharField used to store street addresses.
My problem is, that obviously the data is ordered by ASCII codes since it's a Charfield, with the predictable results .. it sort the numbers like this;
1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21....
Now the obvious step would be to change the Charfield the proper field type (IntegerField let's say), however it cannot work since some address might have apartments .. like "128A".
I really don't know how I can order this properly ..
If you're sure there are only integers in the field, you could get the database to cast it as an integer via the extra method, and order by that:
MyModel.objects.extra(
select={'myinteger': 'CAST(mycharfield AS INTEGER)'}
).order_by('myinteger')
Django is trying to deprecate the extra() method, but has introduced Cast() in v1.10. In sqlite (at least), CAST can take a value such as 10a and will cast it to the integer 10, so you can do:
from django.db.models import IntegerField
from django.db.models.functions import Cast
MyModel.objects.annotate(
my_integer_field=Cast('my_char_field', IntegerField())
).order_by('my_integer_field', 'my_char_field')
which will return objects sorted by the street number first numerically, then alphabetically, e.g. ...14, 15a, 15b, 16, 16a, 17...
If you're using PostgreSQL (not sure about MySQL) you can safely use following code on char/text fields and avoid cast errors:
MyModel.objects.extra(
select={'myinteger': "CAST(substring(charfield FROM '^[0-9]+') AS INTEGER)"}
).order_by('myinteger')
Great tip! It works for me! :) That's my code:
revisioned_objects = revisioned_objects.extra(select={'casted_object_id': 'CAST(object_id AS INTEGER)'}).extra(order_by = ['casted_object_id'])
I know that I’m late on this, but since it’s strongly related to the question, and that I had a hard time finding this:
You have to know that you can directly put the Cast in the ordering option of your model.
from django.db import models
from django.db.models.functions import Cast
class Address(models.Model):
street_number = models.CharField()
class Meta:
ordering = [
Cast("street_number", output_field=models.IntegerField()),
]
From the doc about ordering:
You can also use query expressions.
And from the doc about database functions:
Functions are also expressions, so they can be used and combined with other expressions like aggregate functions. 
The problem you're up against is quite similar to how filenames get ordered when sorting by filename. There, you want "2 Foo.mp3" to appear before "12 Foo.mp3".
A common approach is to "normalize" numbers to expanding to a fixed number of digits, and then sorting based on the normalized form. That is, for purposes of sorting, "2 Foo.mp3" might expand to "0000000002 Foo.mp3".
Django won't help you here directly. You can either add a field to store the "normalized" address, and have the database order_by that, or you can do a custom sort in your view (or in a helper that your view uses) on address records before handing the list of records to a template.
In my case i have a CharField with a name field, which has mixed (int+string) values, for example. "a1", "f65", "P", "55" e.t.c ..
Solved the issue by using the sql cast (tested with postgres & mysql),
first, I try to sort by the casted integer value, and then by the original value of the name field.
parking_slots = ParkingSlot.objects.all().extra(
select={'num_from_name': 'CAST(name AS INTEGER)'}
).order_by('num_from_name', 'name')
This way, in any case, the correct sorting works for me.
In case you need to sort version numbers consisting of multiple numbers separated by a dot (e.g. 1.9.0, 1.10.0), here is a postgres-only solution:
class VersionRecordManager(models.Manager):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version': "string_to_array(version, '.')::int[]",
},
)
def available_versions(self):
return self.filter(available=True).order_by('-natural_version')
def last_stable(self):
return self.available_versions().filter(stable=True).first()
class VersionRecord(models.Model):
objects = VersionRecordManager()
version = models.CharField(max_length=64, db_index=True)
available = models.BooleanField(default=False, db_index=True)
stable = models.BooleanField(default=False, db_index=True)
In case you want to allow non-numeric characters (e.g. 0.9.0 beta, 2.0.0 stable):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version':
"string_to_array( "
" regexp_replace( " # Remove everything except digits
" version, '[^\d\.]+', '', 'g' " # and dots, then split string into
" ), '.' " # an array of integers.
")::int[] "
}
)
I was looking for a way to sort the numeric chars in a CharField and my search led me here. The name fields in my objects are CC Licenses, e.g., 'CC BY-NC 4.0'.
Since extra() is going to be deprecated, I was able to do it this way:
MyObject.objects.all()
.annotate(sorting_int=Cast(Func(F('name'), Value('\D'), Value(''), Value('g'), function='regexp_replace'), IntegerField()))
.order_by('-sorting_int')
Thus, MyObject with name='CC BY-NC 4.0' now has sorting_int=40.
All the answeres in this thread did not work for me because they are assuming numerical text. I found a solution that will work for a subset of cases. Consider this model
Class Block(models.Model):
title = models.CharField()
say I have fields that sometimes have leading characters and trailing numerical characters If i try and order normally
>>> Block.objects.all().order_by('title')
<QuerySet [<Block: 1>, <Block: 10>, <Block: 15>, <Block: 2>, <Block: N1>, <Block: N12>, <Block: N4>]>
As expected, it's correct alphabetically, but makes no sense for us humans. The trick that I did for this particular use case is to replace any text i find with the number 9999 and then cast the value to an integer and order by it.
for most cases that have leading characters this will get the desired result. see below
from django.db.models.expressions import RawSQL
>>> Block.objects.all()\
.annotate(my_faux_integer=RawSQL("CAST(regexp_replace(title, '[A-Z]+', '9999', 'g') AS INTEGER)", ''))\
.order_by('my_faux_integer', 'title')
<QuerySet [<Block: 1>, <Block: 2>, <Block: 10>, <Block: 15>, <Block: N1>, <Block: N4>, <Block: N12>]>