Haystack SearchIndex model_attr not following relation correctly?

Haystack SearchIndex model_attr not following relation correctly? - django

I'm using Django Haystack v2.0.0 and Whoosh v2.4.0. According to Haystack's documentation search indexes can use Django's related field lookup in the model_attr parameter. However, running the following code using manage.py shell command:
from haystack.query import SearchQuerySet
for r in SearchQuerySet():
print r.recruitment_agency # Prints True for every job
print r.recruitment_agency == r.object.employer.recruitment_agency
# Prints False if r.object.employer.recruitment_agency is False
I have tried rebuilding the index several times, the index's directory is writeable, and I don't get any error messages. All other fields work as expected.
I have the following (simplified) models:
companies/models.py:
class Company(models.Model):
recruitment_agency = models.BooleanField(default=False)
jobs/models.py:
class Job(models.Model):
employer = models.ForeignKey(Company, related_name='jobs')
jobs/search_indexes.py:
class JobIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
recruitment_agency = indexes.BooleanField(model_attr='employer__recruitment_agency')
def get_model(self):
return Job
jobs/forms.py:
class JobSearchForm(SearchForm):
no_recruitment_agencies = forms.BooleanField(label="Hide recruitment agencies", required=False)
def search(self):
sqs = super(JobSearchForm, self).search()
if self.cleaned_data['no_recruitment_agencies']:
sqs = sqs.filter(recruitment_agency=False)
return sqs
Does anyone know what could be the problem?

Meanwhile I've switched over to the ElasticSearch backend, but the problem persisted, indicating that it might be a problem in haystack, and not in Whoosh.
The problem is that the python values True and False are not saved as boolean values, but as string, and they are not converted back to boolean values. To filter on boolean values, you have to check for the strings 'true' and 'false':
class JobSearchForm(SearchForm):
no_recruitment_agencies = forms.BooleanField(label="Hide recruitment agencies", required=False)
def search(self):
sqs = super(JobSearchForm, self).search()
if self.cleaned_data['no_recruitment_agencies']:
sqs = sqs.filter(recruitment_agency='false') # Change the filter here
return sqs

Related

Django: make model field and foreign key unique together

I have the following model:
class Locale (models.Model):
"""
Locale model
"""
locale_id = models.AutoField(primary_key=True)
locale_name = models.CharField(max_length=800)
magister = models.ForeignKey(Magister, on_delete=models.CASCADE)
def get_name(self):
return self.locale_name
In the database, there must be exactly one locale-magister pair.
To create each locale item, an administrator has to upload the locales. This is done via a bulk upload:
try:
lcl=Locale(locale_name = data_dict["locale_name"], magister = data_dict["magister "])
# lcl.full_clean()
locales_list.append(lcl)
rows+=1
if rows==INSERTNUMBER:
try:
Locale.objects.bulk_create(locales_list)
locales_uploaded+=rows
except IntegrityError as e:
print("Error: locale bulk_create "+repr(e))
locales_list=[]
rows=0
I tried using lcl.full_clean() in my bulk upload but I get a UNIQUE constraint failed: zones_locale.locale_name error and only about 1/2 of all locales upload successfully.
I also tried using:
def validate_unique(self, exclude=None):
lcl = Locale.objects.filter(locale_id=self.locale_id)
if lcl.filter(magister=self.magister).exists():
raise ValidationError("item already exists")
But the same error occurs.
I also tried using:
class Meta:
unique_together = (("locale_name", "magister"),)
This did not work either.
From what I can tell, the problem is that there exist locales with the same name that belong to different magisters.
How can I allow locales with the same name to be uploaded while also enforcing the uniqueness of any given locale-magister pair?

Django unique_together with nullable ForeignKey

I'm using Django 1.8.4 in my dev machine using Sqlite and I have these models:
class ModelA(Model):
field_a = CharField(verbose_name='a', max_length=20)
field_b = CharField(verbose_name='b', max_length=20)
class Meta:
unique_together = ('field_a', 'field_b',)
class ModelB(Model):
field_c = CharField(verbose_name='c', max_length=20)
field_d = ForeignKey(ModelA, verbose_name='d', null=True, blank=True)
class Meta:
unique_together = ('field_c', 'field_d',)
I've run proper migration and registered them in the Django Admin. So, using the Admin I've done this tests:
I'm able to create ModelA records and Django prohibits me from creating duplicate records - as expected!
I'm not able to create identical ModelB records when field_b is not empty
But, I'm able to create identical ModelB records, when using field_d as empty
My question is: How do I apply unique_together for nullable ForeignKey?
The most recent answer I found for this problem has 5 year... I do think Django have evolved and the issue may not be the same.

Django 2.2 added a new constraints API which makes addressing this case much easier within the database.
You will need two constraints:
The existing tuple constraint; and
The remaining keys minus the nullable key, with a condition
If you have multiple nullable fields, I guess you will need to handle the permutations.
Here's an example with a thruple of fields that must be all unique, where only one NULL is permitted:
from django.db import models
from django.db.models import Q
from django.db.models.constraints import UniqueConstraint
class Badger(models.Model):
required = models.ForeignKey(Required, ...)
optional = models.ForeignKey(Optional, null=True, ...)
key = models.CharField(db_index=True, ...)
class Meta:
constraints = [
UniqueConstraint(fields=['required', 'optional', 'key'],
name='unique_with_optional'),
UniqueConstraint(fields=['required', 'key'],
condition=Q(optional=None),
name='unique_without_optional'),
]

UPDATE: previous version of my answer was functional but had bad design, this one takes in account some of the comments and other answers.
In SQL NULL does not equal NULL. This means if you have two objects where field_d == None and field_c == "somestring" they are not equal, so you can create both.
You can override Model.clean to add your check:
class ModelB(Model):
#...
def validate_unique(self, exclude=None):
if ModelB.objects.exclude(id=self.id).filter(field_c=self.field_c, \
field_d__isnull=True).exists():
raise ValidationError("Duplicate ModelB")
super(ModelB, self).validate_unique(exclude)
If used outside of forms you have to call full_clean or validate_unique.
Take care to handle the race condition though.

#ivan, I don't think that there's a simple way for django to manage this situation. You need to think of all creation and update operations that don't always come from a form. Also, you should think of race conditions...
And because you don't force this logic on DB level, it's possible that there actually will be doubled records and you should check it while querying results.
And about your solution, it can be good for form, but I don't expect that save method can raise ValidationError.
If it's possible then it's better to delegate this logic to DB. In this particular case, you can use two partial indexes. There's a similar question on StackOverflow - Create unique constraint with null columns
So you can create Django migration, that adds two partial indexes to your DB
Example:
# Assume that app name is just `example`
CREATE_TWO_PARTIAL_INDEX = """
CREATE UNIQUE INDEX model_b_2col_uni_idx ON example_model_b (field_c, field_d)
WHERE field_d IS NOT NULL;
CREATE UNIQUE INDEX model_b_1col_uni_idx ON example_model_b (field_c)
WHERE field_d IS NULL;
"""
DROP_TWO_PARTIAL_INDEX = """
DROP INDEX model_b_2col_uni_idx;
DROP INDEX model_b_1col_uni_idx;
"""
class Migration(migrations.Migration):
dependencies = [
('example', 'PREVIOUS MIGRATION NAME'),
]
operations = [
migrations.RunSQL(CREATE_TWO_PARTIAL_INDEX, DROP_TWO_PARTIAL_INDEX)
]

Add a clean method to your model - see below:
def clean(self):
if Variants.objects.filter("""Your filter """).exclude(pk=self.pk).exists():
raise ValidationError("This variation is duplicated.")

I think this is more clear way to do that for Django 1.2+
In forms it will be raised as non_field_error with no 500 error, in other cases, like DRF you have to check this case manual, because it will be 500 error.
But it will always check for unique_together!
class BaseModelExt(models.Model):
is_cleaned = False
def clean(self):
for field_tuple in self._meta.unique_together[:]:
unique_filter = {}
unique_fields = []
null_found = False
for field_name in field_tuple:
field_value = getattr(self, field_name)
if getattr(self, field_name) is None:
unique_filter['%s__isnull' % field_name] = True
null_found = True
else:
unique_filter['%s' % field_name] = field_value
unique_fields.append(field_name)
if null_found:
unique_queryset = self.__class__.objects.filter(**unique_filter)
if self.pk:
unique_queryset = unique_queryset.exclude(pk=self.pk)
if unique_queryset.exists():
msg = self.unique_error_message(self.__class__, tuple(unique_fields))
raise ValidationError(msg)
self.is_cleaned = True
def save(self, *args, **kwargs):
if not self.is_cleaned:
self.clean()
super().save(*args, **kwargs)

One possible workaround not mentioned yet is to create a dummy ModelA object to serve as your NULL value. Then you can rely on the database to enforce the uniqueness constraint.

Django Tests: setUpTestData on Postgres throws: "Duplicate key value violates unique constraint"

I am running into a database issue in my unit tests. I think it has something to do with the way I am using TestCase and setUpData.
When I try to set up my test data with certain values, the tests throw the following error:
django.db.utils.IntegrityError: duplicate key value violates unique constraint
...
psycopg2.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_productgroup_product_name_48ec6f8d_uniq"
DETAIL: Key (product_name)=(Almonds) already exists.
I changed all of my primary keys and it seems to be running fine. It doesn't seem to affect any of the tests.
However, I'm concerned that I am doing something wrong. When it first happened, I reversed about an hour's worth of work on my app (not that much code for a noob), which corrected the problem.
Then when I wrote the changes back in, the same issue presented itself again. TestCase is pasted below. The issue seems to occur after I add the sortrecord items, but corresponds with the items above it.
I don't want to keep going through and changing primary keys and urls in my tests, so if anyone sees something wrong with the way I am using this, please help me out. Thanks!
TestCase
class DetailsPageTest(TestCase):
#classmethod
def setUpTestData(cls):
cls.product1 = ProductGroup.objects.create(
product_name="Almonds"
)
cls.variety1 = Variety.objects.create(
product_group = cls.product1,
variety_name = "non pareil",
husked = False,
finished = False,
)
cls.supplier1 = Supplier.objects.create(
company_name = "Acme",
company_location = "Acme Acres",
contact_info = "Call me!"
)
cls.shipment1 = Purchase.objects.create(
tag=9,
shipment_id=9999,
supplier_id = cls.supplier1,
purchase_date='2015-01-09',
purchase_price=9.99,
product_name=cls.variety1,
pieces=99,
kgs=999,
crackout_estimate=99.9
)
cls.shipment2 = Purchase.objects.create(
tag=8,
shipment_id=8888,
supplier_id=cls.supplier1,
purchase_date='2015-01-08',
purchase_price=8.88,
product_name=cls.variety1,
pieces=88,
kgs=888,
crackout_estimate=88.8
)
cls.shipment3 = Purchase.objects.create(
tag=7,
shipment_id=7777,
supplier_id=cls.supplier1,
purchase_date='2014-01-07',
purchase_price=7.77,
product_name=cls.variety1,
pieces=77,
kgs=777,
crackout_estimate=77.7
)
cls.sortrecord1 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-05",
bags_sorted=20,
turnout=199,
)
cls.sortrecord2 = SortingRecords.objects.create(
tag=cls.shipment1,
date="2015-02-07",
bags_sorted=40,
turnout=399,
)
cls.sortrecord3 = SortingRecords.objects.create(
tag=cls.shipment1,
date='2015-02-09',
bags_sorted=30,
turnout=299,
)
Models
from datetime import datetime
from django.db import models
from django.db.models import Q
class ProductGroup(models.Model):
product_name = models.CharField(max_length=140, primary_key=True)
def __str__(self):
return self.product_name
class Meta:
verbose_name = "Product"
class Supplier(models.Model):
company_name = models.CharField(max_length=45)
company_location = models.CharField(max_length=45)
contact_info = models.CharField(max_length=256)
class Meta:
ordering = ["company_name"]
def __str__(self):
return self.company_name
class Variety(models.Model):
product_group = models.ForeignKey(ProductGroup)
variety_name = models.CharField(max_length=140)
husked = models.BooleanField()
finished = models.BooleanField()
description = models.CharField(max_length=500, blank=True)
class Meta:
ordering = ["product_group_id"]
verbose_name_plural = "Varieties"
def __str__(self):
return self.variety_name
class PurchaseYears(models.Manager):
def purchase_years_list(self):
unique_years = Purchase.objects.dates('purchase_date', 'year')
results_list = []
for p in unique_years:
results_list.append(p.year)
return results_list
class Purchase(models.Model):
tag = models.IntegerField(primary_key=True)
product_name = models.ForeignKey(Variety, related_name='purchases')
shipment_id = models.CharField(max_length=24)
supplier_id = models.ForeignKey(Supplier)
purchase_date = models.DateField()
estimated_delivery = models.DateField(null=True, blank=True)
purchase_price = models.DecimalField(max_digits=6, decimal_places=3)
pieces = models.IntegerField()
kgs = models.IntegerField()
crackout_estimate = models.DecimalField(max_digits=6,decimal_places=3, null=True)
crackout_actual = models.DecimalField(max_digits=6,decimal_places=3, null=True)
objects = models.Manager()
purchase_years = PurchaseYears()
# Keep manager as "objects" in case admin, etc. needs it. Filter can be called like so:
# Purchase.objects.purchase_years_list()
# Managers in docs: https://docs.djangoproject.com/en/1.8/intro/tutorial01/
class Meta:
ordering = ["purchase_date"]
def __str__(self):
return self.shipment_id
def _weight_conversion(self):
return round(self.kgs * 2.20462)
lbs = property(_weight_conversion)
class SortingModelsBagsCalulator(models.Manager):
def total_sorted(self, record_date, current_set):
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= record_date]
return sum(sorted)
class SortingRecords(models.Model):
tag = models.ForeignKey(Purchase, related_name='sorting_record')
date = models.DateField()
bags_sorted = models.IntegerField()
turnout = models.IntegerField()
objects = models.Manager()
def __str__(self):
return "%s [%s]" % (self.date, self.tag.tag)
class Meta:
ordering = ["date"]
verbose_name_plural = "Sorting Records"
def _calculate_kgs_sorted(self):
kg_per_bag = self.tag.kgs / self.tag.pieces
kgs_sorted = kg_per_bag * self.bags_sorted
return (round(kgs_sorted, 2))
kgs_sorted = property(_calculate_kgs_sorted)
def _byproduct(self):
waste = self.kgs_sorted - self.turnout
return (round(waste, 2))
byproduct = property(_byproduct)
def _bags_remaining(self):
current_set = SortingRecords.objects.values().filter(~Q(id=self.id), tag=self.tag)
sorted = [SortingRecords['bags_sorted'] for SortingRecords in current_set if
SortingRecords['date'] <= self.date]
remaining = self.tag.pieces - sum(sorted) - self.bags_sorted
return remaining
bags_remaining = property(_bags_remaining)
EDIT
It also fails with integers, like so.
django.db.utils.IntegrityError: duplicate key value violates unique constraint "InventoryLogs_purchase_pkey"
DETAIL: Key (tag)=(9) already exists.
UDPATE
So I should have mentioned this earlier, but I completely forgot. I have two unit test files that use the same data. Just for kicks, I matched a primary key in both instances of setUpTestData() to a different value and sure enough, I got the same error.
These two setups were working fine side-by-side before I added more data to one of them. Now, it appears that they need different values. I guess you can only get away with using repeat data for so long.

I continued to get this error without having any duplicate data but I was able to resolve the issue by initializing the object and calling the save() method rather than creating the object via Model.objects.create()
In other words, I did this:
#classmethod
def setUpTestData(cls):
cls.person = Person(first_name="Jane", last_name="Doe")
cls.person.save()
Instead of this:
#classmethod
def setUpTestData(cls):
cls.person = Person.objects.create(first_name="Jane", last_name="Doe")

I've been running into this issue sporadically for months now. I believe I just figured out the root cause and a couple solutions.
Summary
For whatever reason, it seems like the Django test case base classes aren't removing the database records created by let's just call it TestCase1 before running TestCase2. Which, in TestCase2 when it tries to create records in the database using the same IDs as TestCase1 the database raises a DuplicateKey exception because those IDs already exists in the database. And even saying the magic word "please" won't help with database duplicate key errors.
Good news is, there are multiple ways to solve this problem! Here are a couple...
Solution 1
Make sure if you are overriding the class method tearDownClass that you call super().tearDownClass(). If you override tearDownClass() without calling its super, it will in turn never call TransactionTestCase._post_teardown() nor TransactionTestCase._fixture_teardown(). Quoting from the doc string in TransactionTestCase._post_teardown()`:
def _post_teardown(self):
"""
Perform post-test things:
* Flush the contents of the database to leave a clean slate. If the
class has an 'available_apps' attribute, don't fire post_migrate.
* Force-close the connection so the next test gets a clean cursor.
"""
If TestCase.tearDownClass() is not called via super() then the database is not reset in between test cases and you will get the dreaded duplicate key exception.
Solution 2
Override TransactionTestCase and set the class variable serialized_rollback = True, like this:
class MyTestCase(TransactionTestCase):
fixtures = ['test-data.json']
serialized_rollback = True
def test_name_goes_here(self):
pass
Quoting from the source:
class TransactionTestCase(SimpleTestCase):
...
# If transactions aren't available, Django will serialize the database
# contents into a fixture during setup and flush and reload them
# during teardown (as flush does not restore data from migrations).
# This can be slow; this flag allows enabling on a per-case basis.
serialized_rollback = False
When serialized_rollback is set to True, Django test runner rolls back any transactions inserted into the database beween test cases. And batta bing, batta bang... no more duplicate key errors!
Conclusion
There are probably many more ways to implement a solution for the OP's issue, but these two should work nicely. Would definitely love to have more solutions added by others for clarity sake and a deeper understanding of the underlying Django test case base classes. Phew, say that last line real fast three times and you could win a pony!

The log you provided states DETAIL: Key (product_name)=(Almonds) already exists. Did you verify in your db?
To prevent such errors in the future, you should prefix all your test data string by test_

I discovered the issue, as noted at the bottom of the question.
From what I can tell, the database didn't like me using duplicate data in the setUpTestData() methods of two different tests. Changing the primary key values in the second test corrected the problem.

I think the problem here is that you had a tearDownClass method in your TestCase without the call to super method.
In this way the django TestCase lost the transactional functionalities behind the setUpTestData so it doesn't clean your test db after a TestCase is finished.
Check warning in django docs here:
https://docs.djangoproject.com/en/1.10/topics/testing/tools/#django.test.SimpleTestCase.allow_database_queries

I had similar problem that had been caused by providing the primary key value to a test case explicitly.
As discussed in the Django documentation, manually assigning a value to an auto-incrementing field doesn’t update the field’s sequence, which might later cause a conflict.
I have solved it by altering the sequence manually:
from django.db import connection
class MyTestCase(TestCase):
#classmethod
def setUpTestData(cls):
Model.objects.create(id=1)
with connection.cursor() as c:
c.execute(
"""
ALTER SEQUENCE "app_model_id_seq" RESTART WITH 2;
"""
)

How do I query for empty MultiValueField results in Django Haystack

Using Django 1.4.2, Haystack 2.0beta, and ElasticSearch 0.19, how do I query for results which have an empty set [] for a MultiValueField?

I'd create an integer field named num_<field> and query against it.
In this example 'emails' is the MultiValueField, so we'll create 'num_emails':
class PersonIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
name = indexes.CharField(model_attr='name')
emails = indexes.MultiValueField(null=True)
num_emails = indexes.IntegerField()
def prepare_num_emails(self, object):
return len(object.emails)
Now, in your searches you can use
SearchQuerySet().filter(num_emails=0)

You can also change prepare_ method of your MultiValueField:
def prepare_emails(self, object):
emails = [e for e in object.emails]
return emails if emails else ['None']
Then you can filter:
SearchQuerySet().filter(emails=None)

Serialize in django witha query set that contains defer

i have i little problem, and that is how can serialize a django query with defer ?
I have this model :
class Evento(models.Model):
nome=models.CharField(max_length=100)
descricao=models.CharField(max_length=200,null=True)
data_inicio= models.DateTimeField()
data_fim= models.DateTimeField()
preco=models.DecimalField(max_digits=6,decimal_places=2)
consumiveis= models.CharField(max_length=5)
dress_code= models.CharField(max_length=6)
guest_list=models.CharField(max_length=15)
local = models.ForeignKey(Local)
user= models.ManyToManyField(User,null=True,blank=True)
def __unicode__(self):
return unicode('%s %s'%(self.nome,self.descricao))
my query is this :
eventos_totais = Evento.objects.defer("user").filter(data_inicio__gte=default_inicio,
data_fim__lte=default_fim)
it works fine i think (how can i check if the query has realy defer the field user ? ) but when i do:
json_serializer = serializers.get_serializer("json")()
eventos_totais = json_serializer.serialize(eventos_totais,
ensure_ascii=False,
use_natural_keys=True)
it always folow the natural keys for user and local, i need natural keys for this query because of the fields local. But i do not need the field user.

To serialize a subset of your models fields, you need to specify the fields argument to the serializers.serialize()
from django.core import serializers
data = serializers.serialize('xml', SomeModel.objects.all(), fields=('name','size'))
Ref: Django Docs

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Haystack SearchIndex model_attr not following relation correctly? - django

Related

Django: make model field and foreign key unique together

Django unique_together with nullable ForeignKey

Django Tests: setUpTestData on Postgres throws: "Duplicate key value violates unique constraint"

How do I query for empty MultiValueField results in Django Haystack

Serialize in django witha query set that contains defer

Categories

Resources