Testing triggers for fulltext search in Django - django

I'm adding a search engine to a Django project, and thus set up SearchVectorFields on several models, with custom triggers.
I would like to unit-test that my columns of type TSVECTOR are updated when the instance of a Model changes.
However, I've been unable to find any information on how to test the content of a SearchVectorField ... I can't compare my_document.search to SearchVector(Value("document content")) or similar, because the first one seems to be string-like, while the latter is an object.
TL;DR
More precisely, with the model:
from django.db import models
class Document(models.Model):
...
content = TextField()
search = SearchVectorField()
and trigger:
-- create trigger function
CREATE OR REPLACE FUNCTION search_trigger() RETURNS trigger AS $$
begin
NEW.search := to_tsvector(COALESCE(NEW.content, ''))
return NEW;
end
$$ LANGUAGE plpgsql;
-- add trigger on insert
DROP TRIGGER IF EXISTS search_trigger ON myapp_document;
CREATE TRIGGER search_trigger
BEFORE INSERT
ON myapp_document
FOR EACH ROW
EXECUTE PROCEDURE search_trigger();
-- add trigger on update
DROP TRIGGER IF EXISTS search_trigger_update ON myapp_document;
CREATE TRIGGER search_trigger_update
BEFORE UPDATE OF content
ON myapp_document
FOR EACH ROW
WHEN (OLD.content IS DISTINCT FROM NEW.content)
EXECUTE PROCEDURE search_trigger();
How can I test that when I create a new Document instance, its search field is populated with the right values ? Same question for updating an existing Document instance, but the answer should be fairly similar.
Thanks for any hint ;)

I think you can compare string representation of your SearchVectorField values:
from django.test import TestCase
from .models import Document
class DocumentTest(TestCase):
def setUp(self):
Document.objects.create(content='Pizza Recipes')
def test_document_search(self):
document_list = list(Document.objects.values_list('search', flat=True))
search_list = ["'pizza':1 'recip':2"]
self.assertSequenceEqual(document_list, search_list)

Related

What is use case for stealth_options in Django?

according to documentation:
For custom management commands that use options not created using
parser.add_argument(), add a stealth_options attribute on the command:
class MyCommand(BaseCommand):
stealth_options = ('option_name', ...)
but why not just add these options to parser.add_argument()? Is there any profit to use stealth_options?
Mainly for testing purpose IMO, check this code snippet taken from this example.
def inspectdb_tables_only(table_name):
"""
Limit introspection to tables created for models of this app.
Some databases such as Oracle are extremely slow at introspection.
"""
return table_name.startswith('inspectdb_')
class InspectDBTestCase(TestCase):
unique_re = re.compile(r'.*unique_together = \((.+),\).*')
def test_stealth_table_name_filter_option(self):
out = StringIO()
call_command('inspectdb', table_name_filter=inspectdb_tables_only, stdout=out)
error_message = "inspectdb has examined a table that should have been filtered out."
self.assertNotIn("class DjangoContentType(models.Model):", out.getvalue(), msg=error_message)
Now we can use stealth_options to simplify the process of bypassing call_command() check and refine our test method while hiding these options from the exposed command API.
call_command() now validates that the argument parser of the command
being called defines all of the options passed to call_command().

Create DB Constraint via Django

I have a Django model which looks like this:
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
I want system never to be empty or to contain whitespace.
I know how to use validators in Django.
But I would enforce this at database level.
What is the easiest and django-like way to create a DB constraint for this?
I use PostgreSQL and don't need to support any other database.
2019 Update
Django 2.2 added support for database-level constrains. The new CheckConstraint and UniqueConstraint classes enable adding custom database constraints. Constraints are added to models using the Meta.constraints option.
Your system validation would look like something like this:
from django.db import models
from django.db.models.constraints import CheckConstraint
from django.db.models.query_utils import Q
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
class Meta:
constraints = [
CheckConstraint(
check=~Q(system="") & ~Q(system__contains=" "),
name="system_not_blank")
]
First issue: creating a database constraint through Django
A)
It seems that django does not have this ability build in yet. There is a 9-year-old open ticket for it, but I wouldn't hold my breath for something that has been going on this long.
Edit: As of release 2.2 (april 2019), Django supports database-level check constraints.
B) You could look into the package django-db-constraints, through which you can define constraints in the model Meta. I did not test this package, so I don't know how useful it really is.
# example using this package
class Meta:
db_constraints = {
'price_above_zero': 'check (price > 0)',
}
Second issue: field system should never be empty nor contain whitespaces
Now we would need to build the check constraint in postgres syntax to accomplish that. I came up with these options:
Check if the length of system is different after removing whitespaces. Using ideas from this answer you could try:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(system) = length(regexp_replace(system, '\s', '', 'g')) )
Check if the whitespace count is 0. For this you could us regexp_matches:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(regexp_matches(system, '\s', 'g')) = 0 )
Note that the length function can't be used with regexp_matches because the latter returns a set of text[] (set of arrays), but I could not find the proper function to count the elements of that set right now.
Finally, bringing both of the previous issues together, your approach could look like this:
class Dummy(models.Model):
# this already sets NOT NULL to the field in the database
system = models.CharField(max_length=16)
class Meta:
db_constraints = {
'system_no_spaces': 'check ( length(system) > 0 AND length(system) = length(regexp_replace(system, "\s", "", "g")) )',
}
This checks that the fields value:
does not contain NULL (CharField adds NOT NULL constraint by default)
is not empty (first part of the check: length(system) > 0)
has no whitespaces (second part of the check: same length after replacing whitespace)
Let me know how that works out for you, or if there are problems or drawbacks to this approach.
You can add CHECK constraint via custom django migration. To check string length you can use char_length function and position to check for containing whitespaces.
Quote from postgres docs (https://www.postgresql.org/docs/current/static/ddl-constraints.html):
A check constraint is the most generic constraint type. It allows you
to specify that the value in a certain column must satisfy a Boolean
(truth-value) expression.
To run arbitrary sql in migaration RunSQL operation can be used (https://docs.djangoproject.com/en/2.0/ref/migration-operations/#runsql):
Allows running of arbitrary SQL on the database - useful for more
advanced features of database backends that Django doesn’t support
directly, like partial indexes.
Create empty migration:
python manage.py makemigrations --empty yourappname
Add sql to create constraint:
# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('yourappname', '0001_initial'),
]
operations = [
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syslen '
'CHECK (char_length(trim(system)) > 1);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syslen;'),
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syswh '
'CHECK (position(' ' in trim(system)) = 0);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syswh;')
]
Run migration:
python manage.py migrate yourappname
I modify my answer to reach out your requirements.
So, if you would like to run a DB constraint try this one :
import psycopg2
def your_validator():
conn = psycopg2.connect("dbname=YOURDB user=YOURUSER")
cursor = conn.cursor()
query_result = cursor.execute("YOUR QUERY")
if query_result is Null:
# Do stuff
else:
# Other Stuff
Then use the pre_save signal.
In your models.py file add,
from django.db.models.signals import pre_save
class Dummy(models.Model):
...
#staticmethod
def pre_save(sender, instance, *args, **kwargs)
# Of course, feel free to parse args in your def.
your_validator()

Best way to populate a new attribute (new database column) of existing Rails models?

I've just added a new column to an existing table in my database:
class AddMoveableDateToDocument < ActiveRecord::Migration
def change
add_column :documents, :moveable_date, :datetime
end
end
In my Rails model I want the moveable_date attribute to be set to a default value upon creation, and the application will be able to change this date later. So, something like:
class Document < ActiveRecord::Base
before_create :set_moveable_date
def set_moveable_date
self.moveable_date ||= self.created_at
end
end
Now, the existing models that are already saved into the database will not have this moveable_date set yet (the value is nil). How do I run through all my existing models and populate the moveable_date attribute with its default value? What is the easiest/best practice way? Can be in the application code itself, in the console, in the terminal, or otherwise. Thanks!
You will get a lot of opinionated answers on this one. Some will suggest the console, some will suggest a one-time rake task.
I would suggest doing it as part of the migration that adds the column. After adding the column, you can run Document.reset_column_information so that the Rails app picks up on your new column, and then iterate through the existing document records and set the moveable date as appropriate.
Or if it's as simple as setting the moveable date to the created_at date, you can use something like Document.update_all("moveable_date = created_at") instead of iterating over them.
That's a good suggestion!
Another way is to add the line before_save :set_moveable_date to your model. It won't accomplish the transition immediately, but if your data is updated on a regular basis, it'd work.

Django notification observe model (watching for product results)

I've been using django-notification (https://github.com/jtauber/django-notification.git) but the documentation is a little brief for a beginner.
I want to be able to have users keep a watch on searches (a results page with product listings) that have no results at the time of search. Then if a record is added that matches the search, the user should be notified.
I can't find any online explanation of how to use 'observe', which I think is what i'd need to use to watch for records appearing (in search results)? Perhaps, this is the wrong approach (using django-notification) as I need a signal to await the occurrence of a filter result that would initially contain no objects...
(the project is too developed to consider an option like Pinax to provide a template for things like this)
I suppose I need to evaluate
f=Products.objects.filter({search_request_args})
if f:
notification.send([request.user], "product_match", {"from_user": settings.FROM_DEFAULT})
Perhaps as a chron job?
It looks like you want to use django signals (see: https://docs.djangoproject.com/en/dev/topics/signals/)
let's say you want to watch the creation of Product objects
from django.db.models.signals import post_save
from my_app.models import Product
def new_product(sender, instance, created, **kwargs):
# short-circuit the function if it isn't a new product (it's
# being updated not created)
if not created: return
# note: instance is the newly saved Product object
if (check_if_the_new_product_matches_searches_here):
notification.send(...)
post_save.connect(new_product, sender=Product)

Bulk create model objects in django

I have a lot of objects to save in database, and so I want to create Model instances with that.
With django, I can create all the models instances, with MyModel(data), and then I want to save them all.
Currently, I have something like that:
for item in items:
object = MyModel(name=item.name)
object.save()
I'm wondering if I can save a list of objects directly, eg:
objects = []
for item in items:
objects.append(MyModel(name=item.name))
objects.save_all()
How to save all the objects in one transaction?
as of the django development, there exists bulk_create as an object manager method which takes as input an array of objects created using the class constructor. check out django docs
Use bulk_create() method. It's standard in Django now.
Example:
Entry.objects.bulk_create([
Entry(headline="Django 1.0 Released"),
Entry(headline="Django 1.1 Announced"),
Entry(headline="Breaking: Django is awesome")
])
worked for me to use manual transaction handling for the loop(postgres 9.1):
from django.db import transaction
with transaction.atomic():
for item in items:
MyModel.objects.create(name=item.name)
in fact it's not the same, as 'native' database bulk insert, but it allows you to avoid/descrease transport/orms operations/sql query analyse costs
name = request.data.get('name')
period = request.data.get('period')
email = request.data.get('email')
prefix = request.data.get('prefix')
bulk_number = int(request.data.get('bulk_number'))
bulk_list = list()
for _ in range(bulk_number):
code = code_prefix + uuid.uuid4().hex.upper()
bulk_list.append(
DjangoModel(name=name, code=code, period=period, user=email))
bulk_msj = DjangoModel.objects.bulk_create(bulk_list)
Here is how to bulk-create entities from column-separated file, leaving aside all unquoting and un-escaping routines:
SomeModel(Model):
#classmethod
def from_file(model, file_obj, headers, delimiter):
model.objects.bulk_create([
model(**dict(zip(headers, line.split(delimiter))))
for line in file_obj],
batch_size=None)
Using create will cause one query per new item. If you want to reduce the number of INSERT queries, you'll need to use something else.
I've had some success using the Bulk Insert snippet, even though the snippet is quite old.
Perhaps there are some changes required to get it working again.
http://djangosnippets.org/snippets/446/
Check out this blog post on the bulkops module.
On my django 1.3 app, I have experienced significant speedup.
bulk_create() method is one of the ways to insert multiple records in the database table. How the bulk_create()
**
Event.objects.bulk_create([
Event(event_name="Event WF -001",event_type = "sensor_value"),
Entry(event_name="Event WT -002", event_type = "geozone"),
Entry(event_name="Event WD -001", event_type = "outage") ])
**
for a single line implementation, you can use a lambda expression in a map
map(lambda x:MyModel.objects.get_or_create(name=x), items)
Here, lambda matches each item in items list to x and create a Database record if necessary.
Lambda Documentation
The easiest way is to use the create Manager method, which creates and saves the object in a single step.
for item in items:
MyModel.objects.create(name=item.name)