Mongoengine create_user function generating null objectId's - django

It always seems to get the exact same error after I've added a single user to the DB.
Tried to save duplicate unique keys (E11000 duplicate key error index: mydb.user.$objectId_1 dup key: { : null })
What's strange is that I copied the code EXACTLY from the test suite and even that doesn't work. I look at the stack trace and the offending issue is always in the same place
/usr/local/lib/python2.7/dist-packages/mongoengine/django/auth.py in set_password
264. self.save()
It's incredibly frustrating and I have been looking at this for basically the last 2 days. It's not something in my code. For some reason it doesn't seem to be generating an ObjectId as it's always null. I don't even understand why that's the case
My code is simply
from mongoengine.django.auth import User
from django.contrib.auth import get_user_model
user_data = {
'username': 'user',
'email': 'user#example.com',
'password': 'test',
}
manager = get_user_model()._default_manager
user = manager.create_user(user_data)

Looks like you have an extra index on an objectId field that is Unique
When creating a new User its not setting an objectId field value and as its unique you can't have two documents without a value (as its not a sparse index).
To test in the mongo shell:
> use mydb
> db.user.getIndexes()
Drop the objectId_1 index:
> db.user.dropIndex("objectId_1")
> db.user.getIndexes()

Related

Create DB Constraint via Django

I have a Django model which looks like this:
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
I want system never to be empty or to contain whitespace.
I know how to use validators in Django.
But I would enforce this at database level.
What is the easiest and django-like way to create a DB constraint for this?
I use PostgreSQL and don't need to support any other database.
2019 Update
Django 2.2 added support for database-level constrains. The new CheckConstraint and UniqueConstraint classes enable adding custom database constraints. Constraints are added to models using the Meta.constraints option.
Your system validation would look like something like this:
from django.db import models
from django.db.models.constraints import CheckConstraint
from django.db.models.query_utils import Q
class Dummy(models.Model):
...
system = models.CharField(max_length=16)
class Meta:
constraints = [
CheckConstraint(
check=~Q(system="") & ~Q(system__contains=" "),
name="system_not_blank")
]
First issue: creating a database constraint through Django
A)
It seems that django does not have this ability build in yet. There is a 9-year-old open ticket for it, but I wouldn't hold my breath for something that has been going on this long.
Edit: As of release 2.2 (april 2019), Django supports database-level check constraints.
B) You could look into the package django-db-constraints, through which you can define constraints in the model Meta. I did not test this package, so I don't know how useful it really is.
# example using this package
class Meta:
db_constraints = {
'price_above_zero': 'check (price > 0)',
}
Second issue: field system should never be empty nor contain whitespaces
Now we would need to build the check constraint in postgres syntax to accomplish that. I came up with these options:
Check if the length of system is different after removing whitespaces. Using ideas from this answer you could try:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(system) = length(regexp_replace(system, '\s', '', 'g')) )
Check if the whitespace count is 0. For this you could us regexp_matches:
/* this check should only pass if `system` contains no
* whitespaces (`\s` also detects new lines)
*/
check ( length(regexp_matches(system, '\s', 'g')) = 0 )
Note that the length function can't be used with regexp_matches because the latter returns a set of text[] (set of arrays), but I could not find the proper function to count the elements of that set right now.
Finally, bringing both of the previous issues together, your approach could look like this:
class Dummy(models.Model):
# this already sets NOT NULL to the field in the database
system = models.CharField(max_length=16)
class Meta:
db_constraints = {
'system_no_spaces': 'check ( length(system) > 0 AND length(system) = length(regexp_replace(system, "\s", "", "g")) )',
}
This checks that the fields value:
does not contain NULL (CharField adds NOT NULL constraint by default)
is not empty (first part of the check: length(system) > 0)
has no whitespaces (second part of the check: same length after replacing whitespace)
Let me know how that works out for you, or if there are problems or drawbacks to this approach.
You can add CHECK constraint via custom django migration. To check string length you can use char_length function and position to check for containing whitespaces.
Quote from postgres docs (https://www.postgresql.org/docs/current/static/ddl-constraints.html):
A check constraint is the most generic constraint type. It allows you
to specify that the value in a certain column must satisfy a Boolean
(truth-value) expression.
To run arbitrary sql in migaration RunSQL operation can be used (https://docs.djangoproject.com/en/2.0/ref/migration-operations/#runsql):
Allows running of arbitrary SQL on the database - useful for more
advanced features of database backends that Django doesn’t support
directly, like partial indexes.
Create empty migration:
python manage.py makemigrations --empty yourappname
Add sql to create constraint:
# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('yourappname', '0001_initial'),
]
operations = [
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syslen '
'CHECK (char_length(trim(system)) > 1);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syslen;'),
migrations.RunSQL('ALTER TABLE appname_dummy ADD CONSTRAINT syswh '
'CHECK (position(' ' in trim(system)) = 0);',
'ALTER TABLE appname_dummy DROP CONSTRAINT syswh;')
]
Run migration:
python manage.py migrate yourappname
I modify my answer to reach out your requirements.
So, if you would like to run a DB constraint try this one :
import psycopg2
def your_validator():
conn = psycopg2.connect("dbname=YOURDB user=YOURUSER")
cursor = conn.cursor()
query_result = cursor.execute("YOUR QUERY")
if query_result is Null:
# Do stuff
else:
# Other Stuff
Then use the pre_save signal.
In your models.py file add,
from django.db.models.signals import pre_save
class Dummy(models.Model):
...
#staticmethod
def pre_save(sender, instance, *args, **kwargs)
# Of course, feel free to parse args in your def.
your_validator()

Django .save(using='...') correct database name

My Django settings.py contains the following:
DATABASES = {
'default': { # several settings omitted
'NAME': 'myproject', 'HOST': 'localhost', },
'other': { 'NAME': 'other', 'HOST': 'else.where', }
}
I now want to fetch some objects from the other DB and save it to default like so:
things = Thing.objects.using('other').raw('SELECT code,created FROM other.otherapp_bigthing WHERE created>...')
# 'code' is a unique string in other and PK of the Thing model
for t in things:
t.save(using='default')
This gives me
ProgrammingError: (1146, "Table 'other.myapp_thing' doesn't exist")
which is a correct observation, however, by the documentation of the using parameter I expected the records to be saved to myproject.myapp_thing. Why is the database name still taken from the other configuration when I explicitly advised it to use default?
I eventually worked around this by making a raw insert (connections['default'].cursor().execute('INSERT INTO myproject.myapp_thing (...) ON DUPLICATE KEY UPDATE...') but still wanted to know why the question's code does not work.
The problem was that the Thing model contained another field: next to the code CharField/PK and created date (in my use case actually last_accessed) there was another date (think modified) with default=None, null=True and I assumed the default would be used if I do not explicitly include in the original raw query.
Apparently, this assumption was wrong and Django tried, on t.save, to look up a value for modified in the DB it originally came from -- and failed, since it came from a raw SQL query.
If I include ,NULL AS 'modified' in the raw SQL query the original error goes away but I still cannot reasonably use Django logic for my use case since t.save(using='default') would, for existing things, update last_accessed and overwrite the modified date with NULL, which I do not want, and t.save(using='default', update_fields=['last_accessed']) would force an update and thus fail for new things.

django retrieve specific data from a dictionary database field

I have a table that contains values saved as a dictionary.
FIELD_NAME: extra_data
VALUE:
{"code": null, "user_id": "103713616419757182414", "access_token": "ya29.IwBloLKFALsddhsAAADlliOoDeE-PD_--yz1i_BZvujw8ixGPh4zH-teMNgkIA", "expires": 3599}
I need to retrieve the user_id value from the field "extra_data" only not the dictionnary like below.
event_list = Event.objects.filter(season_id=season_id, event_status_id=2).value('extra_data')
If you are storing a dictionary as text in the code you can easily convert it to a python dictionary using eval - although I don't know why you'd want to as it opens you to all sorts of potential malicious code injections.
event_list = eval(Event.objects.filter(season_id=season_id, event_status_id=2).value('extra_data'))
user_id = event_list['user_id']
print user_id
Would give:
"103713616419757182414"
Edit:
On deeper inspection , thats not a Python dictionary, you could import a JSON library to import this, or declare what null is like so:
null = None
event_list = eval(Event.objects.filter(season_id=season_id, event_status_id=2).value('extra_data'))
user_id = event_list['user_id']
Either way, the idea of storing any structured data in a django textfield is fraught with danger that will come back to bite you. The best solution is to rethink your data structures.
This method worked for me. However, this works with a json compliant string
import json
json_obj = json.loads(event_list)
dict1 = dict(json_obj)
print dict1['user_id']

Why does get_FOO_display() return integer value when logging info (django)?

Why does get_FOO_display() return integer value when logging info (django)?
I have a model field that is using a choice to restrict its value. This works fine
and I have it working everywhere within the app, except when logging information,
when the get_FOO_display() method returns the underlying integer value instead
of the human-readable version.
This is the model definition (abridged):
THING_ROLE_MONSTER = 0
THING_ROLE_MUMMY = 1
ROLE_CHOICES = (
(THING_ROLE_MONSTER, u'Monster'),
(THING_ROLE_MUMMY, u'Mummy'),
)
# definition of property within model
class Thing(models.Model):
...
role = models.IntegerField(
'Role',
default=0,
choices=ROLE_CHOICES
)
If I run this within the (django) interactive shell it behaves exactly as you would expect:
>>> from frankenstein.core.models import Thing
>>> thing = Thing()
>>> thing.role = 0
>>> thing.get_role_display()
u'Monster'
However, when I use exactly the same construct within a string formatting / logging
scenario I get the problem:
logger.info('New thing: <b>%s</b>', thing.get_role_display())
returns:
New thing: <b>0</b>
Help!
[UPDATE 1]
When I run the logging within the interactive shell I get the correct output:
>>> from frankenstein.core.models import Thing
>>> import logging
>>> thing = Thing()
>>> thing.role = 0
>>> logging.info('hello %s', b.get_role_display())
INFO hello Monster
[UPDATE 2] Django internals
Following up on the answer from #joao-oliveira below, I have dug into the internals and uncovered the following.
The underlying _get_FIELD_display method in django.db.models looks like this:
def _get_FIELD_display(self, field):
value = getattr(self, field.attname)
return force_unicode(dict(field.flatchoices).get(value, value), strings_only=True)
If I put a breakpoint into the code, and then run ipdb I can see that I have the issue:
ipdb> thing.get_role_display()
u'1'
ipdb> thing._get_FIELD_display(thing._meta.get_field('role'))
u'1'
So, the fix hasn't changed anything. If I then try running through the _get_FIELD_display method code by hand, I get this:
ipdb> fld = thing._meta.get_field('role')
ipdb> fld.flatchoices
[(0, 'Monster'), (1, 'Mummy')]
ipdb> getattr(thing, fld.attname)
u'1'
ipdb> value = getattr(thing, fld.attname)
ipdb> dict(fld.flatchoices).get(value, value)
u'1'
Which is equivalent to saying:
ipdb> {0: 'Monster', 1: 'Mummy'}.get(u'1', u'1')
u'1'
So. The problem we have is that the method is using the string value u'1' to look up the corresponding description in the choices dictionary, but the dictionary keys are integers, and not strings. Hence we never get a match, but instead the default value, which is set to the existing value (the string).
If I manually force the cast to int, the code works as expected:
ipdb> dict(fld.flatchoices).get(int(value), value)
'Mummy'
ipdb> print 'w00t'
This is all great, but doesn't answer my original question as to why the get_foo_display method does return the right value most of the time. At some point the string (u'1') must be cast to the correct data type (1).
[UPDATE 3] The answer
Whilst an honourable mention must go to Joao for his insight, the bounty is going to Josh for pointing out the blunt fact that I am passing in the wrong value to begin with. I put this down to being an emigre from 'strongly-typed-world', where these things can't happen!
The code that I didn't include here is that the object is initialised from a django form, using the cleaned_data from a ChoiceField. The problem with this is that the output from a ChoiceField is a string, not an integer. The bit I missed is that in a loosely-typed language it is possible to set an integer property with a string, and for nothing bad to happen.
Having now looked into this, I see that I should have used the TypedChoiceField, to ensure that the output from cleaned_data is always an integer.
Thank you all.
I'm really sorry if this sounds condescending, but are you 100% sure that you're setting the value to the integer 1 and not the string '1'?
I've gone diving through the internals and running some tests and the only way that the issue you're experiencing makes sense is if you're setting the value to a string. See my simple test here:
>>> from flogger.models import TestUser
>>> t = TestUser()
>>> t.status = 1
>>> t.get_status_display()
u'Admin'
>>> t.status = '1'
>>> t.get_status_display()
u'1'
Examine your view code, or whatever code is actually setting the value, and examine the output of the field directly.
As you pasted from the internal model code:
def _get_FIELD_display(self, field):
value = getattr(self, field.attname)
return force_unicode(dict(field.flatchoices).get(value, value), strings_only=True)
It simply gets the current value of the field, and indexes into the dictionary, and returns the value of the attribute if a lookup isn't found.
I'm guessing there were no errors previously, because the value is coerced into an integer before being inserted into the database.
Edit:
Regarding your update mentioning the type system of python. Firstly, you should be using TypedChoiceField to ensure the form verifies the type that you expect. Secondly, python is a strongly typed language, but the IntegerField does its own coercing with int() when preparing for the database.
Variables are not typed, but the values within them are. I was actually surprised that the IntegerField was coercing the string to an int also. Good lessen to learn here - check the basics first!
Haven't tried your code, neither the #like-it answer sorry, but _get_FIELD_display from models.Model is curried in the fields to set the get_Field_display function, so thats probably why you'r getting that output
try calling the _get_FIELD_display:
logging.info('hello %s', b._get_FIELD_display(b._meta.get('role')))
try this:
class Thing(models.Model):
THING_ROLE_MONSTER = 0
THING_ROLE_MUMMY = 1
ROLE_CHOICES = (
(THING_ROLE_MONSTER, u'Monster'),
(THING_ROLE_MUMMY, u'Mummy'),
)
role = models.IntegerField('Role', default=0,choices=ROLE_CHOICES)

how to write a query to get find value in a json field in django

I have a json field in my database which is like
jsonfield = {'username':'chingo','reputation':'5'}
how can i write a query so that i can find if a user name exists. something like
username = 'chingo'
query = User.objects.get(jsonfield['username']=username)
I know the above query is a wrong but I wanted to know if there is a way to access it?
If you are using the django-jsonfield package, then this is simple. Say you have a model like this:
from jsonfield import JSONField
class User(models.Model):
jsonfield = JSONField()
Then to search for records with a specific username, you can just do this:
User.objects.get(jsonfield__contains={'username':username})
Since Django 1.9, you have been able to use PostgreSQL's native JSONField. This makes search JSON very simple. In your example, this query would work:
User.objects.get(jsonfield__username='chingo')
If you have an older version of Django, or you are using the Django JSONField library for compatibility with MySQL or something similar, you can still perform your query.
In the latter situation, jsonfield will be stored as a text field and mapped to a dict when brought into Django. In the database, your data will be stored like this
{"username":"chingo","reputation":"5"}
Therefore, you can simply search the text. Your query in this siutation would be:
User.objects.get(jsonfield__contains='"username":"chingo"')
2019: As #freethebees points out it's now as simple as:
User.objects.get(jsonfield__username='chingo')
But as the doc examples mention you can query deeply, and if the json is an array you can use an integer to index it:
https://docs.djangoproject.com/en/2.2/ref/contrib/postgres/fields/#querying-jsonfield
>>> Dog.objects.create(name='Rufus', data={
... 'breed': 'labrador',
... 'owner': {
... 'name': 'Bob',
... 'other_pets': [{
... 'name': 'Fishy',
... }],
... },
... })
>>> Dog.objects.create(name='Meg', data={'breed': 'collie', 'owner': None})
>>> Dog.objects.filter(data__breed='collie')
<QuerySet [<Dog: Meg>]>
>>> Dog.objects.filter(data__owner__name='Bob')
<QuerySet [<Dog: Rufus>]>
>>> Dog.objects.filter(data__owner__other_pets__0__name='Fishy')
<QuerySet [<Dog: Rufus>]>
Although this is for postgres, I believe it works the same in other DBs like MySQL
Postgres: https://docs.djangoproject.com/en/2.2/ref/contrib/postgres/fields/#querying-jsonfield
MySQL: https://django-mysql.readthedocs.io/en/latest/model_fields/json_field.html#querying-jsonfield
This usage is somewhat anti-pattern. Also, its implementation is not going to have regular performance, and perhaps is error-prone.
Normally don't use jsonfield when you need to look up through fields. Use the way the RDBMS provides or MongoDB(which internally operates on faster BSON), as Daniel pointed out.
Due to the deterministic of JSON format,
you could achieve it by using contains (regex has issue when dealing w/ multiple '\' and even slower), I don't think it's good to use username in this way, so use name instead:
def make_cond(name, value):
from django.utils import simplejson
cond = simplejson.dumps({name:value})[1:-1] # remove '{' and '}'
return ' ' + cond # avoid '\"'
User.objects.get(jsonfield__contains=make_cond(name, value))
It works as long as
the jsonfield using the same dump utility (the simplejson here)
name and value are not too special (I don't know any egde-case so far, maybe someone could point it out)
your jsonfield data is not corrupt (unlikely though)
Actually I'm working on a editable jsonfield and thinking about whether to support such operations. The negative proof is as said above, it feels like some black-magic, well.
If you use PostgreSQL you can use raw sql to solve problem.
username = 'chingo'
SQL_QUERY = "SELECT true FROM you_table WHERE jsonfield::json->>'username' = '%s'"
User.objects.extra(where=[SQL_EXCLUDE % username]).get()
where you_table is name of table in your database.
Any methods when you work with JSON like with plain text - looking like very bad way.
So, also I think that you need a better schema of database.
Here is the way I have found out that will solve your problem:
search_filter = '"username":{0}'.format(username)
query = User.objects.get(jsonfield__contains=search_filter)
Hope this helps.
You can't do that. Use normal database fields for structured data, not JSON blobs.
If you need to search on JSON data, consider using a noSQL database like MongoDB.