Saving a JSON-Field with popped keys

Saving a JSON-Field with popped keys - django

I'm trying to delete all occurences of a certain key in a JSON-Field when a certain key is deleted.
What I've been trying is just popping all occurences of the given key in the json-field. However, saving a JSONField with a popped key doesn't seem to work - the data isn't changed on the Element-objects. Is there a way to do this?
class Element(models.Model):
data = models.JSONField(default=dict, blank=True)
class Key(moedls.Model):
[...]
def delete(self, *args, **kwargs):
to_update = Element.objects.filter(data__has_key=self.slug)
for element in to_update:
element.data.pop(self.slug)
GraphElement.objects.bulk_update(to_update, ["data"])
super().delete(*args, **kwargs)
Edit: I just realized that this code actually seems to work - but just sometimes.

Related

How to modify a Django Model field on `.save()` whose value depends on the incoming changes?

I have fields in multiple related models whose values are fully derived from other fields both in the model being saved and from fields in related models. I wanted to automate their value maintenance so that they are always current/valid, so I wrote a base class that each model inherits from. It overrides the .save() and .delete().
It pretty much works except for when multiple updates are triggered via changes to a through model of a M:M relationship between models named Infusate and Tracer (the through model is named InfusateTracer). So for example, I have a test that creates 2 InfusateTracer model records, which triggers updates to Infusate:
glu_t = Tracer.objects.create(compound=glu)
c16_t = Tracer.objects.create(compound=c16)
io = Infusate.objects.create(short_name="ti")
InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)
InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)
print(f"Name: {infusate.name}")
Infusate.objects.get(name="ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}")
The save() override looks like this:
def save(self, *args, **kwargs):
# Set the changed value triggering this update so that the derived value of the automatically updated field reflects the new values:
super().save(*args, **kwargs)
# Update the fields that change due to the above change (if any)
self.update_decorated_fields()
# Note, I cannot call save again because I get a duplicate exception, so `update_decorated_fields` uses `setattr`:
# super().save(*args, **kwargs)
# Percolate changes up to the parents (if any)
self.call_parent_updaters()
The automatically maintained field updates are performed here. Note that the fields to update, the function that generates their value, and the link to the parent are all maintained in a global returned by get_my_updaters() whose values are from a decorator I wrote applied to the updating functions:
def update_decorated_fields(self):
for updater_dict in self.get_my_updaters():
update_fun = getattr(self, updater_dict["function"])
update_fld = updater_dict["update_field"]
if update_fld is not None:
current_val = None
# ... brevity edit
new_val = update_fun()
setattr(self, update_fld, new_val)
print(f"Auto-updated {self.__class__.__name__}.{update_fld} using {update_fun.__qualname__} from [{current_val}] to [{new_val}]")
And in the test code example at the top of this post, where InfusateTracer linking records are created, this method is crucial to the updates that are not fully happening:
def call_parent_updaters(self):
parents = []
for updater_dict in self.get_my_updaters():
update_fun = getattr(self, updater_dict["function"])
parent_fld = updater_dict["parent_field"]
# ... brevity edit
if parent_inst is not None and parent_inst not in parents:
parents.append(parent_inst)
for parent_inst in parents:
if isinstance(parent_inst, MaintainedModel):
parent_inst.save()
elif parent_inst.__class__.__name__ == "ManyRelatedManager":
if parent_inst.count() > 0 and isinstance(
parent_inst.first(), MaintainedModel
):
for mm_parent_inst in parent_inst.all():
mm_parent_inst.save()
And here's the relevant ordered debug output:
Auto-updated Infusate.name using Infusate._name from [ti] to [ti{glucose-[2,3-13C5,4-17O1]}]
Auto-updated Infusate.name using Infusate._name from [ti{glucose-[2,3-13C5,4-17O1]}] to [ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}]
Name: ti{glucose-[2,3-13C5,4-17O1]}
DataRepo.models.infusate.Infusate.DoesNotExist: Infusate matching query does not exist.
Note that the output Name: ti{glucose-[2,3-13C5,4-17O1]} is incomplete (even though the debug output above it is complete: ti{C16:0-[5,6-13C5,17O1];glucose-[2,3-13C5,4-17O1]}). It contains the information resulting from the creation of the first through record:
InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)
But the subsequent through record created by:
InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)
...(while all the Auto-updated debug output is correct - and is what I expected to see), is not the final value of the Infusate record's name field (which should be composed of values gathered from 7 different records as displayed in the last Auto-updated debug output (1 Infusate record, 2 Tracer records, and 4 TracerLabel records))...
Is this due to asynchronous execution or is this because I should be using something other than setattr to save the changes? I've tested this many times and the result is always the same.
Incidentally, I lobbied our team to not even have these automatically maintained fields because of their potential to become invalid from DB changes, but the lab people like having them apparently because that's how the suppliers name the compounds, and they want to be able to copy/paste them in searches, etc).

The problem here is a misconception over how changes are applied, when they are used in the construction of the new derived field value, and when the super().save method should be called.
Here, I am creating a record:
io = Infusate.objects.create(short_name="ti")
That is related to these 2 records (also being created):
glu_t = Tracer.objects.create(compound=glu)
c16_t = Tracer.objects.create(compound=c16)
Then, those records are linked together in a through model:
InfusateTracer.objects.create(infusate=io, tracer=glu_t, concentration=1.0)
InfusateTracer.objects.create(infusate=io, tracer=c16_t, concentration=2.0)
I had thought (incorrectly) that I had to call super().save() so that when the field values are gathered together to compose the name field, those incoming changes would be included in the name.
However, the self object, is what is being used to retrieve those values. It doesn't matter that they aren't saved yet.
At this point, it's useful to include some of the gaps in the supplied code in the question. This is a portion of the Infusate model:
class Infusate(MaintainedModel):
id = models.AutoField(primary_key=True)
name = models.CharField(...)
short_name = models.CharField(...)
tracers = models.ManyToManyField(
Tracer,
through="InfusateTracer",
)
#field_updater_function(generation=0, update_field_name="name")
def _name(self):
if self.tracers is None or self.tracers.count() == 0:
return self.short_name
return (
self.short_name
+ "{"
+ ";".join(sorted(map(lambda o: o._name(), self.tracers.all())))
+ "}"
)
And this was an error I had inferred (incorrectly) to mean that the record had to have been saved before I could access the values:
ValueError: "<Infusate: >" needs to have a value for field "id" before this many-to-many relationship can be used.
when I had tried the following version of my save override:
def save(self, *args, **kwargs):
self.update_decorated_fields()
super().save(*args, **kwargs)
self.call_parent_updaters()
But what this really meant was that I had to test something else other than self.tracers is None to see if any M:M links exist. We can simply check self.id. If it's None, we can infer that self.tracers does not exist. So the answer to this question is simply to edit the save method override to:
def save(self, *args, **kwargs):
self.update_decorated_fields()
super().save(*args, **kwargs)
self.call_parent_updaters()
and edit the method that generates the value for the field update to:
#field_updater_function(generation=0, update_field_name="name")
def _name(self):
if self.id is None or self.tracers is None or self.tracers.count() == 0:
return self.short_name
return (
self.short_name
+ "{"
+ ";".join(sorted(map(lambda o: o._name(), self.tracers.all())))
+ "}"
)

Auto-incrementing Django DateField

How does one create a DateField, which automatically increments by 1 day in the way that the pk field does?
For example, I would create a new object, this would be of 16/04/2017, the next object would be of 17/04/2017, even if they are both submitted on the same day.
How would I do this?

How about override the model's save method like this:
from datetime import datetime, timedelta
from django.db import models
class MyModel(models.Model):
date = models.DateField() # the below method will NOT work if auto_now/auto_now_add are set to True
def save(self, *args, **kwargs):
# count how many objects are already saved with the date this current object is saved
date_gte_count = MyModel.objects.filter(date__gte=self.date).count()
if date_gte_count:
# there are some objects saved with the same or greater date. Increase the day by this number.
self.date += timedelta(days=date_gte_count)
# save object in db
super().save(*args, **kwargs)
Of course, the above can be implemented using Django signals. The pre_save one.

So I worked this out using parts of Nik_m's answer and also some of my knowledge.
I essentially made a while loop which kept iterating over and adding a day, as opposed to Nik_m's answer which doesn't work after the third object due to a lack of iteration.
def save(self, *args, **kwargs):
same_date_obj = Challenge.objects.filter(date=self.date)
if same_date_obj.exists():
while True:
if Challenge.objects.filter(date=self.date).exists():
self.date += timedelta(days=1)
else:
break
super().save(*args, **kwargs)
EDIT: This answer is no longer valid, it requires a while loop and thus an indefinite amount of queries. #Nik_m's modified answer is better.

Django Models : same object types, different field types

Edit : These different types were just because of the django method:
request.POST.get("attribute")
which from Json data, returns me unicode.
The solution was to parse these values at the beginning
I have got a big problem, and i don't understand where it comes from.
On my Score model to save scores for a game, I need to compare values from the current score and the old one before saving. My error is that the types of my field are different whereas my object types are identical.
Maybe some code could explain :
class Score(models.Model):
map = models.ForeignKey(Map, on_delete=models.CASCADE)
user = models.ForeignKey(User, on_delete=models.CASCADE)
score = models.FloatField()
class Meta:
unique_together = ('map', 'user')
def save(self, *args, **kwargs):
try:
oldScore = Score.objects.get(map=self.map, user=self.user)
except ObjectDoesNotExist:
oldScore = None
if oldScore is not None:
if oldScore.score < self.score:
print >> sys.stderr, type(oldScore), type(self)
print >> sys.stderr, type(oldScore.score), type(self.score)
oldScore.delete()
else:
return False
super(Score, self).save(*args, **kwargs)
return True
def __unicode__(self):
return str(self.map) + ' - ' + self.user.username + " : " + str(self.score)
and how i create the score and save it :
score = Score(map=map, user=user, score=score)
saved = score.save()
The result of debug prints :
<class 'main.models.score.Score'> <class 'main.models.score.Score'>
<type 'float'> <type 'unicode'>
I would like to compare my old with new score, but I can't because of these different types. I know i could do some type conversions, but I'd like to know why this is happening, maybe I have failed on something stupid :s
ps: I'm under python 2.7 and Django 1.9.2
Thanks for helping me :)

That is some magic done by the model's metaclass. See, the model fields are defined as a Field class (or its child, eg. FloatField). But when you want to work with the instance of model, you dont want to have a FloatField in the .score property, you want to have the actual value there, right? And that is done by the ModelBase.__metaclass__ when the model's instance is created.
Now when you are saving the value, it's completely ok, that the type of score is unicode - let's say you received the data via form, and all the data you receive are unicode. The value is converted (and validated) when saving. Django looks what kind of data is expected (float), and it tries to convert the value. If that wont work, it will raise an exception. Otherwise the converted value will be stored.
so what you want to do with your save method is this:
def save(self, *args, **kwargs):
if self.pk: # the model has non-empty primary key, so it's in the db already
oldScore = Score.objects.get(self.pk)
if oldScore.score > float(self.score):
# old score was higher, dont do anything
return False
super(Score, self).save(*args, **kwargs)
return True

django - detect IntegrityError without "save()"

ok, i need a little help here.
I have a model which has a field called slug = models.SlugField(unique=True), and i am trying to set this field on save() by appending 1 to slug if slug already exists and so on.
I want to consider race conditions.
def set_uniqslug(self, slug, i=0):
new_slug = u"{}{}".format(slug, str(i) if i else '')
try:
with transaction.atomic():
self.slug = slugify(new_slug.lower())
self.save()
return self
return self
except IntegrityError as e:
i += 1
return set_uniqslug(self, slug, i)
def save(self, *args, **kwargs):
if not self.pk:
set_uniqslug(self.name.lower()) # <--- but it does "save" above.
# i want something like:
# self.slug = self.get_uniqslug(self.name.lower())
super(Company, self).save(*args, **kwargs)
my problem is, if i call the set_uniqslug(), it needs to try to save, just to know if there is IntegrityError. in my code, it goes into infinite loop.
how can I know without saving if there is IntegrityError and then just return the unique slug back to save() method?
update:
i tried this:
with transaction.atomic():
if Company.objects.filter(slug=new_slug).exists():
i += 1
return self.set_uniqslug(slug, i)
return new_slug
it is working, but i have a stomachache by locking READ-action. am I not blocking other queries or doing any other bad stuff by doing this?

Your check-and-set version will probably not work. That will depend on your database and its implementation of the transaction isolation levels; but taking PostgreSQL as an example, the default READ COMMITTED isolation level will not prevent another transaction from inserting a row with the same slug in between your check and set.
So use your original, optimistic locking idea. As Hugo Rodger-Brown pointed out, you can avoid the infinite loop by calling the superclass's save().
Finally, you might want to consider an alternative slug format. Many times the slug will incorporate the database id (similar to StackOverflow itself, actually), which eliminates the possibility of duplicate slugs.

DRF - How to get WritableField to not load entire database into memory?

I have a very large database (6 GB) that I would like to use Django-REST-Framework with. In particular, I have a model that has a ForeignKey relationship to the django.contrib.auth.models.User table (not so big) and a Foreign Key to a BIG table (lets call it Products). The model can be seen below:
class ShoppingBag(models.Model):
user = models.ForeignKey('auth.User', related_name='+')
product = models.ForeignKey('myapp.Product', related_name='+')
quantity = models.SmallIntegerField(default=1)
Again, there are 6GB of Products.
The serializer is as follows:
class ShoppingBagSerializer(serializers.ModelSerializer):
product = serializers.RelatedField(many=False)
user = serializers.RelatedField(many=False)
class Meta:
model = ShoppingBag
fields = ('product', 'user', 'quantity')
So far this is great- I can do a GET on the list and individual shopping bags, and everything is fine. For reference the queries (using a query logger) look something like this:
SELECT * FROM myapp_product WHERE product_id=1254
SELECT * FROM auth_user WHERE user_id=12
SELECT * FROM myapp_product WHERE product_id=1404
SELECT * FROM auth_user WHERE user_id=12
...
For as many shopping bags are getting returned.
But I would like to be able to POST to create new shopping bags, but serializers.RelatedField is read-only. Let's make it read-write:
class ShoppingBagSerializer(serializers.ModelSerializer):
product = serializers.PrimaryKeyRelatedField(many=False)
user = serializers.PrimaryKeyRelatedField(many=False)
...
Now things get bad... GET requests to the list action take > 5 minutes and I noticed that my server's memory jumps up to ~6GB; why?! Well, back to the SQL queries and now I see:
SELECT * FROM myapp_products;
SELECT * FROM auth_user;
Ok, so that's not good. Clearly we're doing "prefetch related" or "select_related" or something like that in order to get access to all the products; but this table is HUGE.
Further inspection reveals where this happens on Line 68 of relations.py in DRF:
def initialize(self, parent, field_name):
super(RelatedField, self).initialize(parent, field_name)
if self.queryset is None and not self.read_only:
manager = getattr(self.parent.opts.model, self.source or field_name)
if hasattr(manager, 'related'): # Forward
self.queryset = manager.related.model._default_manager.all()
else: # Reverse
self.queryset = manager.field.rel.to._default_manager.all()
If not readonly, self.queryset = ALL!!
So, I'm pretty sure that this is where my problem is; and I need to say, don't select_related here, but I'm not 100% if this is the issue or where to deal with this. It seems like all should be memory safe with pagination, but this is simply not the case. I'd appreciate any advice.

In the end, we had to simply create our own PrimaryKeyRelatedField class to override the default behavior in Django-Rest-Framework. Basically we ensured that the queryset was None until we wanted to lookup the object, then we performed the lookup. This was extremely annoying, and I hope the Django-Rest-Framework guys take note of this!
Our final solution:
class ProductField(serializers.PrimaryKeyRelatedField):
many = False
def __init__(self, *args, **kwargs):
kwarsgs['queryset'] = Product.objects.none() # Hack to ensure ALL products are not loaded
super(ProductField, self).__init__(*args, **kwargs)
def field_to_native(self, obj, field_name):
return unicode(obj)
def from_native(self, data):
"""
Perform query lookup here.
"""
try:
return Product.objects.get(pk=data)
except Product.ObjectDoesNotExist:
msg = self.error_messages['does_not_exist'] % smart_text(data)
raise ValidationError(msg)
except (TypeError, ValueError):
msg = self.error_messages['incorrect_type'] % type(data)
raise ValidationError(msg)
And then our serializer is as follows:
class ShoppingBagSerializer(serializers.ModelSerializer):
product = ProductField()
...
This hack ensures the entire database isn't loaded into memory, but rather performs one-off selects based on the data. It's not as efficient computationally, but it also doesn't blast our server with 5 second database queries loaded into memory!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Saving a JSON-Field with popped keys - django

Related

How to modify a Django Model field on `.save()` whose value depends on the incoming changes?

Auto-incrementing Django DateField

Django Models : same object types, different field types

django - detect IntegrityError without "save()"

DRF - How to get WritableField to not load entire database into memory?

Categories

Resources