Bulk create model objects in django - django

I have a lot of objects to save in database, and so I want to create Model instances with that.
With django, I can create all the models instances, with MyModel(data), and then I want to save them all.
Currently, I have something like that:
for item in items:
object = MyModel(name=item.name)
object.save()
I'm wondering if I can save a list of objects directly, eg:
objects = []
for item in items:
objects.append(MyModel(name=item.name))
objects.save_all()
How to save all the objects in one transaction?

as of the django development, there exists bulk_create as an object manager method which takes as input an array of objects created using the class constructor. check out django docs

Use bulk_create() method. It's standard in Django now.
Example:
Entry.objects.bulk_create([
Entry(headline="Django 1.0 Released"),
Entry(headline="Django 1.1 Announced"),
Entry(headline="Breaking: Django is awesome")
])

worked for me to use manual transaction handling for the loop(postgres 9.1):
from django.db import transaction
with transaction.atomic():
for item in items:
MyModel.objects.create(name=item.name)
in fact it's not the same, as 'native' database bulk insert, but it allows you to avoid/descrease transport/orms operations/sql query analyse costs

name = request.data.get('name')
period = request.data.get('period')
email = request.data.get('email')
prefix = request.data.get('prefix')
bulk_number = int(request.data.get('bulk_number'))
bulk_list = list()
for _ in range(bulk_number):
code = code_prefix + uuid.uuid4().hex.upper()
bulk_list.append(
DjangoModel(name=name, code=code, period=period, user=email))
bulk_msj = DjangoModel.objects.bulk_create(bulk_list)

Here is how to bulk-create entities from column-separated file, leaving aside all unquoting and un-escaping routines:
SomeModel(Model):
#classmethod
def from_file(model, file_obj, headers, delimiter):
model.objects.bulk_create([
model(**dict(zip(headers, line.split(delimiter))))
for line in file_obj],
batch_size=None)

Using create will cause one query per new item. If you want to reduce the number of INSERT queries, you'll need to use something else.
I've had some success using the Bulk Insert snippet, even though the snippet is quite old.
Perhaps there are some changes required to get it working again.
http://djangosnippets.org/snippets/446/

Check out this blog post on the bulkops module.
On my django 1.3 app, I have experienced significant speedup.

bulk_create() method is one of the ways to insert multiple records in the database table. How the bulk_create()
**
Event.objects.bulk_create([
Event(event_name="Event WF -001",event_type = "sensor_value"),
Entry(event_name="Event WT -002", event_type = "geozone"),
Entry(event_name="Event WD -001", event_type = "outage") ])
**

for a single line implementation, you can use a lambda expression in a map
map(lambda x:MyModel.objects.get_or_create(name=x), items)
Here, lambda matches each item in items list to x and create a Database record if necessary.
Lambda Documentation

The easiest way is to use the create Manager method, which creates and saves the object in a single step.
for item in items:
MyModel.objects.create(name=item.name)

Related

Effecient Bulk Update of Model Records in Django

I'm building a Django app that will periodically take information from an external source, and use it to update model objects.
What I want to to be able to do is create a QuerySet which has all the objects which might match the final list. Then check which model objects need to be created, updated, and deleted. And then (ideally) perform the update in the fewest number of transactions. And without performing any unnecessary DB operations.
Using create_or_update gets me most of the way to what I want to do.
jobs = get_current_jobs(host, user)
for host, user, name, defaults in jobs:
obj, _ = Job.upate_or_create(host=host, user=user, name=name, defaults=defaults)
The problem with this approach is that it doesn't delete anything that no longer exists.
I could just delete everything up front, or do something dumb like
to_delete = set(Job.objects.filter(host=host, user=user)) - set(current)
(Which is an option) but I feel like there must already be an elegant solution that doesn't require either deleting everything, or reading everything into memory.
You should use Redis for storage and use this python package in your code. For example:
import redis
import requests
pool = redis.StrictRedis('localhost')
time_in_seconds = 3600 # the time period you want to keep your data
response = requests.get("url_to_ext_source")
pool.set("api_response", response.json(), ex=time_in_seconds)

dynamic condition for relationship(sqlalchemy) in Flask-Admin

I'm using sqlalchemy and have two models, Article and Tag, it's a many-to-many relation.
When I add articles using Flask-Admin, I want just part of tags available (related on user permission) instead of all tags.
any idea? Thanks
Probably the best way to do this is to use dynamic relationship loaders. Simply use lazy='dynamic' in your relationship definition:
posts = relationship(Post, lazy="dynamic")
This returns you a query object instead of a collection of objects, so you can then query it directly:
posts = jack.posts.filter(Post.headline=='this is a post')
You could also achieve what you want with discriminator columns or something, but that is likely overkill.
sounds like you need ModelView.get_query:
class MyView(ModelView):
def get_query(self,*args,**kwargs):
return super(MyView,self).get_query(*args,**kwargs).filter_by(current_user.can_view=True)

MongoEngine get_or_create Alternatives

I've been using the get_or_create method with MongoEngine in a Django app. Today, I noticed there were a few duplicate entries. I came across this in the MongoEngine API Reference for get_or_create:
This requires two separate operations and therefore a race condition exists. Because there are no transactions in mongoDB other approaches should be investigated, to ensure you don’t accidentally duplicate data when using this method. This is now scheduled to be removed before 1.0
Should I be using something like this?:
from models import Post
post = Post(name='hello')
try:
Posts.objects.get(name=post.name)
print "exists"
except:
post.save()
print "saved"
Will that solve my problem?
Is there a better way?
To perform an upsert use the following syntax:
Posts.objects(name="hello").update(set__X=Y, upsert=True)
That will add a post with the name "hello" and where X = Y if it doesn't already exist, otherwise it will update an existing post just setting X = Y.
If you need pass a dictionery, can do this:
post = Post.objects(name="hello").first() or Post(name="hello")
then you can update with something like this:
# data = dictionary_with_data
for field, value in data.items():
post[field] = value
post.save()

Django: How to access the model id's within an AJAX script?

I was wondering what is the correct approach,
Do I create HiddenInput fields in my ModelForm and from the
View I pass in the primaryKey for the models I am about to edit into
the hiddenInput fields and then grab those hiddenInput fields from
the AJAX script to use it like this?
item.load(
"/bookmark/save/" + hidden_input_field_1,
null,
function () {
$("#save-form").submit(bookmark_save);
}
);
Or is there is some more clever way of doing it and I have no idea?
Thanks
It depends upon how you want to implement.
The basic idea is to edit 1. you need to get the existing instance, 2. Save provided information into this object.
For #1 you can do it multiple ways, like passing ID or any other primary key like attribute in url like http://myserver/edit_object/1 , Or pass ID as hidden input then you have to do it through templates.
For #2, I think you would already know this. Do something like
inst = MyModel.objects.get(id=input_id) # input_id taken as per #1
myform = MyForm(request.POST, instance=inst)
if myform.is_valid():
saved_inst = myform.save()
I just asked in the django IRC room and it says:
since js isn't processed by the django template engine, this is not
possible.
Hence the id or the object passed in from django view can't be accessed within AJAX script.

Django: Querying comments based on object field

I've been using the built-in Django comments system which has been working great. On a particular page I need to list the latest X comments which I've just been fetching with:
latest_comments =
Comment.objects.filter(is_public=True, is_removed=False)
.order_by('submit_date').reverse()[:5]
However I've now introduced a Boolean field 'published' into the parent object of the comments, and I want to include that in the query above. I've tried using the content_type and object_pk fields but I'm not really getting anywhere. Normally you'd do something like:
Comment.objects.filter(blogPost__published=True)
But as it is not stored like that I am not sure how to proceed.
posts_ids = BlogPost.objects.filter(is_published=True).values_list('id', flat=True) #return [3,4,5,...]
ctype = ContentType.objects.get_for_model(BlogPost)
latest_comments = Comment.objects.filter(is_public=True, is_removed=False, content_type=ctype, content_object__in=posts_ids).order_by('-submit_date')[:5]
Comments use GenericForeignKey to store the relation to parent object. Because of the way generic relations work related lookups using __<field> syntax are not supported.
You can accomplish the desired behaviour using the 'in' lookup, however it'll require lot of comparisons when there'll be a lot of BlogPosts.
ids = BlogPost.objects.filter(published=True).values_list('id', flat=True) # Get list of ids, you would probably want to limit number of items returned here
content_type = ContentType.objects.get_for_model(BlogPost) # Becasue we filter only comments for BlogPost
latest_comments = Comment.objects.filter(content_type=content_type, object_pk__in=ids, is_public=True, is_removed=False, ).order_by('submit_date').reverse()[:5]
See the Comment model doc for the description of all fields.
You just cannot do that in one query. Comments use GenericForeignKey. Documentation says:
Due to the way GenericForeignKey is implemented, you cannot use such
fields directly with filters (filter() and exclude(), for example) via
the database API.