Importing a CSV using celery - django

I want to allow my users to upload a CSV of contact data that will populate a a model called contacts. I have used django-csv-importer and this seems to work ok. However, I would like to use maybe something like celery so that users can upload and just forget about waiting ( at the moment it can take 5 minutes).
Are they any projects that do what django-csv-importer does but with celery integration part? If so could someone give me any example if there is a better way?
Many thanks.

Happily I've worked with the author of django-csv-importer, and can report there's a newer version in the form of django-adaptors (https://github.com/anthony-tresontani/django-adaptors), it's the same project but renamed, so it might have some new stuff.
As for your specific question, joshua's answer is correct. But if you want a ridiculously rich implementation complete with audit trails, take a look at this: http://codeinthehole.com/writing/use-models-for-uploads/

in tasks.py
from celery.task import task
#task
def import_csv(filename):
my_csv_list = MyCsvModel.import_data(data = open(filename))
...
Then just call import_csv.delay(filename) in your view.

Related

How to invoke QueryPagination in dart? AWS Amplify

I'm using Amplify with flutter, I wanna write a query and limit the data being queried, I looked into the documentation https://docs.amplify.aws/lib/datastore/data-access/q/platform/flutter#pagination and found this code snippet:
List posts = await Amplify.DataStore.query(Post.classType,
pagination: new QueryPagination(page:0, limit:100));
But unlike the snippet I'm not able to invoke QueryPagination to feed data into page and limit attributes, I viewed the source of pagination attribute of query and found that QueryPagination class is defined but I don't know how to invoke it.
Issue image
Thank you for reading, please help me out
For anyone else having the same issue this should help you out: try importing manually:
import 'package:amplify_datastore_plugin_interface/amplify_datastore_plugin_interface.dart';, I found the solution here: https://github.com/aws-amplify/amplify-flutter/issues/500
I experienced a similar issue.
import
import 'package:amplify_datastore/amplify_datastore.dart';
or
import 'package:amplify_datastore_plugin_interface/amplify_datastore_plugin_interface.dart';
if the issue still persists, then restart your IDE. This was the solution that eventually worked for me.

django update_or_create(), see what got updated

My django app uses update_or_create() to update a bunch of records. In some cases, updates are really few within a ton of records, and it would be nice to know what got updated within those records. Is it possible to know what got updated (i.e fields whose values got changed)? If not, does any one has ideas of workarounds to achieve that?
This will be invoked from the shell, so ideally it would be nice to be prompted for confirmation just before a value is being changed within update_or_create(), but if not that, knowing what got changed will also help.
Update (more context): Thought I'd give more context here. The data in this Django app gets updated through various means (through users coming on the web site, through the admin page, through scripts (run from the shell) that populate data from a csv etc.). The above question is important mostly for the shell scripts that update data from csvs, hence a solution at the database/trigger/signal level may not be helpful here (I guess).
This is what I ended up doing:
for row in reader:
school_obj0, created = Org.objects.get_or_create(school_id = row[0])
if (school_obj0.name != row[1]):
print (school_obj0.name, '==>', row[1])
confirmation = input('proceed? [y/n]: ')
if (confirmation == 'y'):
school_obj1, created = Org.objects.update_or_create(
school_id = row[0], defaults={"name": row[1],})
Happy to know about improvements to this approach (please see the update in the question with more context)
This will be invoked from the shell, so ideally it would be nice to be
prompted for confirmation just before a value is being changed
Unfortunately, databases don't work like that. It's the responsibility of applications to provide this functionality. And django isn't an application. You can however use django to write an application that provides this functionality.
As for finding out whether an object was updated or created, that's what the return value gives you. A tuple where the second value is a flag for update or create

Delete row from database when date passes

In a database, i have a field called date. Is there a way to delete a row when the date passes, so that it doesnt show up anymore? Ive tried comparing it to todays date in the view, but this wouldnt happen everyday, and people would still see it on the first page load. Any ideas?
Removing something from your database is not safe for many reasons. Starting from permissions going to on_delete logic. If you are not sure about that it's totally required to delete something, just mark this row as active=false.
I would not recomend to use cron, since it hard to maintain: you have to set different tasks on different environments manually, copy these files somewhere on your VCS, work with bash instead of python.
Also, when talking about events, I would not recommend to store something like this in your database, since it is not controlled by VCS and hard to maintain.
If your app is pretty simple schedule is an option.
But if you are looking for some extra info like:
What rows were deleted?
Were there any exceptions?
You can move to more complex Celery with Beat turned on. Extra dependencies (like Redis, RabbitMQ) are the main disadvantage.
Docs:
celery beat
Related:
How do I get a Cron like scheduler in Python?
I believe the best way would be to use a Cron Job or to use a additional conditional in the view to show only rows after the said date.
I would recommend you use a mysql event, since this will run constantly, unlike triggers that are only fired on database operations. You want this to occur outside of anything happening in the application, just based on time, so mysql event will work for this scenario. See full tutorial here: http://www.sitepoint.com/working-with-mysql-events/
I had a easier approach, i guess you could call it "hard-coded". I made a function called deleteevent, which had the following code
def deleteevent():
yesterday = date.today() - timedelta(1)
if Events.objects.filter(event_date = yesterday).count():
Events.objects.filter(event_date = yesterday).delete()
Then, in every other function i had, i called this at the beginning, so the event would be deleted before the page loaded

Is there a signal or anything similar to a "pre_select" in django?

I'm creating a system in django and it'd be really helpful to have a signal that is called every time a SQL "select" query is done on the database. In other words, does anyone know if there is something like a "pre_select" or "post_select" signal method?
I found the signal "connection_created" in the django docs, but couldn't find any clues of how to use it and less about accessing the model that called it. The official documentation just say that it exists but don't give a simple using example... =/
EDIT:
The connection_created just works when the connection is created (how its name says), so, I still without a solution =/.
An example of what I want would be the execution of this queries on distinct objects:
ExampleObject1.objects.filter(attribute=somevalue)
ExampleObject2.objects.filter(attribute=somevalue)
ExampleObject3.objects.filter(attribute=somevalue)
So a function is called receiving the data from each them just before each query being sent to the database in order to threat data, log, etc.
I imagine that exists some functionality like that in django because django log system appears to use something alike.
Any help is welcome. Thanks in advance!
From http://dabapps.com/blog/logging-sql-queries-django-13/
It's not in the form of signal, but it allows you to track all queries. Tracking specific selects should be doable by providing customized log handlers.
import logging
l = logging.getLogger('django.db.backends')
l.setLevel(logging.DEBUG)
l.addHandler(logging.StreamHandler())
#make your queries now...

How to time Django queries

I've always used Python's timeit library to time my little Python programs.
Now I'm developing a Django app and I was wondering how to time my Django functions, especially queries.
For example, I have a def index(request) in my views.py which does a bunch of stuff when I load the index page.
How can I use timeit to time this particular function without altering too much my existing functions?
if your django project is in debug, you can see your database queries (and times) using:
>>> from django.db import connection
>>> connection.queries
I know this won't satisfy your need to profile functions, but hope it helps for the queries part!
The debug toolbar is what you want, it helps you time each of your queries.
Alternatively this snippet works too.
http://djangosnippets.org/snippets/93/
The best way you can get is by using Debug Toolbar, you will also get some additional functionalities for Query optimization, which will help you to optimize your db query.
Here is another solution, You can use connection.queries. This will return the SQL command has been made for the command which was executed just before the connect.queries command. You can the reset_queries after getting the time of the previous query by using reset_queries(). Using reset_queries() is not mandatory.
Suppose you have a Model named Device. You can measure the query time like this:
>>> from django.db import connection, reset_queries
>>> from appname.models import Device
>>> devices = Device.objects.all()
>>> connection.queries
>>> reset_queries()
Anyone stumbling on to this checkout Sentry's approach.
https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/integrations/django/__init__.py#L476
You can replace execute and executemany with your owns functions that track the time it takes for execute to return.
A simple approach is to create custom context manager that initiates a timer and on exit writes final value of the timer to an array you pass to it.
Then you can just check the array.