I'm thinking of creating a log system for my django web application. The web application is quite comprehensive in its use (covers all aspects of a business's processes) so I'd like to track every event that happens. Specifically, I'd like to log every view that runs and not just the "main" ones and, potentially, log what is happening within the view as its executed.
While I'm in the "idea" stage of the logging system, I've quickly hit a few questions that leave me unsure how to proceed. Here are the main questions I have:
I'm thinking of logging all of the events in the same MySQL database that the main web app holds its data. The concern I have is bloating the MySQL database into a massive DB. Also, if the DB crashes or is destroyed somehow (yes I have backups) I'll loose my log too which blows away any ability to track down the problem. Do I use a seperate DB or just go with text files?
How granular do I go? Initially I was thinking of simply logging things like, "Date - In view myView". However, as I'm thinking about it, it would be nice to log all the stuff that happens within the view. Doing this could make the log massive! and would also make my code ugly with so many log entry lines mixed into the code. This kind of detail:
Date - entered view myView
Date - in view myView, retrieved object myObject from the DB
Date - in view myView, setting myObject field myField to myNewValue
Date - leaving myView
Those are my main thoughts at this point. Any advice on this front?
Thanks
I think the best and right way is to create your own custom middleware where you can log actually everything you need.
Here are some links on the subject:
middleware snippets
http://djangosnippets.org/snippets/2624/
http://djangosnippets.org/snippets/290/
http://djangosnippets.org/snippets/264/
django-logging-middleware (pretty old by may give you an idea)
django-request
django.db.backends logging
Is there a Django middleware/plugin that logs all my requests in a organized fashion?
Django verbose request logging
log all sql queries
django orm, how to view (or log) the executed query?
Also, consider using sentry error logging and aggregation platform instead of writing logs into the database. FYI, see using a database for logging.
If you want to log any action run in every view, you can for example, replace entered view A and exited view A by a line in these words: view A - 147ms.
As alecxe stated, you can log requests/SQL, there are plenty of ways to do it with middleware. About database (object) actions, you can tie individual saves, updates and deletes with signals.
For bulk updates and deletes, you could (it's not a clean way but it would work) monkey-patch manager and queryset methods to add logging.
This way you can log actions rather than SQL.
I would see lines like this:
[2013/09/11 15:11:12.0153] view app.module.view 200 148ms
[2013/09/11 15:11:12.0189] orm save:auth.User,id=1 3ms
This is a quick and dirty proposal, but, maybe it's worth it.
Related
I'm developing a small website with Flask & Flask-Login. I need an admin view to monitor all user's online status. I added an is-online column in user db collection and try to update it. But I didn't find any callbacks to handle session expires. How to solve it or any better idea?
Thanks!
FYI, Counting Online Users with Redis | Flask (A Python Microframework) - http://flask.pocoo.org/snippets/71/.
You could get away with checking if users last activity time is bigger(older) than session life time.
if that's the case, you will go on an update their is_online status.
the way i have handled the problem in my application was, since i had a last_activity field in db for each user, to log when they have done what they have done, i could check that value vs session life time.
One very crude method is to create a separate stack that pushes and pops when a user logs in. Assuming that session id and userid on your db is not tied together (i.e., you have separate session id and user id), you can maintain a ledger of sorts and push and pop as sessions are created and destroyed.
You will have to put special emphasis on users running multiple sessions on multiple devices...which is why i put a caveat saying this is a rather crude method.
I have a multi-database Django Project (Python 2.7, Django 1.4 ) and I need that the database used in every single transaction triggered in a request be selected through the url from that request.
Ex:
url='db1.myproject.com'
Would trigger the use of 'db1' in every transaction, like this:
MyModel.objects.using('db1').all()
That is where my problem resides.
Its virtualy impossible include in every call of 'Model.objects' the '.using(dbname)' function (this project already have 2 dozen of models, and its build with scaling in mind, so it could have hundreds of models, and almost everymodel has (or might have) a implementation a of post_save/post_delete that acts as database triggers).
Is there a way of telling django that he should change the default database for this or that request ?
I already have a Custom QuerySet and a Custom Manager, so, if i could tell THEM the database should be this or that could do the trick to (i dont find anywhere how to get request data inside them).
Any help would be apreciated.
I think my answer to a similar question here will do what you need:
https://stackoverflow.com/a/21382032/202168
...although I'm a bit worried you may have chosen an unconventional or 'wrong' way of doing database load balancing, maybe this article can help with some better ideas:
http://engineering.hackerearth.com/2013/10/07/scaling-database-with-django-and-haproxy/
I have a simple Celery task that write some progress data in the database. I need to read this progress update using a django view to give the update to the user.
I used my own tables to write the progress and read it using AJAX polling from client side. Now it's not working and I don't know the reason.
My database backend is PostgreSQL. I tried changing the transaction isolation level using the following (in the read view):
from django.db import transaction
#4 is READ UNCOMITTED
transaction.connections.all()[0].connection.set_isolation_level(4)
I am not sure if this changes the isolation level for a new connection to the database or the one the current transaction is using, but it doesn't seem to work. no progress data can be read until the task has finished and transaction is committed.
Here is second method I tried.
I also found update_state, I write all the progress updates using update_state, but it doesn't seem to be actually written in the database. I run celerycam and configured celery to send events with -E argument.
I want to know what's the proper way to update progress day and retrieve it.
Thanks you.
After some Googling I found out that "READ UNCOMMITTED" is not implemented in PostgreSQL and most probably won't be implemented in the future.
I also found an extension that allows you to read dirty data. It's part of project enter link description here, but this forced me to use raw sql to get the data I wanted.
I have an authentication backend based off a legacy database. When someone logs in using that database and there isn't a corresponding User record, I create one. What I'm wondering is if there is some way to alert the Django system to this fact, so that for example I can redirect the brand-new user to a different page.
The only thing I can think of is adding a flag to the users' profile record called something like is_new which is tested once and then set to False as soon as they're redirected.
Basically, I'm wondering if someone else has figured this out so I don't have to reinvent the wheel.
I found the easiest way to accomplish this is to do exactly as you've said. I had a similar requirement on one of my projects. We needed to show a "Don't forget to update your profile" message to any new member until they had visit their profile for the first time. The client wanted it quickly so we added a 'visited_profile' field to the User's profile and defaulted that to False.
We settled on this because it was super fast to implement, didn't require tinkering with the registration process, worked with existing users, and didn't require extra queries every page load (since the user and user profile is retrieved on every page already). Took us all of 10 minutes to add the field, run the South migration and put an if tag into the template.
There's two methods that I know of to determine if an object has been created:
1) When using get_or_create a tuple is returned of the form (obj, created) where created is a boolean indicating obviously enough whether the object was created or not
2) The post_save signal passes a created paramater, also a boolean, also indicating whether the object was created or not.
At the simplest level, you can use either of these two hooks to set a session var, that you can then check and redirect accordingly.
If you can get by with it, you could also directly redirect either after calling get_or_create or in the post_save signal.
You can use a file-based cache to store the users that aren't yet saved to the database. When the user logs in for the second time, you can look in the cache, find the user object, and save it to the database for good.
Here's some info on django caching: http://docs.djangoproject.com/en/dev/topics/cache/?from=olddocs
PS: don't use Memcached because it will delete all information in the situation of a computer crash or shut down.
I'm building a small social site, and I want to implement an activity stream in a users profile to display event's like commenting, joining a group, posting something, and so on. Basically, I'm trying to make something similar to a reddit profile that shows a range of user activity.
I'm not sure how I'd do this in Django though. I thought of maybe making an "Activity" model that's OneToOne with their account, and update it through MiddleWare.
Anyone here have a suggestion? Away I could actually implement this in a nice way?
You pretty much need to use an explicit Activity model, then create instances of those records in the view functions that perform the action.
I think you'll find that any other more automatic way of tracking activity would be too inflexible: it would record events at the wrong level of detail, and prevent you from describing events in a way that the user wants to see them.
In my opinion, you should do exactly what you're saying, that is create the model Activity, which has a foreignKey to User which you will populate triggering the things you'll find 'interesting'.
This practice, even if redundant, will speed up your page generation, and you can add a custom field which will hold the text you want to display, and also you can keep track of what generate the Activity.