Postgres has a lot of activity going on despite me not doing anything - django

I am running a local development server. I am working on the project until it is ready for deployment. I checked the postgres admin page and I noticed that I have a lot of transactions running in the background.
I am not using the site/making queries, and am wondering what is causing this. (I am also the only user)
Why is this?

You'll need to find out what is going on yourself.
First, you can try SELECT * FROM pg_stat_activity (doc). It shows the last statement executed by each user.
With some luck, you'll find out what is going on.
If that is not enough, use pg_stat_statements.
It is a little bit more complicated to install (load in postgresql.conf then CREATE EXTENSION pg_stat_statements) but you will not miss any of the queries.

Related

How can I make a SQL application work offline?

I'm making a c++/Qt application. It connects to a small online database to find different information. I need to make sure that the application works offline. So I would like to do the following
On start up of application:
- Check if internet connection is available
- if available connect to online database, download database to local (for next time no internet is available)
- if not available connect to the kocally stored version of the database
My problem is I can't find a simple solution how to "download" the database. The user will not update the database, so there is no need for syncing when online again, just the ability to download the newest version of the database, whenever online. It is a MS SQL server that the application uses.
My only idea for a solution is to have an SQLite db in the application, and then write a script that clears the SQLite database and then puts everything from the online server into it, but this requires that I write a script that goes through all of the databse. There must be a better solution. I'm also not sure how this solution should work if the database structure changes. A solution for this could just be to send out a update for the application if the structure changes with a new SQLite db with the new structure.
I tried searching for a solution, but I could not find anything that are simple. Since I don't neew syncing back and forth, I thought there must be a simple solution. Any help pointing me in the right direction is appreciated.

Django + Ec2 -> Display "runserver" console output

First off, apologies since I am about 150% positive there are a million better ways to phrase this question but I'm hitting a mental block as to how to describe this.
Using: Django, AWS EC2, Apache
Currently, I have a persistent Django app running on an AWS EC2 instance that is automatically run via WSGI. I want to know if there is a way to view the output you would normally see when you run "python manage.py runserver" from terminal.
That said, it's probably infinitely better to just ask using an image.
Something like this. http://puu.sh/gM9Md/c953d65398.png (Link because I can't imbed images yet)
I feel like the answer is extremely simple and I'm simply overlooking it, but I've been trying to figure this out for a good week now. I honestly just think I suck at googling for the right term.
Thanks in advance.
You are looking for Apache access logs. The location of the log file changes from distribution to distribution, but assuming that you are using Ubuntu:
tail /var/log/apache2/access.log
If you want to watch it live, you can use:
tail -f /var/log/apache2/access.log
You can press Ctrl+C to stop.

postgres + GeoDajango - data doesn't seem to save

I have a small python script that pushes data to my django postgres db.
It imports the relevant model from a django project and uses the .save function to save the data to the db without issue.
Yesterday the system was running fine. I started and stopped both my django project and the python script many times over the course of the day, but never rebooted or powered off my computer, until the end of the day.
Today I have discovered that the data is no longer in the db!
This seems silly, as I probably forgotten to do something obvious, but I thought that when the save function is called from a model, the data is committed to the db.
So this answer is "where to start troubleshooting problems like this" since the question is quite vague and we don't have enough info to troubleshoot effectively.
If this ever happens again, the first thing to do is to turn on statement logging for PostgreSQL and look at the statements as they come in. This should show you begin and commit statements as well as the queries. It's virtually impossible to troubleshoot this sort of problem without access to the queries. Things to look for include missing COMMITs, and missing statements.
After that, the next thing to do is to look at the circumstances under which your computer rebooted. Is it possible it did so before an expected commit? Or did it lose power and not have the transaction log flushed to disk in time?
Those two should rule out just about all possible causes on the db side in a development environment. In a production environment for old versions of PostgreSQL you do want to verify that the system has autovacuum running properly and that you aren't getting warnings about xid wraparound. In newer versions this is not a problem because PostgreSQL will refuse to accept queries when approaching xid wraparound.

Rather than using crontab, can Django execute something automatically at a predefined time

How to make Django execute something automatically at a particular time.?
For example, my django application has to ftp upload to remote servers at pre defined times. The ftp server addresses, usernames, passwords, time, day and frequency has been defined in a django model.
I want to run a file upload automatically based on the values stored in the model.
One way to do is to write a python script and add it to the crontab. This script runs every minute and keeps an eye on the time values defined in the model.
Other thing that I can roughly think of is maybe django signals. I'm not sure if they can handle this issue. Is there a way to generate signals at predefined times (Haven't read indepth about them yet).
Just for the record - there is also celery which allows to schedule messages for the future dispatch. It's, however, a different beast than cron, as it requires/uses RabbitMQ and is meant for message queues.
I have been thinking about this recently and have found django-cron which seems as though it would do what you want.
Edit: Also if you are not specifically looking for Django based solution, I have recently used scheduler.py, which is a small single file script which works well and is simple to use.
I've had really good experiences with django-chronograph.
You need to set one crontab task: to call the chronograph python management command, which then runs other custom management commands, based on an admin-tweakable schedule
The problem you're describing is best solved using cron, not Django directly. Since it seems that you need to store data about your ftp uploads in your database (using Django to access it for logs or graphs or whatever), you can make a python script that uses Django which runs via cron.
James Bennett wrote a great article on how to do this which you can read in full here: http://www.b-list.org/weblog/2007/sep/22/standalone-django-scripts/
The main gist of it is that, you can write standalone django scripts that cron can launch and run periodically, and these scripts can fully utilize your Django database, models, and anything else they want to. This gives you the flexibility to run whatever code you need and populate your database, while not trying to make Django do something it wasn't meant to do (Django is a web framework, and is event-driven, not time-driven).
Best of luck!

How do I run one version of a web app while developing the next version?

I just finished a Django app that I want to get some outside user feedback on. I'd like to launch one version and then fork a private version so I can incorporate feedback and add more features. I'm planning to do lots of small iterations of this process. I'm new to web development; how do websites typically do this? Is it simply a matter of copying my Django project folder to another directory, launching the server there, and continuing my dev work in the original directory? Or would I want to use a version control system instead? My intuition is that it's the latter, but if so, it seems like a huge topic with many uses (e.g. collaboration, which doesn't apply here) and I don't really know where to start.
1) Seperate URLs www.yoursite.com vs test.yoursite.com. you can also do www.yoursite.com and www.yoursite.com/development, etc.. You could also create a /beta or /staging..
2) Keep seperate databases, one for production, and one for development. Write a script that will copy your live database into a dev database. Keep one database for each type of site you create. (You may want to create a beta or staging database for your tester).. Do your own work in the dev database. If you change the database structure, save the changes as a .sql file that can be loaded and run on the live site database when you turn those changes live.
3) Merge features into your different sites with version control. I am currently playing with a subversion setup for web apps that has my stable (trunk), one for staging, and one for development. Development tags + branches get merged into staging, and then staging tags/branches get merged into stable. Version control will let you manage your source code in any way you want. You will have to find a methodology that works for you and use it.
4) Consider build automation. It will publish your site for you automatically. Take a look at http://ant.apache.org/. It can drive a lot of automatically checking out your code and uploading it to each specific site as you might need.
5) Toy of the month: There is a utility called cUrl that you may find valuable. It does a lot from the command line. This might be okay for you to do in case you don't want to use all or any of Ant.
Good luck!
You would typically use version control, and have two domains: your-site.com and test.your-site.com. Then your-site.com would always update to trunk which is the current latest, shipping version. You would do your development in a branch of trunk and test.your-site.com would update to that. Then you periodically merge changes from your development branch to trunk.
Jas Panesar has the best answer if you are asking this from a development standpoint, certainly. That is, if you're just asking how to easily keep your new developments separate from the site that is already running. However, if your question was actually asking how to run both versions simultaniously, then here's my two cents.
Your setup has a lot to do with this, but I always recommend running process-based web servers in the first place. That is, not to use threaded servers (less relevant to this question) and not embedding in the web server (that is, not using mod_python, which is the relevant part here). So, you have one or more processes getting HTTP requests from your web server (Apache, Nginx, Lighttpd, etc.). Now, when you want to try something out live, without affecting your normal running site, you can bring up a process serving requests that never gets the regular requests proxied to it like the others do. That is, normal users don't see it.
You can setup a subdomain that points to this one, and you can install middleware that redirects "special" user to the beta version. This allows you to unroll new features to some users, but not others.
Now, the biggest issues come with database changes. Schema migration is a big deal and something most of us never pay attention to. I think that running side-by-side is great, because it forces you to do schema migrations correctly. That is, you can't just shut everything down and run lengthy schema changes before bringing it back up. You'd never see any remotely important site doing that.
The key is those small steps. You need to always have two versions of your code able to access the same database, so changes you make for the new code need to not break the old code. This breaks down into a few steps you can always make:
You can add a column with a default value, or that is optional. The new code can use it, and the old code can ignore it.
You can update the live version with code that knows to use a new column, at which point you can make it required.
You can make the new version ignore a column, and when it becomes the main version, you can delete that column.
You can make these small steps to migrate between any schemas. You can iteratively add a new column that replaces an old one, roll out the new code, and remove the old column, all without interrupting service.
That said, its your first web app? You can probably break it. You probably have few users :-) But, it is fantastic you're even asking this question. Many "professionals" fair to ever ask it, and even then fewer answer it.
What I do is have an export a copy of my SVN repository and put the files on the live production server, and then keep a virtual machine with a development working copy, and submit the changes to the repo when Im done.