I have a small python script that pushes data to my django postgres db.
It imports the relevant model from a django project and uses the .save function to save the data to the db without issue.
Yesterday the system was running fine. I started and stopped both my django project and the python script many times over the course of the day, but never rebooted or powered off my computer, until the end of the day.
Today I have discovered that the data is no longer in the db!
This seems silly, as I probably forgotten to do something obvious, but I thought that when the save function is called from a model, the data is committed to the db.
So this answer is "where to start troubleshooting problems like this" since the question is quite vague and we don't have enough info to troubleshoot effectively.
If this ever happens again, the first thing to do is to turn on statement logging for PostgreSQL and look at the statements as they come in. This should show you begin and commit statements as well as the queries. It's virtually impossible to troubleshoot this sort of problem without access to the queries. Things to look for include missing COMMITs, and missing statements.
After that, the next thing to do is to look at the circumstances under which your computer rebooted. Is it possible it did so before an expected commit? Or did it lose power and not have the transaction log flushed to disk in time?
Those two should rule out just about all possible causes on the db side in a development environment. In a production environment for old versions of PostgreSQL you do want to verify that the system has autovacuum running properly and that you aren't getting warnings about xid wraparound. In newer versions this is not a problem because PostgreSQL will refuse to accept queries when approaching xid wraparound.
Related
I am running a local development server. I am working on the project until it is ready for deployment. I checked the postgres admin page and I noticed that I have a lot of transactions running in the background.
I am not using the site/making queries, and am wondering what is causing this. (I am also the only user)
Why is this?
You'll need to find out what is going on yourself.
First, you can try SELECT * FROM pg_stat_activity (doc). It shows the last statement executed by each user.
With some luck, you'll find out what is going on.
If that is not enough, use pg_stat_statements.
It is a little bit more complicated to install (load in postgresql.conf then CREATE EXTENSION pg_stat_statements) but you will not miss any of the queries.
I am having trouble understanding how to synchronise my development and production environments.
I have a production and development branch in git, with the production branch being of course what the server's copy is.
My sqlite database is currently under version control (which I now gather it shouldn't be, however I am not sure how I would sync my copies of the project if it wasn't?)
When I want to make a change I commit and push the server's copy to production and then I pull that down to my local machine. I then make a change (which can include database changes), but then in terms of getting those changes back into production, I am not sure how to get the changes back onto my server without potentially overwriting changes that have occurred on the server since I started the change?
How can I handle local changes to the database when changes may also have occurred on the server at the same time? I have been searching for a while and thought that maybe South was for that kind of problem but I gather that it is an old solution.
Thanks for your help
Well, it's definitively a wrong way. You should never share a database between environments. However, it is a good approach to use the same database engine on the production and dev environment but it doesn't mean that you need to share a DB, in the case of sqlite3.
Many developers use sqlite3 on dev and other DB engines on the production. This is acceptable but it is not recommended, because of differences between database engines.
I have a Django 1.9-driven website run on Ubuntu, and I very often face a strange issue that some error vanishes when I run the clone of the project locally from my PC using 127.0.0.1:8000 url. Locating the error in such cases is EXTREMELY time consuming and I wonder what are the best practices for debugging a large-scale project, especially when the website is already partially in use.
Just to be as specific as possible, I provide a step-by-step description of what goes wrong.
Step 1. I type some url, say, 10.8.0.1:8000/show_students/
Step 2. Do some action on the page, say, save a student profile. The operation does not end successfully, yielding an error.
Step 3. I copy-paste the project directory located on the remote server onto a local directory on my PC and try to run the CLONE. I see that the error does not take place.
Real-life example,
task_email_recipients = TaskEmailRecipients.objects.get(task_type =
task_instance.type, legal_entity_own = legal_entity_own_instance)
This line throws exception saying that LegalEntityOwn has no field named (yes, I did not omit anything. It is empty string after "field named")
If I run the same view from 127.0.0.1, the error goes away.
What should be my actions ?
BTW, I use Eclipse if this makes any difference. And I have MS Windows 10 on my local PC.
Summing up, my goal is to debug the project run from 10.8.0.1
UPDATE for – Paul Becotte's comment
I've always ignored this warning, but when running the project, it gives a warning
You have unapplied migrations; your app may not work properly until
they are applied. Run 'python manage.py migrate' to apply them.
So, let me explain a few concepts.
A. Source Control (Git) lets you keep track of all the changes to your source code. This is pretty important so that you can feel confident that you are running the same version of your code on your development machine as your deployed server without trying to do something like copy the files back and forth. A command like git status can show you if you changed something and maybe give you tips on what is different between the two environments. If you are not using git, you should immediately start!
B. Migrations are like source control for the schema of your database. A SQL database like Mysql or Postgres has a fixed schema- you have THIS many tables, with THESE names, and table A has three columns with one of them called Name and one called ID and so forth. Migrations are designed to give you visibility into what these schema are- instead of logging into the database and running CREATE TABLE A ... you build a migration file that contains the necessary commands, and then stamps the database with a version number. Then you run those command files so that if the databases are on the same version, you know they have the same structure (which allows you to get your local database to match your deployed one). Django has a helpful migration system all built in... manage.py migrate is a command to apply all of the migration files to the current database. If you are getting the error message you listed, there is basically no chance that your app IS going to work properly, because your database schema, somewhere, is out of sync with your model files. Based on your very limited description, you added a field to a model that now exists in your local database but does not exist in your production database.
C. I mentioned a deploy script- this is a single command you can run to get your code running on your remote server so that you are sure it happens the same way every time. In this case it might be something like...
ssh production
git pull
python manage.py migrate
uwsgi
Set up a script like that so that you know what is going on, and you can rule out accidentally skipping steps as an error vector.
I have a site running on an EC2 image I have bee updating for over a year. This week I have been busy building a new image, to move to 64bit instances. I've got everything installed, the code running, and I'm testing the site under the new setup. I start getting lots of weird problems and eventually realize it only happens when memcached is running.
essentially, memcache is sending the wrong entries back. It works if I use other django-supported caches, such as locmem:// or file:// but it fails on memcache. Most of it seems to work, but a few specific places, even in the template cache tags, it will return not just the wrong values, but entirely different types.
It could be a problem in the way memcached was installed. I presume 64-bit memcached needs to be installed a different way than 32-bit.
The memcached Google Group might be a better place to ask.
I just finished a Django app that I want to get some outside user feedback on. I'd like to launch one version and then fork a private version so I can incorporate feedback and add more features. I'm planning to do lots of small iterations of this process. I'm new to web development; how do websites typically do this? Is it simply a matter of copying my Django project folder to another directory, launching the server there, and continuing my dev work in the original directory? Or would I want to use a version control system instead? My intuition is that it's the latter, but if so, it seems like a huge topic with many uses (e.g. collaboration, which doesn't apply here) and I don't really know where to start.
1) Seperate URLs www.yoursite.com vs test.yoursite.com. you can also do www.yoursite.com and www.yoursite.com/development, etc.. You could also create a /beta or /staging..
2) Keep seperate databases, one for production, and one for development. Write a script that will copy your live database into a dev database. Keep one database for each type of site you create. (You may want to create a beta or staging database for your tester).. Do your own work in the dev database. If you change the database structure, save the changes as a .sql file that can be loaded and run on the live site database when you turn those changes live.
3) Merge features into your different sites with version control. I am currently playing with a subversion setup for web apps that has my stable (trunk), one for staging, and one for development. Development tags + branches get merged into staging, and then staging tags/branches get merged into stable. Version control will let you manage your source code in any way you want. You will have to find a methodology that works for you and use it.
4) Consider build automation. It will publish your site for you automatically. Take a look at http://ant.apache.org/. It can drive a lot of automatically checking out your code and uploading it to each specific site as you might need.
5) Toy of the month: There is a utility called cUrl that you may find valuable. It does a lot from the command line. This might be okay for you to do in case you don't want to use all or any of Ant.
Good luck!
You would typically use version control, and have two domains: your-site.com and test.your-site.com. Then your-site.com would always update to trunk which is the current latest, shipping version. You would do your development in a branch of trunk and test.your-site.com would update to that. Then you periodically merge changes from your development branch to trunk.
Jas Panesar has the best answer if you are asking this from a development standpoint, certainly. That is, if you're just asking how to easily keep your new developments separate from the site that is already running. However, if your question was actually asking how to run both versions simultaniously, then here's my two cents.
Your setup has a lot to do with this, but I always recommend running process-based web servers in the first place. That is, not to use threaded servers (less relevant to this question) and not embedding in the web server (that is, not using mod_python, which is the relevant part here). So, you have one or more processes getting HTTP requests from your web server (Apache, Nginx, Lighttpd, etc.). Now, when you want to try something out live, without affecting your normal running site, you can bring up a process serving requests that never gets the regular requests proxied to it like the others do. That is, normal users don't see it.
You can setup a subdomain that points to this one, and you can install middleware that redirects "special" user to the beta version. This allows you to unroll new features to some users, but not others.
Now, the biggest issues come with database changes. Schema migration is a big deal and something most of us never pay attention to. I think that running side-by-side is great, because it forces you to do schema migrations correctly. That is, you can't just shut everything down and run lengthy schema changes before bringing it back up. You'd never see any remotely important site doing that.
The key is those small steps. You need to always have two versions of your code able to access the same database, so changes you make for the new code need to not break the old code. This breaks down into a few steps you can always make:
You can add a column with a default value, or that is optional. The new code can use it, and the old code can ignore it.
You can update the live version with code that knows to use a new column, at which point you can make it required.
You can make the new version ignore a column, and when it becomes the main version, you can delete that column.
You can make these small steps to migrate between any schemas. You can iteratively add a new column that replaces an old one, roll out the new code, and remove the old column, all without interrupting service.
That said, its your first web app? You can probably break it. You probably have few users :-) But, it is fantastic you're even asking this question. Many "professionals" fair to ever ask it, and even then fewer answer it.
What I do is have an export a copy of my SVN repository and put the files on the live production server, and then keep a virtual machine with a development working copy, and submit the changes to the repo when Im done.