Why is list(Model.objects.all()) 10x slower against an Oracle database as compared to Postgres? - django

On a django application at my job, we develop locally using postgres but our dev/test/prod servers all use Oracle.
Using essentially the same data (loaded via fixtures), this command:
list(Person.objects.all())
runs about 10x slower against oracle databases as compared to postgres. I checked django.db.connection.queries, and the time it takes to actually query the oracle database does not account for the difference in time.
Is this cx_oracle? Is it avoidable? How do I track down this problem? I will hardcode all of the SQL I have to to get it up to a reasonable speed. It's taking something like 5 seconds vs .5 seconds.

Related

RDS slow queries after few weeks

I had a problem with very slow queries in an Aurora RDS instance. In my local XAMPP the same queries took a few seconds but in RDS some of them were about 6 minutes. Trying to offer a fast solution for my customer I migrated the database from Aurora RDS to a normal MariaDB RDS instance (probably overkill but I needed to do something). After the migration query times were similar to local environment, acceptable times, but it's been three weeks since the migration and queries are very slow again, about 6 minutes.
CPU rises to almost 50% and there is only one connection in DB, the one doing the slow query.
I've checked and modified parameters in RDS but they had no visible effect in the query speed.
I've also optimized tables and at first it makes a difference but just after a few minutes the query is slow again.
I was thinking about a log table or similar which could be growing too much after three weeks and therefore slowing my queries but it's just an idea. At this point I'm a bit lost.
I'm using a db.r3.large instance with MariaDB.
Tables in these slow queries are between 63 a 200K+ rows.

Postgresql skipping some/many write queries when used along with python crawlers

i am using postgres paired with django(python) and application is crawling internet for specific kind of data. As the crawlers find anything of their use they write it to the database. Now as the speed of crawlers is high and they are querying database by get_or_create(in django which checks for if the data is already in the database or not if it is not present then it makes a new row of it) All the cores on my machine are engaged to almost 99%. In that case when i trace the process,the postgres is skipping the write command for some or many instances.What can be the reason? Any recommendations in terms of change in architecture?
traced the crawler procedure manually and the process was printing the data it found but was not added to the postgres. That confirmed the data was getting lost.

Collecting Relational Data and Adding to a Database Periodically with Python

I have a project that :
fetches data from active directory
fetches data from different services based on active directory data
aggregates data
about 50000 row have to be added to database in every 15 min
I'm using Postgresql as database and django as ORM tool. But I'm not sure that django is the right tools for such projects. I have to drop and add 50000 rows data and I'm worry about performance.
Is there another way to do such process?
50k rows/15m is nothing to worry about.
But I'd make sure to use bulk_create to avoid 50k of round trips to the database, which might be a problem depending on your database networking setup.
For sure there are other ways, if that's what you're asking. But Django ORM is quite flexible overall, and if you write your queries carefully there will be no significant overhead. 50000 rows in 15 minutes is not really big enough. I am using Django ORM with PostgreSQL to process millions of records a day.
You can write a custom Django management command for this purpose, Then call it like
python manage.py collectdata
Here is the documentation link

Postgresql in memory database django

For performance issues I would like to execute an optimization algorithm on an in memory database in django (I'm likely to execute a lot of queries). I know it's possible to use a sqlite in memory (How to run Django's test database only in memory?) but I would rather use postgresql because our prod database is a postgresql one.
Does someone knows how to tell django to create the postgresql database in the memory ?
Thanks in advance
This is premature optimization. Postgresql is very very fast even if you are running it on a cold metal hard disk provided you use the right indexes. If you don't persist the data on disk, you are opening yourself upto a world of pain.
If on the other hand you want to speed up your tests by running an in memory postgresql database you can try something like these non durability optimizations:
http://www.postgresql.org/docs/9.1/static/non-durability.html
The most drastic suggestion is to use a ramdisk on that page. Here's how to set up one. After following the OS/Postgresql steps edit django settings.py and add the tablespace to the DATABASES section.
Last but not least: This is just a complete waste of time.
This is not possible. You cannot use PostgreSQL exclusively in memory; at least not without defeating the purpose of using PostgreSQL over something else. An in-memory data store like Redis running alongside PostgreSQL is the best you can do.
Also, at this point, the configuration is far out of Django's hands. It will have to be done outside of Django's environment.
It appears you may be missing some understanding about how these systems work. Build your app first to verify that everything is working, then worry about optimizing the database in such ways.

Django-cms and SQLite performance

I have a corporate website with django-cms and with a SQLite database that has only a few updates per month. The majority are reads. It will be less than 5000k requests per day.
My deployment is in a Cpanel server with apache and wsgi. I need to know first if I should be worried about the usage of SQLite and if PostgreSQL in this situation will be faster and less resources consumption (because it is already installed and running on server).
This site use a 11MB SQLite file. Is this file in memory?
SQLite will be faster than PostgreSQL except in cases of high contention. If you only update a few times per month, then SQLite will amost certainly be faster than PostgreSQL.
You may still choose to use PostgreSQL for other reasons (i.e. if you need network access to your data), but probably not for performance reasons.