sqlalchemy - data doesnt get pushed to database while commit but present in session(in memory) - python-2.7

I am adding data to sqlalchemy. But sometimes data is not getting updated or inserted to database. But the commit is successful and I can see the data in memory of session's object.
ie
session.identity_map
Running on sqlalchemy 1.3.3. python 2.7. ubuntu 18.04
from sqlalchemy.orm import Session
from . import Errors as ExecuteErrors
class Errors(object):
def __init__(self, sqlalchemy_engine, d):
self.sqlalchemy_engine = sqlalchemy_engine
self.d = d
def upsert(self, error):
session = Session(self.sqlalchemy_engine)
row = session.query(ExecuteErrors).filter_by(**{'c_name':error['c_name'], 'c_type':error['c_type'],
'f_name':error['f_name']}).scalar()
session.close()
if row:
self.update(error)
else:
self.insert(error)
def insert(self, error):
e = ExecuteErrors(**{'c_name':error['c_name'], 'c_type':error['c_type'], 'f_name':error['f_name'],
'msg':error['msg'], 'details':error['details']})
session = Session(self.sqlalchemy_engine, expire_on_commit=False)
session.add(e)
session.identity_map
session.commit()
session.close()
def update(self, error):
session = Session(self.sqlalchemy_engine, expire_on_commit=False)
session.query(ExecuteErrors).filter_by(**{'c_name':error['c_name'], 'c_type':error['c_type'],
'f_name':error['f_name']}).update({'msg': error['msg'], 'details': error['details']})
session.commit()
session.close()
def get_errors(self):
session = Session(self.sqlalchemy_engine)
e = session.query(ExecuteErrors).all()
session.close()
return e
def clear(self):
session = Session(self.sqlalchemy_engine)
session.query(ExecuteErrors).delete()
session.commit()
session.close()
Calling this with:
e = Error(engine, 'emp')
e.upsert({'c_name':'filter','c_type':'task','f_name':'f1','msg':'TypeError','details':'xyz'})
This should add row in database or update row with new data.
Its working for some insert and for some not.

You could find a possible workaround by explicitely flushing your session when needed.
That said, I think you should reconsider the way you're using sessions. Sessions are intended to manage database connections but you're using them as if there where actual connections.
IMHO, a better way to do would be to create a session at Error instanciation and use it when needed in all your methods.
An even better way to proceed could be to create a session at begining of your "calling" module and pass it to the Error instanciation, and to any other object which need access to database.
Doing this, you could even experience better performance and it may solve your problem (?)
More details about how to manage sessions in the sqlalchemy doc.
EDIT: In addition, the sqlalchemy doc lists some potential problems when used with sqlite. One of them could be the cause of your problem.

Related

Problem with Django Tests and Trigram Similarity

I have a Django application that executes a full-text-search on a database. The view that executes this query is my search_view (I'm ommiting some parts for the sake of simplicity). It just retrieve the results of the search on my Post model and send to the template:
def search_view(request):
posts = m.Post.objects.all()
query = request.GET.get('q')
search_query = SearchQuery(query, config='english')
qs = Post.objects.annotate(
rank=SearchRank(F('vector_column'), search_query) + TrigramSimilarity('post_title', query)
).filter(rank__gte=0.15).order_by('-rank'), 15
)
context = {
results = qs
}
return render(request, 'core/search.html', context)
The application is working just fine. The problem is with a test I created. Here is my tests.py:
class SearchViewTests(TestCase):
def test_search_without_results(self):
"""
If the user's query did not retrieve anything
show him a message informing that
"""
response = self.client.get(reverse('core:search') + '?q=eksjeispowjskdjies')
self.assertEqual(response.status_code, 200)
self.assertContains(response.content, "We didn\'t find anything on our database. We\'re sorry")
This test raises an ProgrammingError exception:
django.db.utils.ProgrammingError: function similarity(character varying, unknown) does not exist
LINE 1: ...plainto_tsquery('english'::regconfig, 'eksjeispowjskdjies')) + SIMILARITY...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
I understand very well this exception, 'cause I got it sometimes. The SIMILARITY function in Postgres accepts two arguments, and both need to be of type TEXT. The exception is raising because the second argument (my query term) is of type UNKNOWN, therefore the function won't work and Django raises the exception. And I don't understand why, because the actual search is working! Even in the shell it works perfectly:
In [1]: from django.test import Client
In [2]: c = Client()
In [3]: response = c.get(reverse('core:search') + '?page=1&q=eksjeispowjskdjies')
In [4]: response
Out[4]: <HttpResponse status_code=200, "text/html; charset=utf-8">
Any ideas about why test doesn't work, but the actual execution of the app works and console test works too?
I had the same problem and this how I solved it in my case:
First of all, the problem was that when Django creates the test database that it is going to use for tests it does not actually run all of your migrations. It simply creates the tables based on your models.
This means that migrations that create some extension in your database, like pg_trgm do not run when creating the test database.
One way to overcome this is to use a fixture in your conftest.py file which will make create said extensions before any tests run.
So, in your conftest.py file add the following:
# the following fixture is used to add the pg_trgm extension to the test database
#pytest.fixture(scope="session", autouse=True)
def django_db_setup(django_db_setup, django_db_blocker):
"""Test session DB setup."""
with django_db_blocker.unblock():
with connection.cursor() as cursor:
cursor.execute("CREATE EXTENSION IF NOT EXISTS pg_trgm;")
You can of course replace pg_trgm with any other extension you require.
PS: You must make sure the extension you are trying to use works for the test database you have chosen. In order to change the database used by Django you can change the value of
DATABASES = {'default': env.db('your_database_connection_uri')}
in your application's settings.py.

Psycopg2 & Flask - tying connection to before_request & teardown_appcontext

Cheers guys,
refactoring my Flask app I got stuck at tying the db connection to #app.before_request and closing it at #app.teardown_appcontext. I am using plain Psycopg2 and the app factory pattern.
First I created a function to call wihtin the app factory so I could use #app as suggested by Miguel Grinberg here:
def create_app(test_config=None):
app = Flask(__name__, instance_relative_config=True)
--
from shop.db import connect_and_close_db
connect_and_close_db(app)
--
return app
Then I tried this pattern suggested on http://flask.pocoo.org/docs/1.0/appcontext/#storing-data:
def connect_and_close_db(app):
#app.before_request
def get_db_test():
conn_string = "dbname=testdb user=testuser password=test host=localhost"
if 'db' not in g:
g.db = psycopg2.connect(conn_string)
return g.db
#app.teardown_appcontext
def close_connection(exception):
db = g.pop('db', None)
if db is not None:
db.close()
It resulted in:
TypeError: 'psycopg2.extensions.connection' object is not callable
Anyone has an idea what happend and how to make it work?
Furthermore I wonder how I would access the connection object for creating a cursor once its creation is tied to before_request?
This solution is probably far from perfect, and it's not really DRY. I'd welcome comments, or other answers that build on this.
To implement for raw psycopg2 support, you probably need to take a look at the connection pooler. There's also a good guide on how to implement this outwith Flask.
The basic idea is to create your connection pool first. You want this to be established when the flask application initializes (This could within the python interpreter or via gunicorn worker of which there may be several - in which case each worker has its own connection pool). I chose to store the returned pool in the config:
from flask import Flask, g, jsonify
import psycopg2
from psycopg2 import pool
app = Flask(__name__)
app.config['postgreSQL_pool'] = psycopg2.pool.SimpleConnectionPool(1, 20,
user = "postgres",
password = "very_secret",
host = "127.0.0.1",
port = "5432",
database = "postgres")
Note the first two arguments to SimpleConnectionPool are the min & max connections. That's the number of connections going to your database server, bwtween 1 & 20 in this case.
Next define a get_db function:
def get_db():
if 'db' not in g:
g.db = app.config['postgreSQL_pool'].getconn()
return g.db
The SimpleConnectionPool.getconn() method used here simply returns a connection from the pool, which we assign to g.db and return. This means when we call get_db() anywhere in the code it returns the same connection, or creates a connection if not present. There's no need for a before.context decorator.
Do define your teardown function:
#app.teardown_appcontext
def close_conn(e):
db = g.pop('db', None)
if db is not None:
app.config['postgreSQL_pool'].putconn(db)
This runs when the application context is destroyed, and uses SimpleConnectionPool.putconn() to put away the connection.
Finally define a route:
#app.route('/')
def index():
db = get_db()
cursor = db.cursor()
cursor.execute("select 1;")
result = cursor.fetchall()
print (result)
cursor.close()
return jsonify(result)
This code works for me tested against postgres runnning in a docker container. A few things which probably should be improved:
This view isn't very DRY. Perhaps you could move some of this into the get_db function so it returns a cursor. (!!!)
When the python interpreter exits, you should also find away to close the connection with app.config['postgreSQL_pool'].closeall
Although tested some kind of way to monitor the pool would be good, so that you could watch pool/db connections under load and make sure the pooler behaves as expected.
(!!!)In another land, the sqlalchemy.scoped_session documentation explains more things relating to this, with some theory on how its 'sessions' work in relation to requests. They have implemented it in such a way that you can call Session.query('SELECT 1') and it will create the session if it doesn't already exist.
EDIT: Here's a gist with your app factory pattern, and sample usage in the comment.
Currently I am using this pattern:
(I ll edit this answer eventually if I came up with better solution)
This is main script in which we use database. It uses two functions from config: get_db() to get connection from pool and put_db() to return connection into pool:
from config import get_db, put_db
from threading import Thread
from time import sleep
def select():
db = get_db()
sleep(1)
cursor = db.cursor()
# Print select result and db connection address in memory
# To see if it gets connection from another addreess on second thread
cursor.execute("SELECT 'It works %s'", (id(db),))
print(cursor.fetchone())
cursor.close()
put_db(db)
Thread(target=select).start()
Thread(target=select).start()
print('Main thread')
This is config.py:
import sys
import os
import psycopg2
from psycopg2 import pool
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
def get_db(key=None):
return getattr(get_db, 'pool').getconn(key)
def put_db(conn, key=None):
getattr(get_db, 'pool').putconn(conn, key=key)
# So we here need to init connection pool in main thread in order everything to work
# Pool is initialized under function object get_db
try:
setattr(get_db, 'pool', psycopg2.pool.ThreadedConnectionPool(1, 20, os.getenv("DB")))
print(color.red('Initialized db'))
except psycopg2.OperationalError as e:
print(e)
sys.exit(0)
And also if you are curious there is an .env file containing db connection string in DB env variable:
DB="dbname=postgres user=postgres password=1234 host=127.0.0.1 port=5433"
(.env file is loaded using dotenv module in config.py)

Dynamically set Django settings variables from Database

I am currently trying to build an application that manages multiple databases. Since the app will be managing data in 30+ databases I am attempting to generate DATABASE_ROUTERS in the settings file. I cannot directly import the db model into the settings file. I get this error:
django.core.exceptions.AppRegistryNotReady: Apps aren't loaded yet.
This error makes since. Is there a way I can control the sequence of events so that I have access to the database before all of the settings are established on execution? My goal is to automate database connections pulling relevant data from a DB and generate the DATABASE_ROUTERS and DATABASES within the setting file. Is this even possible? Is there a package that I can download that does exactly this?
If you do not know what I am asking please do not down vote just ask me to elaborate.
I was able to figure out how to query the data I needed from my database and import it into the settings file. I created the script below. Keep in mind this can be improved, this is just something I modified from here. This directly queries data from my test db (sqlite3). I use postgreSQL in production. This script should work, with some modification, with PostgreSQL.
As you can see below I am storing the data in dictionaries that is then stored in a list. I then import that list of dictionaries into my settings file. From there I can loop through the list and create my DATABASE_ROUTERS and DATABASES dynamically from the database. I was also able to generate router Classes in my routers.py file by importing the same list. Please comment below if you need me to elaborate further.
import sqlite3
from sqlite3 import Error
dbs = []
def create_connection(db_file):
""" create a database connection to the SQLite database
specified by the db_file
:param db_file: database file
:return: Connection object or None
"""
try:
conn = sqlite3.connect(db_file)
return conn
except Error as e:
print(e)
return None
def select_all_data(conn):
"""
Query all rows in the table
:param conn: the Connection object
:return:
"""
cur = conn.cursor()
cur.execute("SELECT * FROM fund_table")
rows = cur.fetchall()
for row in rows:
print(row)
def select_name_and_db(conn):
"""
Query table by fund_name and db_name
:param conn: the Connection object
:return:
"""
cur = conn.cursor()
cur.execute("SELECT fund_name, db_name FROM fund_table")
rows = cur.fetchall()
for row in rows:
dbs.append({"fund_name": row[0], "db_name": row[1]})
return dbs
def main():
database = "edb.sqlite3"
# create a database connection
conn = create_connection(database)
with conn:
""" select_all_data(conn) """
select_name_and_db(conn)
main()
Make one function that loads this variables and make it async, so after you app is ready you load it, but im not sure is this will work properly
https://hackernoon.com/asynchronous-python-45df84b82434
Dirty Sollution is make 1 file for each BD and you call your settings based in what BD gonna work...

Enable integrity checking with sqlite in django

In my django project, I use mysql db for production, and sqlite for tests.
Problem is, some of my code rely on model integrity checking. It works well with mysql, but integrity errors are not thrown when the same code is executed in tests.
I know that foreign keys checking must be activated in sqlite :
PRAGMA foreign_keys = 1;
However, I don't know where is the best way to do this activation (same question here).
Moreover, the following code won't work :
def test_method(self):
from django.db import connection
cursor = connection.cursor()
cursor.execute('PRAGMA foreign_keys = ON')
c = cursor.execute('PRAGMA foreign_keys')
print c.fetchone()
>>> (0,)
Any ideas?
So, if finally found the correct answer. All I had to do was to add this code in the __init__.py file in one of my installed app:
from django.db.backends.signals import connection_created
def activate_foreign_keys(sender, connection, **kwargs):
"""Enable integrity constraint with sqlite."""
if connection.vendor == 'sqlite':
cursor = connection.cursor()
cursor.execute('PRAGMA foreign_keys = ON;')
connection_created.connect(activate_foreign_keys)
You could use django signals, listening to post_syncdb.
from django.db.models.signals import post_syncdb
def set_pragma_on(sender, **kwargs):
"your code here"
post_syncdb.connect(set_pragma_on)
This ensures that whenever syncdb is run (syncdb is run, when creating the test database), that your SQLite database has set 'pragma' to 'on'. You should check which database you are using in the above method 'set_pragma_on'.

Syncing data between devel/live databases in Django

With Django's new multi-db functionality in the development version, I've been trying to work on creating a management command that let's me synchronize the data from the live site down to a developer machine for extended testing. (Having actual data, particularly user-entered data, allows me to test a broader range of inputs.)
Right now I've got a "mostly" working command. It can sync "simple" model data but the problem I'm having is that it ignores ManyToMany fields which I don't see any reason for it do so. Anyone have any ideas of either how to fix that or a better want to handle this? Should I be exporting that first query to a fixture first and then re-importing it?
from django.core.management.base import LabelCommand
from django.db.utils import IntegrityError
from django.db import models
from django.conf import settings
LIVE_DATABASE_KEY = 'live'
class Command(LabelCommand):
help = ("Synchronizes the data between the local machine and the live server")
args = "APP_NAME"
label = 'application name'
requires_model_validation = False
can_import_settings = True
def handle_label(self, label, **options):
# Make sure we're running the command on a developer machine and that we've got the right settings
db_settings = getattr(settings, 'DATABASES', {})
if not LIVE_DATABASE_KEY in db_settings:
print 'Could not find "%s" in database settings.' % LIVE_DATABASE_KEY
return
if db_settings.get('default') == db_settings.get(LIVE_DATABASE_KEY):
print 'Data cannot synchronize with self. This command must be run on a non-production server.'
return
# Fetch all models for the given app
try:
app = models.get_app(label)
app_models = models.get_models(app)
except:
print "The app '%s' could not be found or models could not be loaded for it." % label
for model in app_models:
print 'Syncing %s.%s ...' % (model._meta.app_label, model._meta.object_name)
# Query each model from the live site
qs = model.objects.all().using(LIVE_DATABASE_KEY)
# ...and save it to the local database
for record in qs:
try:
record.save(using='default')
except IntegrityError:
# Skip as the record probably already exists
pass
Django command extension's Dumpscript should help a lot.
This doesn't answer your question exactly but why not just do a db dump and a db restore?