Django settings.py - django

At my project login, some settings.py environment variables are loaded to enable some behaviors:
unit_id = settings.COMPANY
When another user logged in the system changes the value of this variable, through a function, it reflects in all other users who are already active:
settings.COMPANY = "coke"
in this case, all users will see "coke" in settings.COMPANY. I believed this would be in memory and would only apply to the user section in question, because I did not write in the physical file.
I wonder if this is how Django handles the settings.py environment variables: Does it propagate dynamically to all instances opened by all users?
This variable is accessed by context_processors.py, below:
def units(request):
unit_id = settings.COMPANY

You should not change settings at runtime.
This is (mainly) because Django doesn't know anything about it's runtime, so it's definitely possible to run multiple instances of the same Django installation. Changing a setting like this will not propagate it to any other processes.
I wonder if this is how Django handles the settings.py environment
variables: Does it propagate dynamically to all instances opened by
all users?
Django doesn't run an instance for every user. There is one or more (if you for example use something like gunicorn or if you use multiple servers with a load balancer.) processes that listen on a certain port.
To have some changeable setting, you could specify a default value, but you should store something like the active company in the database.

Related

Load data in memory during server startup in Django app

I am creating a Django app wherein I need to load and keep some data in memory when the server starts for quick access. To achieve this I am using Django's AppConfig class. The code looks something like this :
from django.core.cache import cache
class myAppConfig(AppConfig):
name = 'myapp'
def ready(self):
data = myModel.objects.values('A','B','C')
cache.set('mykey', data)
The problem is that this data in cache will expire after sometime. One alternative is to increase the TIMEOUT value. But I want this to be available in memory all the time. Is there some other configuration or approach which I can use to achieve this ?
If you are using Redis. It does provide persistent data capture.
Check it out in https://redis.io/topics/persistence
You have to set it up by Redis Configuration.
There will be two methods to keep the data persist in the cache after reboot - both of them will save the data in other formats in HDD. I will suggest you set up the AOF for your purpose since it will record every write operation. However, it will cause more space.

In Django how can I influence the connection process to the DB?

What is the best method to change the way how django connects to the DB (i.e. MySQL)?
For instance what should I do if I needed django to connect to the db via ssh tunnel, whose settings may change dynamically? (I'm planning to use sshtunnel)
I understand I should sub-class the django.db.backends.mysql.base.DatabaseWrapper and probably super()/modify the get_new_connection(self, conn_params)? (see the example below)
But then how would I submit this custom class in the settings, as it seems that the settings expect path to the module, rather than the class?
Something among the lines:
class myDatabaseWrapper(DatabaseWrapper):
"""Oversimplified example."""
def get_new_connection(self, conn_params):
with open('path/to/json.js', 'rt') as file:
my_conn_params = json.load(file)
conn_params.update(my_conn_params)
return super().get_new_connection(my_conn_params)

Django - Settings : sensitive settings and different environments (dev, prod, staging). What is the recommended way to do this

I am using django version 2.1.7:
I have read a lot of article and questions about this. I found people using various methods.
Approches used in the general are:
1) setting environmental variables
2) multiple settings
3) load configuration variables from a file like using django-environ etc
Approach 1: I will not be using, because i dont want to use environmental variables to store variables.
Approach 3: uses 3rd party library and sometimes they may not be maintained any more. But if there is any standard way to do this i am fine.
Approach 2: Here i am not sure how to store the sensitive data seprately.
So can someone guide me in this regard.
I'm using a combination of all:
When launching gunicorn as a service in systemd, you can set your environment variables using the conf file in your .service.d directory:
[Service]
Environment="DJANGO_SETTINGS_MODULE=myapp.settings.production"
Environment="DATABASE_NAME=MYDB"
Environment="DATABASE_USER=MYUSER"
Environment="DATABASE_PASSWORD=MYPASSWORD"
...
That file is fetched from S3 (where it is encrypted) when the instance is launched. I have a startup script using aws-cli that does that.
In order to be able to run management commands, e.g. to migrate your database in production, you also need these variables, so if the environment variables aren't set, I fetch them directly from this file. So in my settings, I have something like this:
def get_env_variable(var_name):
try:
return os.environ[var_name]
except KeyError:
env = fetch_env_from_gunicorn()
return env[var_name]
def fetch_env_from_gunicorn():
gunicorn_conf_directory = '/etc/systemd/system/gunicorn.service.d'
gunicorn_env_filepath = '%s/gunicorn.conf' % (gunicorn_conf_directory,)
if not os.path.isfile(gunicorn_env_filepath):
raise ImproperlyConfigured('No environment settings found in gunicorn configuration')
key_value_reg_ex = re.compile(r'^Environment\s*=\s*\"(.+?)\s*=\s*(.+?)\"$')
environment_vars = {}
with open(gunicorn_env_filepath) as f:
for line in f:
match = key_value_reg_ex.search(line)
if match:
environment_vars[match.group(1)] = match.group(2)
return environment_vars
...
DATABASES = {
...
'PASSWORD': get_env_variable('DATABASE_PASSWORD'),
...}
Note that all management commands I run in my hosted instances, I need to explicitly pass --settings=myapp.settings.production so it knows which settings file to use.
Different settings for different types of environments. That's because things like DEBUG but also mail settings, authentication settings, etc... are too different between the environments to use one single settings file.
I have default.py settings in a settings directory with all common settings, and then production.py for example starts with from .default import * and just overrides what it needs to override like DEBUG = False and different DATABASES (to use the above mechanism), LOGGING etc...
With this, you have the following security risks:
Anyone with ssh access and permissions for the systemd directory can read the secrets. Make sure you have your instances in a VPN and restrict ssh access to specific IP addresses for example. Also I only allow ssh with ssh keys, not username/password, so no risk of stealing passwords.
Anyone with ssh access to the django app directory could run a shell and from django.conf import settings to then read the settings. Same as above, should be restricted anyway.
Anyone with access to your storage bucket (e.g. S3) can read the file. Again, this can be restricted easily.

Is it dangerous to use dynamic database TEST_NAME in Django?

We are a small team of developers working on an application using the PostgreSQL database backend. Each of us has a separate working directory and virtualenv, but we share the same PostgreSQL database server, even Jenkins is on the same machine.
So I am trying to figure out a way to allow us to run tests on the same project in parallel without running into a database name collision. Furthermore, sometimes a Jenkins build would fail mid-way and the test database doesn't get dropped in the end, such that subsequent Jenkins build could get confused by the existing database and fail automatically.
What I've decided to try is this:
import os
from datetime import datetime
DATABASES = {
'default': {
# the usual lines ...
TEST_NAME: '{user}_{epoch_ts}_awesome_app'.format(
user=os.environ.get('USER', os.environ['LOGNAME']),
# This gives the number of seconds since the UNIX epoch
epoch_ts=int((datetime.utcnow() - datetime.utcfromtimestamp(0)).total_seconds())
),
# etc
}
}
So the test database name at each test run most probably will be unique, using the username and the timestamp. This way Jenkins can even run builds in parallel, I think.
It seems to work so far. But could it be dangerous? I'm guessing we're safe as long as we don't try to import the project settings module directly and only use django.conf.settings because that should be singleton-like and evaluated only once, right?
I'm doing something similar and have not run into any issue. The usual precautions should be followed:
Don't access settings directly.
Don't cause the values in settings to be evaluated in a module's top level. See the doc for details. For instance, don't do this:
from django.conf import settings
# This is bad because settings might still be in the process of being
# configured at this stage.
blah = settings.BLAH
def some_view():
# This is okay because by the time views are called by Django,
# the settings are supposed to be all configured.
blah = settings.BLAH
Don't do anything that accesses the database in a module's top level. The doc warns:
If your code attempts to access the database when its modules are compiled, this will occur before the test database is set up, with potentially unexpected results. For example, if you have a database query in module-level code and a real database exists, production data could pollute your tests. It is a bad idea to have such import-time database queries in your code anyway - rewrite your code so that it doesn’t do this.
Instead of the time, you could use the Jenkins executor number (available in the environment); that would be unique enough and you wouldn't have to worry about it changing.
As a bonus, you could then use --keepdb to avoid rebuilding the database from scratch each time... On the downside, failed and corrupted databases would have to be dropped separately (perhaps the settings.py can print out the database name that it's using, to facilitate manual dropping).

Django Celery Database for Models on Producer and Worker

I want to develop an application which uses Django as Fronted and Celery to do background stuff.
Now, sometimes Celery workers on different machines need database access to my django frontend machine (two different servers).
They need to know some realtime stuff and to run the django-app with
python manage.py celeryd
they need access to a database with all models available.
Do I have to access my MySQL database through direct connection? Thus I have to allow user "my-django-app" access not only from localhost on my frontend machine but from my other worker server ips?
Is this the "right" way, or I'm missing something? Just thought it isn't really safe (without ssl), but maybe that's just the way it has to be.
Thanks for your responses!
They will need access to the database. That access will be through a database backend, which can be one that ships with Django or one from a third party.
One thing I've done in my Django site's settings.py is load database access info from a file in /etc. This way the access setup (database host, port, username, password) can be different for each machine, and sensitive info like the password isn't in my project's repository. You might want to restrict access to the workers in a similar manner, by making them connect with a different username.
You could also pass in the database connection information, or even just a key or path to a configuration file, via environment variables, and handle it in settings.py.
For example, here's how I pull in my database configuration file:
g = {}
dbSetup = {}
execfile(os.environ['DB_CONFIG'], g, dbSetup)
if 'databases' in dbSetup:
DATABASES = dbSetup['databases']
else:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
# ...
}
}
Needless to say, you need to make sure that the file in DB_CONFIG is not accessible to any user besides the db admins and Django itself. The default case should refer Django to a developer's own test database. There may also be a better solution using the ast module instead of execfile, but I haven't researched it yet.
Another thing I do is use separate users for DB admin tasks vs. everything else. In my manage.py, I added the following preamble:
# Find a database configuration, if there is one, and set it in the environment.
adminDBConfFile = '/etc/django/db_admin.py'
dbConfFile = '/etc/django/db_regular.py'
import sys
import os
def goodFile(path):
return os.path.isfile(path) and os.access(path, os.R_OK)
if len(sys.argv) >= 2 and sys.argv[1] in ["syncdb", "dbshell", "migrate"] \
and goodFile(adminDBConfFile):
os.environ['DB_CONFIG'] = adminDBConfFile
elif goodFile(dbConfFile):
os.environ['DB_CONFIG'] = dbConfFile
Where the config in /etc/django/db_regular.py is for a user with access to only the Django database with SELECT, INSERT, UPDATE, and DELETE, and /etc/django/db_admin.py is for a user with these permissions plus CREATE, DROP, INDEX, ALTER, and LOCK TABLES. (The migrate command is from South.) This gives me some protection from Django code messing with my schema at runtime, and it limits the damage an SQL injection attack can cause (though you should still check and filter all user input).
This isn't a solution to your exact problem, but it might give you some ideas for ways to smarten up Django's database access setup for your purposes.