Gunicorn reflect changed code dynamically - django

I am developing a django web application where a user can modify the code of certain classes, in the application itself, through UI using ace editor (think of as gitlab/github where you can change code online). But these classes are ran by django and celery worker at some point.
Once code changes are saved, the changes are not picked by django due to gunicorn but works fine with celery because its different process. (running it locally using runserver works fine and changes are picked by both django and celery).
Is there a way to make gunicorn reflects the changes of certain directory that contain the classes without reloading the whole application? and if reloading is necessary, is there a way to reload gunicorn's workers one-by-one without having any downtime?
the gunicron command:
/usr/local/bin/gunicorn config.wsgi --bind 0.0.0.0:5000 --chdir=/app
The wsgi configuration file:
import os
import sys
from django.core.wsgi import get_wsgi_application
app_path = os.path.abspath(os.path.join(
os.path.dirname(os.path.abspath(__file__)), os.pardir))
sys.path.append(os.path.join(app_path, 'an_application'))
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "config.settings.production")
application = get_wsgi_application()

The reload option is "intended for development". There's no strong wording saying you shouldn't use it in production. The reason you shouldn't use it in production is because people make typos, change in one file, may need several other changes in others, etc etc. So, you can make your site inaccessible and then you don't have a working app to fix it again.
For a dev, that's no problem as you look at the logs/output in your shell and restart it. This is why #Krzysztof's suggestion is the best one. Push the code changes to your repo, make it go through the CI/CD and switch over the pod. If CI fails, then CD won't happen so you're good.
Of course, that's a scope far too large for a Q&A site.

Why not save the code in a separate text file or database and the relevant method can simply load the code dynamically as a string and execute it using exec()?
Let say you have a function function1 which can be edited by a user. When the user submits the changes, process the input (separate out the functions so that you know which function has what definition), and save them all individually, like function1, function2 etc., in a database or a text file as strings.
One you need to execute function1, just load its value that you saved and use exec to execute the code.
This way, you won't need to reload gunicorn since all workers will always fetch the updated function definition at run time!
Something in the lines of:
def function1_original():
# load function definition
f = open("function1.txt", "r")
# execute the string
exec(f.read()) # this will just load the function definition
function1() # this will execute the user defined function
So the user will define:
def function1():
# user defined code
# blah blah
...

I was able to solve this by changing the extension of the python scripts to anything but .py
Then I loaded these files using the following function:
from importlib import util
from immportlib.machinary import SourceFileLoader
def load_module(module_name, modele_path):
module_path = path.join(path.dirname(__file__), "/path/to/your/files{}.anyextension".format(module_name))
spec = util.spec_from_loader(module_name,
SourceFileLoader(module_name, module_path))
module = util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
In this case, they are not loaded by Gunicorn in RAM and I was able to apply the changes on fly without the need to apply eval or exec functiong.

Related

How to use threading with django and gunicorn

I am trying to use the threading library inside of a django application that uses gunicorn. When I run my code locally everything is good, but as soon as I try to call the view from production I get a context error. I believe this is due to gunicorn.
Here is the error
RuntimeError: cannot exit context: thread state references a different context object
Here is my code.
t = threading.Thread(
target=myFunction, args=[arg1]
)
t.setDaemon(True)
t.start()
I'm posting the solution I found as I could not find any reference to this exact issue and resolution. It turns out the issue was not with python or django but rather Gunicorn itself. In order to use threading I had to add the --threads param to the service file.
/usr/bin/gunicorn3 --name=my_app --pythonpath=/home/django/myenv --bind unix:/home/django/myenv/my_app/gunicorn.socket my_app.wsgi:application --workers=4 --threads=2 --worker-class=gthread
I also set the worker class to gthread

Shared global variables among apps

Suppose I have two apps: data and visual.
App data executes a database retrieval upon starting up. This thread here and this doc advises on how and where to place code that executes once on starting up. So, in app data:
#apps.py
from django.apps import AppConfig
global_df #want to declare a global variable to be shared across all apps here.
class DataConfig(AppConfig):
# ...
def ready(self):
from .models import MyModel
...
df = retrieve_db() #retrieve model instances from database
...
return df
In the code above, I am looking to execute ready() once on starting up and return df to a shared global variable (in this case, global_df). App visual should be able to access (through import, maybe) this global_df. But, any further modification of this global_df should be done only in app data.
This thread here advised to place any global variable in the app's __init__.py file. But it mentioned that this only works for environmental variable.
Two questions:
1 - How and where do I declare such a global variable?
2 - On starting up, how to pass the output of a function that executes only once to this global variable?
Some threads discuss Redis for caching purpose. But I am not looking for that solution as it seems like an overkill for the problem I am having.

Google App Engine deferred.defer task not getting executed

I have a Google App Engine Standard Environment application that has been working fine for a year or more, that, quite suddenly, refuses to enqueue new deferred tasks using deferred.defer.
Here's the Python 2.7 code that is making the deferred call:
# Find any inventory items that reference the product, and change them too.
# because this could take some time, we'll do it as a deferred task, and only
# if needed.
if upd:
updater = deferredtasks.InvUpdate()
deferred.defer(updater.run, product_key)
My app.yaml file has the necessary bits to support deferred.defer:
- url: /_ah/queue/deferred
script: google.appengine.ext.deferred.deferred.application
login: admin
builtins:
- deferred: on
And my deferred task has logging in it so I should see it running when it does:
#-------------------------------------------------------------------------------
# DEFERRED routine that updates the inventory items for a particular product. Should be callecd
# when ANY changes are made to the product, because it should trigger a re-download of the
# inventory record for that product to the iPad.
#-------------------------------------------------------------------------------
class InvUpdate(object):
def __init__(self):
self.to_put = []
self.product_key = None
self.updcount = 0
def run(self, product_key, batch_size=100):
updproduct = product_key.get()
if not updproduct:
logging.error("DEFERRED ERROR: Product key passed in does not exist")
return
logging.info(u"DEFERRED BEGIN: beginning inventory update for: {}".format(updproduct.name))
self.product_key = product_key
self._continue(None, batch_size)
...
When I run this in the development environment on my development box, everything works fine. Once I deploy it to the App Engine server, the inventory updates never get done (i.e. the deferred task is not executed), and there are no errors (and no other logging from the deferred task in fact) in the log files on the server. I know that with the sudden move to get everybody on Python 3 as quickly as possible, the deferred.defer library has been marked as not recommended because it only works with the 2.7 Python environment, and I planned on moving to task queues for this, but I wasn't expecting deferred.defer to suddenly stop working in the existing python environment.
Any insight would be greatly appreciated!
I'm pretty sure you cant pass the method of an instance to appengine taskqueue, because that instance will not get exist when your task runs since it will be running in a different process. I actually dont understand how your task ever worked when running remotely in the first place (and running locally is not an accurate representation of how things will run remotely)
Try changing your code to this:
if upd:
deferred.defer(deferredtasks.InvUpdate.run_cls, product_key)
and then InvUpdate is the same but has a new function run_cls:
class InvUpdate(object):
#classmethod
def run_cls(cls, product_key):
cls().run(product_key)
And I'm still on the process of migrating to cloud tasks and my deferred tasks still work

How do I get SQL logging in Rails 4 WEBrick for a non-development environment?

I'm using Rails 4.1.2. I have some environments which are exact copies of my development environment. In other words, I created them by simply copying config/environments/development.rb to a file with a different name (e.g., destaging.rb). They differ only in the connection information in database.yml.
If I issue RAILS_ENV=destaging rails s or rails s -e destaging at the command line, everything works just as I desire, except that I get no SQL logging to STDOUT, which is a bummer.
Since my destaging environment is absolutely identical to my development environment except for different connection settings in database.yml, I suspect that something is looking for an environment named development and enabling SQL logging to STDOUT only if an environment with that name is active. How can I enable SQL logging to STDOUT for other environments launched through WEBRick?
For posterity, I've discovered how to do this. First, I'm running Ruby 2.1.2 with Rails 4.1.2. If that is not your environment, your mileage may vary, though I suspect the solution will be very similar.
So, first you must modify bin/rails. Open this file and change it as follows. (I have posted the entire file, minus the shebang, for clarity.)
begin
load File::expand_path("../spring", __FILE__)
rescue LoadError
end
APP_PATH = File.expand_path('../../config/application', __FILE__)
require_relative '../config/boot'
# Here comes the important part
require 'rails/commands/server'
class Rails::Server::Options
def parse_with_logging!(args)
options = parse_without_logging!(args)
options[:log_stdout] = true # Or whatever condition you want
options
end
alias_method_chain :parse!, :logging
end
require 'rails/commands'
Since require 'rails/commands' executes the server immediately, monkey-patching after that line does not work. It is simply ignored. If you try to monkey-patch it before you require the commands, it explodes because the Rails::Server::Options class has not yet been defined. Thus, we have to pre-emptively require rails/commands/server so we can alias its parse! method.
Monkey-patching should almost always be a last resort, IMHO. However, I see no alternative in this case. If anyone has a better idea, I'd love to hear it.
I also encountered this problem with the same versions of Rails and Ruby, using a non-standard environment name (in your case "destaging"). However I did not want it to affect all environments, nor lose any more time to not getting work done, so I simply changed the way I start the server:
(tail -F log/destaging.log &) && rails s
Then afterwards to restart the server, ctrl-c as usual and then rails s again. The tail will keep going in the background and for all intents and purposes the experience will be like it was before this stopped working.

why does reverse() prepend a server path?

I have several instances of my project running on my server, like so:
http://0.0.0.0/one
http://0.0.0.0/two
I also have an activation view that is accessible via:
http://0.0.0.0/one/activate/u/1/c/123
When I do reverse() on this view from django shell, the url given to me as:
/activate/u/1/c/123
So it does not include the /one server path. However, when I use reverse() to look up the path of the page to send in an email somewhere else in the project, reverse() seems to return the full server path + the view path, like so:
/one/activate/u/1/c/123
Does anyone have any idea of why this is happening?
reverse() is supposed to include this server path, so that you can just use it in a link and it'll work without having to change anything else in your code. But manage.py shell doesn't set the appropriate path prefix; that code happens in the wsgi/etc handler. This is Django bug #16734 (which I incidentally reported :p).
You can work around this by calling django.core.management.base.set_script_prefix manually, presumably in your settings.py. For example:
# when running through wsgi, this will get overriden
# but it's needed for manage.py
from django.core.urlresolvers import set_script_prefix
set_script_prefix('/one/')