Error in event loop with Flask, gremlin python and uWSGI - flask

I'm running into a problem when using Flask with a gremlin database (it's an Amazon Neptune database) and using uWSGI. Everything works fine in my unit tests which use the test_client provided by Flask. However, in production we use uWSGI and there I get the following error:
There is no current event loop in thread 'uWSGIWorker4Core1'.
My app code is creating a connection to the database before a request and assigning it to the Flask g object. During teardown, the database connection is removed. The error happens when the app is trying to close the connection.
from flask import Flask, g
from gremlin_python.structure.graph import Graph
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal
app = Flask(__name__, instance_relative_config=True)
#app.before_request
def _db_connect():
if not hasattr(g, 'graph_conn'):
g.graph_conn = DriverRemoteConnection(app.config['DATABASE_HOST'],'g')
g.gg = traversal().withRemote(g.graph_conn)
# This hook ensures that the connection is closed when we've finished
# processing the request.
#app.teardown_appcontext
def _db_close(exc):
if hasattr(g, 'graph_conn'):
g.graph_conn.close(). # <- ERROR THROWN AT THIS LINE
del g.graph_conn
the uWSGI config does use multiple threads:
[uwsgi]
http = 0.0.0.0:3031
manage-script-name = true
module = dogmaserver:app
processes = 4
threads = 2
offload-threads = 2
stats = 0.0.0.0:9191
But my understanding of how Flask's g object worked would be that it is all on the same thread. Can anyone let me know what I'm missing?
I'm using Flask 1.0.2, gremlinpython 3.4.11 and uWSGI 2.0.17.1.

I used a workaround by removing the threads configuration option in uWSGI which makes there only be a single thread per process.

Related

Socket handshake error when using gunicorn

I have a flask app that processes a web socket stream of audio from Twilio.
The app works fine without gunicorn but when I start it with gunicorn I get only the first message of the socket (connect) and an unsuccessful handshake. Here is how the app looks:
from flask import Flask
from flask_sockets import Sockets
from geventwebsocket.handler import WebSocketHandler
from gevent import pywsgi
...
app = Flask(__name__)
sockets = Sockets(app)
...
#sockets.route('/media')
def media(ws):
...
if __name__ == '__main__':
server = pywsgi.WSGIServer(('', HTTP_SERVER_PORT), app, handler_class=WebSocketHandler)
server.serve_forever()
When I start the app directly using python flaskapp.py it works ok.
When I start it using gunicorn by writing:
gunicorn -k flask_sockets.worker --bind 0.0.0.0:5055 --log-level=bug flaskapp:app
this is where the connection "hangs" and carries no further than the initial connection, apparently due to the handshake failing.
It's important to note that I haven't "gevent monkey patched" the code, but I'm not sure if it has anything to do with the problem.
Any idea will much be appreciated!
Don't have the ability to test this right now, but perhaps try with:
from flask import Flask
from flask_sockets import Sockets
from geventwebsocket.handler import WebSocketHandler
from gevent import pywsgi
...
app = Flask(__name__)
sockets = Sockets(app)
...
#sockets.route('/media')
def media(ws):
...
server = pywsgi.WSGIServer(('', HTTP_SERVER_PORT), app, handler_class=WebSocketHandler)
if __name__ == '__main__':
server.serve_forever()
Then change the launch command to:
gunicorn -k flask_sockets.worker --bind 0.0.0.0:5055 --log-level=bug flaskapp:server
(Gunicorn should be importing the server object, which can't live within that final if statement, as that code only runs when launched with python directly).

Why do I get the error "A valid Flask application was not obtained from..." when I use the flask cli to run my app?

I try to use the flask cli to start my application, i.e. flask run. I use the FLASK_APP environment variable to point to my application, i.e. export FLASK_APP=package_name.wsgi:app
In my wsgi.py file, I create the app with a factory function, i.e. app = create_app(config) and my create_app method looks like this:
def create_app(config_object=LocalConfig):
app = connexion.App(config_object.API_NAME,
specification_dir=config_object.API_SWAGGER_DIR,
debug=config_object.DEBUG)
app.app.config.from_object(config_object)
app.app.json_encoder = JSONEncoder
app.add_api(config_object.API_SWAGGER_YAML,
strict_validation=config_object.API_SWAGGER_STRICT,
validate_responses=config_object.API_SWAGGER_VALIDATE)
app = register_extensions(app)
app = register_blueprints(app)
return app
However, the application doesn't start, I get the error:
A valid Flask application was not obtained from "package_name.wsgi:app".
Why is this?
I can start my app normally when I use gunicorn, i.e. gunicorn package_name.wsgi:app
My create_app function didn't return an object of class flask.app.Flask but an object of class connexion.apps.flask_app.FlaskApp, because I am using the connexion framework.
In my wsgi.py file, I could simply set:
application = create_app(config)
app = application.app
I didn't even have to do export FLASK_APP=package_name.wsgi:app anymore, autodescovery worked if the flask run command was executed in the folder where the wsgi.py file is.
application = create_app(config)
app = application.app
App = app.app
For me I just created another variable pointing to connexion.app.Flask type and I set the environ as below
export FLASK_APP= __main__:App
flask shell

celery parallel tasking error 'no result backend configured'

Running django-celery 3.1.16, Celery 3.1.17, Django 1.4.16. Trying to run some parallel tasks using 3 workers and collect the results using the following:
from celery import group
positions = []
jobs = group(celery_calculate_something.s(data.id) for data in a_very_big_list)
results = jobs.apply_async()
positions.extend(results.get())
The task celery_calculate_something returns an object to place the in the results list:
app.task(ignore_result=False)
def celery_calculate_something(id):
<do stuff>
No matter what I try, I always get the same result when calling get() on results:
No result backend configured. Please see the documentation for more information.
However, the results backend IS configured - I have many other tasks with ignore_result=False merrily adding to the tasks meta table in django_celery. It is something to do with using the results returned from group(). I should note it is not set explicitly in settings - it seems that django-celery has set it automatically for you.
I have the worker collecting events using:
manage.py celery worker -l info -E
and celerycam running with
python manage.py celerycam
Inspecting the results object returned (an instance of GroupResult) I can see that the backend attr is an instance of DisabledBackend. Is this the problem? What have I mis-understood?
You did not configure the results backend, so basically you need tables to store the results, since you have django-celery add it to INSTALLED_APPS in your settings.py file and then perform the migration (python manage.py migrate) After that open your celery.py file and modify your backend to djcelery.backends.database:DatabaseBackend. Here's an example
app = Celery('almanet',
broker='amqp://guest#localhost//',
backend='djcelery.backends.database:DatabaseBackend',
include=['alm_crm.tasks'] #References your tasks. Donc forget to put the whole absolute path.
)
After that you can import results from celery import result Now you can save the result and extract the result by job.id
from celery import group
positions = []
jobs = group(celery_calculate_something.s(data.id) for data in
a_very_big_list)
results = jobs.apply_async()
results.save()
some_task_result = result.GroupResult.restore(results.id)
print some_task_results.ready()

Running periodic tasks with django and celery

I'm trying create a simple background periodic task using Django-Celery-RabbitMQ combination. I installed Django 1.3.1, I downloaded and setup djcelery. Here is how my settings.py file looks like:
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
....
import djcelery
djcelery.setup_loader()
...
INSTALLED_APPS = (
'djcelery',
)
And I put a 'tasks.py' file in my application folder with the following contents:
from celery.task import PeriodicTask
from celery.registry import tasks
from datetime import timedelta
from datetime import datetime
class MyTask(PeriodicTask):
run_every = timedelta(minutes=1)
def run(self, **kwargs):
self.get_logger().info("Time now: " + datetime.now())
print("Time now: " + datetime.now())
tasks.register(MyTask)
And then I start up my django server (local development instance):
python manage.py runserver
Then I start up the celerybeat process:
python manage.py celerybeat --logfile=<path_to_log_file> -l DEBUG
I can see entries like this in the log:
[2012-04-29 07:50:54,671: DEBUG/MainProcess] tasks.MyTask sent. id->72a5963c-6e15-4fc5-a078-dd26da663323
And I also can see the corresponding entries getting created in database, but I can't find where it is logging the text I specified in the actual run function in MyTask class.
I tried fiddling with the logging settings, tried using the django logger instead of celery logger, but of no use. I'm not even sure, my task is getting executed. If I print any debug information in the task, where does it go?
Also, this is first time I'm working with any type of message queuing system. It looks like the task will get executed as part of the celerybeat process - outside the django web framework. Will I still be able to access all the django models I created.
Thanks,
Venkat.
Celerybeat it stuff, which pushes task when it need, but not executing them. You tasks instances stored in RabbitMq server. You need to execute celeryd daemon for executing your tasks.
python manage.py celeryd --logfile=<path_to_log_file> -l DEBUG
Also if you using RabbitMq, I recommend to you to install special rabbitmq management plugins:
rabbitmq-plugins list
rabbitmq-enable rabbitmq_management
service rabbitmq-server restart
It will be available at http://:55672/ login: guest pass: guest. Here you can check how many tasks in your rabbit instance online.
You should check the RabbitMQ logs, since celery sends the tasks to RabbitMQ and it should execute them. So all the prints of the tasks should be in RabbitMQ logs.

django with twisted web - wgi and vhost

I have a project which has a directory setup like:
myproject
someapp
sites
foo
settings.py - site specific
settings.py - global
I am using twisted web.wsgi to serve this project. The problem am I running into is setting up the correct environment.
import sys
import os
from twisted.application import internet, service
from twisted.web import server, resource, wsgi, static, vhost
from twisted.python import threadpool
from twisted.internet import reactor
from django.core.handlers.wsgi import WSGIHandler
from django.core.management import setup_environ,ManagementUtility
sys.path.append(os.path.abspath("."))
sys.path.append(os.path.abspath("../"))
DIRNAME= os.path.dirname(__file__)
SITE_OVERLOADS = os.path.join(DIRNAME,'sites')
def import_module(name):
mod = __import__(name)
components = name.split('.')
for comp in components[1:]:
mod = getattr(mod,comp)
return mod
def buildServer():
hosts = [d for d in os.listdir(SITE_OVERLOADS) if not os.path.isfile(d) and d != ".svn"]
root = vhost.NameVirtualHost()
pool = threadpool.ThreadPool()
pool.start()
reactor.addSystemEventTrigger('after', 'shutdown', pool.stop)
for host in hosts:
settings = os.path.join(SITE_OVERLOADS,"%s/settings.py" % host)
if os.path.exists(settings):
sm = "myproject.sites.%s.settings" % host
settings_module = import_module(sm)
domain = settings_module.DOMAIN
setup_environ(settings_module)
utility = ManagementUtility()
command = utility.fetch_command('runserver')
command.validate()
wsgi_resource = wsgi.WSGIResource(reactor,pool,WSGIHandler())
root.addHost(domain,wsgi_resource)
return root
root = buildServer()
site = server.Site(root)
application = service.Application('MyProject')
sc = service.IServiceCollection(application)
i = internet.TCPServer(8001, site)
i.setServiceParent(sc)
I am trying to setup vhosts for each site which has a settings module in the subdirectory "sites". However, it appears that the settings are being shared for each site.
Django projects within the same Python process will share the same settings. You will need to spawn them as separate processes in order for them to use separate settings modules.
Since your goal is a bunch of shared-nothing virtual hosts, you probably won't benefit from trying to set up your processes in anything but the simplest way. So, how about changing your .tac file to just launch a server for a single virtual host, starting up a lot of instances (manually, with a shell script, with another simple Python script, etc), and then putting a reverse proxy (nginx, Apache, even another Twisted Web process) in front of all of those processes?
You could do this all with Twisted, and it might even confer some advantages, but for just getting started you would probably rather focus on your site than on minor tweaks to your deployment process. If it becomes a problem that things aren't more integrated, then that would be the time to revisit the issue and try to improve on your solution.