Scale Gevent Socketio - django

I currently have a site setup using Django. I have added Gevent Socketio to add a chat function. I have a need to scale it as there are quite a few users already on the site and can't find a way to do so.
I tried https://github.com/abourget/gevent-socketio/tree/master/examples/django_chat/chat
I am using Gunicorn & the socketio.sgunicorn.GeventSocketIOWorker worker class so at first I thought of increasing the worker count. Unfortunately this seems to fail intermittently. I have started rewriting it to use redis from a few sources I found and have 1 worker on each server which is now being load balanced. However this seems to have the same problem. I am wondering if there is some issue in the gevent socketio code itself which does not allow it to scale.
Here is how I have started which is just the submit message code.
def redis_client():
"""Get a redis client."""
return Redis(settings.REDIS_HOST, settings.REDIS_PORT, settings.REDIS_DB)
class PubSub(object):
"""
Very simple Pub/Sub pattern wrapper
using simplified Redis Pub/Sub functionality.
Usage (publisher)::
import redis
r = redis.Redis()
q = PubSub(r, "channel")
q.publish("test data")
Usage (listener)::
import redis
r = redis.Redis()
q = PubSub(r, "channel")
def handler(data):
print "Data received: %r" % data
q.subscribe(handler)
"""
def __init__(self, redis, channel="default"):
self.redis = redis
self.channel = channel
def publish(self, data):
self.redis.publish(self.channel, simplejson.dumps(data))
def subscribe(self, handler):
redis = self.redis.pubsub()
redis.subscribe(self.channel)
for data_raw in redis.listen():
if data_raw['type'] != "message":
continue
data = simplejson.loads(data_raw["data"])
handler(data)
from socketio.namespace import BaseNamespace
from socketio.sdjango import namespace
from supremo.utils import redis_client, PubSub
from gevent import Greenlet
#namespace('/chat')
class ChatNamespace(BaseNamespace):
nicknames = []
r = redis_client()
q = PubSub(r, "channel")
def initialize(self):
# Setup redis listener
def handler(data):
self.emit('receive_message',data)
greenlet = Greenlet.spawn(self.q.subscribe, handler)
def on_submit_message(self,msg):
self.q.publish(msg)

I used parts of code from https://github.com/fcurella/django-push-demo and gevent-socketio 0.3.5rc1 instead of rc2 and it is working now with multiple workers and load balancing.

Related

Psycopg2 & Flask - tying connection to before_request & teardown_appcontext

Cheers guys,
refactoring my Flask app I got stuck at tying the db connection to #app.before_request and closing it at #app.teardown_appcontext. I am using plain Psycopg2 and the app factory pattern.
First I created a function to call wihtin the app factory so I could use #app as suggested by Miguel Grinberg here:
def create_app(test_config=None):
app = Flask(__name__, instance_relative_config=True)
--
from shop.db import connect_and_close_db
connect_and_close_db(app)
--
return app
Then I tried this pattern suggested on http://flask.pocoo.org/docs/1.0/appcontext/#storing-data:
def connect_and_close_db(app):
#app.before_request
def get_db_test():
conn_string = "dbname=testdb user=testuser password=test host=localhost"
if 'db' not in g:
g.db = psycopg2.connect(conn_string)
return g.db
#app.teardown_appcontext
def close_connection(exception):
db = g.pop('db', None)
if db is not None:
db.close()
It resulted in:
TypeError: 'psycopg2.extensions.connection' object is not callable
Anyone has an idea what happend and how to make it work?
Furthermore I wonder how I would access the connection object for creating a cursor once its creation is tied to before_request?
This solution is probably far from perfect, and it's not really DRY. I'd welcome comments, or other answers that build on this.
To implement for raw psycopg2 support, you probably need to take a look at the connection pooler. There's also a good guide on how to implement this outwith Flask.
The basic idea is to create your connection pool first. You want this to be established when the flask application initializes (This could within the python interpreter or via gunicorn worker of which there may be several - in which case each worker has its own connection pool). I chose to store the returned pool in the config:
from flask import Flask, g, jsonify
import psycopg2
from psycopg2 import pool
app = Flask(__name__)
app.config['postgreSQL_pool'] = psycopg2.pool.SimpleConnectionPool(1, 20,
user = "postgres",
password = "very_secret",
host = "127.0.0.1",
port = "5432",
database = "postgres")
Note the first two arguments to SimpleConnectionPool are the min & max connections. That's the number of connections going to your database server, bwtween 1 & 20 in this case.
Next define a get_db function:
def get_db():
if 'db' not in g:
g.db = app.config['postgreSQL_pool'].getconn()
return g.db
The SimpleConnectionPool.getconn() method used here simply returns a connection from the pool, which we assign to g.db and return. This means when we call get_db() anywhere in the code it returns the same connection, or creates a connection if not present. There's no need for a before.context decorator.
Do define your teardown function:
#app.teardown_appcontext
def close_conn(e):
db = g.pop('db', None)
if db is not None:
app.config['postgreSQL_pool'].putconn(db)
This runs when the application context is destroyed, and uses SimpleConnectionPool.putconn() to put away the connection.
Finally define a route:
#app.route('/')
def index():
db = get_db()
cursor = db.cursor()
cursor.execute("select 1;")
result = cursor.fetchall()
print (result)
cursor.close()
return jsonify(result)
This code works for me tested against postgres runnning in a docker container. A few things which probably should be improved:
This view isn't very DRY. Perhaps you could move some of this into the get_db function so it returns a cursor. (!!!)
When the python interpreter exits, you should also find away to close the connection with app.config['postgreSQL_pool'].closeall
Although tested some kind of way to monitor the pool would be good, so that you could watch pool/db connections under load and make sure the pooler behaves as expected.
(!!!)In another land, the sqlalchemy.scoped_session documentation explains more things relating to this, with some theory on how its 'sessions' work in relation to requests. They have implemented it in such a way that you can call Session.query('SELECT 1') and it will create the session if it doesn't already exist.
EDIT: Here's a gist with your app factory pattern, and sample usage in the comment.
Currently I am using this pattern:
(I ll edit this answer eventually if I came up with better solution)
This is main script in which we use database. It uses two functions from config: get_db() to get connection from pool and put_db() to return connection into pool:
from config import get_db, put_db
from threading import Thread
from time import sleep
def select():
db = get_db()
sleep(1)
cursor = db.cursor()
# Print select result and db connection address in memory
# To see if it gets connection from another addreess on second thread
cursor.execute("SELECT 'It works %s'", (id(db),))
print(cursor.fetchone())
cursor.close()
put_db(db)
Thread(target=select).start()
Thread(target=select).start()
print('Main thread')
This is config.py:
import sys
import os
import psycopg2
from psycopg2 import pool
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
def get_db(key=None):
return getattr(get_db, 'pool').getconn(key)
def put_db(conn, key=None):
getattr(get_db, 'pool').putconn(conn, key=key)
# So we here need to init connection pool in main thread in order everything to work
# Pool is initialized under function object get_db
try:
setattr(get_db, 'pool', psycopg2.pool.ThreadedConnectionPool(1, 20, os.getenv("DB")))
print(color.red('Initialized db'))
except psycopg2.OperationalError as e:
print(e)
sys.exit(0)
And also if you are curious there is an .env file containing db connection string in DB env variable:
DB="dbname=postgres user=postgres password=1234 host=127.0.0.1 port=5433"
(.env file is loaded using dotenv module in config.py)

Start and Stop a periodically background Task with Django

I would like to make a bitcoin notification with Django. If managed to have a working Telegram bot that send the bitcoin stat when I ask him to do so. Now I would like him to send me a message if bitcoin reaches a specific value. There are some tutorials with running python script on server but not with Django. I read some answers and descriptions about django channels but couldn't adapt them to my project.
I would like to send, by telegram, a command about the amount and duration. Django would then start a process with these values and values of the channel I'm sending from in the background. If now, within the duration, the amount is reached, Django sends a message back to my channel. This should also be possible for more than one person.
Is these possible to do with Django out of the box, maybe with decorators, or do I need django-channels or something else?
Edit 2018-08-10:
Maybe my code explains a little bit better what I want to do.
import requests
import json
from datetime import datetime
from django.shortcuts import render
from django.http import HttpResponse
from django.conf import settings
from django.views.generic import TemplateView
from django.views.decorators.csrf
import csrf_exempt
class AboutView(TemplateView):
template_name = 'telapi/about.html'
bot_token = settings.BOT_TOKEN
def get_url(method):
return 'https://api.telegram.org/bot{}/{}'.format(bot_token, method)
def process_message(update):
data = {}
data['chat_id'] = update['message']['from']['id']
data['text'] = "I can hear you!"
r = requests.post(get_url('sendMessage'), data=data)
#csrf_exempt
def process_update(request, r_bot_token):
''' Method that is called from telegram-bot'''
if request.method == 'POST' and r_bot_token == bot_token:
update = json.loads(request.body.decode('utf-8'))
if 'message' in update:
if update['message']['text'] == 'give me news':
new_bitcoin_price(update)
else:
process_message(update)
return HttpResponse(status=200)
bitconin_api_uri = 'https://api.coinmarketcap.com/v2/ticker/1/?convert=EUR'
# response = requests.get(bitconin_api_uri)
def get_latest_bitcoin_price():
response = requests.get(bitconin_api_uri)
response_json = response.json()
euro_price = float(response_json['data']['quotes']['EUR']['price'])
timestamp = int(response_json['metadata']['timestamp'])
date = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
return euro_price, date
def new_bitcoin_price(update):
data = {}
data['chat_id'] = update['message']['from']['id']
euro_price, date = get_latest_bitcoin_price()
data['text'] = "Aktuel ({}) beträgt der Preis {:.2f}€".format(
date, euro_price)
r = requests.post(get_url('sendMessage'), data=data)
Edit 2018-08-13:
I think the solution would be celery-beat and channels. Does anyone know a good tutorial?
One of my teammates uses django-celery-beat, that is available at https://github.com/celery/django-celery-beat to do this and he gave me some excellent feedback from it. You can schedule the celery tasks using the crontab syntax.
I had same issue, there are several typical approaches: Celery, Django-Channels, etc.
But you can avoid them all with simple approach: https://docs.djangoproject.com/en/2.1/howto/custom-management-commands/
I have used django commands in my project to run periodically tasks to rebuild users statistics:
Implement yourself application command, for example your application name is myapp and you have placed my_periodic_task.py in myapp/management/commands folder, so you can run your task once by typing python manage.py my_periodic_task
place beside manage.py file new file for example background.py with same code:
-
import os
from subprocess import call
BASE = os.path.dirname(__file__)
MANAGE_BASE = os.path.join(BASE, 'manage.py')
while True:
sleep(YOUR_TIMEOUT)
call(['python', MANAGE_BASE , 'my_periodic_task'])
Run your server for example: python background.py & python manage.py runserver 0.0.0.0:8000

Rabbitmq listener using pika in django

I have a django application and I want to consume messages from a rabbit mq. I want the listener to start consuming when I start the django server.I am using pika library to connect to rabbitmq.Proving some code example will really help.
First you need to somehow run your application at the start of the django project
https://docs.djangoproject.com/en/2.0/ref/applications/#django.apps.AppConfig.ready
def ready(self):
if not settings.IS_ACCEPTANCE_TESTING and not settings.IS_UNITTESTING:
consumer = AMQPConsuming()
consumer.daemon = True
consumer.start()
Further in any convenient place
import threading
import pika
from django.conf import settings
class AMQPConsuming(threading.Thread):
def callback(self, ch, method, properties, body):
# do something
pass
#staticmethod
def _get_connection():
parameters = pika.URLParameters(settings.RABBIT_URL)
return pika.BlockingConnection(parameters)
def run(self):
connection = self._get_connection()
channel = connection.channel()
channel.queue_declare(queue='task_queue6')
print('Hello world! :)')
channel.basic_qos(prefetch_count=1)
channel.basic_consume(self.callback, queue='queue')
channel.start_consuming()
This will help
http://www.rabbitmq.com/tutorials/tutorial-six-python.html

celery, flask sqlalchemy: DatabaseError: (DatabaseError) SSL error: decryption failed or bad record mac

Hi I have a setup where I'm using Celery Flask SqlAlchemy and I am intermittently getting this error:
(psycopg2.DatabaseError) SSL error: decryption failed or bad record mac
I followed this post:
Celery + SQLAlchemy : DatabaseError: (DatabaseError) SSL error: decryption failed or bad record mac
and also a few more and added a prerun and postrun methods:
#task_postrun.connect
def close_session(*args, **kwargs):
# Flask SQLAlchemy will automatically create new sessions for you from
# a scoped session factory, given that we are maintaining the same app
# context, this ensures tasks have a fresh session (e.g. session errors
# won't propagate across tasks)
d.session.remove()
#task_prerun.connect
def on_task_init(*args, **kwargs):
d.engine.dispose()
But I'm still seeing this error. Anyone solved this?
Note that I'm running this on AWS (with two servers accessing same database). Database itself is hosted on it's own server (not RDS). I believe the total celery background tasks running are 6 (2+4). Flask frontend is running using gunicorn.
My related thread:
https://github.com/celery/celery/issues/3238#issuecomment-225975220
Here is my comment along with additional information:
I use Celery, SQLAlchemy and PostgreSQL on AWS and there is no such problem.
The only difference I can think of is that I have the database on RDS.
I think you can try switching to RDS temporary, just to test if the
issue will be still present or not. If it disappered with RDS then
you'll need to look into PostgreSQL settings.
According to the RDS paramters, I have SSL enabled:
ssl = 1, Enables SSL connections.
ssl_ca_file = /rdsdbdata/rds-metadata/ca-cert.pem
ssl_cert_file = /rdsdbdata/rds-metadata/server-cert.pem
ssl_ciphers = false, Sets the list of allowed SSL ciphers.
ssl_key_file = /rdsdbdata/rds-metadata/server-key.pem
ssl_renegotiation_limit = 0, integer, (kB) Set the amount of traffic to send and receive before renegotiating the encryption keys.
As for Celery initialization code, it is roughly this
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
import sqldb
engine = sqldb.get_engine()
cached_data = None
def do_the_work():
global engine, ruckus_data
if cached_data is not None:
return cached_data
db_session = None
try:
db_session = scoped_session(sessionmaker(
autocommit=False, autoflush=False, bind=engine))
data = sqldb.get_session().query(
sqldb.system.MyModel).filter_by(
my_type = sqldb.system.MyModel.TYPEA).all()
cached_data = {}
for row in data:
... # put row into cached_data
finally:
if db_session is not None:
db_session.remove()
return cached_data
This do_the_work function is then called from the celery task.
The sqldb.get_engine looks like this:
from sqlalchemy import create_engine
_engine = None
def get_engine():
global _engine
if _engine:
return _engine
_engine = create_engine(config.SQL_DB_URL, echo=config.SQL_DB_ECHO)
return _engine
Finally, the SQL_DB_URI and SQL_DB_ECHO in the config module are these:
SQL_DB_URL = 'postgresql+psycopg2://%s:%s#%s/%s' % (
POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_DB_NAME)
SQL_DB_ECHO = False

Emit/Broadcast Messages on REST Call in Python With Flask and Socket.IO

Background
The purpose of this project is to create a SMS based kill switch for a program I have running locally. The plan is to create web socket connection between the local program and an app hosted on Heroku. Using Twilio, receiving and SMS will trigger a POST request to this app. If it comes from a number on my whitelist, the application should send a command to the local program to shut down.
Problem
What can I do to find a reference to the namespace so that I can broadcast a message to all connected clients from a POST request?
Right now I am simply creating a new web socket client, connecting it and sending the message, because I can't seem to figure out how to get access to the namespace object in a way that I can call an emit or broadcast.
Server Code
from gevent import monkey
from flask import Flask, Response, render_template, request
from socketio import socketio_manage
from socketio.namespace import BaseNamespace
from socketio.mixins import BroadcastMixin
from time import time
import twilio.twiml
from socketIO_client import SocketIO #only necessary because of the hack solution
import socketIO_client
monkey.patch_all()
application = Flask(__name__)
application.debug = True
application.config['PORT'] = 5000
# White list
callers = {
"+15555555555": "John Smith"
}
# Part of 'hack' solution
stop_namespace = None
socketIO = None
# Part of 'hack' solution
def on_connect(*args):
global stop_namespace
stop_namespace = socketIO.define(StopNamespace, '/chat')
# Part of 'hack' solution
class StopNamespace(socketIO_client.BaseNamespace):
def on_connect(self):
self.emit("join", 'server#email.com')
print '[Connected]'
class ChatNamespace(BaseNamespace, BroadcastMixin):
stats = {
"people" : []
}
def initialize(self):
self.logger = application.logger
self.log("Socketio session started")
def log(self, message):
self.logger.info("[{0}] {1}".format(self.socket.sessid, message))
def report_stats(self):
self.broadcast_event("stats",self.stats)
def recv_connect(self):
self.log("New connection")
def recv_disconnect(self):
self.log("Client disconnected")
if self.session.has_key("email"):
email = self.session['email']
self.broadcast_event_not_me("debug", "%s left" % email)
self.stats["people"] = filter(lambda e : e != email, self.stats["people"])
self.report_stats()
def on_join(self, email):
self.log("%s joined chat" % email)
self.session['email'] = email
if not email in self.stats["people"]:
self.stats["people"].append(email)
self.report_stats()
return True, email
def on_message(self, message):
message_data = {
"sender" : self.session["email"],
"content" : message,
"sent" : time()*1000 #ms
}
self.broadcast_event_not_me("message",{ "sender" : self.session["email"], "content" : message})
return True, message_data
#application.route('/stop', methods=['GET', 'POST'])
def stop():
'''Right here SHOULD simply be Namespace.broadcast("stop") or something.'''
global socketIO
if socketIO == None or not socketIO.connected:
socketIO = SocketIO('http://0.0.0.0:5000')
socketIO.on('connect', on_connect)
global stop_namespace
if stop_namespace == None:
stop_namespace = socketIO.define(StopNamespace, '/chat')
stop_namespace.emit("join", 'server#bayhill.com')
stop_namespace.emit('message', 'STOP')
return "Stop being processed."
#application.route('/', methods=['GET'])
def landing():
return "This is Stop App"
#application.route('/socket.io/<path:remaining>')
def socketio(remaining):
try:
socketio_manage(request.environ, {'/chat': ChatNamespace}, request)
except:
application.logger.error("Exception while handling socketio connection",
exc_info=True)
return Response()
I borrowed code heavily from this project chatzilla which is admittedly pretty different because I am not really working with a browser.
Perhaps Socketio was a bad choice for web sockets and I should have used Tornado, but this seemed like it would work well and this set up helped me easily separate the REST and web socket pieces
I just use Flask-SocketIO for that.
from gevent import monkey
monkey.patch_all()
from flask import Flask
from flask.ext.socketio import SocketIO
app = Flask(__name__)
socketio = SocketIO(app)
#app.route('/trigger')
def trigger():
socketio.emit('response',
{'data': 'someone triggered me'},
namespace='/global')
return 'message sent via websocket'
if __name__ == '__main__':
socketio.run(app)