I'm trying to set up python-telegram-bot library in webhook mode with Django. That should work as follows: on Django startup, I do some initial setting of python-telegram-bot and get a dispatcher object as a result. Django listens to /telegram_hook url and receives updates from Telegram servers. What I want to do next is to pass the updates to the process_update method of the dispatcher created on startup. It contains all the parsing logic and invokes callbacks specified during setup.
The problem is that the dispatcher object needs to be saved globally. I know that global states are evil but that's not really a global state because the dispatcher is immutable. However, I still don't know where to put it and how to ensure that it will be visible to all threads after setup phase is finished. So the question is how do I properly save the dispatcher after setup to invoke it from Django's viewset?
P.S. I know that I could use a built-in web server or use polling or whatever. However, I have reasons to use Django and I anyway would like to know how to deal with cases like that because it's not the only situation I can imagine when I need to store an immutable object created on startup globally.
It looks like you need thread safe singleton like this one https://gist.github.com/werediver/4396488 or http://alacret.blogspot.ru/2015/04/python-thread-safe-singleton-pattern.html
import threading
# Based on tornado.ioloop.IOLoop.instance() approach.
# See https://github.com/facebook/tornado
class SingletonMixin(object):
__singleton_lock = threading.Lock()
__singleton_instance = None
#classmethod
def instance(cls):
if not cls.__singleton_instance:
with cls.__singleton_lock:
if not cls.__singleton_instance:
cls.__singleton_instance = super(SingletonMixin, cls).__new__(cls)
return cls.__singleton_instance
Related
Our callback system worked such that during a request where you needed more user input you would run the following:
def view(req):
# do checks, maybe get a variable.
bar = req.bar()
def doit():
foo = req.user
do_the_things(foo, bar)
req.confirm(doit, "Are you sure you want to do it")
From this, the server would store the function object in a dictionary, with a UID as a key that would be send to the client, where a confirmation dialog would be shown. When OK is pressed another request is sent to another view which looks up the stored function object and runs it.
This works in a single process deployment. However with nginx, if there's a process pool greater than 1, a different process gets the confirmation request, and thus doesn't have the stored function, and can no run.
We've looked into ways to force nginx to use a certain process for certain requests, but haven't found a solution.
We've also looked into multiprocessing libraries and celery, however there doesn't seem to be a way to send a predefined function into another process.
Can anyone suggest a method that will allow us to store a function to run later when the request for continuing might come from a separate process?
There doesn't seem to be a good reason to use a callback defined as an inline function here.
The web is a stateless environment. You can never be certain of getting the same server process two requests in a row, and your code should never be written to store data in memory.
Instead you need to put data into a data store of some kind. In this case, the session is the ideal place; you can store the IDs there, then redirect the user to a view that pops that key from the session and runs the process on the relevant IDs. Again, no need for an inline function at all.
I was writing an application in Play 2.3.7 and when trying to create an actor (using the default Akka.system() of Play) inside the beforeStart overriden method of the Global object, the application crashes with some infinite recursive call of beforeStart, ultimately throwing an exception due to Global object not being initialized. If I create this actor inside the onStart method, then everything goes well.
My "intuition" was: "ok, this actor must be ready before the application receives the first request, so it must be created on beforeStart, not in onStart".
When is Akka.system() ready to use?
Akka.system returns an ActorSystem held by the AkkaPlugin. Therefore, if you want to use it, you must do so after the AkkaPlugin has been initialized. The AkkaPlugin is given priority 1000, which means its started after most other internal plugins (database, evolutions, ..). The Global plugin has priority 10000, which means the AkkaPlugin is available there (and for any plugin with priority > 1000).
Note the warning in the docs about beforeStart:
Called before the application starts.
Resources managed by plugins, such as database connections, are likely not available at this point.
You have to start this in onStart() because beforeStart() is called too early - way before anything like Akka (which is actually a plugin) or any database connections are created. In fact, the documentation for GlobalSettings states:
Resources managed by plugins, such as database connections, are likely not available at this point.
The general guidance (confirmed by this thread) is that onStart() is the place to create your actors. And in practice, that has worked for me as well.
My web app is based on (embedded) Jetty 9. The code that runs inside Jetty (i.e. from the *.war file) has the need to, at times, execute an HTTP request back into Jetty and itself, completely asynchronously to "real" HTTP requests coming from the network.
I know what you will say, but this is the situation I ended up with after merging multiple disparate products into one container and presently cannot avoid it. A stop-gap measure is in place - it actually sends a network HTTP request back to itself (presently using Jetty client, but that does not matter). However, not only that adds more overhead, it also does not allow us to pass actual object references we'd like to be able to pass via, say, request attributes.
Desire is to be able to do something like constructing new HttpServletRequest and HttpServletResponse pair and use a dispatcher to "include" (or similar) the other servlet we presently can only access via the network. We've built "dummy" implementations of those, but the this fails in Jetty's dispatcher line 120 with a null pointer exception:
public void include(ServletRequest request, ServletResponse response) throws ServletException, IOException
{
Request baseRequest=(request instanceof Request)?((Request)request):HttpChannel.getCurrentHttpChannel().getRequest();
... because this is not an instance of Jetty's Request class and getCurrentHttpChannel() returns null because the thread is a worker thread, not an http serving one and does not have Jetty's thread locals set up.
I am contemplating options, but would like some guidance if anyone can offer it. Some things I am thinking of:
Actually use Jetty's Request class as a base. Currently not visible to the web app (a container class, would have to play with classpath and class loaders perhaps. May still be impossible (don't know what to expect there).
Play with Jetty's thread locals, attempt to tell Jetty to set up current thread as necessary. Don't know where to begin. UPDATE Tried to setServerClasses([]) and then set the current HttpChannel to the one I 'stole' from another thread. Failed misearably: java.lang.IllegalAccessError: tried to access method org.eclipse.jetty.server.HttpChannel.setCurrentHttpChannel(Lorg/eclipse/jetty/server/HttpChannel;)V from class ...
Ideally, find a better/proper way of feeding a "top" request in without going via the network. Ideally would execute on the same thread, but I would be less concerned with that.
Remember that, unfortunately, I cannot avoid this at this time. I would much rather invoke code directly, but I cannot, as the code I had to add into mine is too big to handle at this time and too dependent on some third party filters I can't even modify (and only work as filters, on real requests).
Please help!
I am running a python application on the App Engine using Django. Additionally, I am using a session-management library called gae-sessions. If threadsafe is set to "no", there is no problem, but when threadsafe is set to "yes", I occasionally see a problem with sessions being lost.
The issue that I am seeing is that when treading is enabled, multiple requests are ocassionally interleaved in GAE-Sessions middleware.
Within the gae-sessions library, there is a variable called _tls, which is a threading.local() variable. When a user makes an http request to the website, a function called process_request() is first run, followed by a bunch of custom html generation for the current page, and then a function called process_response() is run. State is remembered between the process_request and process_response in the _tls "thread safe" variable. I am able to check uniqueness of the _tls variable by printing out the _tls value (eg. "<thread._local object at 0xfc2e8de0>").
What I am occasionally witnessing is that on what appears to be a single thread in the GAE-Sessions middleware (inferred to be a single thread by the fact that they have the same memory location for the thread_local object, and inferred by the fact that data from one request appears to be overwriting data from another requst), multiple http requests are being interleaved. Given User1 and User2 that make a request at the same time, I have witnessed the following execution order:
User1 -> `process_request` is executed on thread A
User2 -> `process_request` is executed on thread A
User2 -> `process_response` is executed on thread A
User1 -> `process_response` is executed on thread A
Given the above scenario, the User2 session stomps on some internal variables and causes the session of User1 to be lost.
So, my question is the following:
1) Is this interleaving of different requests in the middleware expected behaviour in App-Engine/Django/Python? (or am I totally confused, and there is something else going on here)
2) At what level is this interleaving happening (App-Engine/Django/Python)?
I am quite surprised by seeing this behaviour, and so would be interested to understand why/what is happening here.
I found the following links to be helpful in understanding what is happening:
http://blog.notdot.net/2011/10/Migrating-to-Python-2-7-part-1-Threadsafe
Is Django middleware thread safe?
http://blog.roseman.org.uk/2010/02/01/middleware-post-processing-django-gotcha/
Assuming that I am understanding everything correctly, the reason that the above happened is the following:
1) When Django is running, it runs most of the base functionality in a parent (common) thread that includes the Django Middleware.
2) Individual requests are run in child threads which can interact with the parent thread.
The result of the above is that requests (child threads) can indeed be interleaved within the Middleware - and this is by design (only running a single copy of Django and the Middleware would save memory, be more efficient, etc.). [see the first article that I linked to in this answer for a quick description of how threading and child/parent processes interact]
With respect to GAE-Sessions - the thread that we were examining was the same for different requests, given that it was the parent thread (common for all children/requests), as opposed to the child threads that we were looking at each time that the middleware was entered.
GAE-Sessions was storing state data in the middleware, which could be over-written by different requests, given the possible interleaving of the child threads within the parent (Django + Middlware) thread. The fix that I applied to GAE-Sessions was to store all state data on the request object, as opposed to within the middlware.
Fixes: previously a writable reference to response handler functions was stored in the DjangoSessionMiddlware object as self.response_handlers - which I have moved to the request object as request.response_handlers. I also removed the _tls variable, and moved data that it contained into the request object.
django-piston appears to create a data attribute on the request object before it gets to the Handler phase. This data is available, for example, in the PUT and POST handlers by accessing request.data.
However, in the DELETE handler, the data is not available.
I would like to modify django-piston to make this data available but I have no real idea on where to start. Any ideas? Where does the data attribute originate from?
I solved this for myself. The short hacky answer is that the method
translate_mime(request)
from piston.utils needs to be run on the request to make the data attribute available.
The overall fix for this would be to make a change in the Piston source code itself in resource.py to execute the translate_mime method for DELETE actions. Currently it only does to automatically for PUT and POST.
But, like I said, you can actually just manually call translate_mime in the actual handler method and it works fine.