Running asynchronous code synchronously in separate thread - django

I'm using Django Channels to support websockets and am using their concept of a group to broadcast messages to multiple consumers in the same group. In order to send messages outside of a consumer, you need to call asynchronuous methods in otherwise synchronous code. Unfortunately, this is presenting problems when testing.
I began by using loop.run_until_complete:
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.ensure_future(channel_layer.group_send(group_name, {'text': json.dumps(message),
'type': 'receive_group_json'}),
loop=loop))
Then the stacktrace read that the thread did not have an event loop: RuntimeError: There is no current event loop in thread 'Thread-1'.. To solve this, I added:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(asyncio.ensure_future(channel_layer.group_send(group_name, {'text': json.dumps(message),
'type': 'receive_group_json'}),
loop=loop))
And now the stacktrace is reading the RuntimeError: Event loop is closed, although if I add print statements loop.is_closed() prints False.
For context, I'm using Django 2.0, Channels 2, and a redis backend.
Update: I tried running this in a Python interpreter (outside of py.test to remove moving variables). When I ran the second code block, I did not get an Event loop is closed error (that may be due to something on Pytest's end whether its timeouts, etc). But, I did not receive the group message in my client. I did, however, see a print statement:
({<Task finished coro=<RedisChannelLayer.group_send() done, defined at /Users/my/path/to/venv/lib/python3.6/site-packages/channels_redis/core.py:306> result=None>}, set())
Update 2: After flushing redis, I added a fixture in py.test to flush it for every function as well as a session-scoped event loop. This time yielding yet another print from RedisChannelLayer:
({<Task finished coro=<RedisChannelLayer.group_send() done, defined at /Users/my/path/to/venv/lib/python3.6/site-packages/channels_redis/core.py:306> exception=RuntimeError('Task <Task pending coro=<RedisChannelLayer.group_send() running at /Users/my/path/to/venv/lib/python3.6/site-packages/channels_redis/core.py:316>> got Future <Future pending> attached to a different loop',)>}, set())

If channel_layer expects to reside in its own event loop in another thread, you will need to get a hold of that event loop object. Once you have it, you can submit coroutines to it and synchronize with your thread, like this:
def wait_for_coro(coro, loop):
# submit coroutine to the event loop in the other thread
# and wait for it to complete
future = asyncio.run_coroutine_threadsafe(coro, loop)
return future.wait()
wait_for_coro(channel_layer.group_send(group_name, ...), channel_loop)

By default, only the main thread gets an event loop and calling get_event_loop in other threads will fail.
If you need an event loop in another thread -- such as a thread handling an HTTP or WebSockets request -- you need to make it yourself with new_event_loop. After that you can use set_event_loop and future get_event_loop calls will work. I do this:
# get or create an event loop for the current thread
def get_thread_event_loop():
try:
loop = asyncio.get_event_loop() # gets previously set event loop, if possible
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
return loop
More here.

Related

How to integrate Cap'n'Proto threads with non Cap'n'Proto threads?

How do I properly integrate Cap'n'Proto client usage with surrounding multi-threaded code? The Cap'n'Proto docs say that each Cap'n'Proto interface is single-threaded with a dedicated event loop. Additionally they recommend using Cap'n'Proto to communicate between threads. However, the docs don't seem to describe how non-Cap'n'Proto threads (e.g. the UI loop) could integrate with that. Even if could integrate Cap'n'Proto event loops with the UI loop in some places, other models like thread pools (Android Binder, global libdispatch queues) seem more challenging.
I think the solution is to cache the thread executor for the client thread in a synchronized place that the non-capnp thread will access it.
I believe though that the calling thread always needs to be on its own event loop as well to marry them but I just want to make sure that's actually the case. My initial attempt to do that in a simple unit test is failing. I created a KjLooperEventPort class (following the structure for the node libuv adapter) to marry KJ & ALooper on Android.
Then my test code is:
TEST(KjLooper, CrossThreadPromise) {
std::thread::id kjThreadId;
ConditionVariable<const kj::Executor*> executorCv{nullptr};
ConditionVariable<std::pair<bool, kj::Promise<void>>> looperThreadFinished{false, nullptr};
std::thread looperThread([&] {
auto looper = android::newLooper();
android::KjLooperEventPort kjEventPort{looper};
kj::WaitScope waitScope(kjEventPort.getKjLoop());
auto finished = kj::newPromiseAndFulfiller<void>();
looperThreadFinished.constructValueAndNotifyAll(true, kj::mv(finished.promise));
executorCv.waitNotValue(nullptr);
auto executor = executorCv.readCopy();
kj::Promise<void> asyncPromise = executor->executeAsync([&] {
ASSERT_EQ(std::this_thread::get_id(), kjThreadId);
});
asyncPromise = asyncPromise.then([tid = std::this_thread::get_id(), kjThreadId, &finished] {
std::cerr << "Running promise completion on original thread\n";
ASSERT_NE(tid, kjThreadId);
ASSERT_EQ(std::this_thread::get_id(), tid);
std::cerr << "Fulfilling\n";
finished.fulfiller->fulfill();
std::cerr << "Fulfilled\n";
});
asyncPromise.wait(waitScope);
});
std::thread kjThread([&] {
kj::Promise<void> finished = kj::NEVER_DONE;
looperThreadFinished.wait([&](auto& promise) {
finished = kj::mv(promise.second);
return promise.first;
});
auto ioContext = kj::setupAsyncIo();
kjThreadId = std::this_thread::get_id();
executorCv.setValueAndNotifyAll(&kj::getCurrentThreadExecutor());
finished.wait(ioContext.waitScope);
});
looperThread.join();
kjThread.join();
}
This crashes fulfilling the promise back to the kj thread.
terminating with uncaught exception of type kj::ExceptionImpl: kj/async.c++:1269: failed: expected threadLocalEventLoop == &loop || threadLocalEventLoop == nullptr; Event armed from different thread than it was created in. You must use
Executor to queue events cross-thread.
Most Cap'n Proto RPC and KJ Promise-related objects can only be accessed in the thread that created them. Resolving a promise cross-thread, for example, will fail, as you saw.
Some ways you could solve this include:
You can use kj::Executor to schedule code to run on a different thread's event loop. The calling thread does NOT need to be a KJ event loop thread if you use executeSync() -- however, this function blocks until the other thread has had a chance to wake up and execute the function. I'm not sure how well this will perform in practice; if it's a problem, there is probably room to extend the Executor interface to handle this use case more efficiently.
You can communicate between threads by passing messages over pipes or socketpairs (but sending big messages this way would involve a lot of unnecessary copying to/from the socket buffer).
You could signal another thread's event loop to wake up using a pipe, signal, or (on Linux) eventfd, then have it look for messages in a mutex-protected queue. (But kj::Executor mostly obsoletes this technique.)
It's possible, though not easy, to adapt KJ's event loop to run on top of other event loops, so that both can run in the same thread. For example, node-capnp adapts KJ to run on top of libuv.

kafka-python: Closing the kafka producer with 0 vs inf secs timeout

I am trying to produce the messages to a Kafka topic using kafka-python 2.0.1 using python 2.7 (can't use Python 3 due to some workplace-related limitations)
I created a class as below in a separate and compiled the package and installed in virtual environment:
import json
from kafka import KafkaProducer
class KafkaSender(object):
def __init__(self):
self.producer = self.get_kafka_producer()
def get_kafka_producer(self):
return KafkaProducer(
bootstrap_servers=['locahost:9092'],
value_serializer=lambda x: json.dumps(x),
request_timeout_ms=2000,
)
def send(self, data):
self.producer.send("topicname", value=data)
My driver code is something like this:
from mypackage import KafkaSender
# driver code
data = {"a":"b"}
kafka_sender = KafkaSender()
kafka_sender.send(data)
Scenario 1:
I run this code, it runs just fine, no errors, but the message is not pushed to the topic. I have confirmed this as offset or lag is not increased in the topic. Also, nothing is getting logged at the consumer end.
Scenario 2:
Commented/removed the initialization of Kafka producer from __init__ method.
I changed the sending line from
self.producer.send("topicname", value=data) to self.get_kafka_producer().send("topicname", value=data) i.e. creating kafka producer not in advance (during class initialization) but right before sending the message to topic. And when I ran the code, it worked perfectly. The message got published to the topic.
My intention using scenario 1 is to create a Kafka producer once and use it multiple times and not to create Kafka producer every time I want to send the messages. This way I might end up creating millions of Kafka producer objects if I need to send millions of messages.
Can you please help me understand why is Kafka producer behaving this way.
NOTE: If I write the Kafka Code and Driver code in same file it works fine. It's not working only when I write the Kafka code in separate package, compile it and import it in my another project.
LOGS: https://www.diffchecker.com/dTtm3u2a
Update 1: 9th May 2020, 17:20:
Removed INFO logs from the question description. I enabled the DEBUG level and here is the difference between the debug logs between first scenario and the second scenario
https://www.diffchecker.com/dTtm3u2a
Update 2: 9th May 2020, 21:28:
Upon further debugging and looking at python-kafka source code, I was able to deduce that in scenario 1, kafka sender was forced closed while in scenario 2, kafka sender was being closed gracefully.
def initiate_close(self):
"""Start closing the sender (won't complete until all data is sent)."""
self._running = False
self._accumulator.close()
self.wakeup()
def force_close(self):
"""Closes the sender without sending out any pending messages."""
self._force_close = True
self.initiate_close()
And this depends on whether kafka producer's close() method is called with timeout 0 (forced close of sender) or without timeout (in this case timeout takes value float('inf') and graceful close of sender is called.)
Kafka producer's close() method is called from __del__ method which is called at the time of garbage collection. close(0) method is being called from method which is registered with atexit which is called when interpreter terminates.
Question is why in scenario 1 interpreter is terminating?

Why are my requests handled by a single thread in spray-http?

I set up an http server using spray-can, spray-http 1.3.2 and akka 2.3.6.
my application.conf doesn't have any akka (or spray) entries. My actor code:
class TestActor extends HttpServiceActor with ActorLogging with PlayJsonSupport {
val route = get {
path("clientapi"/"orders") {
complete {{
log.info("handling request")
System.err.println("sleeping "+Thread.currentThread().getName)
Thread.sleep(1000)
System.err.println("woke up "+Thread.currentThread().getName)
Seq[Int]()
}}
}
}
override def receive: Receive = runRoute(route)
}
started like this:
val restService = system.actorOf(Props(classOf[TestActor]), "rest-clientapi")
IO(Http) ! Http.Bind(restService, serviceHost, servicePort)
When I send 10 concurrent requests, they are all accepted immediately by spray and forwarded to different dispatcher actors (according to logging config for akka I have removed from applicaiton.conf lest it influenced the result), but all are handled by the same thread, which sleeps, and only after waking up picks up the next request.
What should I add/change in the configuration? From what I've seen in reference.conf the default executor is a fork-join-executor, so I'd expect all the requests to execute in parallel out of the box.
From your code I see that there is only one TestActor to handle all requests, as you've created only one with system.actorOf. You know, actorOf doesn't create new actor per request - more than that, you have the val there, so it's only one actor. This actor handles requests sequntially one-by-one and your routes are processing inside this actor. There is no reason for dispatcher to pick-up some another thread, while the only one thread per time is used by only one actor, so you've got only one thread in the logs (but it's not guaranteed) - I assume it's first thread in the pool.
Fork-join executor does nothing here except giving first and always same free thread as there is no more actors requiring threads in parallel with current one. So, it receives only one task at time. Even with "work stealing" - it doesn't work til you have some blocked (and marked to have managed block) thread to "steal" resources from. Thread.sleep(1000) itself doesn't mark thread automatically - you should surround it with scala.concurrent.blocking to use "work stealing". Anyway, it still be only one thread while you have only one actor.
If you need to have several actors to process the requests - just pass some akka router actor (it has nothing in common with spray-router):
val restService = context.actorOf(RoundRobinPool(5).props(Props[TestActor]), "router")
That will create a pool (not thread-pool) with 5 actors to serve your requests.

Stopping a function in Python using a timeout

I'm writing a healthcheck endpoint for my web service.
The end point calls a series of functions which return True if the component is working correctly:
The system is considered to be working if all the components are working:
def is_health():
healthy = all(r for r in (database(), cache(), worker(), storage()))
return healthy
When things aren't working, the functions may take a long time to return. For example if the database is bogged down with slow queries, database() could take more than 30 seconds to return.
The healthcheck endpoint runs in the context of a Django view, running inside a uWSGI container. If the request / response cycle takes longer than 30 seconds, the request is harakiri-ed!
This is a huge bummer, because I lose all contextual information that I could have logged about which component took a long time.
What I'd really like, is for the component functions to run within a timeout or a deadline:
with timeout(seconds=30):
database_result = database()
cache_result = cache()
worker_result = worker()
storage_result = storage()
In my imagination, as the deadline / harakiri timeout approaches, I can abort the remaining health checks and just report the work I've completely.
What's the right way to handle this sort of thing?
I've looked at threading.Thread and Queue.Queue - the idea being that I create a work and result queue, and then use a thread to consume the work queue while placing the results in result queue. Then I could use the thread's Thread.join function to stop processing the rest of the components.
The one challenge there is that I'm not sure how to hard exit the thread - I wouldn't want it hanging around forever if it didn't complete it's run.
Here is the code I've got so far. Am I on the right track?
import Queue
import threading
import time
class WorkThread(threading.Thread):
def __init__(self, work_queue, result_queue):
super(WorkThread, self).__init__()
self.work_queue = work_queue
self.result_queue = result_queue
self._timeout = threading.Event()
def timeout(self):
self._timeout.set()
def timed_out(self):
return self._timeout.is_set()
def run(self):
while not self.timed_out():
try:
work_fn, work_arg = self.work_queue.get()
retval = work_fn(work_arg)
self.result_queue.put(retval)
except (Queue.Empty):
break
def work(retval, timeout=1):
time.sleep(timeout)
return retval
def main():
# Two work items that will take at least two seconds to complete.
work_queue = Queue.Queue()
work_queue.put_nowait([work, 1])
work_queue.put_nowait([work, 2])
result_queue = Queue.Queue()
# Run the `WorkThread`. It should complete one item from the work queue
# before it times out.
t = WorkThread(work_queue=work_queue, result_queue=result_queue)
t.start()
t.join(timeout=1.1)
t.timeout()
results = []
while True:
try:
result = result_queue.get_nowait()
results.append(result)
except (Queue.Empty):
break
print results
if __name__ == "__main__":
main()
Update
It seems like in Python you've got a few options for timeouts of this nature:
Use SIGALARMS which work great if you have full control of the signals used by the process but probably are a mistake when you're running in a container like uWSGI.
Threads, which give you limited timeout control. Depending on your container environment (like uWSGI) you might need to set options to enable them.
Subprocesses, which give you full timeout control, but you need to be conscious of how they might change how your service consumes resources.
Use existing network timeouts. For example, if part of your healthcheck is to use Celery workers, you could rely on AsyncResult's timeout parameter to bound execution.
Do nothing! Log at regular intervals. Analyze later.
I'm exploring the benefits of these different options more.
Update #2
I put together a GitHub repo with quite a bit more information on the topic:
https://github.com/johnboxall/pytimeout
I'll type it up into a answer one day but the TLDR is here:
https://github.com/johnboxall/pytimeout#recommendations

twisted self.transport.write working inside loop

I have the following code for the client which sends some data to server after every 8 seconds and following is my code
class EchoClient(LineReceiver):
def connectionMade(self):
makeByteList()
self.transport.write(binascii.unhexlify("7777"))
while 1:
print "hello"
lep = random.randint(0,4)
print lep
print binascii.unhexlify(sendHexBytes(lep))
try:
self.transport.write("Hello")
self.transport.write(binascii.unhexlify(sendHexBytes(lep)))
except Exception, ex1:
print "Failed to send"
time.sleep(8)
def lineReceived(self, line):
pass
def dataReceived(self, data):
print "receive:", data
Every statement inside while loop execute except self.transport.write. The server doesn't receive any data. Also self.transport.write outside while loop doesn't execute. In both cases no exception is raised, but if I remove while loop the statement outside loop executes correctly. Why is this happening? Please correct me where I am making mistake?
All methods in twisted are asynchronous. All of the the methods such as connectionMade and lineReceived are happening on the same thread. The Twisted reactor runs a loop (called an event loop) and it calls methods such as connectionMade and lineReceived when these events happen.
You have an infinite loop in connectionMade. Once Python gets into that loop, it can never get out. Twisted calls connectionMade when connection is established, and your code stays there forever. Twisted has no opportunity to actually write the data to the transport, or receive data, it is stuck in connectionMade!
When you write Twisted code, the important point that you must understand is that you may not block on the Twisted thread. For example, let's say I want to send a "Hello" 4 seconds after a client connects. I might write this:
class EchoClient(LineReceiver):
def connectionMade(self):
time.sleep(4)
self.transport.write("Hello")
but this would be wrong. What happens if 2 clients connect at the same time? The first client will go into connectionMade, and my program will hang for 4 seconds until the "Hello" is sent.
The Twisted way to do this would be like this:
class EchoClient(LineReceiver):
def connectionMade(self):
reactor.callLater(4, self.sendHello)
def sendHello(self):
self.transport.write("Hello")
Now, when Twisted enters connectionMade, it calls reactor.callLater to schedule an event 4 seconds in the future. Then it exits connectionMade and continues doing all the other stuff it needs to do. Until you grasp the concept of async programming you can't continue in Twisted. I suggest you read through the Twisted docs here.
Finally, an unrelated note: If you have a LineReceiver, you should not implement your own dataReceived, it will make lineReceived not called. LineReceiver is a protocol which implements its own dataReceived which buffers and breaks up data into lines and calls lineReceived methods.