How to schedule SMS using Twilio w/o using Celery? - django

I want to schedule SMS using Twilio in Python. After reading some articles I came to know about Celery.
But I opted not to use celery and go with Python Threading module. Threading module works perfectly when using some dummy function, but when calling
client.api.account.messages.create(
to="+91xxxxxxxxx3",
from_=settings.TWILIO_CALLER_ID,
body=message)
it sends the SMS at the same time.
Here is my code
from threading import Timer
from django.conf import settings
from twilio.rest import Client
account_sid = settings.TWILIO_ACCOUNT_SID
auth_token = settings.TWILIO_AUTH_TOKEN
message = to_do
client = Client(account_sid, auth_token)
run_at = user_given_time() #this function extracts the user given time from database. it works perfectly fine.
# find current DateTime
now = DT.now()
now = DT.strptime(str(now), '%Y-%m-%d %H:%M:%S.%f')
now = now.replace(microsecond=0)
delay = (run_at - now).total_seconds()
Timer(delay, client.api.account.messages.create(
to="+91xxxxxxxxx3",
from_=settings.TWILIO_CALLER_ID,
body=to_do)).start()
So the problem is that Twilio sends SMS at the same time, but I want it to send after given delay.

You are calling the function before starting your Timer, and then passing your Timer thread the return value. You need to pass Timer the function client.api.account.messages.create and the kwargs to pass it as separate arguments so the thread can call the function itself when the time comes:
Timer(delay, client.api.account.messages.create,
kwargs={'to': "+91xxxxxxxxx3",
'from_': settings.TWILIO_CALLER_ID,
'body'=to_do)).start()
See the documentation for Timer and notice that it takes args and kwargs parameters to pass to the provided function.

The Timer shouldn't be used with web service, here's the reason:
Web requests are served using multiple threads.
You need to wait for the timer thread to execute so it will become blocking otherwise, timer thread won't be executed.
Hence I would recommend, use some any queue to do such things.

Related

kafka-python: Closing the kafka producer with 0 vs inf secs timeout

I am trying to produce the messages to a Kafka topic using kafka-python 2.0.1 using python 2.7 (can't use Python 3 due to some workplace-related limitations)
I created a class as below in a separate and compiled the package and installed in virtual environment:
import json
from kafka import KafkaProducer
class KafkaSender(object):
def __init__(self):
self.producer = self.get_kafka_producer()
def get_kafka_producer(self):
return KafkaProducer(
bootstrap_servers=['locahost:9092'],
value_serializer=lambda x: json.dumps(x),
request_timeout_ms=2000,
)
def send(self, data):
self.producer.send("topicname", value=data)
My driver code is something like this:
from mypackage import KafkaSender
# driver code
data = {"a":"b"}
kafka_sender = KafkaSender()
kafka_sender.send(data)
Scenario 1:
I run this code, it runs just fine, no errors, but the message is not pushed to the topic. I have confirmed this as offset or lag is not increased in the topic. Also, nothing is getting logged at the consumer end.
Scenario 2:
Commented/removed the initialization of Kafka producer from __init__ method.
I changed the sending line from
self.producer.send("topicname", value=data) to self.get_kafka_producer().send("topicname", value=data) i.e. creating kafka producer not in advance (during class initialization) but right before sending the message to topic. And when I ran the code, it worked perfectly. The message got published to the topic.
My intention using scenario 1 is to create a Kafka producer once and use it multiple times and not to create Kafka producer every time I want to send the messages. This way I might end up creating millions of Kafka producer objects if I need to send millions of messages.
Can you please help me understand why is Kafka producer behaving this way.
NOTE: If I write the Kafka Code and Driver code in same file it works fine. It's not working only when I write the Kafka code in separate package, compile it and import it in my another project.
LOGS: https://www.diffchecker.com/dTtm3u2a
Update 1: 9th May 2020, 17:20:
Removed INFO logs from the question description. I enabled the DEBUG level and here is the difference between the debug logs between first scenario and the second scenario
https://www.diffchecker.com/dTtm3u2a
Update 2: 9th May 2020, 21:28:
Upon further debugging and looking at python-kafka source code, I was able to deduce that in scenario 1, kafka sender was forced closed while in scenario 2, kafka sender was being closed gracefully.
def initiate_close(self):
"""Start closing the sender (won't complete until all data is sent)."""
self._running = False
self._accumulator.close()
self.wakeup()
def force_close(self):
"""Closes the sender without sending out any pending messages."""
self._force_close = True
self.initiate_close()
And this depends on whether kafka producer's close() method is called with timeout 0 (forced close of sender) or without timeout (in this case timeout takes value float('inf') and graceful close of sender is called.)
Kafka producer's close() method is called from __del__ method which is called at the time of garbage collection. close(0) method is being called from method which is registered with atexit which is called when interpreter terminates.
Question is why in scenario 1 interpreter is terminating?

Using requests library in on_failure or on_sucesss hook causes the task to retry indefinitely

This is what I have:
import youtube_dl # in case this matters
class ErrorCatchingTask(Task):
# Request = CustomRequest
def on_failure(self, exc, task_id, args, kwargs, einfo):
# If I comment this out, all is well
r = requests.post(server + "/error_status/")
....
#app.task(base=ErrorCatchingTask, bind=True, ignore_result=True, max_retires=1)
def process(self, param_1, param_2, param_3):
...
raise IndexError
...
The worker will throw exception and then seemingly spawn a new task with a different task id Received task: process[{task_id}
Here are a couple of things I've tried:
Importing from celery.worker.request import Request and overriding on_failure and on_success functions there instead.
app.conf.broker_transport_options = {'visibility_timeout': 99999999999}
#app.task(base=ErrorCatchingTask, bind=True, ignore_result=True, max_retires=1)
Turn off DEBUG mode
Set logging to info
Set CELERY_IGNORE_RESULT to false (Can I use Python requests with celery?)
import requests as apicall to rule out namespace conflict
Money patch requests Celery + Eventlet + non blocking requests
Move ErrorCatchingTask into a separate file
If I don't use any of the hook functions, the worker will just throw the exception and stay idle until the next task is scheduled, which what I expect even when I use the hooks. Is this a bug? I searched through and through on github issues, but couldn't find the same problem. How do you debug a problem like this?
Django 1.11.16
celery 4.2.1
My problem was resolved after I used grequests
In my case, celery worker would reschedule as soon as conn.urlopen() was being called in requests/adapters.py. Another behavior I observed was if I had another worker from another project open in the same machine, sometimes infinite rescheduling would stop. This probably was some locking mechanism that was originally intended for other purpose kicking in.
So this led me to suspect that this is indeed threading issue and after researching whether requests library was thread safe, I found some people suggesting different things.. In theory, monkey patching should have a similar effect as using grequests, but it is not the same, so just use grequests or erequests library instead.
Celery Debugging instruction is here

Is it possible to put a function in timed loop using django-background-task

Say i want to execute a function every 5 minutes without using cron job.
What i think of doing is create a django background task which actually calls that function and at the end of that function, i again create that task with schedule = say 60*5.
this effectively puts the function in a time based loop.
I tried a few iterations, but i am getting import errors. But is it possible to do or not?
No It's not possible in any case as it will effectively create cyclic import problems in django. Because in tasks you will have to import that function and in the file for that function, you will have to import tasks.
So no whatever strategy you take, you are gonna land into the same problem.
I made something like. Are you looking for this?
import threading
import time
def worker():
"""do your stuff"""
return
threads = list()
while (true):
time.sleep(300)
t = threading.Thread(target=worker)
threads.append(t)
t.start()

Stopping a function in Python using a timeout

I'm writing a healthcheck endpoint for my web service.
The end point calls a series of functions which return True if the component is working correctly:
The system is considered to be working if all the components are working:
def is_health():
healthy = all(r for r in (database(), cache(), worker(), storage()))
return healthy
When things aren't working, the functions may take a long time to return. For example if the database is bogged down with slow queries, database() could take more than 30 seconds to return.
The healthcheck endpoint runs in the context of a Django view, running inside a uWSGI container. If the request / response cycle takes longer than 30 seconds, the request is harakiri-ed!
This is a huge bummer, because I lose all contextual information that I could have logged about which component took a long time.
What I'd really like, is for the component functions to run within a timeout or a deadline:
with timeout(seconds=30):
database_result = database()
cache_result = cache()
worker_result = worker()
storage_result = storage()
In my imagination, as the deadline / harakiri timeout approaches, I can abort the remaining health checks and just report the work I've completely.
What's the right way to handle this sort of thing?
I've looked at threading.Thread and Queue.Queue - the idea being that I create a work and result queue, and then use a thread to consume the work queue while placing the results in result queue. Then I could use the thread's Thread.join function to stop processing the rest of the components.
The one challenge there is that I'm not sure how to hard exit the thread - I wouldn't want it hanging around forever if it didn't complete it's run.
Here is the code I've got so far. Am I on the right track?
import Queue
import threading
import time
class WorkThread(threading.Thread):
def __init__(self, work_queue, result_queue):
super(WorkThread, self).__init__()
self.work_queue = work_queue
self.result_queue = result_queue
self._timeout = threading.Event()
def timeout(self):
self._timeout.set()
def timed_out(self):
return self._timeout.is_set()
def run(self):
while not self.timed_out():
try:
work_fn, work_arg = self.work_queue.get()
retval = work_fn(work_arg)
self.result_queue.put(retval)
except (Queue.Empty):
break
def work(retval, timeout=1):
time.sleep(timeout)
return retval
def main():
# Two work items that will take at least two seconds to complete.
work_queue = Queue.Queue()
work_queue.put_nowait([work, 1])
work_queue.put_nowait([work, 2])
result_queue = Queue.Queue()
# Run the `WorkThread`. It should complete one item from the work queue
# before it times out.
t = WorkThread(work_queue=work_queue, result_queue=result_queue)
t.start()
t.join(timeout=1.1)
t.timeout()
results = []
while True:
try:
result = result_queue.get_nowait()
results.append(result)
except (Queue.Empty):
break
print results
if __name__ == "__main__":
main()
Update
It seems like in Python you've got a few options for timeouts of this nature:
Use SIGALARMS which work great if you have full control of the signals used by the process but probably are a mistake when you're running in a container like uWSGI.
Threads, which give you limited timeout control. Depending on your container environment (like uWSGI) you might need to set options to enable them.
Subprocesses, which give you full timeout control, but you need to be conscious of how they might change how your service consumes resources.
Use existing network timeouts. For example, if part of your healthcheck is to use Celery workers, you could rely on AsyncResult's timeout parameter to bound execution.
Do nothing! Log at regular intervals. Analyze later.
I'm exploring the benefits of these different options more.
Update #2
I put together a GitHub repo with quite a bit more information on the topic:
https://github.com/johnboxall/pytimeout
I'll type it up into a answer one day but the TLDR is here:
https://github.com/johnboxall/pytimeout#recommendations

Celery Storing unrecoverable task failures for later resubmission

I'm using the djkombu transport for my local development, but I will probably be using amqp (rabbit) in production.
I'd like to be able to iterate over failures of a particular type and resubmit. This would be in the case of something failing on a server or some edge case bug triggered by some new variation in data.
So I could be resubmitting jobs up to 12 hours later after some bug is fixed or a third party site is back up.
My question is: Is there a way to access old failed jobs via the result backend and simply resubmit them with the same params etc?
You can probably access old jobs using:
CELERY_RESULT_BACKEND = "database"
and in your code:
from djcelery.models import TaskMeta
task = TaskMeta.objects.filter(task_id='af3185c9-4174-4bca-0101-860ce6621234')[0]
but I'm not sure you can find the arguments that the task is being started with ... Maybe something with TaskState...
I've never used it this way. But you might want to consider the task.retry feature?
An example from celery docs:
#task()
def task(*args):
try:
some_work()
except SomeException, exc:
# Retry in 24 hours.
raise task.retry(*args, countdown=60 * 60 * 24, exc=exc)
From IRC
<asksol> dpn`: task args and kwargs are not stored with the result
<asksol> dpn`: but you can create your own model and store it there
(for example using the task_sent signal)
<asksol> we don't store anything when the task is sent, only send a
message. but it's very easy to do yourself
This was what I was expecting, but hoped to avoid.
At least I have an answer now :)