Pathos multiprocessing pool hangs - django

I'm trying to use multiprocessing inside docker container. However, I'm facing two issues.
(I'm using python 2.7)
Creating ProcessingPool()/Pool() (I tried both) takes abnormally long time to create. Maybe over a minute or two.
After it processes the function, it hangs.
I basically trying to run a very simple case inside my container. Here's what I have..
import pathos.multiprocessing import ProcessingPool
import multiprocessing
class MultiprocessClassExample():
.
.
.
def worker(self, number):
return "Printing number %s" %(number)
.
.
def generateNumber(self):
PROCESSES = multiprocessing.cpu_count() - 1
NUMBER = ['One', 'Two', 'Three', 'Four', 'Five']
result = ProcessingPool(PROCESSES).map(self.worker, NUMBER)
print("Finished processing.")
print(result)
and I call using the following code.
MultiprocessClassExample().generateNumber()
Now, this seems fairly straight forward enough. I ran this on a jupyter notebook and it ran without an issue. I also tried running python inside my docker container, and tried running the above code inside, and it went fine. So I'm assuming it has to do with the complete code that I have. Obviously I didn't write out all the code, but that's the main section of the code I'm trying to handle right now.
I would expect the above code to work as well. However, first thing I notice is that when I call ProcessingPool(), it takes a long time. I tried regular multiprocessing.Pool() before, and had the same effect. Whereas, in the notebook, it ran very quick and smoothly.
After waiting several minutes, it prints :
Printing number One
Printing number Two
Printing number Three
Printing number Four
Printing number Five
and that's it. It never prints out Finished processing. and it just hangs there.
But when the print statements appear, I notice that several debug message appear at the same time. It says
[CRITICAL] WORKER TIMEOUT
[WARNING] Worker graceful timeout
[INFO] Worker exiting
[INFO] Booting worker with pid:
Any suggestions would be greatly appreciated.

Related

Why celery not executing parallelly in Django?

I am having a issue with the celery , I will explain with the code
def samplefunction(request):
print("This is a samplefunction")
a=5,b=6
myceleryfunction.delay(a,b)
return Response({msg:" process execution started"}
#celery_app.task(name="sample celery", base=something)
def myceleryfunction(a,b):
c = a+b
my_obj = MyModel()
my_obj.value = c
my_obj.save()
In my case one person calling the celery it will work perfectly
If many peoples passing the request it will process one by one
So imagine that my celery function "myceleryfunction" take 3 Min to complete the background task .
So if 10 request are coming at the same time, last one take 30 Min delay to complete the output
How to solve this issue or any other alternative .
Thank you
I'm assuming you are running a single worker with default settings for the worker.
This will have the worker running with worker_pool=prefork and worker_concurrency=<nr of CPUs>
If the machine it runs on only has a single CPU, you won't get any parallel running tasks.
To get parallelisation you can:
set worker_concurrency to something > 1, this will use multiple processes in the same worker.
start additional workers
use celery multi to start multiple workers
when running the worker in a docker container, add replica's of the container
See Concurrency for more info.

Django/Celery 4.3 - jobs seem to fail randomly

These are the tasks in tasks.py:
#shared_task
def add(x, y):
return x * y
#shared_task
def verify_external_video(video_id, media_id, video_type):
return True
I am calling verify_external_video 1000+ times from a custom Django command I run from CLI
verify_external_video.delay("1", "2", "3")
In Flower, I am then monitoring the success or failure of the jobs. A random number of jobs fail, others succeed...
Those that fail, do so because of two reasons that I just cannot understand:
NotRegistered('lstv_api_v1.tasks.verify_external_video')
if it's not registered, why are 371 succeedings?
and...
TypeError: verify_external_video() takes 1 positional argument but 3 were given
Again, a mystery, as I quit Celery and Flower, and run them AGAIN from scratch before running my CLI Django command. There is no code living anywhere where verify_external_video() takes 1 parameter. And if this is the case... why are SOME of the calls successful?
This type of failure isn't sequential. I can have 3 successful jobs, followed by one that does not succeed, followed by success again, so it's not a timing issue.
I'm at a loss here.
In Short: I had a number of rogue celery processes running around from previous "violent" CTRL-C's which prevented graceful termination of what was running.

How to run 10 processes at a time from a list of 1000 processes in python 2.7

def get_url(url):
# conditions
import multiprocessing
threads = []
thread = multiprocessing.Process(target=get_url,args=(url))
threads.append(thread)
for st in threads:
st.start()
Now i want to execute 10 requests at a time, once those 10 are completed. Pick other 10 and so on. I was going through the documentation but i haven't found any use case. I am using this module for the first time. Any help would be appreciated.

APScheduler not executing the job at the specified time

I wrote a code to gather data in 1 hour intervals from 12 o'clock, from an online source. I have Python 2.7.12 on Mac with APScheduler of version 3.3.0.
My code consist of two functions as below:
1- Main Function which is executed every 1 hour using 'cron' scheduling type
2- Check Function which is executed every 2 minutes using 'interval' scheduling type
def Main():
#do main stuff
def Check():
#check what has been done in Main
scheduler = BackgroundScheduler()
scheduler.add_job(Main, 'cron', month='*', day='*',day_of_week='*', hour='0-24', minute='0')
scheduler.add_job(check(Check, 'interval', minutes=2)
scheduler.start()
I have ran this code in Python 3.5 and it works perfectly good. In python 3.5 the Main Function starts when the minute in time hits 0 and the Check Function runs every 2 minutes.
However, in Python 2.7 when run the code, the Main Function Immediately starts.
How can I fix this problem?

Django: Gracefully restart nginx + fastcgi sites to reflect code changes?

Common situation: I have a client on my server who may update some of the code in his python project. He can ssh into his shell and pull from his repository and all is fine -- but the code is stored in memory (as far as I know) so I need to actually kill the fastcgi process and restart it to have the code change.
I know I can gracefully restart fcgi but I don't want to have to manually do this. I want my client to update the code, and within 5 minutes or whatever, to have the new code running under the fcgi process.
Thanks
First off, if uptime is important to you, I'd suggest making the client do it. It can be as simple as giving him a command called deploy-code. Using your method, if there is an error in their code, your method requires a 10 minute turnaround (read: downtime) for fixing it, assuming he gets it correct.
That said, if you actually want to do this, you should create a daemon which will look for files modified within the last 5 minutes. If it detects one, it will execute the reboot command.
Code might look something like:
import os, time
CODE_DIR = '/tmp/foo'
while True:
if restarted = True:
restarted = False
time.sleep(5*60)
for root, dirs, files in os.walk(CODE_DIR):
if restarted=True:
break
for filename in files:
if restared=True:
break
updated_on = os.path.getmtime(os.path.join(root, filename))
current_time = time.time()
if current_time - updated_on <= 6 * 60: # 6 min
# 6 min could offer false negatives, but that's better
# than false positives
restarted = True
print "We should execute the restart command here."