Recently, I have been going though celery & kombu documentation as i need them integrated in one of my projects. I have a basic understanding of how this should work but documentation examples using different brokers have me confused.
Here is the scenario:
Within my application i have two views ViewA and ViewB both of them does some expensive processing, so i wanted to have them use celery tasks for processing. So this is what i did.
views.py
def ViewA(request):
tasks.do_task_a.apply_async(args=[a, b])
def ViewB(request):
tasks.do_task_b.apply_async(args=[a, b])
tasks.py
#task()
def do_task_a(a, b):
# Do something Expensive
#task()
def do_task_b(a, b):
# Do something Expensive here too
Until now, everything is working fine. The problem is that do_task_a creates a txt file on the system, which i need to use in do_task_b. Now, in the do_task_b method i can check for the file existence and call the tasks retry method [which is what i am doing right now] if the file does not exist.
Here, I would rather want to take a different approach (i.e. where messaging comes in). I would want do_task_a to send a message to do_task_b once the file has been created instead of looping the retry method until the file is created.
I read through the documentation of celery and kombu and updated my settings as follows.
BROKER_URL = "django://"
CELERY_RESULT_BACKEND = "database"
CELERY_RESULT_DBURI = "sqlite:///celery"
TASK_RETRY_DELAY = 30 #Define Time in Seconds
DATABASE_ROUTERS = ['portal.db_routers.CeleryRouter']
CELERY_QUEUES = (
Queue('filecreation', exchange=exchanges.genex, routing_key='file.create'),
)
CELERY_ROUTES = ('celeryconf.routers.CeleryTaskRouter',)
and i am stuck here.
don't know where to go from here.
What should i do next to make do_task_a to broadcast a message to do_task_b on file creation ? and what should i do to make do_task_b receive (consume) the message and process the code further ??
Any Ideas and suggestions are welcome.
This is a good example for using Celery's callback/linking function.
Celery supports linking tasks together so that one task follows another.
You can read more about it here
apply_async() functions has two optional arguments
+link : excute the linked function on success
+link_error : excute the linked function on an error
#task
def add(a, b):
return a + b
#task
def total(numbers):
return sum(numbers)
#task
def error_handler(uuid):
result = AsyncResult(uuid)
exc = result.get(propagate=False)
print('Task %r raised exception: %r\n%r' % (exc, result.traceback))
Now in your calling function do something like
def main():
#for error_handling
add.apply_async((2, 2), link_error=error_handler.subtask())
#for linking 2 tasks
add.apply_async((2, 2), link=add.subtask((8, )))
# output 12
#what you can do is your case is something like this.
if user_requires:
add.apply_async((2, 2), link=add.subtask((8, )))
else:
add.apply_async((2, 2))
Hope this is helpful
Related
I'm trying to figure out the best solution to increase the speed of my lambda function by running my code in parallel as always my loop doing same thing over and over again is there a solution ? or way?
This is prime example for multithreading.
Taken from python std lib:
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
You can easily adapt it for your needs.
One caveat is that you cannot open more than 1000 threads in AWS Lambda, so be mindful of that.
I am running a rosnode with a kalman filter running. The kalman filter is an object with states that get updated as time plays out. Conventionally, a ros node has a run(self) method that runs at a specified frequency using the while condition
while not rospy.is_shutdown():
do this
Going through each loop my kalman filter object updates. I just want to be able to save the kalman filter object when the node is shutdown either some external condition or when the user presses ctrl+C. I am not able to do this. In the run(self) method, I tried
while not rospy.is_shutdown():
do this
# save in file
output = pathlib.Path('path/to/location')
results_path = output.with_suffix('.npz')
with open(results_path, 'xb') as results_file:
np.savez(results_file,kfObj=kf_list)
But it has not worked. Is it not executing the save command? If ctrl+C is pressed does it stop short of executing it? Whats the way to do it?
Check out the atexit module:
http://docs.python.org/library/atexit.html
import atexit
def exit_handler():
output = pathlib.Path('path/to/location')
results_path = output.with_suffix('.npz')
with open(results_path, 'xb') as results_file:
np.savez(results_file,kfObj=kf_list
atexit.register(exit_handler)
Just be aware that this works great for normal termination of the script, but it won't get called in all cases (e.g. fatal internal errors).
Why not try the following python example class structure engaging shutdown hooks :
import rospy
class Hardware_Interface:
def __init__(self, selectedBoard):
...
# Housekeeping, cleanup at the end
rospy.on_shutdown(self.shutdown)
# Get the connection settings from the parameter server
self.port = rospy.get_param("~"+self.board+"-port", "/dev/ttyACM0")
# Get the prefix
self.prefix = rospy.get_param("~"+self.board+"-prefix", "travel")
# Overall loop rate
self.rate = int(rospy.get_param("~rate", 5))
self.period = rospy.Duration(1/float(self.rate))
...
def shutdown(self):
rospy.loginfo("Shutting down Hardware Interface Node...")
try:
rospy.loginfo("Stopping the robot...")
self.controller.send(0, 0, 0, 0)
#self.cmd_vel_pub.publish(Twist())
rospy.sleep(2)
except:
rospy.loginfo("Cannot stop!")
try:
self.controller.close()
except:
pass
finally:
rospy.loginfo("Serial port closed.")
os._exit(0)
This just an extract from a personal script, please modify it to your needs. I imagine that the on_shutdown will do the trick. Another similar approach comes from my friends in the Robot Ignite Academy in The Construct and seems like that
#!/usr/bin/env python
import rospy
from geometry_msgs.msg import Twist
class my_class():
def __init__(self):
...
self.cmd = Twist()
self.ctrl_c = False
self.rate = rospy.Rate(10) # 10hz
rospy.on_shutdown(self.shutdownhook)
def publish_once_in_cmd_vel(self):
while not self.ctrl_c:
...
def shutdownhook(self):
# works better than the rospy.is_shutdown()
self.ctrl_c = True
def move_something(self, linear_speed=0.2, angular_speed=0.2):
self.cmd.linear.x = linear_speed
self.cmd.angular.z = angular_speed
self.publish_once_in_cmd_vel()
if __name__ == '__main__':
rospy.init_node('class_test', anonymous=True)
...
This is obviously a sample of their code (for more please join the academy)
in my Django project with celery, I have celery task function that needs to be received all incoming tasks but starts it step by step like Singleton.
I can do this like:
#shared_task(bind=True)
def make_some_task(self, event_id):
lock_name = os.path.join(settings.BASE_DIR, 'create_lock')
is_exists = os.path.exists(lock_name)
while is_exists:
time.sleep(10)
with open('create_lock', 'w') as file:
file.write('locked')
..... do some staff.....
os.remove(lock_name)
but I think this is not the correct way to use this inside Celery, I think must be the better way to implement this
I have a tool, where i am implementing upnp discovery of devices connected in network.
For that i have written a script and used datagram class in it.
Implementation:
whenever scan button is pressed on tool, it will run that upnp script and will list the devices in the box created in tool.
This was working fine.
But when i again press the scan button, it gives me following error:
Traceback (most recent call last):
File "tool\ui\main.py", line 508, in updateDevices
upnp_script.main("server", localHostAddress)
File "tool\ui\upnp_script.py", line 90, in main
reactor.run()
File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1191, in run
self.startRunning(installSignalHandlers=installSignalHandlers)
File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1171, in startRunning
ReactorBase.startRunning(self)
File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 683, in startRunning
raise error.ReactorNotRestartable()
twisted.internet.error.ReactorNotRestartable
Main function of upnp script:
def main(mode, iface):
klass = Server if mode == 'server' else Client
obj = klass
obj(iface)
reactor.run()
There is server class which is sending M-search command(upnp) for discovering devices.
MS = 'M-SEARCH * HTTP/1.1\r\nHOST: %s:%d\r\nMAN: "ssdp:discover"\r\nMX: 2\r\nST: ssdp:all\r\n\r\n' % (SSDP_ADDR, SSDP_PORT)
In server class constructor, after sending m-search i am stooping reactor
reactor.callLater(10, reactor.stop)
From google i found that, we cannot restart a reactor beacause it is its limitation.
http://twistedmatrix.com/trac/wiki/FrequentlyAskedQuestions#WhycanttheTwistedsreactorberestarted
Please guide me how can i modify my code so that i am able to scan devices more than 1 time and don't get this "reactor not restartable error"
In response to "Please guide me how can i modify my code...", you haven't provided enough code that I would know how to specifically guide you, I would need to understand the (twisted part) of the logic around your scan/search.
If I were to offer a generic design/pattern/mental-model for the "twisted reactor" though, I would say think of it as your programs main loop. (thinking about the reactor that way is what makes the problem obvious to me anyway...)
I.E. most long running programs have a form something like
def main():
while(True):
check_and_update_some_stuff()
sleep 10
That same code in twisted is more like:
def main():
# the LoopingCall adds the given function to the reactor loop
l = task.LoopingCall(check_and_update_some_stuff)
l.start(10.0)
reactor.run() # <--- this is the endless while loop
If you think of the reactor as "the endless loop that makes up the main() of my program" then you'll understand why no-one is bothering to add support for "restarting" the reactor. Why would you want to restart an endless loop? Instead of stopping the core of your program, you should instead only surgically stop the task inside that is complete, leaving the main loop untouched.
You seem to be implying that the current code will keep "sending m-search"s endlessly when the reactor is running. So change your sending code so it stops repeating the "send" (... I can't tell you how to do this because you didn't provide code, but for instance, a LoopingCall can be turned off by calling its .stop method.
Runnable example as follows:
#!/usr/bin/python
from twisted.internet import task
from twisted.internet import reactor
from twisted.internet.protocol import Protocol, ServerFactory
class PollingIOThingy(object):
def __init__(self):
self.sendingcallback = None # Note I'm pushing sendToAll into here in main()
self.l = None # Also being pushed in from main()
self.iotries = 0
def pollingtry(self):
self.iotries += 1
if self.iotries > 5:
print "stoping this task"
self.l.stop()
return()
print "Polling runs: " + str(self.iotries)
if self.sendingcallback:
self.sendingcallback("Polling runs: " + str(self.iotries) + "\n")
class MyClientConnections(Protocol):
def connectionMade(self):
print "Got new client!"
self.factory.clients.append(self)
def connectionLost(self, reason):
print "Lost a client!"
self.factory.clients.remove(self)
class MyServerFactory(ServerFactory):
protocol = MyClientConnections
def __init__(self):
self.clients = []
def sendToAll(self, message):
for c in self.clients:
c.transport.write(message)
# Normally I would define a class of ServerFactory here but I'm going to
# hack it into main() as they do in the twisted chat, to make things shorter
def main():
client_connection_factory = MyServerFactory()
polling_stuff = PollingIOThingy()
# the following line is what this example is all about:
polling_stuff.sendingcallback = client_connection_factory.sendToAll
# push the client connections send def into my polling class
# if you want to run something ever second (instead of 1 second after
# the end of your last code run, which could vary) do:
l = task.LoopingCall(polling_stuff.pollingtry)
polling_stuff.l = l
l.start(1.0)
# from: https://twistedmatrix.com/documents/12.3.0/core/howto/time.html
reactor.listenTCP(5000, client_connection_factory)
reactor.run()
if __name__ == '__main__':
main()
This script has extra cruft in it that you might not care about, so just focus on the self.l.stop() in PollingIOThingys polling try method and the l related stuff in main() to illustrates the point.
(this code comes from SO: Persistent connection in twisted check that question if you want to know what the extra bits are about)
I am using Python 2.7 and Jenkins.
I am writing some code in Python that will perform a checkin and wait/poll for Jenkins job to be complete. I would like some thoughts on around how I achieve it.
Python function to create a check-in in Perforce-> This can be easily done as P4 has CLI
Python code to detect when a build got triggered -> I have the changelist and the job number. How do I poll the Jenkins API for the build log to check if it has the appropriate changelists? The output of this step is a build url which is carrying out the job
How do I wait till the Jenkins job is complete?
Can I use snippets from the Jenkins Rest API or from Python Jenkins module?
If you need to know if the job is finished, the buildNumber and buildTimestamp are not enough.
This is the gist of how I find out if a job is complete, I have it in ruby but not python so perhaps someone could update this into real code.
lastBuild = get jenkins/job/myJob/lastBuild/buildNumber
get jenkins/job/myJob/lastBuild/build?token=gogogo
currentBuild = get jenkins/job/myJob/lastBuild/buildNumber
while currentBuild == lastBuild
sleep 1
thisBuild = get jenkins/job/myJob/lastBuild/buildNumber
buildInfo = get jenkins/job/myJob/[thisBuild]/api/xml?depth=0
while buildInfo["freeStyleBuild/building"] == true
buildInfo = get jenkins/job/myJob/[thisBuild]/api/xml?depth=0
sleep 1
ie. I found I needed to A) wait until the build starts (new build number) and B) wait until the building finishes (building is false).
You can query the last build timestamp to determine if the build finished. Compare it to what it was just before you triggered the build, and see when it changes. To get the timestamp, add /lastBuild/buildTimestamp to your job URL
As a matter of fact, in your Jenkins, add /lastBuild/api/ to any Job, and you will see a lot of API information. It even has Python API, but I not familiar with that so can't help you further
However, if you were using XML, you can add lastBuild/api/xml?depth=0 and inside the XML, you can see the <changeSet> object with list of revisions/commit messages that triggered the build
Simple solution using invoke and block_until_complete methods (tested with Python 3.7)
import jenkinsapi
from jenkinsapi.jenkins import Jenkins
...
server = Jenkins(jenkinsUrl, username=jenkinsUser,
password=jenkinsToken, ssl_verify=sslVerifyFlag)
job = server.create_job(jobName, None)
queue = job.invoke()
queue.block_until_complete()
Inpsired by a test method in pycontribs
This snippet starts build job and wait until job is done.
It is easy to start the job but we need some kind of logic to know when job is done. First we need to wait for job ID to be applied and than we can query job for details:
from jenkinsapi import jenkins
server = jenkins.Jenkins(jenkinsurl, username=username, password='******')
job = server.get_job(j_name)
prev_id = job.get_last_buildnumber()
server.build_job(j_name)
while True:
print('Waiting for build to start...')
if prev_id != job.get_last_buildnumber():
break
time.sleep(3)
print('Running...')
last_build = job.get_last_build()
while last_build.is_running():
time.sleep(1)
print(str(last_build.get_status()))
Don't know if this was available at the time of the question, but jenkinsapi module's Job.invoke() and/or Jenkins.build_job() return a QueueItem object, which can block_until_building(), or block_until_complete()
jobq = server.build_job(job_name, job_params)
jobq.block_until_building()
print("Job %s (%s) is building." % (jobq.get_job_name(), jobq.get_build_number()))
jobq.block_until_complete(5) # check every 5s instead of the default 15
print("Job complete, %s" % jobq.get_build().get_status())
Was going through the same problem and this worked for me, using python3 and python-jenkins.
while "".join([d['color'] for d in j.get_jobs() if d['name'] == "job_name"]) == 'blue_anime':
print('Job is Running')
time.sleep(1)
print('Job Over!!')
Working Github Script: Link
This is working for me
#!/usr/bin/env python
import jenkins
import time
server = jenkins.Jenkins('https://jenkinsurl/', username='xxxxx', password='xxxxxx')
j_name = 'test'
server.build_job(j_name, {'testparam1': 'test', 'testparam2': 'test'})
while True:
print('Running....')
if server.get_job_info(j_name)['lastCompletedBuild']['number'] == server.get_job_info(j_name)['lastBuild']['number']:
print "Last ID %s, Current ID %s" % (server.get_job_info(j_name)['lastCompletedBuild']['number'], server.get_job_info(j_name)['lastBuild']['number'])
break
time.sleep(3)
print('Stop....')
console_output = server.get_build_console_output(j_name, server.get_job_info(j_name)['lastBuild']['number'])
print console_output
the issue main issue that the build_job doesn't return the number of the job, returns the number of a queue item (that only last 5 min). so the trick is
build_job
get the queue number,
with the queue number get the job_number
now we know the name of the job and the job number
get_job_info and loop the jobs till we find one with our job number
check the status
so i made a function for it with time_out
import time
from datetime import datetime, timedelta
import jenkins
def launch_job(jenkins_connection, job_name, parameters={}, wait=False, interval=30, time_out=7200):
"""
Create a jenkins job and waits for the job to finish
:param jenkins_connection: jenkins server jenkins object
:param job_name: the name of job we want to create and see if finish string
:param parameters: the parameters of the job to build directory
:param wait: if we want to wait for the job to finish or not bool
:param interval: how often we want to monitor seconds int
:param time_out: break the loop after certain X seconds int
:return: build job number int
"""
# we lunch the job and returns a queue_id
job_id = jenkins_connection.build_job(job_name, parameters)
# from the queue_id we get the job number that was created
queue_job = jenkins_connection.get_queue_item(job_id, depth=0)
build_number = queue_job["executable"]["number"]
print(f"job_name: {job_name} build_number: {build_number}")
if wait is True:
now = datetime.now()
later = now + timedelta(seconds=time_out)
while True:
# we check current time vs the timeout(later)
if datetime.now() > later:
raise ValueError(f"Job: {job_name}:{build_number} is running for more than {time_out} we"
f"stop monitoring the job, you can check it in Jenkins")
b = jenkins_connection.get_job_info(job_name, depth=1, fetch_all_builds=False)
for i in b["builds"]:
loop_id = i["id"]
if int(loop_id) == build_number:
result = (i["result"])
print(f"result: {result}") # in the json looks like null
if result is not None:
return i
# break
time.sleep(interval)
# return result
return build_number
after we ask jenkins to build the job>get queue#>get job#> loop the info and get the status till change from None to something else.
if works will return the directory with the information of that job. (hope the jenkins library could implement something like this.)