I want to create a CFschedule task to run every 30 seconds and was wondering what would happen if a Task tries to fire up while the previous task is still running - does it wait or die or ???
cfschedule doesnt care much about the content of your jobs, think of it like cron daemon on Linux. Its just the scheduling agent.
If you need to ensure only one job of a certain type is running its up to you to implement some kind of busy/locking behavior, probably using some kind of a datastore such as a relational database, plain files (ala the presence of a file indicates in-process, then delete it right before job completion, this would be a good use of the onComplete handler), redis, etc.
Related
I'm developing a Django app which relies heavily on Celery task scheduling, using Redis as backend. Tasks can be set to run at a large periods of time, as well as in a few seconds/minutes.
I've read about Redis visibility timeout and consequences of scheduling tasks with timedelta greater than visibility timeout (I'm also in the process of dealing with it in a previous project), so I'm interested if there's anything neater than my solution, which is to have another "helper" task run 5 minutes before the "main" one needs to be executed, scheduling the "main" task to run in required time, storing task id in DB, and then checking in "main" task if the stored task id is the one that is being run. The last part (with task id storing) is required as multiple runs of "helper" task could spawn a lot of "main" task instances, but with this approach each will have different task id.
I really hate how that approach sounds and how it works, as if the task is scheduled to be run a month from current time, "helper" and "main" tasks are executed up to a hundred times.
I also know that it's an open issue, so I'm interested in more a neat workaround than a solution itself.
Having tested available options, in my opinion only using RabbitMQ as broker solves the whole problem.
Although it's a viable option for me, lack of some of redis configuration parameters (e.g. pool size) makes it unusable for those who are using hosting services with some limit on opened broker connection.
We've got a little java scheduler running on AWS ECS. It's doing what cron used to do on our old monolith. it fires up (fargate) tasks in docker containers. We've got a task that runs every hour and it's quite important to us. I want to know if it crashes or fails to run for any reason (eg the java scheduler fails, or someone turns the task off).
I'm looking for a service that will alert me if it's not notified. I want to call the notification system every time the script runs successfully. Then if the alert system doesn't get the "OK" notification as expected, it shoots off an alert.
I figure this kind of service must exist, and I don't want to re-invent the wheel trying to build it myself. I guess my question is, what's it called? And where can I go to get that kind of thing? (we're using AWS obviously and we've got a pagerDuty account).
We use this approach for these types of problems. First, the task has to write a timestamp to a file in S3 or EFS. This file is the external evidence that the task ran to completion. Then you need an http based service that will read that file and calculate if the time stamp is valid ie has been updated in the last hour. This could be a simple php or nodejs script. This process is exposed to the public web eg https://example.com/heartbeat.php. This script returns a http response code of 200 if the timestamp file is present and valid, or a 500 if not. Then we use StatusCake to monitor the url, and notify us via its Pager Duty integration if there is an incident. We usually include a message in the response so a human can see the nature of the error.
This may seem tedious, but it is foolproof. Any failure anywhere along the line will be immediately notified. StatusCake has a great free service level. This approach can be used to monitor any critical task in same way. We've learned the hard way that critical cron type tasks and processes can fail for any number of reasons, and you want to know before it becomes customer critical. 24x7x365 monitoring of these types of tasks is necessary, and helps us sleep better at night.
Note: We always have a daily system test event that triggers a Pager Duty notification at 9am each day. For the truly paranoid, this assures that pager duty itself has not failed in some way eg misconfiguratiion etc. Our support team knows if they don't get a test alert each day, there is a problem in the notification system itself. The tech on duty has to awknowlege the incident as per SOP. If they do not awknowlege, then it escalates to the next tier, and we know we have to have a talk about response times. It keeps people on their toes. This is the final piece to insure you have robust monitoring infrastructure.
OpsGene has a heartbeat service which is basically a watch dog timer. You can configure it to call you if you don't ping them in x number of minutes.
Unfortunately I would not recommend them. I have been using them for 4 years and they have changed their account system twice and left my paid account orphaned silently. I have to find a new vendor as soon as I have some free time.
I use Django-Celery +rabbitmq to execute some asyn tasks,I define a queue 'sendmail' to execute send email task,send mail is triggered by a specific task(this task has own queue), but now I encounter a problem,after the specific task finish, the mail sometimes send at once, sometimes need 5-20minutes.I want to know what reason caused it.
Django-celery will package the taskname and param as message to rabbitmq when call task.delay().
I want to know when the message go to the rabbitmq, but use web management tool only can see total messages,can't see the every message's detail, especially the time the message reached. Django-celery log can only see the work got from broker time and execute task time.I want to know all related timepoint to sure which step the time main consumed.
Django-Celery does (I believe) report task data on a per-task basis. When you sync your database, it crates a bunch of monitoring tables which are accessible via the admin. However, in order for these tasks to be recorded in these tables, you need to run the celerycam program in the django context (python ./manage.py celerycam). The celerycam program will take "snapshots" of your tasks every second or so (by default) and record information about them. Another useful tool for monitoring is the celerymon program (which also has to run in the django context). This is a command line ncurses program that reports real-time information about tasks as they occur. Finally, rabbitmqctrl has a bunch of options that might help with monitoring.
This is a particularly useful page in the docs:
http://celery.github.com/celery/userguide/monitoring.html
Anyway, this is what I use to monitor my tasks when using celery.
I've only heard about tools like Celery, but I don't know if it fits my needs and is the best solution I can have.
Imagine a game like Travian. We initiate building and we have to wait N seconds until the construction is finished. When and how should we complete the construction?
Solution 1: Check if there are active construction every time the page loads. If queries like that takes some time we can make them asynchronous. If there are some - then complete.
However, in this way we are constantly waiting for the user to reload the page. Sure, we can use cronjob to check for constructions to be completed from time to time, but cronjobs execute once in a minute or less often. Constructions / attacks etc. must be executed as precisely as possible.
The solution above works, but has some cons. What are better and RELIABLE ways to perform actions like those I mentioned.
Moreover, let's assume that resources needs to be regenerated at X per hour speed and we need to regenerate them very precisely and pretty often. How can I achieve this without waiting for the page to be refreshed?
Finally, solution shall work in Webfaction hosting or any other shared hosting. I've heard that Celery doesn't work in Webfaction or am I mistaken?
Yes, celery have periodic tasks with seconds:
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html
Also you can run tasks in time with celery's crontab
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
Also if you need to check resources count I think it's common part for every request, so your response should looks like
{
"header": {"resources": {"wood":1, "stone":500}}
"data": {.. you real data shoud be here...}
}
You need to add header to response that will contain common information like resources count, unread messages etc and handle it properly on client.
To improve it you can use nginx + ssl + memcache backend.
I'm working on a script that creates a MySQL dump via <cfexecute> and then FTPs the SQL script to another server. I've resorted to checking once per second to see if the filesize has changed, and if it has not changed within the past five seconds I assume it has completed.
This is fine for the current application, but eventually I would like to be able to import the SQL script on the second server and provide some sort of notification that it has completed.
Is there some way to track the status of a running process?
If not, is there a way to accomplish a full DB export and import via ColdFusion alone?
Actually you may not realize it, but when you call <cfexecute> without passing a timeout attribute it defaults to '0' timeout. And if you read the docs on <cfexecute> you'd see:
If the value is 0:
ColdFusion starts a process and returns immediately. ColdFusion may
return control to the calling page
before any program output displays. To
ensure that program output displays,
set the value to 2 or higher.
So I would suggest passing a higher value for timeout which will cause ColdFusion to wait for mysqldump to complete before moving on.
Reference
Check out Event Gateways[1] for one way to deal with asynchronous operations. There's a Directory Watcher gateway that comes with CF as an example.[2]
Barring that, create some sort of batch processing facility using CF Scheduled Tasks. Add the job to a database table and have a scheduled task periodically pull jobs out of the table and execute them, reporting on the result. A second scheduled task can detect that the first completed and carry out the next step of the process.
[1] http://help.adobe.com/en_US/ColdFusion/9.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec214e3-7fa7.html
[2] http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSc3ff6d0ea77859461172e0811cbec22c24-77f7.html