Airflow ignore LatestOnlyOperator - airflow-scheduler

I use LatestOnlyOperator as the first task of my scheduled DAG.
How can I rerun a dagrun which is not the last?

LatestOnlyOperator skips externally triggered dags.
Go to Tree View in Airflow UI, click on the relevant dagrun (the upper circle, not any square), then click Edit, mark External Trigger and Save.
After that click on the dagrun again and choose Clear.
# If the DAG Run is externally triggered, then return without
# skipping downstream tasks

Related

Tasks are not performed using the admin interface

I use django-celery-beat to create the task. Everything is registered, connected, the usual Django tascas using beat_schedule work. I add a new task, I don't register it in beat_schedule:
#app.task
def say_hi():
print("hello test")
I go into admin, add Periodic tasks, this task in Task (registered) is visible, I select it, I also select the interval every minute and save, Task (registered) is zeroed, and its value appears in Task (custom).
The task itself does not start its execution, in the console print is not displayed, but Last Run Datetime is updated in the admin. What could go wrong?

Google App Engine, tasks in Task Queue are not executed automatically

My tasks are added to Task Queue, but nothing executed automatically. I need to click the button "Run now" to run tasks, tasks are executed without problem. Have I missed some configurations ?
I use default queue configuration, standard App Engine with python 27.
from google.appengine.api import taskqueue
taskqueue.add(
url='/inserturl',
params={'name': 'tablename'})
This documentation is for the API you are now mentioning. The idea would be the same: you need to specify the parameter for when you want the task to be executed. In this case, you have different options, such as countdown or eta. Here is the specific documentation for the method you are using to add a task to the queue (taskqueue.add)
ORIGINAL ANSWER
If you follow this tutorial to create queues and tasks, you will see it is based on the following github repo. If you go to the file where the tasks are created (create_app_engine_queue_task.py). There is where you should specify the time when the task must be executed. In this tutorial, to finally create the task, they use the following command:
python create_app_engine_queue_task.py --project=$PROJECT_ID --location=$LOCATION_ID --queue=$QUEUE_ID --payload=hello
However, it is missing the time when you want to execute it, it should look like this
python create_app_engine_queue_task.py --project=$PROJECT_ID --location=$LOCATION_ID --queue=$QUEUE_ID --payload=hello --in_seconds=["countdown" for when the task will be executed, in seconds]
Basically, the key is in this part of the code in create_app_engine_queue_task.py:
if in_seconds is not None:
# Convert "seconds from now" into an rfc3339 datetime string.
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
# Create Timestamp protobuf.
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
# Add the timestamp to the tasks.
task['schedule_time'] = timestamp
If you create the task now and you go to your console, you will see you task will execute and disappear from the queue in the amount of seconds you specified.

Dataprep: No scheduled destinations set. Create an output to set a destination

What does this error mean?
No scheduled destinations set. Create an output to set a destination.
I am getting this error on dataprep when I attempt to create a run schedule for my jobs. They work perfectly when i simply hit run. But this error appear when I want to have them scheduled
As per the Dataprep documentation (emphasis mine):
To add a scheduled execution of the recipes in your flow:
Define the scheduled time and interval of execution at the flow level.
See Add Schedule Dialog. After the schedule has been created, you can
review, edit, or delete the schedule through the Clock icon.
Define the scheduled destinations for each recipe through its output
object. These destinations are targets for the scheduled job. See View for
Outputs below.
You'll find detailed instructions on how to set these up here.

Airflow - Why is there no trigger rule "One done"? Am i missing something?

I am trying to do this flow in an airflow dag.
Task 1: check if file exists in s3 (s3 sensor). If no new file is found, skip to task 4.
Task 2: if task 1 meets the criteria, delete the existing file in the local folder
Task 3: if task 2 is finished, download the s3 file into the local folder
Task 4: in either case, update table (using the only file in the folder)
I am not sure what trigger rule to add in the task 4. If i add one_failed, obviously the task wont be executed if the file exists.
If i add "all_done" it wont be executed because in either path, the dag will be skipping tasks (that's the whole purpose).
How should I go about it? I think i am missing something here...
Thanks everyone.
UPDATE
It also seems that my s3keysensor is not triggering "Fail" status when timed out. It appears in yellow even though the log shows "Snap, time is out".
Should be triggering Fail. This is from the documentation.
" Sensor operators keep executing at a time interval and succeed when
a criteria is met and fail if and when they time out."
This message appears in the console "These tasks are deadlocked: {...}." and the dag does not keep running. Can't get task 4 running! I am also trying it with a backfill for the same start and end date, is this correct?
Okay. It seems that Airflow cant have "empty paths". So you just have to add a dummy branch-false and then "ONE_SUCEED" on task 4.
Simple as that.

Sitecore Scheduled Job: Unable to run

I am very new with Sitecore, I am trying to create one task, but after creating task I configured command and task at content editor. Still I don't see run now option for my task at content editor. Need help.I want to know where the logs of scheduled jobs are written?
There are 2 places where you can define custom task.
In database
In config file
If you decide to go with 1st option
a task item must be created under /sitecore/system/tasks/schedules
item in the “core” database (default behavior).
no matter what schedule you set on that item, it may never be executed, if you do not have right DatabaseAgent looking after that task item.
DatabaseAgent periodically checks task items and if tasks must be
executed (based on the value set in the Scheduling field), it
executes actual code
By default the DatabaseAgent is called each 10
minutes
If you decide to go with 2nd option, check this article first.
In short, you need to define your task class and start method in the config files (check out the /sitecore/admin/showconfig.aspx page, to make sure config changes are applied successfully)
<scheduling>
<!--
Time between checking for scheduled tasks waiting to execute
-->
<frequency>00:00:05</frequency>
<agent type="NameSpace.TaskClass" method="MethodName" interval="00:10:00"/>
</agent>
</scheduling>
As specified in the other answers, you can use config file or database to execute your task. However, it seems that you want to run it manually.
I have a custom module on the Sitecore Marketplace which allows you to select the task you want run. Here is the link
In brief, you need to go to the Sitecore control panel, then click on administration and lastly click on the Run Agent.
It will open a window where you can select the task. I am assuming that the task you have implemented does not take the Item details on which you are when triggering the job.