How make branches in a Apache Airflow Dag? - airflow-scheduler

I have a dag like this (This is a semi-pseudocode), I want to execute the tasks in different branches based on their output.
#This is a method that return a or b
def dosth():
.....
return a or b
t1 = PythonOperator(
't1',
python_callable = dosth
)
branchA = BashOperator(
'branchA',....
)
branchB = BashOperator(
'branchB',....
)
What I want is if dosth returns a, I want the dag to execute the task in branchA, if it returns b,I want the dag to execute the task in branchB. Anyone knows how can we approach this?

Check this doc about Branching: https://airflow.apache.org/docs/stable/concepts.html?highlight=branch#branching
You need to use BranchPythonOperator where you can specify the condition to be evaluated to decide which task should be run next.
Example based on your semi-pseudocode:
def dosth():
if some_condition:
return 'branchA'
else:
return 'branchB'
t1 = BranchPythonOperator(
task_id='t1',
provide_context=True,
python_callable= dosth,
dag=dag)
branchA = BashOperator(
'branchA',....
)
branchB = BashOperator(
'branchB',....
)
The function you pass to python_callable should return the task_id of the next task that should run.
Another Example:
def branch_func(**kwargs):
ti = kwargs['ti']
xcom_value = int(ti.xcom_pull(task_ids='start_task'))
if xcom_value >= 5:
return 'continue_task'
else:
return 'stop_task'
start_op = BashOperator(
task_id='start_task',
bash_command="echo 5",
xcom_push=True,
dag=dag)
branch_op = BranchPythonOperator(
task_id='branch_task',
provide_context=True,
python_callable=branch_func,
dag=dag)
continue_op = DummyOperator(task_id='continue_task', dag=dag)
stop_op = DummyOperator(task_id='stop_task', dag=dag)
start_op >> branch_op >> [continue_op, stop_op]

Related

How to use a returned value outside an operator/tasks

I have a dag like the one below and i need to use the value returned from a python operator outside the tasks. How do i achieve this?
dag = DAG(
dag_id='example_batch_submit_job',
schedule_interval=None,
start_date=datetime(2022, 7, 27),
tags=['batch_job'],
catchup=False)
def get_inputs(**kwargs):
num_jobs = kwargs['dag_run'].conf['num_jobs']
return num_jobs
run_this = PythonOperator(
task_id='get_input',
provide_context=True,
python_callable=get_inputs,
dag=DAG,
)
jobs = num_jobs <------ How do i pass the returned value here
for job in jobs:
submit_batch_job = BatchOperator(
task_id=f'submit_batch_job_{job}',
job_name=JOB_NAME,
job_queue=JOB_QUEUE,
job_definition=JOB_DEFINITION,
parameters={}
)
For Airflow<2.3.0
#task
def make_list(count):
context = get_current_context()
for i in range(count):
t = BatchOperator(
task_id=f"submit_batch_job_{i}",
job_name=JOB_NAME,
job_queue=JOB_QUEUE,
job_definition=JOB_DEFINITION,
parameters={},
overrides={},
)
t.execute(context)
job_list = make_list("{{ ti.xcom_pull(task_ids='get_input', key='return_value') }}")
run_this >> job_list
For Airflow >= 2.3.0 :
You can use Dynamic Task, which create number of task dynamically according to parameter in the execution.
#task
def make_list(count):
return [i for i in range(count)]
job_list = make_list("{{ ti.xcom_pull(task_ids='get_input', key='return_value') }}")
batch = BatchOperator.partial(
task_id="submit_batch_job",
job_name=JOB_NAME,
job_queue=JOB_QUEUE,
job_definition=JOB_DEFINITION,
parameters={}
).expand(job_id=job_list)
run_this >> job_list >> batch
also, be notice that num_jobs is str unless you set in your Dag that "render_template_as_native_obj=True". if you don't then you just need to cast it : int(count)

Batch Prediction Job non-blocking

I am running a Vertex AI batch prediction using the python API.
The function I am using is from the google cloud docs:
def create_batch_prediction_job_dedicated_resources_sample(
key_path,
project: str,
location: str,
model_display_name: str,
job_display_name: str,
gcs_source: Union[str, Sequence[str]],
gcs_destination: str,
machine_type: str = "n1-standard-2",
sync: bool = True,
):
credentials = service_account.Credentials.from_service_account_file(
key_path)
# Initilaize an aiplatfrom object
aiplatform.init(project=project, location=location, credentials=credentials)
# Get a list of Models by Model name
models = aiplatform.Model.list(filter=f'display_name="{model_display_name}"')
model_resource_name = models[0].resource_name
# Get the model
my_model = aiplatform.Model(model_resource_name)
batch_prediction_job = my_model.batch_predict(
job_display_name=job_display_name,
gcs_source=gcs_source,
gcs_destination_prefix=gcs_destination,
machine_type=machine_type,
sync=sync,
)
#batch_prediction_job.wait_for_resource_creation()
batch_prediction_job.wait()
print(batch_prediction_job.display_name)
print(batch_prediction_job.resource_name)
print(batch_prediction_job.state)
return batch_prediction_job
datetime_today = datetime.datetime.now()
model_display_name = 'test_model'
key_path = 'vertex_key.json'
project = 'my_project'
location = 'asia-south1'
job_display_name = 'batch_prediction_' + str(datetime_today)
model_name = '1234'
gcs_source = 'gs://my_bucket/Cleaned_Data/user_item_pairs.jsonl'
gcs_destination = 'gs://my_bucket/prediction'
create_batch_prediction_job_dedicated_resources_sample(key_path,project,location,model_display_name,job_display_name,
gcs_source,gcs_destination)
OUTPUT:
92 current state:
JobState.JOB_STATE_RUNNING
INFO:google.cloud.aiplatform.jobs:BatchPredictionJob projects/my_project/locations/asia-south1/batchPredictionJobs/37737350127597649
The above output is being printed on the terminal over and over after every few seconds.
The issue that I have is that the python program calling this function keeps on running until it is force stopped. I have tried both batch_prediction_job.wait() & batch_prediction_job.wait_for_resource_creation() with the same results.
How do I start a batch_prediction_job without waiting for it to complete and terminating the program just after the job has be created?
I gave you the wrong instruction on the comments, change the parameter sync=False and the function should return just after be executed.
Whether this function call should be synchronous (wait for pipeline run to finish before terminating) or asynchronous (return immediately)
sync=False
def create_batch_prediction_job_dedicated_resources_sample(
# ...
sync: bool = False,
):
UPDATE - Adding more details:
Check here my notebook code where I tested it and its working:
You have to change the sync=False AND remove/comment the following print lines:
#batch_prediction_job.wait()
#print(batch_prediction_job.display_name)
#print(batch_prediction_job.resource_name)
#print(batch_prediction_job.state)
Your code edited:
def create_batch_prediction_job_dedicated_resources_sample(
key_path,
project: str,
location: str,
model_display_name: str,
job_display_name: str,
gcs_source: Union[str, Sequence[str]],
gcs_destination: str,
machine_type: str = "n1-standard-2",
sync: bool = False,
):
credentials = service_account.Credentials.from_service_account_file(key_path)
# Initilaize an aiplatfrom object
aiplatform.init(project=project, location=location, credentials=credentials)
# Get a list of Models by Model name
models = aiplatform.Model.list(filter=f'display_name="{model_display_name}"')
model_resource_name = models[0].resource_name
# Get the model
my_model = aiplatform.Model(model_resource_name)
batch_prediction_job = my_model.batch_predict(
job_display_name=job_display_name,
gcs_source=gcs_source,
gcs_destination_prefix=gcs_destination,
machine_type=machine_type,
sync=sync,
)
return batch_prediction_job
datetime_today = datetime.datetime.now()
model_display_name = 'test_model'
key_path = 'vertex_key.json'
project = '<my_project_name>'
location = 'asia-south1'
job_display_name = 'batch_prediction_' + str(datetime_today)
model_name = '1234'
gcs_source = 'gs://<my_bucket_name>/Cleaned_Data/user_item_pairs.jsonl'
gcs_destination = 'gs://<my_bucket_name>/prediction'
create_batch_prediction_job_dedicated_resources_sample(key_path,
project,location,
model_display_name,
job_display_name,
gcs_source,
gcs_destination,
sync=False,)
Results sync=False:
Results sync=True:

Flask app is keep on loading at the time of prediction(TensorRT)

This is in the continuation to the question
Facing issue while running Flask app with TensorRt model on jetson nano
Above is resolve but when I am running flask 'app' it keep loading and not showing video.
code:
def callback():
cuda.init()
device = cuda.Device(0)
ctx = device.make_context()
onnx_model_path = './some.onnx'
fp16_mode = False
int8_mode = False
trt_engine_path = './model_fp16_{}_int8_{}.trt'.format(fp16_mode, int8_mode)
max_batch_size = 1
engine = get_engine(max_batch_size, onnx_model_path, trt_engine_path, fp16_mode, int8_mode)
context = engine.create_execution_context()
inputs, outputs, bindings, stream = allocate_buffers(engine)
ctx.pop()
##callback function ends
worker_thread = threading.Thread(target=callback())
worker_thread.start()
trt_outputs = do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
print("start in do_inferece")
# Transfer data from CPU to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
# Run inference.
print("before run infernce in do_inferece")
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
print("before output in do_inferece")
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
print("before stream synchronize in do_inferece")
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
print("before return in do_inferece")
return [out.host for out in outputs]
Your worker_thread creates the context required for do_inference. You should call the do_inference method inside the callback()
def callback():
cuda.init()
device = cuda.Device(0)
ctx = device.make_context()
onnx_model_path = './some.onnx'
fp16_mode = False
int8_mode = False
trt_engine_path = './model_fp16_{}_int8_{}.trt'.format(fp16_mode, int8_mode)
max_batch_size = 1
engine = get_engine(max_batch_size, onnx_model_path, trt_engine_path, fp16_mode, int8_mode)
context = engine.create_execution_context()
inputs, outputs, bindings, stream = allocate_buffers(engine)
trt_outputs = do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
# post-process the trt_outputs
ctx.pop()

Django Viewflow - Return Handler Response

Following is my flow:-
class APLWorkflow(Flow):
start = (
flow.StartFunction(function1)
.Next(this.request_quotes)
)
request_quotes = (
flow.Handler(function2)
.Next(this.move_package)
)
move_package = (
flow.Handler(function3)
.Next(this.shipment_create)
)
shipment_create = (
flow.Function(function4)
.Next(this.end)
)
end = flow.End()
Following are my util functions:-
def function1():
return 1
def function2():
return 2
def function3():
return 3
def function4():
return 4
The problem is when I start the flow, it runs perfectly well. However, the response returned is that of start node, not the last executed node.
Following is my code:-
activation.prepare()
response = APLWorkFLow.start.run(**some_kwargs)
activation.done() # stops the flow at `move_package`.
print(response) # prints 1, not 3.
How do I return the response of the last executed node, in this Handler (move_package)?

Airflow scheduling task "t2failure" depends on task t1

t1 = BashOperator(
task_id='task_1',
bash_command="Rscript Failure.R",
dag=dag)
t2 = BashOperator(
task_id='task_2',
bash_command="Rscript Success.R",
dag=dag)
t1fail = BashOperator(
trigger_rule=TriggerRule.ONE_FAILED,
task_id='task_1fail',
bash_command="echo task1 failed",
dag=dag)
t2fail = BashOperator(
trigger_rule=TriggerRule.ONE_FAILED,
task_id='task_2fail',
bash_command="echo task2 failed",
dag=dag)
tSuccess = BashOperator(
task_id='t_Success',
bash_command="echo task1 failed",
dag=dag)
t2.set_upstream(t1)
tSuccess.set_upstream(t1, t2)
t2fail.set_upstream(t2)
t1fail.set_upstream(t1)
When task t1 fails to execute, ideal situation is t1fail is to be called.
But what I get is the task t2fail is also being called, as the trigger is set to one failed. Is there a way i can make the task t2fail to start only when the task t2 runs and it fails.