Synchronous + Asynchronous steps in step function - amazon-web-services

I am very new to step function and still exploring the same.
I have workflow something like this
--Steps A to C are synchronous.
Step A
if(respose is X)
Step B
else
Step C
--Need to return response to user here and need to follow two below steps asynchronously to unblock the caller of step function.
Step D
Step E
Is it possible to achieve the same? I believe, I will append .sync for step A, B and C. Will not append anything to D and E and it should work. Am I missing anything here?
Note that, all steps will be executed by activity workers only.

We can take two approaches.
Break the step function into two. First three steps will be in an express step function and last two steps will be a regular step function.
OR
We can have just one step function, where ever we call this step function, we need to wait for first three steps to be completed before moving forward. This can be done by calling get-execution-history in a loop to grab the output of intermediate step. Here is an answer with this approach.

Related

Parallel processes with dependency

There are two parallel processes. Each process has two steps. The second step of the first process is always executed after the first step. The second step of the second process is performed only under a certain condition.
Activity diagram:
How to reflect an additional condition: to complete the second step of the second process, the first step of the first process must be completed.
I managed:
Flaws:
No match between fork and join
If the condition of the second process is not met, the token “hangs” before join
Having looked at your solution once more made me think that you saw issues, where there are none. You are worried about the hanging token, but that is no issue in this case. If P22 is bypassed, the token from P11 will go down directly to the join node. P11 and P12 will pass their token down also with no issue, thereby creating that ghost token which gets stuck in the middle right join. Since the lower join now has two tokens it will continue to the end where the activity is terminated. At that point any free running tokens (and even active actions) are terminated as well. All good.
I leave my former answer for further inspiration. But basically they will all be implemented in similar ways since they represent a gateway.
Original answer
I guess that using an event would be the best way:
This way D can only start (and finish) when the event has been received which is sent after As completion.
Another way would be to use an object that stores the finalization of action A and which is read by D.
Note that the diagonal connectors through a ready are ObjectFlows which UML does not per default distinguish optically (unlike SysML).
P. 374 of UML 2.5 states
Object tokens pass over ObjectFlows, carrying data through an Activity via their values, or carrying no data ( null tokens). A null token can still be passed along an ObjectFlow and used like any other token. For example, an Action can output a null token to explicitly indicate that it did not produce an optional value, and a downstream DecisionNode (see sub clause 15.3) can test for this and branch accordingly.
So you can see that as a buffer holding a token and no real data is needed to be stored. Basically that's the same as an event. Implementation wise you would use a semaphore or a stream to realize that, but of course at this level you would not care too much about such details.

R Studio: Mutating a column variable based on two selection conditions

The dataframe above represents a repeated-measures design, each participant took part in both task A and B. The condition determines which order the tasks occurred - if in condition 1, then Task A came first followed by task B, and vice-versa for condition 2.
I would like to mutate a new column in my dataframe called 'First Task'. This column must represent the scores from the task that always occurred first. For example, participant 1001 was in condition 1, so their score from task A should go into this first task column. For participant 1002, in condition 2, their score from task B should go into the first task column, and so on.
After scouring possible threads (which have always solved every need I have!) I considered using the mutate function, combined with cases_when (group == 1), and thereafter I am not sure how to properly pipe something along the lines of select score from TASK A. Alternatively, I considered how I may go about using if or ifelse, probably the likely piece of code to execute something like this?
It would be an elegant piece of coding like this I am after, as opposed to re-creating a new dataframe . I would greatly appreciate any thoughts or ideas on this. Let me know if this is clear (note I have simplified this image as a example to make the question clearer).
Many Thanks community

Can we send output from 1 branch to another in parallel type step function execution?

I need the output from that 2nd branch to be included as input to 1st branch's next step. Like multiple inheritance just check the diagram to get clear picture.
You want indeed an inner parallel block to reconciliate the two branches.
Sorry for the bad quality of my drawing.
And in this specific context, it could also mean that the step should be outside of the parallel block, if you want to have one single parallel block.

PDI - Block this step until steps finished not working

Why my Block this step until steps finished not work? I should wait all my insert step before run rest of them. Any suggestion?
All table input step will run parallelly when you execute the transformation.
If you want to stop table execution then I suggest adding one constant (i.e 1) before block until step and in the table input step you can add one condition like where 1 = ? with option enabling and execute for each row
You are possibly confusing blocking the data flow and finishing the connection. See there.
As far as I can understand by you questions since 3 month, you should really have a look here and there.
And try to move to writing Jobs (kjb) to orchestrate your transformations (ktr).

kettle etl transformation hop between steps doesn't work

I am using PDI 6 and new to PDI. I created these two tables:
create table test11 (
a int
)
create table test12 (
b int
)
I created a transformation in PDI, simple ,just two steps
In first step:
insert into test11 (a)
select 1 as c;
In second step:
insert into test12 (b)
select 9 where 1 in (select a from test11);
I was hoping second step execute AFTER first step, so the value 9 will be inserted. But when I run it, nothing got inserted into table test12. It looks to me the two steps are executed in parallel. To proved this, I eliminated second step and put the sql in step 1 like this
insert into test11 (a)
select 1 as c;
insert into test12 (b)
select 9 where 1 in (select a from test11);
and it worked. So why? I was thinking one step is one step so next step will wait until it finishes, but it is not?
In PDI Transformations, the step initialization and execution happen in parallel. So if you are having multiple steps in a single transformation, these steps will be executed in parallel and the data movement happens in round-robin fashion (by default). This is primarily the reason why your two execute SQL steps do not work, since both the steps are executed in parallel. The same is not the case with PDI Jobs. Jobs work in a sequential fashion unless it is configured to run in parallel.
Now for your question, you can try to do any one of the below steps:
Create two separate transformations with the SQL steps and place it inside a JOB. Execute the job in sequence.
You can try using the Block this step until finish in transformation which will wait for a particular step to get execute. This is one way to avoid parallelism in transformations. The design of your transformation will similar to as below:
Data grids are a dummy input step. No need to assign any data to the data grids.
Hope this helps :)