I have an AWS environment with 3 CodePipelines. Let's call those- P1, P2 and P3. I'd like to run those in "specific" sequence. That sequence will be determined by a Lambda function.
This Lambda function does some calculation, and determine which sequence the pipelines need to be run. So, it could be-
P1 > P2 > P3
P3 > P2 > P1
P2 > P3 > P1
Each codepipeline must finish successfully before the next one is run. How can I achieve this?
At first I tried to do it using that same Lambda function, but it has 15 mins timeout. We don't know how long each pipeline's gonna take. All together they could take even ~30 mins.
Also, since the sequence is dynamic, I couldn't just get one pipeline to export a file to S3, and use that as one of source for another one!
Any suggestion?
CP emits events using EventBridge. Thus you would have to use that.
Basically, you would setup event rules which would trigger executions of subsequent pipeline, based on successful completion of previous pipeline.
Generally EventBridge would do the trick (as mentioned by #Marcin), however, it didn't work with my particular scenario.
I resolved the issue by introducing "Step Function" between the Lambda function and pipelines.
Lambda function passes some parameter to the Step Function. Based on that, it takes different path (different order of pipeline to execute)
Also, it has "wait" mechanism, as well as, GetPipelineExecution to check status. Together they ensure that one pipeline needs to end successfully before the next one is triggered.
Hope this helps someone in future :)
Related
I'm trying to build a process like this:
In state1, it will trigger 10 lambdas, and only when ALL those 10 lambda respond/ or call callback with taskToken, it will then proceed to next state2.
How to design this process?
This is a perfect scenario for the Map state. You can pass in an array of lambda function names, then add a Lambda task and use the Parameters block to set the function dynamically. And if you want them to run one at a time instead of in parallel, you can set MaxConcurrency.
I would like to run one lambda function, that would return a list of parameters. Based on the number of parameters I would like to trigger another lambda functions to finish the process individually (e.g. 100 independent sub-lambda function).
Would like to know how this be done? It would be great if there are some github settings? Thanks a lot.
There are three options: first, call Invoke using the AWS SDK for your language.
Second, use Step Functions.
Third, write each parameter onto an SQS queue, and configure the second Lambda to be triggered by that queue. This is the approach that I'd use.
I have nearly 1000 items in my DB. I have to run the same operation on each item. The issue is that this is a third party service that has a 1 second rate limit for each operation. Until now, I was able to do the entire thing inside a lambda function. It is now getting close to the 15 minute (900 second) timeout limit.
I was wondering what the best way for splitting this would be. Can I dump each item (or batches of items) into SQS and have a lambda function process them sequentially? But from what I understood, this isn't the recommended way to do this as I can't delay invocations sufficiently long. Or I would have to call lambda within a lambda, which also sounds weird.
Is AWS Step Functions the way to go here? I haven't used that service yet, so I was wondering if there are other options too. I am also using the serverless framework for doing this if it is of any significance.
Both methods you mentioned are options that would work. Within lambda you could add a delay (sleep) after one item has been processed and then trigger another lambda invocation following the delay. You'll be paying for that dead time, of course, if you use this approach, so step functions may be a more elegant solution. One lambda can certainly invoke another--even invoking itself. If you invoke the next lambda asynchronously, then the initial function will finish while the newly-invoked function starts to run. This article on Asynchronous invocation will be useful for that approach. Essentially, each lambda invocation would be responsible for processing one item, delaying sufficiently to accommodate the service limit, and then invoking the next item.
If anything goes wrong you'd want to build appropriate exception handling so a problem with one item either halts the rest or allows the rest of the chain to continue, depending on what is appropriate for your use case.
Step Functions would also work well to handle this use case. With options like Wait and using a loop you could achieve the same result. For example, your step function flow could invoke one lambda that processes an item and returns the next item, then it could next run a wait step, then process the next item and so on until you reach the end. You could use a Map that runs a lambda task and a wait task:
The Map state ("Type": "Map") can be used to run a set of steps for
each element of an input array. While the Parallel state executes
multiple branches of steps using the same input, a Map state will
execute the same steps for multiple entries of an array in the state
input.
This article on Iterating a Loop Using Lambda is also useful.
If you want the messages to be processed serially and are happy to dump the messages to sqs, set both the concurency of the lambda and the batchsize property of the sqs event that triggers the function to 1
Make it a FIFO queue so that messages dont potentially get processed more than once if that is important.
I have a step function with three states currently.
Pass state -> Wait for 9 hours -> x Lambda Task. - (a)
I want to update the state machine adding another task at the end, so effectively the machine would look like this:
Pass state -> Wait for 9 hours -> x Lambda Task -> y Lambda task. - (b)
Is there a way in which I can edit (a) to (b) and all the running executions will get updated with it ?
Or the ideal way is to abort all the (a) executions and supply the same data to run(b). If so, what would be the correct and possibly the easiest way to do this using SFN tools ?
As mentioned in the step function docs
Running executions will continue to use the previous definition and roleArn
However, if you change your lambda function, step function (even if they were started before the lambda function was changed) will run the new lambda code when it reaches that state (assuming you are not using versioned ARN of lambda function in your state machine). You can change the lambda x to call lambda y while you are in migration phase which will ensure all running executions also run y.
It appears to me that the function of snapshot dependency completely supersedes that of finished build trigger in TeamCity. Can anyone explain more the effect of these methods if they result in different chain behaviour? As an example, if I had a build chain of A->B
Does the chain actually behave any differently between these three setups?
Setup 1: Single finished build trigger of A in B.
Setup 2: Single snapshot dependency of A in B.
Setup 3: Both finished build trigger of A AND snapshot dependency of A defined in B.
I understand that one can kind of treat Snapshot Dependency as "AND" operation of all the dependees, while Finished Build Trigger works like "OR" operation amongst the dependees. But in the context of a sequential chain, is there any difference?
Thanks,
Scott
A "Snapshot Dependency" and "Finished Build" trigger are very different. one is basically a "push" operation while the other is a "pull" operation, respectively.
Setup 1:
If I have build configs A and B where B has a "Finished Build" trigger on A, then the opposite behavior is true. Triggering B will have no affect on A, but triggering A will effectively trigger B once it has finished.
Setup 2:
If I have the exact same setup but instead B has a snapshot dependency on A, then whenever B is triggered, A will run first, or at least check to see if it needs to run, before running B. IF only A is triggered, then B will not be triggered.
Setup 3:
Setup 3 is slightly different because it doesn't JUST depend on the "Finished Build" trigger or the snapshot dependency. it ALSO depends on the initial trigger (VCS, scheduled, or whatever). for example, if you have a VCS trigger on A, and B has both the "Finished Build" trigger and "snapshot dependency" on A, then you effectively get the behavior of Setup 1. A will get triggered on VCS changes and B will be triggered AFTER A (using the same snapshot). In fact, without the snapshot setup, it is not guaranteed that B will use the same snapshot as A, which may or may not be what you want.
So in general, when you want a "left-to-right" trigger process, you use BOTH finished build triggers and snapshot dependencies to guarantee the sanctity of the build collateral.
If, on the other hand, you have your initial trigger (VCS or scheduled or whatever) setup on B, then having the "finished build" trigger is somewhat nullified, because B will always be triggered first (but not run), and then it will trigger all of its dependencies and automatically run after they finish.
hope that helps. thanks!
As you said, there's a big difference if a config snapshot-depends on multiple other configs (Z snapshot-depending on both X and Y). But you're not interested in that...
It's true to say that in the trivial A->B scenario Setups 1 .. 3 are close to equivalent. Of course, only if the events that trigger A and B are one-to-one (e.g. both A and B are triggered on the same VCS root; or they use different VCS roots but are only triggered manually). If this is true, then your A->B chain is super-trivial and might be possible to implement within a single build configuration.
Other subtle differences that come to mind:
Passing parameters down the chain.
Suppose A and B share some user-defined parameter "foo". The A->B snapshot dependency lets you define %foo% in A and reuse it in B using %dep.A.foo%. That's really convenient because you don't need to worry about keeping the value of %foo% in sync between A and B.
Now suppose that you want to manually trigger the A->B chain with a custom value of %foo% (you'll specify the value in the "Run..." dialog).
Since TC cannot pass the value up the chain (from B to A), you must really trigger A and specify the custom value there. When A finishes, it will trigger B, which will overtake the custom value. That's Setup 3.
You can't achieve the same with Setup 2, i.e. by triggering B with the custom value. The custom value would have no way of getting across to A.
Scheduling.
Suppose you have a scarce resource, and B cannot possibly run for every commit. You end up with each run of B "containing" (covering) multiple VCS commits. At the same time, A has no problems running for every commit.
In Setups 1 and 3, if you have a VCS trigger on A, you'll end up with
a run of A for every commit
a run of B with "aggregated" commits (each run covering multiple changes)
In Setup 2, if you have a VCS trigger on B only, you'll end up with aggregated commits in both A and B. Which may or may not be what you want...
Different VCS roots.
If A and B have different VCS roots, then Setup 1 (with VCS trigger on A only) and Setup 2 (with VCS trigger on B only) will behave quite differently.
In general, I agree the "pull" nature of snapshot dependencies (Setup 2) is much more appealing. Combine with the trigger if needed (Setup 3).