how to edit a already deployed pipeline in data fusion? - google-cloud-platform

I am trying to edit a pipeline which is already deployed I understand that we can duplicate a same pipeline and rename it but how can do make a change in existing pipeline as renaming would require a change in production scheduling jobs as well.

There is one way thru http calls executor..
Open https://<cdf instnace url ..datafusion.googleusercontent.com>/cdap/httpexecutor
Select PUT(to change pipeline code) from drop down and give
namespaces/<namespaces_name>/apps/<pipeline_name>
Go to body part and paste the new pipeline code (export the code of updated pipeline to i.e. json formatted)
Click on SEND and Response would come as "Deploy Complete" with status code 200.

Related

Multiple configuration file for a code pipeline action

I want to run a CFN stack from code pipeline, but the paramaters are scatered in different outputs from previous action. The only way I can think of is to run a Lambda action which load all of them and create a new one output out of it.
Is there something more straight forward?

Run a "PUT" request using Rest-Api on Azure DevOps

I'm trying to run a PUT request using Postman to change the retention rules of a specific build definition, in Azure DevOps, and change the daysToKeep value.
But I keep getting the error:
"The request specifies pipeline ID 1722 but the supplied pipeline has ID 0."
Any idea where do I go wrong?
In order to change\update any parameter on the build definition, first run a GET request.
The result JSON output should be used as the body for the PUT request.
Use this body and change\update the relevant parameter(s) you need...
It is very important to increase the "revision" parameter, located in the root of the JSON output, in 1. (for example, if current is 97, for next run it should be 98).
Special thanks to Tinxuanna for directing me to this solution!
#ShaiO Please take a look at this link List pipelines Azure DevOps. It may help you because it gets a list of the pipelines.
Then you can create a pipeline following the documentation.

Perform cloud formation only if any changes in lambda using AWS Code Pipeline

I am uisng AWS Code pipeline to perform cloud formation. My source code is committed in GitHub repository. When ever a commit is happening in my github repository, AWS Code Pipeline will starts its execution and perform the cloud formation. These functionalities are working fine.
In my project I have multiple modules. So if a user is modified only in one module, the entire module's lambda's are updated. Is there any way to restrict this using AWS Code Pipeline.
My Code Pipeline has 3 stages.
Source
Build
Deploy
The following is the snapshot of my code pipeline.
We had a similar issue and eventually we came to conclusion that this is not exactly possible. So unless you separate your modules into different repos and make separate pipelines for each of them it is always going to execute everything.
The good thing is that with each execution of the pipeline it is not entirely redeploying everything when the cloud formation is executed. In the deploy stage you can add Create Changeset part which is basically going to detect what is changed from the previous CloudFormation deployment and it is going to redeploy only those parts and will not touch anything else.
This is the exact issue we faced recently and while I see comments mentioning that it isn't possible to achieve with a single repository, I have found a workaround!
Generally, the code pipeline is triggered by a CloudWatch event listening to the GitHub/Code Commit repository. Rather than triggering the pipeline, I made the CloudWatch event trigger a lambda function. In the lambda, we can write the logic to execute the pipeline(s) only for module which has changes. This works really nicely and provides a lot of control over the pipeline execution. This way multiple pipeline can be created from a single repository, solving the problem mention in the question.
Lambda logic can be something like:
import boto3
# Map config files to pipelines
project_pipeline_mapping = {
"CodeQuality_ScoreCard" : "test-pipeline-code-quality",
"ProductQuality_ScoreCard" : "test-product-quality-pipeline"
}
files_to_ignore = [ "readme.md" ]
codecommit_client = boto3.client('codecommit')
codepipeline_client = boto3.client('codepipeline')
def lambda_handler(event, context):
projects_changed = []
# Extract commits
print("\n EVENT::: " , event)
old_commit_id = event["detail"]["oldCommitId"]
new_commit_id = event["detail"]["commitId"]
# Get commit differences
codecommit_response = codecommit_client.get_differences(
repositoryName="ScorecardAPI",
beforeCommitSpecifier=str(old_commit_id),
afterCommitSpecifier=str(new_commit_id)
)
print ("\n Code commit response: ", codecommit_response)
# Search commit differences for files that trigger executions
for difference in codecommit_response["differences"]:
file_name = difference["afterBlob"]["path"]
project_name = file_name.split('/')[0]
print("\nChanged project: ", project_name)
# If project corresponds to pipeline, add it to the pipeline array
if project_name in project_pipeline_mapping:
projects_changed.insert(len(projects_changed),project_name)
projects_changed = list(dict.fromkeys(projects_changed))
print("pipeline(s) to be executed: ", projects_changed)
for project in projects_changed:
codepipeline_response = codepipeline_client.start_pipeline_execution(
name=project_pipeline_mapping[project]
)
Check AWS blog on this topic: Customizing triggers for AWS CodePipeline with AWS Lambda and Amazon CloudWatch Events
Why not model this as a pipeline per module?

AWS Data Pipeline: how to add steps other than data nodes and activities

EDIT
What I'm really asking here, is whether most folks use the "Architect" GUI to build their pipelines, or whether most folks just use JSON. Is JSON the only way to access some functionality?
/EDIT
I'm just getting started in AWS, so I'm hoping someone here can help me out.
I've used the template for "Load S3 Data into RDS MySQL table" to create a basic pipeline that does a very simple insert:
For learning purposes I want to recreate that pipeline from scratch, but I can't figure out how to add anything to the pipeline that isn't an activity or a data node. Does this have to be done through the CLI? When I try to use the "Add" button in Architect I only see options for activities and data nodes.
TaskRunners, Preconditions, Databases, Actions and Resources can be added to the Pipeline only from their respective Activities and Datanodes.
For example, An RDSDatabase can be added to the Pipeline from SqlActivity or SqlDataNode or MySqlDataNode.
Add SqlActivity -- Choose Database -- Create new: Database : Adds a Database object to the Pipeline.
Database -- Choose Type -- Select type: RDSDatabase

Importing data from Excel sheet to DynamoDB table

I am having a problem importing data from Excel sheet to a Amazon DynamoDB table. I have the Excel sheet in an Amazon S3 bucket and I want to import data from this sheet to a table in DynamoDB.
Currently I am following Import and Export DynamoDB Data Using AWS Data Pipeline but my pipeline is not working normally.
It gives me WAITING_FOR_RUNNER status and after sometime the status changed to CANCELED. Please suggest what I am doing wrong or is there any other way to import data from an Excel sheet to a DynamoDB table?
The potential reasons are as follows:-
Reason 1:
If your pipeline is in the SCHEDULED state and one or more tasks
appear stuck in the WAITING_FOR_RUNNER state, ensure that you set a
valid value for either the runsOn or workerGroup fields for those
tasks. If both values are empty or missing, the task cannot start
because there is no association between the task and a worker to
perform the tasks. In this situation, you've defined work but haven't
defined what computer will do that work. If applicable, verify that
the workerGroup value assigned to the pipeline component is exactly
the same name and case as the workerGroup value that you configured
for Task Runner.
Reason 2:-
Another potential cause of this problem is that the endpoint and
access key provided to Task Runner is not the same as the AWS Data
Pipeline console or the computer where the AWS Data Pipeline CLI tools
are installed. You might have created new pipelines with no visible
errors, but Task Runner polls the wrong location due to the difference
in credentials, or polls the correct location with insufficient
permissions to identify and run the work specified by the pipeline
definition.