Google Deployment Manager - BigTable example

Google Deployment Manager - BigTable example - google-cloud-platform

I have been trying this example provided in the Google's Deployment Manager GitHub project.
It works, yet I am not sure what is the purpose of creating three instances named instance_create, instance_update and instance_delete.
For example, taken from the link:
instance_create = {
'name':
'instance_create',
'action':
'gcp-types/bigtableadmin-v2:bigtableadmin.projects.instances.create',
'properties': {
'parent': project_path,
'instanceId': instance_name,
'clusters': copy.deepcopy(initial_cluster),
'instance': context.properties['instance']
},
'metadata': {
'runtimePolicy': ['CREATE']
}
}
What is the purpose of `action` and `metadata`.`runtimePolicy`? I have tried to find it in the documentation but failed miserably.
Why there are three `BigTable` instances there?

You are right, the documentation is missing the information, which would answer your questions regarding these parameters.
However, it helps knowing what's going on in the Depoyment Manager example you linked.
First of all, the following line in the config.yaml is where the things get tricky:
resources:
- name: my-bigtable
type: bigtable.py
This line will do a call to the bigtable.py python file, which sets the resource type of the deployment to that which are in it, under the GenerateConfig function. See how this is done here.
The resources are returned as {'resources': resources} at the end of it, being the resources variable a list of templates created there.
These templates have different name identifiers, which are set by the "name" tag.
So you are not creating three different instances with the name of instance_create, instance_update and instance_delete in this file, but you are creating three templates with those names, that will later be appended to the resources list, and later returned to the config.yaml resources.type tag.
These templates then will be sequentially build and executed by the deployment manager, once the create command is used. Note that they might appear out of order, this is due not using a schema.
It's easier to see this structure in a .yaml file format, for example, built with jinja, the template you posted would be:
resources:
- action: gcp-types/bigtableadmin-v2:bigtableadmin.projects.instances.create
name: instance_create
metadata:
runtimePolicy:
- CREATE
properties:
clusters:
initial:
defaultStorageType: HDD
location: projects/<PROJECT_ID>/locations/<PROJECT_LOCATION>
serveNodes: 4
instance:
displayName: My BigTable Instance.
type: PRODUCTION
instanceId: my-instance
parent: projects/<PROJECT_ID>
Notice that the parameters under properties are the fields in the request body to bigtableadmin.projects.instances.create (which is nesting a clusters object parameters and a instance object parameters). Note that the InstanceId under properties is always the same, hence the BigTable instance, on which the templates do the calls, is always the same one.
The thing is that, not only the example you linked creates various templates to be run in the same script, but that the resource type for each template is a call to the BigTable API.
Normally the template resources are specified with the type tag, but since you are calling a resource that is directly running an API call (i.e. instead of just specifying gcp-types/bigtableadmin-v2, you are specifying bigtableadmin-v2:bigtableadmin.projects.instances.create), the action tag is used. I haven't found this difference on usage documented anywhere, but it needs to be specified like that.
You will know if you are calling an API 'endpoint' directly if the resource ends with either create/update/delete.
Finally, the I have investigated in my side, and the metadata.runtimePolicy is tied to the fact that the resource type is an API call (like in the previous point). Once again, I haven't found this documented anywhere.
However, since this is a requirement, you will always have to set the correct value in this field. It basically boils down to have metadata.runtimePolicy set to this values, depending on which type of API call you do:
create -> ['CREATE']
update -> ['UPDATE_ON_CHANGE']
delete -> ['DELETE']
Summarizing:
You are not creating three different instances, but three different templates, which do the work on the same BigTable instance.
You need to change the resource type flag to action if you are calling an API endpoint (create/update/delete), instead of just naming the base API.
The metadata.runtimePolicy value is a requirement when doing a call to one of the aforenamed endpoints.

Related

Storing parameterized values in cloud formation and referencing it

Is there a way to store variables in Cloudformation?
I've created a resource with a name which is a stage specific name in the following form:
DeliveryStreamName: {'Fn::Sub': ['firehose-events-${Stage}', 'Stage': {'Ref' : 'Stage' }]}
Now if I've to create a cloudwatch alarm on that resource I'm again following the same pattern:
Dimensions:
- Value: {'Fn::Sub': ['firehose-events-${Stage}', 'Stage': {'Ref' : 'Stage' }]}
Instead if I could store the whole value in one variable, it would be much easier for me to refer it.
I thought initially storing it in parameters, like this:
Parameters:
FirehoseEvent: {Type:String, Default: 'firehose-events-${Stage}'}
But the stage value doesn't seem to get passed in here. And there is no non default value either for this resource name.
The other option I considered was using mapping, but that defeats the purpose of using ${Stage}.
Is there some other way which I've missed?

Sadly you haven't missed anything. Parameters can't reference other parameters in their definition.
The only way I can think of doing what you which would be through a custom macro. In its simplest form the macro would just perform traditional find-and-replace type of template processing.
However, the time required to develop such macro could be not worth its benefits, at least in this simple example you've provided in the question.

Difference between an Output & an Export

In CloudFormation we have the ability to output some values from a template so that they can be retrieved by other processes, stacks, etc. This is typically the name of something, maybe a URL or something generated during stack creation (deployment), etc.
We also have the ability to 'export' from a template. What is the difference between returning a value as an 'output' vs as an 'export'?

Regular output values can't be references from other stacks. They can be useful when you chain or nest your stacks and their scope/visibility is local. Exported outputs are visible globally within account and region, and can be used by any future stack you are going to deploy.
Chaining
When you chain your stacks, you deploy one stack, take it outputs, and use as input parameters to the second stack you are going to deploy.
For example, let's say you have two templates called instance.yaml and eip.yaml. The instance.yaml outputs its instance-id (no export), while eip.yaml takes instance id as an input parameter.
To deploy them both, you have to chain them:
Deploy instance.yaml and wait for its completion.
Note it outputs values (i.e. instance-id) - usually done programmatically, not manually.
Deploy eip.yaml and pass instance-id as its input parameter.
Nesting
When you nest stacks you will have a parent template and a child template. Child stack will be created from inside of the parent stack. In this case the child stack will produce some outputs (not exports) for the parent stack to use.
For example, lets use again instance.yaml and eip.yaml. But this time eip.yaml will be parent and instance.yaml will be child. Also eip.yaml does not take any input parameters, but instance.yaml outputs its instance-id (not export)
In this case, to deploy them you do the following:
Upload parrent template (eip.yaml) to s3
In eip.yaml create the child instance stack using AWS::CloudFormation::Stack and the s3 url from step 1.
This way eip.yaml will be able to access the instance-id from the outputs of the nested stack using GetAtt.
Cross-referencing
When you cross-reference stacks, you have one stack that exports it outputs so that they can be used by any other stack in the same region and account.
For example, lets use again instance.yaml and eip.yaml. instance.yaml is going to export its output (instance-id). To use the instance-id eip.yaml will have to use ImportValue in its template without the need for any input parameters or nested stacks.
In this case, to deploy them you do the following:
Deploy instance.yaml and wait till it completes.
Deploy eip.yaml which will import the instance-id.
Altough cross-referencing seems very useful, it has one major issue, which is that its very difficult to update or delete cross-referenced stacks:
After another stack imports an output value, you can't delete the stack that is exporting the output value or modify the exported output value. All of the imports must be removed before you can delete the exporting stack or modify the output value.
This is very problematic if you are starting your design and your templates can change often.
When to use which?
Use cross-references (exported values) when you have some global resources that are going to be shared among many stacks in a given region and account. Also they should not change often as they are difficult to modify. Common examples are: a global bucket for centralized logging location, a VPC.
Use nested stack (not exported outputs) when you have some common components that you often deploy, but each time they can be a bit different. Examples are: ALB, a bastion host instance, vpc interface endpoint.
Finally, chained stacks (not exported outputs) are useful for designing loosely-coupled templates, where you can mix and match templates based on new requirements.

Short answer from here, use export between stacks, and use output with nested stacks.
Export
To share information between stacks, export a stack's output values.
Other stacks that are in the same AWS account and region can import
the exported values.
Output
With nested stacks, you deploy and manage all resources from a single
stack. You can use outputs from one stack in the nested stack group as
inputs to another stack in the group. This differs from exporting
values.

AWS CDK generated resource identifiers are horrible and not readable. Any way to fix this?

Anyone, that has used AWS CDK suffers from horrible resource identifiers.
Examples of Stacks/Nested Stacks names:
Or examples of resource names:
These identifiers are horrible to read. Is there any work-around to override these identifiers?
I have tried to set ids / names / identifiers / alies of the resources. However it seems that cdk or cloudformation itself is generating these strings.
Thank you for suggestions!

All of resources(or at least for most that I know) could be named manually.
For AWS::EC2::SecurityGroup that would be Properties -> GroupName
AWS::CloudWatch::Alarm - Properties -> AlarmName
AWS::Lambda::Function - Properties -> FunctionName
etc.
But for some of them that would lead to consequences - you won't be able to update some of them, because they might need recreation (and the name is already occupied). So in general it's not a good practice.
And obviously you won't be able to create a full env duplicate not changing some parameter for the generated name like this:
FunctionName: !Sub '${InstanceName}-your-resourse-constant-name-${Environment}'
If you don't specify the naming it would create a name like this:
${stackName}-${resourceNameInCF}-${someHashCode}, but in your case it seems you have nested stacks and it becomes pretty unreadable, especially with long names because of the names chaining.

Yeah, this is a good question. There's two types of IDs here:
Logical ID
Physical ID
Typically the Physical ID can be specified as a property on the resources. For instance, if you are using the CDK, you can set the functionName property when creating your Lambda (as below).
The Logical ID is also added when creating the resource and as you mentioned, however, the Logical ID is derived from a combination of what you specify and where it is within your stack. So, for example, if you have a stack that uses constructs, then this ID will be prefixed with the construct's Logical ID as well... and it's definitely not very readable.
I'd be very careful changing these IDs, especially if you have already deployed the stack, but if you really want to override them then you could do something like this in the CDK (TypeScript):
import {
CfnResource,
} from "#aws-cdk/core";
import {
Function,
Runtime,
Code,
} from "#aws-cdk/aws-lambda";
const consumerLambda = new Function(this, 'LogicalIdOnResource', {
runtime: Runtime.NODEJS_12_X,
handler: 'index.handler',
code: Code.fromAsset(path.join(__dirname, 'lambda-handler')),
functionName: 'ds-di-kafka-consumer-lambda' // PhysicalIdOnResource
});
// Override Logical ID
(consumerLambda.node.defaultChild as CfnResource).overrideLogicalId(
'Consumer'
);
Which looks like this on CloudFormation:

How to work around Cfn action's character limit in CodePipeline

Using the AWS CDK, I have a CodePipeline that produces build artifacts for 5 different Lambda functions, and then passes those artifacts as parameters to a CloudFormation template. The basic setup is the same as this example, and the CloudFormation deploy action looks basically like this:
new CloudFormationCreateUpdateStackAction({
actionName: 'Lambda_CFN_Deploy',
templatePath: cdkBuildOutput.atPath('LambdaStack.template.json'),
stackName: 'LambdaDeploymentStack',
adminPermissions: true,
parameterOverrides: {
...props.lambdaCode.assign(lambdaBuildOutput.s3Location),
// more parameter overrides here
},
extraInputs: [lambdaBuildOutput],
})
However, when I try to deploy, I get this error:
1 validation error detected: Value at 'pipeline.stages.3.member.actions.1.member.configuration' failed to satisfy constraint:
Map value must satisfy constraint:
[Member must have length less than or equal to 1000, Member must have length greater than or equal to 1]
The CodePipeline documentation specifies that values in the Configuration property of the ActionDeclaration can be up to 1000 characters. If I look at the YAML output from cdk synth, the ParameterOverrides property comes out to 1351 characters. So that's a problem.
How can I work around this issue? I may need to add more Lambda functions in the future, so this problem will only get worse. Part of the problem is that the CDK code inserts 'LambdaSourceBucketNameParameter' and 'LambdaSourceObjectKeyParameter' in each bucket/object pair name in the configuration output, putting me at 61 * 5 = 305 characters lost just to being verbose. Could I get part of the way there by overriding those generated names?

I got some assistance from a CDK maintainer here, which let me get well under the 1000-character limit. Reproducing the workaround here:
LambdaSourceBucketNameParameter and LambdaSourceObjectKeyParameter are just the default parameter names. You can create your own:
lambda.Code.fromCfnParameters({
bucketNameParam: new CfnParameter(this, 'A'),
objectKeyParam: new CfnParameter(this, 'B'),
});
You can also name Artifacts explicitly, thus saving a lot of characters over the defaults:
const sourceOutput = new codepipeline.Artifact('S');
EDIT 10-Jan-2020
I finally got a response from AWS Support regarding the issue:
I've queried the CodePipeline team and searched though the current development workflows and couldn't find any current activity related to increasing the limit for parameters passed to a CloudFormation stack or any alternative method for this action, so we have issued a feature request based on your request for our development team.
I'm not able to provide an estimated time for this feature to be available, but you can follow the release on new features through the CloudFormation and CodePipeline official pages to check when the new feature will be available.
So for now, it looks like the CfnParameter workaround is the best option.

AWS Cloudformation Nested Templates

I am trying to create a nested topology from 4 existing templates. These templates do the following:
1: deploys a policy and a role.
2: deploys an EC2 instance.
3: deploys an ELB.
4: deploys an RDS instance.
All of them are "linked" by using outputs. All of the parameters are also contained within these.
Now I want to create a fifth template (master) and treat the other 4 templates as child.
However I am not too sure about the minimum code that I need in the master template:
Parameters: these are defined within the child so I don't need them here, do I?
Resources: point to the 4 child templates by providing the S3 URL where they're stored.
DependsOn clause: I need this as the child templates need to be deployed in sequential order.
Outputs: not too sure what to include here, shall I leave the outputs on the child and define here only the master's?
The master I think it should be small but not too sure if I am missing something. Another question, do I need to change anything on the child templates?
Any help would be much appreciated.

A handful of questions here, so I'll address what I can :)
For the master, or parent template, I'd recommend including all Parameters that the child stacks will need.
When you want to make any updates in the future to any of the child stacks, you'll want to initiate that from the parent stack.
According to the docs:
Certain stack operations, such as stack updates, should be initiated
from the root stack rather than performed directly on nested stacks
themselves.
So your parent template could have a lot of parameters depending on how many parameters need to be passed directly to the child templates.
Depending on how the child stacks use the Outputs from the other child stacks, you may not need to use the DependsOn to enforce ordering, since Cloudformation is smart enough to figure out Implicit Dependencies (see docs discussing DependsOn). It certainly won't hurt to include these, but the DependsOn attribute isn't needed for most situations.
You'll want to make sure the child stacks have an Outputs section so that other child stacks can use them. Pay close attention to the Return values for AWS::CloudFormation::Stack

If you have many dependent stacks, it is much easier to run everything for example from Ansible. Add outputs in each CF template, then just write simple playbook that will run your templates in desired order. Please take a look at https://docs.ansible.com/ansible/devel/modules/cloudformation_module.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js