I have a project in Google cloud using the following resources
-BigQuery, Google functions (Python), google storage, Cloud Scheduler
is it possible to save the whole project as code and share it, so someone else can just use that code and deploy it using his own tenant ?
the reason, I am asking, I have published all the code and SQL queries in Github, but some users find it very hard to reproduce, they are not necessarily very familiar with Google Cloud, in an ideal situation, they need just to get a file and click deploy ?
When you create a solution for GCP we will commonly find that it consists of code, data and configuration. The code and data you can save in a source repository like GitHub ... but what of the configuration? What if your "solution" expects to have BQ datasets and tables or GCS buckets or Scheduler jobs defined? This is where you can create "Infrastructure As Code" definitions. Google supports its own IaC technology called Deployment Manager but you can also use the popular Terraform as it too has a GCP provider. The definitions for these IaC coordinators are typically text / yaml files that you can also package with your code. Sprinkle in some Make, Chef, Puppet for building apps and pushing code to deployment environments and you have a "build it from source" story. Study also the concepts of CI/CD and you will commonly find that the steps you perform for building CI/CD overlap with the steps for trivial deployment.
There are also projects such as terraformer that can do some kind of a job of reverse engineering an existing configuration to create IaC description that, when run elsewhere, will recreate the configuration.
Related
The problem
I'm approaching AWS, and the first test project will be a website, but i'm struggling on how to approach the resource and the tools to accomplish this.
AWS documentation is not really beginner-friendly, so to me it is like to being punched in the face at the first boxe training session.
First attempt
I've installed bot AWS and SAM cli tools, so what I would expect is to be able to create an empty stack at first and adding the resource one by one as the specifications are given/outlined, but instead what I see is that i need to give a template to the tool to create the new stack, but that means I need to know how to write it beforehand and therefore the template specifications for each resource type.
Second attempt
This lead me to create the stack and the related resources from the online console to get the final stack template, but then I need to test every new resource or any updated resource locally, so I have to copy the template from the online console to my machine and run the cli tools with this, but obviously it is not the desired development flow.
What I expected
Coming from a standard/classical web development I would expect to be able to create the project locally, test the related resources locally, version it, and delegate the deployment to the pipeline.
So what?
All this made me understand that "probably" I'm missing somenthing on how to use the aws cli tools and how the development for an aws-hosted application is meant to be done.
I'm not seeking for a guide on specific resource types like every single tutorial I've found online, but something on a higher level on how to handle a project development on aws, best practices and stuffs like that, I can then dig deeper on any resource later when needed.
AWS's Cloud Development Kit ticks the boxes on your specific criteria.
Caveat: the CDK has a learning curve in line with its power and flexibility. There are much easier ways to deploy a web app on AWS, like the higher-level AWS Amplify framework, with abstractions tailored to front-end devs who want to minimise the mental energy spent on the underlying infrastructure.
Each of the squillion AWS and 3rd Party deploy tools is great for somebody. Nevertheless, looking at your explicit requirements in "What I expected", we can get close to the CDK as an objective answer:
Coming from a standard/classical web development
So you know JS/Python. With the CDK, you code infrastructure as functions and classes, rather than 500 lines of YAML as with SAM. The CDK's reference implementation is in Typescript. JS/Python are also supported. There are step-by-step AWS online workshops for these and the other supported languages.
create the project locally
Most of your work will be done locally in your language of choice, with a cdk deploy CLI command to
bundle the deployment artefacts and send them up to the cloud.
test the related resources locally
The CDK has built-in testing and assertion support.
version it
"Deterministic deploy" is a CDK design goal. Commit your code and the generated deployment artefacts so you have change control over your infrastructure.
delegate the deployment to the pipeline
The CDK has good pipeline support: i.e. a push to the remote main branch can kick off a deploy.
AWS SAM is actually a good option if you are just trying to get your feet wet with AWS. SAM is an open-source wrapper around the aws-cli, which allows you to create aws resources like Lambda in say ~10 lines of code vs ~100 lines if you were to use the aws-cli directly. Yes, you'll need to learn SAM specific things like SAMtemplate and SAM-cli but it is pretty straightforward using this doc.
Once you get the hang of it, it would be easier to start looking under the hood of what/how SAM is doing things and get into the weeds with aws-cli if you wanted. Which will then allow you to build out custom solutions (using aws-cli) for your complex use cases that SAM may not support. Caveat: SAM is still pretty new and has open issues that could be a blocker for advanced features/complex use cases.
Currently we're having a dev environment in a gcp project. We're using GDM templates and other stuffs along with repo in bitbucket. Whenever we push any changes in bitbucket it builds and deploy to this dev environment. Suddenly, we've decided to have a new gcp project as test environment and we want to deploy automatically to this environment like dev environment. Our preference will be to deploy to this environment from the cloudbuild execution in dev environment. Can you suggest us any guideline that'll help us to set up things in one place that'll automatically deploy this in multiple projects as multiple environments automatically?
You can use Terraform to achieve this.
There's a lot of information on how to start here.
However, I would suggest having projects in separate deployments. This way you limit the blast radius and protect production from errors occurring in other environments.
You need separate calls for separate projects. Just like almost all Google API resources deploymentmanager/deployments lives inside a project (https://www.googleapis.com/deploymentmanager/v2/projects/[PROJECT]/global/deployments), thus you cannot deploy to multiple projects in one call.
I'm presently looking into GCP's Deployment Manager to deploy new projects, VMs and Cloud Storage buckets.
We need a web front end that authenticated users can connect to in order to deploy the required infrastructure, though I'm not sure what Dev Ops tools are recommended to work with this system. We have an instance of Jenkins and Octopus Deploy, though I see on Google's Configuration Management page (https://cloud.google.com/solutions/configuration-management) they suggest other tools like Ansible, Chef, Puppet and Saltstack.
I'm supposing that through one of these I can update something simple like a name variable in the config.yaml file and deploy a project.
Could I also ensure a chosen name for a project, VM or Cloud Storage bucket fits with a specific naming convention with one of these systems?
Which system do others use and why?
I use Deployment Manager, as all 3rd party tools are reliant upon the presence of GCP APIs, as well as trusting that those APIs are in line with the actual functionality of the underlying GCP tech.
GCP is decidedly behind the curve on API development, which means that even if you wanted to use TF or whatever, at some point you're going to be stuck inside the SDK, anyway. So that's why I went with Deployment Manager, as much as I wanted to have my whole infra/app deployment use other tools that I was more comfortable with.
To specifically answer your question about validating naming schema, what you would probably want to do is write a wrapper script that uses the gcloud deployment-manager subcommand. Do your validation in the wrapper script, then run the gcloud deployment-manager stuff.
Word of warning about Deployment Manager: it makes troubleshooting very difficult. Very often it will obscure the error that can help you actually establish the root cause of a problem. I can't tell you how many times somebody in my office has shouted "UGGH! Shut UP with your Error 400!" I hope that Google takes note from my pointed survey feedback and refactors DM to pass the original error through.
Anyway, hope this helps. GCP has come a long way, but they've still got work to do.
I am just getting started with both, GCP & Google Cloud Data Fusion. Just viewed the intro video. I see that pipelines can be exported. I was wondering how we might promote a pipeline from say, Dev to Prod env? My guess is that after some testing, the exported file is copied to the Prod branch on Git, from where we need to invoke the APIs to deploy it? Also, what about connection details, how do we avoid hard-coding the source/destination configurations & credentials?
Yes. You would have to export and re-import the pipeline.
About the first question, if you have different environments for development and production, you can export your pipeline and import it in the correct environment.
I didn't understand the second question very well. In the official Data Fusion plugins there is a standard way to provide your credentials. If you need a better answer, please explain a little more carefully your doubt.
Is there a way to duplicate an entire project?
The project contains:
2x Cloud SQL: main + backup
1x Cloud Storage
4x Google Compute Engine
We have an exactly the same project already built up and configured, so it would be much easier for us if we could just make a copy of those.
The projects are not under the same account.
There is no such a way to replicate as-is a project.
However, you can use Terraformer starting from your current project: this CLI tool will generate Terraform template files starting from the existing infrastructure (reverse Terraform). Then, you can use these files to re-create the target resources within a second GCP project in a programmatic fashion (see https://cloud.google.com/community/tutorials/getting-started-on-gcp-with-terraform).
Disclaimer: Comments and opinions are my own and not the views of my employer.