I am just getting started with both, GCP & Google Cloud Data Fusion. Just viewed the intro video. I see that pipelines can be exported. I was wondering how we might promote a pipeline from say, Dev to Prod env? My guess is that after some testing, the exported file is copied to the Prod branch on Git, from where we need to invoke the APIs to deploy it? Also, what about connection details, how do we avoid hard-coding the source/destination configurations & credentials?
Yes. You would have to export and re-import the pipeline.
About the first question, if you have different environments for development and production, you can export your pipeline and import it in the correct environment.
I didn't understand the second question very well. In the official Data Fusion plugins there is a standard way to provide your credentials. If you need a better answer, please explain a little more carefully your doubt.
Related
The problem
I'm approaching AWS, and the first test project will be a website, but i'm struggling on how to approach the resource and the tools to accomplish this.
AWS documentation is not really beginner-friendly, so to me it is like to being punched in the face at the first boxe training session.
First attempt
I've installed bot AWS and SAM cli tools, so what I would expect is to be able to create an empty stack at first and adding the resource one by one as the specifications are given/outlined, but instead what I see is that i need to give a template to the tool to create the new stack, but that means I need to know how to write it beforehand and therefore the template specifications for each resource type.
Second attempt
This lead me to create the stack and the related resources from the online console to get the final stack template, but then I need to test every new resource or any updated resource locally, so I have to copy the template from the online console to my machine and run the cli tools with this, but obviously it is not the desired development flow.
What I expected
Coming from a standard/classical web development I would expect to be able to create the project locally, test the related resources locally, version it, and delegate the deployment to the pipeline.
So what?
All this made me understand that "probably" I'm missing somenthing on how to use the aws cli tools and how the development for an aws-hosted application is meant to be done.
I'm not seeking for a guide on specific resource types like every single tutorial I've found online, but something on a higher level on how to handle a project development on aws, best practices and stuffs like that, I can then dig deeper on any resource later when needed.
AWS's Cloud Development Kit ticks the boxes on your specific criteria.
Caveat: the CDK has a learning curve in line with its power and flexibility. There are much easier ways to deploy a web app on AWS, like the higher-level AWS Amplify framework, with abstractions tailored to front-end devs who want to minimise the mental energy spent on the underlying infrastructure.
Each of the squillion AWS and 3rd Party deploy tools is great for somebody. Nevertheless, looking at your explicit requirements in "What I expected", we can get close to the CDK as an objective answer:
Coming from a standard/classical web development
So you know JS/Python. With the CDK, you code infrastructure as functions and classes, rather than 500 lines of YAML as with SAM. The CDK's reference implementation is in Typescript. JS/Python are also supported. There are step-by-step AWS online workshops for these and the other supported languages.
create the project locally
Most of your work will be done locally in your language of choice, with a cdk deploy CLI command to
bundle the deployment artefacts and send them up to the cloud.
test the related resources locally
The CDK has built-in testing and assertion support.
version it
"Deterministic deploy" is a CDK design goal. Commit your code and the generated deployment artefacts so you have change control over your infrastructure.
delegate the deployment to the pipeline
The CDK has good pipeline support: i.e. a push to the remote main branch can kick off a deploy.
AWS SAM is actually a good option if you are just trying to get your feet wet with AWS. SAM is an open-source wrapper around the aws-cli, which allows you to create aws resources like Lambda in say ~10 lines of code vs ~100 lines if you were to use the aws-cli directly. Yes, you'll need to learn SAM specific things like SAMtemplate and SAM-cli but it is pretty straightforward using this doc.
Once you get the hang of it, it would be easier to start looking under the hood of what/how SAM is doing things and get into the weeds with aws-cli if you wanted. Which will then allow you to build out custom solutions (using aws-cli) for your complex use cases that SAM may not support. Caveat: SAM is still pretty new and has open issues that could be a blocker for advanced features/complex use cases.
I am going to create environments. For now i have gcp machine and i run jupyter in there. Everytime, i need start it, and with 3 people it is hard to work in same environment. I know, there is docker, jupyter hub, but did not find and suitable roadmap to create dev/prod environment.
My aim to create dev and production environment. Everything should be on GCP.
Any suggested path ?
Thanks
You can take a look at the best practices for enterprise organizations. In order to properly split resources it's often advised to use different projects. However, depending on the GCP product, you could also use versions, such as with App Engine (see this StackOverflow thread).
I'm presently looking into GCP's Deployment Manager to deploy new projects, VMs and Cloud Storage buckets.
We need a web front end that authenticated users can connect to in order to deploy the required infrastructure, though I'm not sure what Dev Ops tools are recommended to work with this system. We have an instance of Jenkins and Octopus Deploy, though I see on Google's Configuration Management page (https://cloud.google.com/solutions/configuration-management) they suggest other tools like Ansible, Chef, Puppet and Saltstack.
I'm supposing that through one of these I can update something simple like a name variable in the config.yaml file and deploy a project.
Could I also ensure a chosen name for a project, VM or Cloud Storage bucket fits with a specific naming convention with one of these systems?
Which system do others use and why?
I use Deployment Manager, as all 3rd party tools are reliant upon the presence of GCP APIs, as well as trusting that those APIs are in line with the actual functionality of the underlying GCP tech.
GCP is decidedly behind the curve on API development, which means that even if you wanted to use TF or whatever, at some point you're going to be stuck inside the SDK, anyway. So that's why I went with Deployment Manager, as much as I wanted to have my whole infra/app deployment use other tools that I was more comfortable with.
To specifically answer your question about validating naming schema, what you would probably want to do is write a wrapper script that uses the gcloud deployment-manager subcommand. Do your validation in the wrapper script, then run the gcloud deployment-manager stuff.
Word of warning about Deployment Manager: it makes troubleshooting very difficult. Very often it will obscure the error that can help you actually establish the root cause of a problem. I can't tell you how many times somebody in my office has shouted "UGGH! Shut UP with your Error 400!" I hope that Google takes note from my pointed survey feedback and refactors DM to pass the original error through.
Anyway, hope this helps. GCP has come a long way, but they've still got work to do.
I have a project in Google cloud using the following resources
-BigQuery, Google functions (Python), google storage, Cloud Scheduler
is it possible to save the whole project as code and share it, so someone else can just use that code and deploy it using his own tenant ?
the reason, I am asking, I have published all the code and SQL queries in Github, but some users find it very hard to reproduce, they are not necessarily very familiar with Google Cloud, in an ideal situation, they need just to get a file and click deploy ?
When you create a solution for GCP we will commonly find that it consists of code, data and configuration. The code and data you can save in a source repository like GitHub ... but what of the configuration? What if your "solution" expects to have BQ datasets and tables or GCS buckets or Scheduler jobs defined? This is where you can create "Infrastructure As Code" definitions. Google supports its own IaC technology called Deployment Manager but you can also use the popular Terraform as it too has a GCP provider. The definitions for these IaC coordinators are typically text / yaml files that you can also package with your code. Sprinkle in some Make, Chef, Puppet for building apps and pushing code to deployment environments and you have a "build it from source" story. Study also the concepts of CI/CD and you will commonly find that the steps you perform for building CI/CD overlap with the steps for trivial deployment.
There are also projects such as terraformer that can do some kind of a job of reverse engineering an existing configuration to create IaC description that, when run elsewhere, will recreate the configuration.
I have got a running AMI Ubuntu instance with CI/CD pipeline with Jenkins, Tomcat, Selenium, Sonar, Nagios etc .I have used elastic ip directly in different configuration to get it working.
Is there any easy and direct way to export this image to my local Ubuntu apart from AWS export/import service.
For what it's worth I don't agree with the down votes. Cloud is fast becoming "infrastructure as code" and I place emphasis on the code aspect. As the differences between programming and infrastructure management begin to fade (see FaaS - functions as a service) there needs to be room for having that conversation in this forum IMHO.
That said, let's move on to your questions. AWS provides instance export functionality though this solution may not fit all use cases. Two links below illustrate how to perform that export. YMMV. Good luck!
http://docs.aws.amazon.com/cli/latest/reference/ec2/create-instance-export-task.html
http://docs.aws.amazon.com/vm-import/latest/userguide/vmexport.html