How to use Google Compute Python API to create custom machine type or instance with GPU? - google-cloud-platform

I am just looking into using GCP for cloud computing stuff. So far I have been using AWS and the boto3 library and was trying to use the google python client API for launching instances.
So an example I came across was from their docs here. The instance machine type is specified as:
machine_type = "zones/%s/machineTypes/n1-standard-1" % zone
and then it passed to the configuration as:
config = {
'name': name,
'machineType': machine_type,
....
I wonder how does one go about specifying machines with GPU and custom RAM and processors etc. from the python API?

The Python API is basically a wrapper around the REST API, so in the example code you are using, the config object is being built using the same schema as would be passed in the insert request.
Reading that document shows that the guestAccelerators structure is the relevant one for GPUs.
Custom RAM and CPUs are more interesting. There is a format for specifying a custom machine type name (you can see it in the gcloud documentation for creating a machine type). The format is:
[GENERATION]custom-[NUMBER_OF_CPUs]-[RAM_IN_MB]
Generation refers to the "n1" or "n2" in the predefined names. For n1, this block is empty, for n2, the prefix is "n2-". That said, experimenting with gcloud seems to indicate that "n1-" as a prefix also works as you would expect.
So, for a 1 CPU n1 machine with 5GB of ram, it would be a custom-1-5120. This is what you would replace the n1-standard-1 in your example with.
You are, of course, subject to the limits of how to specify a custom machine such as the fact that RAM must be a multiple of 256MB.
Finally, there's a neat little feature at the bottom of the console "create instance" page:
Clicking on the relevant link will show you the exact REST object you need to create the machine you have defined in the console at that very moment, so it can be very useful to see how a particular parameter is used.

You can create a Compute Engine instance using the Compute Engine API. Specifically, we can use the insert API request. This accepts a JSON payload in a REST request that describes the desired VM instance that you desire. A full specification of the request is found in the docs. It includes:
machineType - specs of different (common) machines including CPUs and memory
disks - specs of disks to be added including size and type
guestAccelerators - specs for GPUs to add
many more options ...
One can also create a template description of the machine structure you want and simplify the creation of an instance by naming the template to use and thereby abstracting the configuration details out of code and into configuration.
Beyond using REST requests (which can be passed from a python), you also have the capability to create Compute Engines from:
GCP Console - web interface
gcloud - command line (which I suspect can also be driven from within Python)
Deployment Manager - configuration driven deployment which includes Python as a template language
Terraform - popular environment for creating Infrastructure as Code environments

Related

How to manage multiple Environments within one project (GCP/AWS)

I am building a lab utility to deploy my teams development environments (testing / stress etc).
At present, the pipeline is as follows:
Trigger pipeline via HTTP request, args contain the distribution, web server and web server version using ARGs that are passed too multi stage dockerfiles.
Dockerx builds the container (if it doesn't exist in ECR)
Pipeline job pushes that container to ECR (if it doesn't already exist).
Terraform deploys the container using Fargate, sets up VPCs and a ALB to handle ingress externally.
FQDN / TLS is then provisioned on ...com
Previously when I've made tools like this that create environments, environments were managed and deleted solely at project level, given each environment had it's own project, given this is best practice for isolation and billing tracking purposes, however given the organisation security constraints of my company, I am limited to only 1 project wherein I can create all the resources.
This means I have to find a way to manage/deploy 30 (the max) environments in one project without it being a bit of a clustered duck.
More or less, I am looking for a way that allows me to keep track, and tear down environments (autonomously) and their associated resources relevant to a certain identifier, most likely these environments can be separated by resource tags/groups.
It appears the CDKTF/Pulumi looks like a neat way of achieving some form of "high-level" structure, but I am struggling to find ways to use them to do what I want. If anyone can recommend an approach, it'd be appreciated.
I have not tried anything yet, mainly because this is something that requires planning before I start work on it (don't like reaching deadends ha).

The environment variable "GOOGLE_APPLICATION_CREDENTIALS" in Google machines

Background
I have a virtual machine running a code using Google SDK for diffrent products (like Google PubSub). According to Google documentation, my machine should have an environment variable called GOOGLE_APPLICATION_CREDENTIALS and its values should be pointing to a clear text file that holding the service account of the application.
I have done it and it's working for me.
The Problem
It sounds like an unsafe practice to store such a key, in plain text, inside a virtual machine. If the machine has been hacked, this key will be one of the first targets of the attacker.
I was expected to find a solution to "hide" this key file or just encrypt it with a key that my application will be able to read.
I found some code examples (C#), that allow the programmer to pass the credentials manually to the SDK functions. But, it's not a standard way to do it and it's being changed from one product to another (seems impossible in some products).
What is the best practice to do it?
Have a good read at the following:
https://cloud.google.com/docs/authentication/production
This describes a concept called "Application Default Credentials". The concept here is that a Compute Engine (a virtual machine) has a default service account (that you can configure) associated with it. Applications running on the Compute Engine can thus make requests from that Compute Engine to other GCP services and the requests to those services will implicitly appear to come from the service account configured against the Compute Engine.
The key phrase in the article is:
If the environment variable GOOGLE_APPLICATION_CREDENTIALS isn't set, ADC uses the default service account that Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, and Cloud Functions provide.

Creating a duplicate of a VM

I'm preparing to get in to the world of cloud computing.
My first question is:
Is it possible to programmatically create a new, or duplicate an existing VM from my server?
Project Background
I provide a file processing service, and as it's been growing I need to offer a better service.
Project Requirement
Machine specs:
HDD: Min 16gb
CPU: Min 1 core
RAM: Min 2
GB GPU: Min CUDA 10.1 compatible
What I'm thinking is the following steps:
User uploads a file
A dedicated VM is created for that specific file inside Google Cloud Compute
The file is sent to the VM
File is processed using a Anaconda environment
Results are downloaded to local server
Dedicated VM is removed
Results are served to user
How is this accomplished?
PS: I'm looking for resources and advice. Not code.
Your question is a perfect formulation of the concept of Google Cloud Run. At the highest level concept, you create a Docker image (think of it like a VM) and then register that Docker image with GCP Cloud Run. When a trigger occurs, GCP will spin up an instance of that Docker container and pass in information about the cause of that trigger (a file created in GCS or a REST request or others ...). What you do in your container is up to you. You have full power of the Linux environment (under Docker) to do as you like. When your request ends, the container is spun down. You are only billed for the compute resources you use. If your container (VM) isn't being used, you pay nothing until the next trigger.
An alternative to Cloud Run is Cloud Functions. This is a higher level abstraction where instead of providing a Docker container, you provide the body of a function (JavaScript, Java, Python or others) and the request is passed to that function when a trigger occurs. Which you use is mostly personal choice (you didn't elaborate on "File is processed").
References:
Cloud Run
Cloud Functions

Kubernetes: how to properly change apiserver runtime settings

I'm using kube-aws to run a Kubernetes cluster on AWS, and everything works as expected.
Now, I realize that cron jobs aren't turned on in the version I'm using (v1.7.10_coreos.0), while the documentation for Kubernetes only states the following:
For previous versions of cluster (< 1.8) you need to explicitly enable batch/v2alpha1 API by passing --runtime-config=batch/v2alpha1=true to the API server (see Turn on or off an API version for your cluster for more).
And the documentation directed to in that text only states this (it's the actual, full documentation):
Specific API versions can be turned on or off by passing --runtime-config=api/ flag while bringing up the API server. For example: to turn off v1 API, pass --runtime-config=api/v1=false. runtime-config also supports 2 special keys: api/all and api/legacy to control all and legacy APIs respectively. For example, for turning off all API versions except v1, pass --runtime-config=api/all=false,api/v1=true. For the purposes of these flags, legacy APIs are those APIs which have been explicitly deprecated (e.g. v1beta3).
I have been unsuccessful in finding information about how to change the configuration of a running cluster, and I, of course, don't want to try to re-run the command on api-server.
Note that kube-aws still use hyperkube, and not kubeadm. Also, the /etc/kubernetes/manifests-directory only contains the ssl-directory.
The setting I want to apply is this: --runtime-config=batch/v2alpha1=true
What is the proper way, preferably using kubectl, to apply this setting and have the apiservers restarted?
Thanks.
batch/v2alpha1=true is set by default in kube-aws. You can find it here

Why ec2 instances get metadata over REST?

EC2 documentation says that it is possible to get instance specific metadata, such as instance-id, ami-id, etc over a REST call to 169.254.169.254 . My question is: since this information is available to the instance at launch time, why doesn't AWS just write it to a file in the /etc dir?
There are a few reasons:
Some of the information is dynamic, that being that it changes over time. Using a REST interface allows the software that wants the information to get the current values when they need them.
Amazon adds new metadata and new metadata versions from time-to-time. Using a REST interface lets you query the data and version you want without needing to bake the new data schema and version into the OS AMI.
The information needs to come from somewhere. Even if it's placed on a file in the OS, it has to get there somehow. A REST interface is simply that conduit.
That data does actually exist in the file system, though its location could vary from one AMI type to another. Using the RESTful interface gives you a more reliable way at getting at the data.