Background
I have a virtual machine running a code using Google SDK for diffrent products (like Google PubSub). According to Google documentation, my machine should have an environment variable called GOOGLE_APPLICATION_CREDENTIALS and its values should be pointing to a clear text file that holding the service account of the application.
I have done it and it's working for me.
The Problem
It sounds like an unsafe practice to store such a key, in plain text, inside a virtual machine. If the machine has been hacked, this key will be one of the first targets of the attacker.
I was expected to find a solution to "hide" this key file or just encrypt it with a key that my application will be able to read.
I found some code examples (C#), that allow the programmer to pass the credentials manually to the SDK functions. But, it's not a standard way to do it and it's being changed from one product to another (seems impossible in some products).
What is the best practice to do it?
Have a good read at the following:
https://cloud.google.com/docs/authentication/production
This describes a concept called "Application Default Credentials". The concept here is that a Compute Engine (a virtual machine) has a default service account (that you can configure) associated with it. Applications running on the Compute Engine can thus make requests from that Compute Engine to other GCP services and the requests to those services will implicitly appear to come from the service account configured against the Compute Engine.
The key phrase in the article is:
If the environment variable GOOGLE_APPLICATION_CREDENTIALS isn't set, ADC uses the default service account that Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, and Cloud Functions provide.
Related
I am new at google cloud and this is my first experience with this platform. ( Before I was using Azure )
So I am working on a c# project and the project has a requirement to save images online and for that, I created cloud storage.
not for using the services, I find our that I have to download a service account credential file and set the path of that file in the environment variable.
Which is good and working file
RxStorageClient = StorageClient.Create();
But the problem is that. my whole project is a collection of 27 different projects and that all are in the same solution and there are multi-cloud storage account involved also I want to use them with docker.
So I was wondering. is there any alternative to this service account system? like API key or connection string like Azure provides?
Because I saw this initialization function have some other options to authenticate. but didn't saw any example
RxStorageClient = StorageClient.Create();
Can anyone please provide a proper example to connect with cloud storage services without this service account file system
You can do this instead of relying on the environment variable by downloading credential files for each project you need to access.
So for example, if you have three projects that you want to access storage on, then you'd need code paths that initialize the StorageClient with the appropriate service account key from each of those projects.
StorageClient.Create() can take an optional GoogleCredential() object to authorize it (if you don't specify, it grabs the default application credentials, which, one way to set is that GOOGLE_APPLICATION_CREDENTIALS env var).
So on GoogleCredential, check out the FromFile(String) static call, where the String is the path to the service account JSON file.
There are no examples. Service accounts are absolutely required, even if hidden from view, to deal with Google Cloud products. They're part of the IAM system for authenticating and authorizing various pieces of software for use with various products. I strongly suggest that you become familiar with the mechanisms of providing a service account to a given program. For code running outside of Google Cloud compute and serverless products, the current preferred solution involves using environment variables to point to files that contain credentials. For code running Google (like Cloud Run, Compute Engine, Cloud Functions), it's possible to provide service accounts by configuration so that the code doesn't need to do anything special.
I am just looking into using GCP for cloud computing stuff. So far I have been using AWS and the boto3 library and was trying to use the google python client API for launching instances.
So an example I came across was from their docs here. The instance machine type is specified as:
machine_type = "zones/%s/machineTypes/n1-standard-1" % zone
and then it passed to the configuration as:
config = {
'name': name,
'machineType': machine_type,
....
I wonder how does one go about specifying machines with GPU and custom RAM and processors etc. from the python API?
The Python API is basically a wrapper around the REST API, so in the example code you are using, the config object is being built using the same schema as would be passed in the insert request.
Reading that document shows that the guestAccelerators structure is the relevant one for GPUs.
Custom RAM and CPUs are more interesting. There is a format for specifying a custom machine type name (you can see it in the gcloud documentation for creating a machine type). The format is:
[GENERATION]custom-[NUMBER_OF_CPUs]-[RAM_IN_MB]
Generation refers to the "n1" or "n2" in the predefined names. For n1, this block is empty, for n2, the prefix is "n2-". That said, experimenting with gcloud seems to indicate that "n1-" as a prefix also works as you would expect.
So, for a 1 CPU n1 machine with 5GB of ram, it would be a custom-1-5120. This is what you would replace the n1-standard-1 in your example with.
You are, of course, subject to the limits of how to specify a custom machine such as the fact that RAM must be a multiple of 256MB.
Finally, there's a neat little feature at the bottom of the console "create instance" page:
Clicking on the relevant link will show you the exact REST object you need to create the machine you have defined in the console at that very moment, so it can be very useful to see how a particular parameter is used.
You can create a Compute Engine instance using the Compute Engine API. Specifically, we can use the insert API request. This accepts a JSON payload in a REST request that describes the desired VM instance that you desire. A full specification of the request is found in the docs. It includes:
machineType - specs of different (common) machines including CPUs and memory
disks - specs of disks to be added including size and type
guestAccelerators - specs for GPUs to add
many more options ...
One can also create a template description of the machine structure you want and simplify the creation of an instance by naming the template to use and thereby abstracting the configuration details out of code and into configuration.
Beyond using REST requests (which can be passed from a python), you also have the capability to create Compute Engines from:
GCP Console - web interface
gcloud - command line (which I suspect can also be driven from within Python)
Deployment Manager - configuration driven deployment which includes Python as a template language
Terraform - popular environment for creating Infrastructure as Code environments
In Google shell which is a part of Google cloud, I set environment variable GOOGLE_APPLICATION_CREDENTIALS because It is need it for PHP NLP project [info: https://cloud.google.com/natural-language/docs/quickstart-client-libraries#client-libraries-install-php]. My project worked fine, but I notice that variable GOOGLE_APPLICATION_CREDENTIALS lasts on my sistem only one day. This is my third time that I am setting it. My project doesn't work when I am missing required variable. Am I doing something wrong?
EDIT:
It is default OS (Debian) when you create new App on Google App engine.
When I type help in Google shell I get info with:
Your 5GB home directory will persist across sessions, but the VM is ephemeral and will be reset
approximately 20 minutes after your session ends. No system-wide change will persist beyond that.
You are completely right, Cloud Shell is running on an ephemeral instance that resets some minutes after the session has ended, reason why you are losing the content of the environment variable you mentioned.
The documentation about limitations in Cloud Shell clearly states that it is intended for interactive use only, and any non-interactive session or intensive usage can be automatically terminated with (or without) a warning.
Therefore, and understanding from your question that you have a background script that is working with Cloud Natural Language, I would strongly advise you to move to a "real" instance of Compute Engine, in which you will have much more control about what is happening. This will allow more flexibility and you will be able to use a bigger machine type, given that Cloud Shell runs on a g1-small GCE instance which, in general, is not enough to run an application. Also, depending on your use case, you may even consider App Engine.
That being said, I have found that when constructing the LanguageClient instance, you may also not use Application Default Credentials and, instead, use the keyFile or keyFilePath variables (explained in the PHP Client Library reference) to pass the path to the JSON key directly to your code, instead of reading it from the environment variable.
Lets assume you are using Linux, make sure that:
The system is not being restarted, and if it is, make sure to set the environment variables accordingly (see how to set permantent environment variables)
How do we define credentials in Java program which connects to Google Cloud Platform to execute the code.
There is a standard way of setting GOOGLE_APPLICATION_CREDENTIALS env variable. I want to define in code. any suggestions?
Thanks for your response. Understood defining credentials is not recommended by GCP. So, I would use ADC(Authenticate Default Credentials).
Adding more info:
Providing credentials to your application
GCP client libraries use a strategy called Application Default Credentials (ADC) to find your application's credentials. When your code uses a client library, the strategy checks for your credentials in the following order:
First, ADC checks to see if the environment variable GOOGLE_APPLICATION_CREDENTIALS is set. If the variable is set, ADC uses the service account file that the variable points to.
If the environment variable isn't set, ADC uses the default service account that Compute Engine, Kubernetes Engine, App Engine, and Cloud Functions provide, for applications that run on those services.
If ADC can't use either of the above credentials, an error occurs.
The following code example illustrates this strategy. The example doesn't explicitly specify the application credentials. However, ADC is able to implicitly find the credentials as long as the GOOGLE_APPLICATION_CREDENTIALS environment variable is set, or as long as the application is running on Compute Engine, Kubernetes Engine, App Engine, or Cloud Functions.
Java Code:
static void authImplicit() {
// If you don't specify credentials when constructing the client, the client library will
// look for credentials via the environment variable GOOGLE_APPLICATION_CREDENTIALS.
Storage storage = StorageOptions.getDefaultInstance().getService();
System.out.println("Buckets:");
Page<Bucket> buckets = storage.list();
for (Bucket bucket : buckets.iterateAll()) {
System.out.println(bucket.toString());
}
}
You can find all these details in GCP link: https://cloud.google.com/docs/authentication/production#auth-cloud-app-engine-java
I'm working on web app and i want to migrate this web app to virtual machine scale set in windows azure cloud,i'm new to cloud computing ,till i didn't got any proper tutorial about virtual machine scale set,please someone help with this
A few things to consider..
You could build a custom VM which contains the complete app, or you could use VM extensions to deploy the app on a platform image each time a new VM in the scale set is deployed. See: https://msftstack.wordpress.com/2016/04/20/deploying-applications-in-azure-vm-scale-sets/ for some thoughts on this. Ultimately it might depend on how much you need to install over a base image, and how fast you want scaling to be.
Do you need autoscale based on resource usage or do you plan to manually increase/decrease the number of VMs in the set? See https://azure.microsoft.com/en-us/documentation/articles/virtual-machine-scale-sets-windows-autoscale/
A good way to get started with scale sets is to deploy an existing template directly from Azure Quick start templates. Look at https://github.com/Azure/azure-quickstart-templates and search for vmss. These templates will give you an idea of some of the options you have.
To learn the basics about VM Scale Sets, start with the documentation page: https://azure.microsoft.com/documentation/services/virtual-machine-scale-sets/ and the GA announcement: https://azure.microsoft.com/en-us/blog/azure-virtual-machine-scale-sets-ga/
Also look at higher level services like the Azure Web App service if you haven't already, the advantage of a higher level service is that some of the basic web app operations get taken care of for you: https://azure.microsoft.com/en-us/services/app-service/web/