Dataproc : Submit a Spark Job through REST API - google-cloud-platform

We are using GoogleCloudPlatform for big-data analytics. For processing we are currently using the google cloud dataproc & spark-streaming.
I want to submit a Spark job using the REST API, but when I am calling the URI with the api-key, I am getting the below error!
{
"error": {
"code": 403,
"message": "The caller does not have permission",
"status": "PERMISSION_DENIED"
}
}
URI :- https://dataproc.googleapis.com/v1/projects/orion-0010/regions/us-central1-f/clusters/spark-recon-1?key=AIzaSyA8C2lF9kT*************SGxAipL0
I created the API from the google console> API manager

While API keys can be used for associating calls with a developer project, it's not actually used for authorization. Dataproc's REST API, like most other billable REST APIs within Google Cloud Platform, uses oauth2 for authentication and authorization. If you want to call the API programmatically, you'll likely want to use one of the client libraries such as the Java SDK for Dataproc which provides convenience wrappers around the low-level JSON protocols, as well as giving you handy thick libraries for using oauth2 credentials.
You can also experiment with the direct REST API using Google's API explorer where you'll need to click the button on the top right that says "Authorize requests using OAuth 2.0".
I also noticed you used us-central1-f under the regions/ path for the Dataproc URI; note that Dataproc's regions don't map one-to-one with Compute Engine zones or regions; rather, Dataproc's regions will each contain multiple Compute Engine zones or regions. Currently there is only one Dataproc region available publicly, which is called global and is capable of deploying clusters into all Compute Engine zones. For an easy illustration of using an oauth2 access token, you can simply use curl along with gcloud if you have the gcloud CLI installed:
PROJECT='<YOUR PROJECT HERE>'
ACCESS_TOKEN=$(gcloud beta auth application-default print-access-token)
curl \
--header "Authorization: Bearer ${ACCESS_TOKEN}" \
--header "Content-Type: application/json" \
https://dataproc.googleapis.com/v1/projects/${PROJECT}/regions/global/clusters
Keep in mind that the ACCESS_TOKEN printed by gcloud here by nature expires (in about 5 minutes, if I remember correctly); the key concept is that the token you pass along in HTTP headers for each request will generally be a "short-lived" token, and by design you'll have code which separately fetches new tokens whenever the access tokens expire using a "refresh token"; this helps protect against accidentally compromising long-lived credentials. This "refresh" flow is part of what the thick auth libraries handle under the hood.

Related

Access GCP cloud Storage bucket from docker container without using gcloud utility

I have a docker file that needs to access the GCP bucket. I do not want to authenticate GCP using SA and gcloud utility (gcloud auth activate-service-account <<gcp account>> --key-file=<<serviceaccount>>.json due to security violence.
I want to use a different authentication approach with security compliance.
Could anyone please help with the same?
You can try API authentication to access GCP bucket instead of using ‘gsutil’ and client library authentication.
To make requests using OAuth 2.0 to either the Cloud Storage XML API or JSON API, include your application's access token in the Authorization header in every request that requires authentication. You can generate an access token from the OAuth 2.0 Playground.
Authorization: Bearer OAUTH2_TOKEN
Use the list method of the Objects resource.
GET /storage/v1/b/example-bucket/o HTTP/1.1
Host: www.googleapis.com
Authorization: Bearer’ ya29.AHES…….gSwg’
To authorize requests from the command line or for testing, you can use the curl command with the following syntax:
curl -H "Authorization: Bearer OAUTH2_TOKEN "https://storage.googleapis.com/storage/v1/b/BUCKET_NAME/o"
For local testing, you can use the ‘gcloud auth application-default print-access-token’ command to generate a token.
Due to the complexity of managing and refreshing access tokens and the security risk when dealing directly with cryptographic applications, we strongly encourage you to use a verified client library.
Cloud storage authentication for additional information.

How do I allow a team member to use my Google Cloud Speech-to-Text API account?

I enabled the Google Cloud Speech-to-Text API, but I would like to allow a team member to use it on my account.
I went into IAM to add a new user, but I don't see any roles related to Cloud Speech-to-Text API. What IAM role(s) do I need to select to allow the new team member access to the API?
Text-to-speech API is a special (old?) API at Google and doesn't require role. The API URL also doesn't require project definition. So, you need an account linked to the project to be able to reach the Speech-to-text API.
For this, the "service-account" account is the account to use. So, the users need to use a service account to reach the API. To prevent the service account key file generation (source of potential security issue), prefer the impersonation.
With the gcloud cli you can do this to generate a valid access-token on behalf of the service account.
gcloud auth print-access-token --impersonate-service-account=<the service account to impersonate>
So, in your API call, from your computer (I mean with your own user credential) you can do like this
curl -d #inputdata -H "content-type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token --impersonate-service-account=<the service account to impersonate>)" \
https://speech.googleapis.com/v1/speech:recognize

GCP Python SDK - Enable APIs without gcloud

I have used the documentation here https://cloud.google.com/endpoints/docs/frameworks/enable-api#gcloud to enable new APIs via the gcloud CLI tool. I have a service account with the owner role over my project, is it possible from within the python SDK, to enable APIs without using gcloud? Or, is it possible to call the gcloud CLI tool from the python SDK?
You can by using the discovery API client library for Service Usage API
You can also perform an authenticated call to the Service Usage API enable service endpoint
Let me know if you have issues and you want code example.
According to the official documentation:
Enabling APIs
You can enable Cloud APIs for a project using the console
You can also enable and disable Cloud APIs using Cloud SDK and Service Usage API
Therefore I would recommend to make a HTTP request using python to this url :POST https://serviceusage.googleapis.com/v1/{name=*/*/services/*}:enable
This is the curl command I used to enable analytics.googleapis.com API
OAUTH_TOKEN=`gcloud auth print-access-token`
curl -X POST -H "Authorization: Bearer $OAUTH_TOKEN" -H "Content-Type: application/json" https://serviceusage.googleapis.com/v1/projects/your-project/services/analytics.googleapis.com:enable
Response body
.......
"state": "ENABLED",
"parent": "projects/xxxxxxx"
.

Make authenticated requests with a slackbot

I followed the tutorial on the Google Cloud Run page and I have created a small, private Google Cloud Run API. Now I can use curl as described here to make requests to my API:
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" SERVICE_URL
So far so good. Now I would like to build a slackbot. The slackbot should respond to slashcommands and whenever a certain slashcommand is issued it should 1) authenticate itself with the API and then 2) issue a command.
Is that possible? I looked around in the entire Slack API documentation, but could not find an example in which a Slack Bot had to authenticate itself with another service. Could someone maybe point me to a guide/tutorial where the author implemented a private API in the Google Cloud that is called from a slackbot?
It's not possible. Instead of giving Slack the ability to make an authenticated request to your Cloud Run instance, configure it to allow unauthenticated access and instead validate that the event from Slack is valid by validating the token provided in the request.
This is described in Slack's Event's API documentation:
token: The shared-private callback token that authenticates this callback to the application as having come from Slack. Match this against what you were given when the subscription was created. If it does not match, do not process the event and discard it.

Getting Google IdToken for service account?

I have a backend that is serving android clients, authenticating them with IdToken sent from the android app.
Now, I need to authenticate a service running on aws that is using my apis. So I figured a service account would do the trick, using the private pem file to create a IdToken and send it along just as the android clients do. But I find no way of obtaining an IdToken with these credentials. Is this possible (preferrably in nodejs).
Or am I on the wrong path here?
I know this is older, but I found this question and it didn't lead me to the answer I ended up with.
I followed the guide in https://cloud.google.com/endpoints/docs/openapi/service-account-authentication#using_a_google_id_token with some mix of https://cloud.google.com/iap/docs/authentication-howto, which mentioned that the key to this was to include a target_audience claim in the generated JWT.
So, essentially I made a JWT that looked like:
{
"exp": 1547576771,
"iat": 1547575906,
"aud":"https://www.googleapis.com/oauth2/v4/token",
"target_audience": "https://example.com/",
"iss": EMAIL OF SERVICE ACCOUNT
}
and posted that to https://www.googleapis.com/oauth2/v4/token with params grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer and assertion=<THE JWT>
Without target_audience the endpoint gave me an access token, but with it I got an id_token instead.
Grettings since 2020
I had problems in Java for take ID_TOKEN of a Google Service Account. My project had two years and i were using GoogleCredentials, fromStream method and a JSON credential, but this class didn't gave me ID_TOKEN, only access_token on a not JWT format.
I solved because on this years Google updated here java code for authentication, for take ID_TOKEN you must use this library https://github.com/googleapis/google-auth-library-java
<dependency>
<groupId>com.google.auth</groupId>
<artifactId>google-auth-library-credentials</artifactId>
<version>0.20.0</version>
</dependency>
And then use ServiceAccountCredential
String credPath = "/path/to/svc_account.json";
ServiceAccountCredentials sourceCredentials = ServiceAccountCredentials
.fromStream(new FileInputStream(credPath));
When you create this class, itself will authenticate with google and have a access_token,refreshToken...
For extract ID_TOKEN you must use this function:
String audience = "http://localhost"; //Your server domain
IdToken idToken = credential.idTokenWithAudience(audience, new ArrayList<IdTokenProvider.Option>());
String id_token = idToken.getTokenValue();
And with this you have a JWT token.
I hope this help people like me,that are trying get ID_TOKEN.
You cannot use service accounts generated for Google Cloud APIs to directly authenticate against your own APIs. How will you know which service account private keys are valid and which have been revoked? Google does not expose this information.
Service accounts are rather meant for delegation of credentials. When you access Google Cloud platform service, you will be authenticating with your google account credentials. You will not want to provision the very same credentials everywhere your running code needs to access any of the Google cloud services (i.e. Cloud APIs). Instead you create service accounts whose scope can be reduced to a subset of the scope of your google account credentials. This way a particular piece of code can be limited to only a few set of APIs.
Service Accounts
A service account is a special account that can be used by services
and applications running on your Google Compute Engine instance to
interact with other Google Cloud Platform APIs. Applications can use
service account credentials to authorize themselves to a set of APIs
and perform actions within the permissions granted to the service
account and virtual machine instance.
What are service accounts?
Service accounts authenticate applications running on your virtual
machine instances to other Google Cloud Platform services. For
example, if you write an application that reads and writes files on
Google Cloud Storage, it must first authenticate to the Google Cloud
Storage API. You can create a service account and grant the service
account access to the Cloud Storage API. Then, you would update your
application code to pass the service account credentials to the Cloud
Storage API. In this way, your application authenticates seamlessly to
the API without embedding any secret keys or user credentials in your
instance, image, or application code.
I know where your confusion stems, it is because service account also have the same OAuth model you are used to.
You can use service accounts to get access tokens and refresh them as needed, but the scope of authentication is at the very maximum limited to the surface of the Google Cloud APIs. You will not be able to mix and match your APIs with that.
Alternative is to either build your own authentication model (which is not so clear from your question when you say authenticating them with IdToken sent from the android app) or rely on something like Cloud endpoints which you create and manage APIs along with API keys for authentication.
As you already mentioned in one of your comments, you can follow the Service-to-Service authentication guide which describes how you can use Google Cloud Service accounts to authenticate with your APIs running on Google Cloud Endpoint.
It supports using Google ID JWT tokens. The caller will have to send the JWT to Google Token endpoints to obtain a Google ID token and then use this Google ID token in all of your requests. This approach also has the advantage that you only have to whitelist the Google ID token server in your API configuration.