Google Cloud Natural Language API - Authorization - google-cloud-platform

I'm trying to usethe Cloud Natural Language API with the code given in this link. (first or second)
link
I get the below error.
google.api_core.exceptions.PermissionDenied: 403 Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are nots
upported by the language.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service
_account setting. For more information about service accounts and how to use them in your application, see https://cloud.google.com/docs/authentication/.
I know that we need to give the json file of the service account key in the code as per google doc (link). But it does not say about handling the API Key. Also isn't it not recommended to give API Key in code. So what is the way?
I tried using the Curl command with the API Key and it works fine.
curl "https://language.googleapis.com/v1/documents:analyzeEntities?key=${API_KEY}" -s -X POST -H "Content-Type: applicat
ion/json" --data-binary #request.json
However i dont know how to incorporate the API Key (which i have) into the code present in the first link i posted above.

Did you go through all the steps provided by Google? You probably missed one of the steps outlined below.
Create a project in GCP.
Enable billing for that project.
Enable the Natural Language API in that project.
gcloud services enable language.googleapis.com --project [PROJECT_ID]
Use your own account OR a service account key (with owner permissions).
gcloud auth application-default login
OR
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/my-key.json"
Set the project id trough the command line.
gcloud config set project [PROJECT_ID]

Related

Add gcloud "Google Auth Library" to App Access Control on Google Workspace Admin

I use gcloud auth application-default login to authenticate my local applications to Google Cloud Platform services.
One of them is a Python app that performs a query on many BigQuery tables and they work well except for one Table, which its values come from a Drive Spreadsheet (if you go in the table details on the Console it has the link to the drive spreadsheet).
I did some research and found that, in order for it to work, I have to explicitly inform the scope https://www.googleapis.com/auth/drive. So I did like this and it worked for production environemnt since the credentials are loaded from Kubernetes Workload Identity:
credentials, _ = google.auth.default(scopes=[
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/drive"
])
But for running it locally I need to specify the scopes to my gcloud auth application-default login like this:
gcloud auth application-default login --scopes=https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/cloud-platform
I've ran the command and was prompted with the authorization issue:
Access blocked: Authorization Error
myuser#domain.com
Access to your account data is restricted by policies within your organization. Please contact the administrator of your organization for more information.
If you are a developer of Google Auth Library, see error details.
Error 400: admin_policy_enforced
This error happens because the Google Workspace Admin restricts access to all applications , except the ones whitelisted at "App Access Control" section.
I contacted my Workspace admin so we went to the "App Access Controll" to add the gcloud to the Apps whitelist. We tried to search anything related to Google like "Google Cloud Platform" or "Google Auth Library" to add to the whitelist, but there isn't anything. I tried removing the restriction and it works, but it's not acceptable for our security standards.
Is there any way to get around this issue?

Receiving HTTP 401 when accessing Cloud Composer's Airflow Rest API

I am trying to invoke Airflow 2.0's Stable REST API from Cloud Composer Version 1 via a Python script and encountered a HTTP 401 error while referring to Triggering DAGS with Cloud Functions and Access the Airflow REST API.
The service account has the following list of permissions:
roles/iam.serviceAccountUser (Service Account User)
roles/composer.user (Composer User)
roles/iap.httpsResourceAccessor (IAP-Secured Web App User, added when the application returned a 403, which was unusual as the guides did not specify the need for such a permission)
I am not sure what is wrong with my configuration; I have tried giving the service account the Editor role and roles/iap.tunnelResourceAccessor (IAP-Secured Tunnel User) & roles/composer.admin (Composer Administrator), but to no avail.
EDIT:
I found the source of my problems: The Airflow Database did not have the credentials of the service account in the users table. However, this is unusual as I currently have a service account (the first I created) whose details were added automatically to the table. Subsequent service accounts were not added to the users table when they tried to initially access the REST API, thus returning the 401. I am not sure of a way to create users without passwords since the Airflow web server is protected by IAP.
Thanks to answers posted by #Adrie Bennadji and #ewertonvsilva, I was able to diagnose the HTTP 401 issue.
The email field in some of Airflow's database tables that are pertaining to users, have a limit of 64 characters (Type: character varying(64)), as noted in: Understanding the Airflow Metadata Database
Coincidentally, my first service account had an email whose character length was just over 64 characters.
When I tried running the command: gcloud composer environments run <instance-name> --location=<location> users -- create --use-random-password --username "accounts.google.com:<service_accounts_uid>" --role Op --email <service-account-username>#<...>.iam.gserviceaccount.com -f Service -l Account as suggested by #ewertonvsilva to add my other service accounts, they failed with the following error: (psycopg2.errors.StringDataRightTruncation) value too long for type character varying(64).
As a result, I created new service accounts with shorter emails and these were able to be authenticated automatically. I was also able to add these new service accounts with shorter emails to Airflow manually via the gcloud command and authenticate them. Also, I discovered that the failure to add the user upon first acccess to the REST API was actually logged in Cloud Logging. However, at that time I was not aware of how Cloud Composer handled new users accessing the REST API and the HTTP 401 error was a red herring.
Thus, the solution is to ensure that the combined length of your service account's email is lesser than 64 characters.
ewertonvsilva's solution worked for me (manually adding the service account to Airflow using gcloud composer environments run <instance-name> --location=<location> users -- create ... )
At first it didn't work but changing the username to accounts.google.com:<service_accounts_uid> made it work.
Sorry for not commenting, not enough reputation.
Based on #Adrien's Bennadji feedback, I'm posting the final answer.
Create the service accounts with the proper permissions for cloud composer;
Via gcloud console, add the users in airflow database manually:
gcloud composer environments run <instance-name> --location=<location> users -- create --use-random-password --username "accounts.google.com:<service_accounts_uid>" --role Op --email <service-account-username>#<...>.iam.gserviceaccount.com -f Service -l Account
And then, list the users with: gcloud composer environments run <env_name> --location=<env_loc> users -- list
use: accounts.google.com:<service_accounts_uid> for the username.
Copying my answer from https://stackoverflow.com/a/70217282/9583820
It looks like instead of creating Airflow accounts with
gcloud composer environments run
You can just use GCP service accounts with email length <64 symbols.
It will work automatically under those conditions:
TL'DR version:
In order to make Airflow Stable API work at GCP Composer:
Set "api-auth_backend" to "airflow.composer.api.backend.composer_auth"
Make sure your service account email length is <64 symbols
Make sure your service account has required permissions (Composer User role should be sufficient)
Longread:
We are using Airflow for a while now, and started with version 1.x.x with "experimental" (now deprecated) API's.
To Authorize, we are using "Bearer" token obtained with service account:
# Obtain an OpenID Connect (OIDC) token from metadata server or using service account.
google_open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
# Fetch the Identity-Aware Proxy-protected URL, including an
# Authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(
method, url,
headers={'Authorization': 'Bearer {}'.format(
google_open_id_connect_token)}, **kwargs)
Now we are migrating to Airflow 2.x.x and faced with exact same issue:
403 FORBIDDEN.
Our environment details are:
composer-1.17.3-airflow-2.1.2 (Google Cloud Platform)
"api-auth_backend" is set to "airflow.api.auth.backend.default".
Documentation claims that:
After you set the api-auth_backend configuration option to airflow.api.auth.backend.default, the Airflow web server accepts all API requests without authentication.
However, this does not seem to be true.
In experimental way, we found that if "api-auth_backend" is set to "airflow.composer.api.backend.composer_auth", Stable REST API (Airflow 2.X.X) starting to work.
But there is other caveat to this: for us, some of our service accounts did work, and some did not.
The ones that did not work were throwing "401 Unauthorized" error.
We figured out that accounts having email length > 64 symbols were throwing error. Same was observed at this answer.
So after setting "api-auth_backend" to "airflow.composer.api.backend.composer_auth" and making sure that our service account email length is <64 symbols - our old code for Airflow 1.x.x started to work for Authentication. Then we needed to make changes (API URLs and response handling) and stable Airflow (2.x.x) API started to work for us
in the same way as it was for Airflow 1.x.x.
UPD: this is a defect in Airflow and will be fixed here:
https://github.com/apache/airflow/pull/19932
I was trying to invoke Airflow 2.0's Stable REST API from Cloud Composer Version 2 via a Python script and encountered an HTTP 401 error while referring to Triggering DAGS with Cloud Functions and accessing the Airflow REST API.
I used this image version: composer-2.1.2-airflow-2.3.4
I also followed these 2 guides:
Triggering Cloud Composer DAGs with Cloud Functions (Composer 2 + Airflow 2)
Access the Airflow REST API Cloud Composer 2
But I was always stuck with Error 401, when I tried to run the DAG via the Cloud Function.
However, when the DAG was executed from the Airflow UI, it was successful (Trigger DAG in the Airflow UI).
For me the following solution worked:
In the airflow.cfg, set the following settings:
api - auth_backends=airflow.composer.api.backend.composer_auth,airflow.api.auth.backend.session
api - composer_auth_user_registration_role = Op (default)
api - enable_experimental_api = False (default)
webserver - rbac_user_registration_role = Op (default)
Service Account:
The service account email total length is <64 symbols.
The account has these roles:
Cloud Composer v2 API Service Agent Extension, Composer User
Airflow UI
Add the service account to the Airflow Users via Airflow UI
(Security -> List Users with username) = accounts.google.com:<service account uid>, and assign the role of Op to it.
You can get the UID from via cloud shell command (see above), or just
navigate to the IAM & Admin Page on Google Cloud -> Service Accounts
-> Click on the service account and read the Unique ID from the Details page.
And now, IMPORTANT!: SET THE ACCOUNT ACTIVE! (In the Airflow UI, check the box "is Active?" to true).
This last step to set it active was not described anywhere, and for long time I just assumed it gets set active when there is an open session (when it makes the calls), but that is not the case. The account has to be set manually active.
After that, everything worked fine :)
Other remarks: As I joined a new company, I also had to check some other stuff (maybe this is not related to your problem, but it's good to know anyway - maybe others can use this). I use Cloud Build to deploy the Cloud Functions and the DAGs in the Airflow, so I also had to check the following:
Cloud Source Repository (https://source.cloud.google.com/) is in sync with the GitHub Repository. If not: Disconnect the repository and reconnect again.
The GCS Bucket which is created when the Composer 2 Environment is setup the very first time has a subfolder "/dags/". I had to manually add the subfolder "/dags/dataflow/" so the deployed Dataflow Pipeline codes could be uploaded to that subfolder "/dags/dataflow/"

How gcloud manages to work with svcacc only API?

I'm trying to understand how gcloud manages to work with APIs that require service account to access them, e.g. accessing Speech API using your user (not svcacc) credentials will result in "403 Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the speech.googleapis.com".
However when I run gcloud ml speech recognize gs://cloud-samples-tests/speech/brooklyn.flac --language-code=en-US it works just fine even-though I didn't set any dedidcated svcacc keys as described in the quick start [1], and even disabled all service accounts in the project just to be sure.
So again,
gcloud ml speech recognize gs://cloud-samples-tests/speech/brooklyn.flac --language-code=en-US - works
curl -s -H "Content-Type: application/json" -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) https://speech.googleapis.com/v1/speech:recognize -d #sync-request.json as per [2] fails with 403 error above
Question: how gcloud manages to work without me providing it with a dedicated service account?
[1] https://cloud.google.com/speech-to-text/docs/quickstart-gcloud
Google credentials are provided by two types of account: service accounts and regular (human?) accounts (e.g. #gmail.com, #your.email). The types of tokens issued for these accounts differ but they both authenticate to services.
When you use gcloud you're often using human accounts that you have previously authenticated using gcloud auth login and may be shown with gcloud auth list.
In this case, you're also using these gcloud human credentials with curl because you acquire an access token using gcloud auth print-access-token. Your two examples effectively authenticated using the same (probably human) account.
When you want to have one service authenticate to another, it is recommended that you use service accounts. There's no human in the process to support OAuth's three-legged auth and so service accounts use two-legged auth (see link).
Generally with Cloud Platform services, credentials must also have IAM roles that describe permissions that authorize their use, but some Cloud Platform services (IIRC ML) do not (yet) implement IAM and so authorization is achieved solely using OAuth scopes and you can use vanilla service accounts that have no specific IAM bindings assigned.
NOTE it is possible to authenticate gcloud with service accounts too gcloud auth activate-service-account but IIRC this approach is being discouraged.

Difference between "gcloud auth application-default login" and "gcloud auth login"

What is the difference between gcloud auth application-default login vs gcloud auth login?
Despite the definitions below, it is still hard to differentiate them.
gcloud auth application-default login :
acquire new user credentials to use for Application Default Credentials
gcloud auth login :
authorize gcloud to access the Cloud Platform with Google user credentials
When should I use one over the other?
The difference is the use cases:
As a developer, I want to interact with GCP via gcloud.
gcloud auth login
This obtains your credentials and stores them in ~/.config/gcloud/. Now you can run gcloud commands from your terminal and it will find your credentials automatically. Any code/SDK will not automatically pick up your creds in this case.
Reference: https://cloud.google.com/sdk/gcloud/reference/auth/login
As a developer, I want my code to interact with GCP via SDK.
gcloud auth application-default login
This obtains your credentials via a web flow and stores them in 'the well-known location for Application Default Credentials'. Now any code/SDK you run will be able to find the credentials automatically. This is a good stand-in when you want to locally test code which would normally run on a server and use a server-side credentials file.
Reference: https://cloud.google.com/sdk/gcloud/reference/auth/application-default/login
Edit (09/19/2019):
As Kent contributed in his comment below, 'the well-known location for Application Default Credentials' is a file named application_default_credentials.json located in your local ~/.config/gcloud/ directory. I've added an additional link below to an article by Theodore Sui and Daniel De Leo which goes into greater detail about the different authentication methods.
Article:
https://medium.com/google-cloud/local-remote-authentication-with-google-cloud-platform-afe3aa017b95
I'm adding this as an answer because I don't have the reputation to comment. I think #Himal's answer is spot on but I'd like to clarify that when it says code/SDK, we should think code or Language (Java/Ruby/Python) SDK v/s the gcloud SDK (which is also referred to as Cloud SDK). This confused me a bit because I had the same doubts
So,
gcloud auth login -> Login to gcloud SDK
gcloud auth application-default login -> Login to any code running on the computer (language SDK's within an application)
There is also a give-away in the OAuth authentication screen in the browser windows that open up:
gcloud auth login asks you to choose an account to continue to give access to 'google cloud sdk'.
gcloud auth application-default login asks you to give access to google auth library instead.

gcloud ml language Request had insufficient authentication scopes

For a relatively small academic research project, I am trying to use Google Cloud Natural Language API.
From what I understood on the Authentication Overview, it looks like an API key would be the best and simplest approach to authentication, rather than a service account or user account.
Creating the key was easy enough. But now I am stuck on how to actually use it in conjunction with gcloud commands on an Ubuntu VM instance on Google cloud compute engine.
When I try to run the simple example on the Natural Language Quickstart Guide, I get this error:
gcloud ml language analyze-entities --content="Michelangelo Caravaggio, Italian painter, is known for 'The Calling of Saint Matthew'."
ERROR: (gcloud.ml.language.analyze-entities) PERMISSION_DENIED:
Request had insufficient authentication scopes.
The documentation and Q&A I've seen related to this error are related to service accounts or user accounts, but I am trying to just use the "simple" API key.
The documentation for Using an API key shows how to do so via REST. But, for now as a "quick" test to see if I have the Natural Language API working, I want to just do a simple test with gcloud on the command line. I looked through the gcloud documentation, but could not find anything about specifying an API key string.
How can I run the above command with gcloud and authenticate with my API key?
If this API key turns out to be more of a hassle, I may consider switching to a service account.
Any help would be greatly appreciated...
Got this to work by:
From Google Cloud console:
Compute Engine -> VM instances
Click name of existing VM, which brings up VM instance details page. Click "Edit" link near the top of the page.
Then modify Cloud API access scopes to allow full access to all Cloud APIs.
If you are using a GCE VM the easiest way to authenticate to the Cloud APIs is to use the VM's service account. When you create the VM you can specify what scopes to authorize for the service account. The simplest solution is to provision a VM with Cloud Platform scope. Using gcloud
gcloud --project=$PROJECT compute instances create $VM --zone=$ZONE --machine-type=$MACHINE --scopes=cloud-platform