Upload GCS File to Google Drive (Airflow) - google-cloud-platform

I am using airflow.providers.google.suite.transfers.gcs_to_gdrive.GCSToGoogleDriveOperator
to upload a file from GCS to Google Drive.
Getting 403 Forbidden with reason "insufficientPermissions"
This is the logging. Not sure where and what the issue is. any help is highly appreciated !!!
{gcs_to_gdrive.py:151} INFO - Executing copy of gs://adwords_data/conversion_data.csv to gdrive://google_adwords/
{gcs.py:328} INFO - File downloaded to /tmp/tmpzmk4n2z9
{http.py:126} WARNING - Encountered 403 Forbidden with reason "insufficientPermissions"
Code
from airflow.providers.google.suite.transfers.gcs_to_gdrive import GCSToGoogleDriveOperator
copy_google_adwords_from_gcs_to_google_drive = GCSToGoogleDriveOperator(
task_id="copy_google_adwords_from_gcs_to_google_drive",
source_bucket="{}".format(gcs_to_gdrive_bucket),
source_object="conversion_data.csv",
destination_object="adwords_data/",
gcp_conn_id='google_cloud_default',
dag=dag
)
In the gcp_conn_id = google_cloud_default. I have added the scope = https://www.googleapis.com/auth/drive

In your airflow connection, in the scopes field, try adding:
https://www.googleapis.com/auth/drive, https://www.googleapis.com/auth/cloud-platform
You'll also need to ensure the service account has the correct roles assigned (as well as the drive being shared w/ the Service Account's email)
Some of the roles you can add to the service account by going to IAM > click edit on the service account, and add the roles

Related

can't download blob from Google Vault API export

I can't get this to work either. In the Google API example documentation, it states this, see below. I am able to authenticate using a storage account, and access the Bucket and see the blobs, but if I use any kind of a blob method, e.g., blob.exists() or blob.download_from_filename(), it gets a forbidden 403 error. I have added storage admin privileges to both the user account that is authenticated and the service account, but still get this error. The documentation below doesn't mention anything about using a service account to access the blob. But, I don't know how to instantiate a storage client with the user account instead of a service account. Does anyone have an example of this ?
def download_exports(service, matter_id):
#"""Google Cloud storage service is authenticated by running
#`gcloud auth application-default login` and expects a billing enabled project
#in ENV variable `GOOGLE_CLOUD_PROJECT` """
gcpClient = storage.Client()
matter_id = os.environ['MATTERID']
for export in vaultService.matters().exports().list(
matterId=matter_id).execute()['exports']:
if 'cloudStorageSink' in export:
directory = export['name']
if not os.path.exists(directory):
os.makedirs(directory)
print(export['id'])
for sinkFile in export['cloudStorageSink']['files']:
filename = '%s/%s' % (directory, sinkFile['objectName'].split('/')[-1])
objectURI = 'gs://%s/%s' % (sinkFile['bucketName'],
sinkFile['objectName'])
print('get %s to %s' % (objectURI, filename))
gcpClient.download_blob_to_file(objectURI, open(filename, 'wb+'))
O.K., I figured out the problem. I worked around this by using the default storage service account, instead of creating a new service account.
#use the default service account
gcpClient = storage.Client()

Revoke/Invalidate the oauth2 refresh token generated in a .boto file (gsutil config)

I'm using a user account to access Google Play Console and get several monthly reporting.
So, I developed a web app to get automatically this reporting (csv files) hosted in a Google Cloud Storage bucket (gs://uri).
For that, a .boto file was created using "gsutil config" command where I obtained inside a refresh token oauth2 through the user account:
[Credentials]
# Google OAuth2 credentials (for "gs://" URIs):
# The following OAuth2 account is authorized for scope(s):
# https://www.googleapis.com/auth/cloud-platform
# https://www.googleapis.com/auth/accounts.reauth
gs_oauth2_refresh_token = [*************]
This works fine, but (always a but...) I don't know how I can revoke/invalidate this refresh token. I mean, I can generate again a new refresh token with the "gsutil config" command but this action is not revoking/invalidating the previous token.
Maybe someone on this planet could help me with this ;)
Many thanks in advance.
R.
You can send a request to the URL https://oauth2.googleapis.com/revoke?token={token}.
This is mentioned in Google's documentation on how to use OAuth2 tokens in web services:
https://developers.google.com/identity/protocols/oauth2/web-server#tokenrevoke

Google Cloud Platform - can I discover reason for 403 response?

I am using https://dataproc.googleapis.com/v1/projects/{projectId}/regions/{region}/clusters to create GCP Dataproc clusters as described at https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters/create.
I am using service account credentials that have been exported into a JSON keyfile. That service account (myserviceaccount#projectA.iam.gserviceaccount.com) exists in projectA and I have been able to use it to successfully create Dataproc clusters in projectA.
I now need to use the same service account to create Dataproc clusters in projectB. I'm running exactly the same code using exactly the same credentials, the only difference is the project that I'm creating it in. I have granted myserviceaccount#projectA.iam.gserviceaccount.com the exact same permissions in projectB as it has in projectA but when I try and create the cluster it fails:
2019-03-22 10:58:47 INFO: _retrieve_discovery_doc():272: URL being requested: GET https://www.googleapis.com/discovery/v1/apis/dataproc/v1/rest
2019-03-22 10:58:54 INFO: method():873: URL being requested: GET https://dataproc.googleapis.com/v1/projects/dh-coop-no-test-35889/regions/europe-west1/clusters?alt=json
2019-03-22 10:58:54 INFO: new_request():157: Attempting refresh to obtain initial access_token
2019-03-22 10:58:54 DEBUG: make_signed_jwt():100: [b'blahblahblah', b'blahblahblah']
2019-03-22 10:58:54 INFO: _do_refresh_request():777: Refreshing access_token
2019-03-22 10:58:55 WARNING: _should_retry_response():121: Encountered 403 Forbidden with reason "forbidden"
So, that service account is forbidden from creating clusters in projectB, but I don't get any information about why. I am hoping there are some audit logs that explain more about why the request was forbidden but I've looked in https://console.cloud.google.com/logs/viewer?project=projectB and can't find any.
Can someone tell me where I can get more information to diagnose why this request is failing?
As mentioned in the comments, one way to get more information on the failed request is to set up gcloud to use the service account. Running gcloud commands with --log-http may also give additional information.
Re-pasting here for easier readability/visibility.

PermissionDenied: 403 IAM permission 'dialogflow.intents.list'

I'm trying to get the list of the intents in my Dialogflow agent using Dialogflow's V2 APIs but have been getting the following error:
PermissionDenied: 403 IAM permission 'dialogflow.intents.list' on 'projects/xxxx/agent' denied.
I adopted the following steps:
I created a new agent(with V2 APIs enabled) and a new service account for it.
I downloaded the JSON key and set my GOOGLE_APPLICATION_CREDENTIALS variable to its path.
Following is my code:
import dialogflow_v2 as dialogflow
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/user/folder/service-account-key.json"
client=dialogflow.IntentsClient()
parent = client.project_agent_path('[PROJECT_ID]')
for element in client.list_intents(parent):
pass
I have made various agents and service accounts and even changed the role from Admin to Client but can't figure out any solution. I tried the following solution but didnt' work
Tried Solution: DialogFlow PermissionDenied: 403 IAM permission 'dialogflow.sessions.detectIntent'
There is no need for creating a new Agent. You can edit the existing agents IAM.
In Dialogflow's console, go to settings ⚙ > under the general tab, you'll see the project ID section with a Google Cloud link to open the Google Cloud console > Open Google Cloud.
In google cloud, go to IAM Admin > IAM under tab Members. Find the name of your agents and then click on edit.
Give admin permissions to the agent to give permissions to list intent.
The problem lies in the IAM section of GCP. Probably you are making a POST request with a role that does not have the necessary authorizations.
Look into your key.json file that contains the field "client_email"
Proceed to the IAM page and set the relevant role with that email to
a role that has posting capabilities. (e.g. Admin)
This solved my problem.
In Dialogflow's console, go to settings ⚙ > under the general tab, you'll see the project ID section with a Google Cloud link to open the Google Cloud console > Open Google Cloud.
(Optional) In the Cloud console, go to the menu icon > APIs & Services > Library. Select any APIs (if any) > Enable.
In Cloud Console > under the menu icon ☰ > APIs & Services > Credentials > Create Credentials > Service Account Key.
Under Create service account key, select New Service Account from the dropdown and enter a project name and for role choose Owner > Create.
JSON private key file will be downloaded to your local machine that you will need.
For Javascript:
In the index.js file you can do service account auth with JWT:
const serviceAccount = {}; // Starts with {"type": "service_account",...
// Set up Google Calendar Service account credentials
const serviceAccountAuth = new google.auth.JWT({
email: serviceAccount.client_email,
key: serviceAccount.private_key,
scopes: 'https://www.googleapis.com/auth/xxxxxxx'
});
For Python:
There's a Google Auth Python Library available via pip install google-auth and you can check out more here.
When you create the intentClient, use following:
key_file_path = "/home/user/folder/service-account-key.json";
client=dialogflow.IntentsClient({
keyFilename: key_file_path
})
Intents list
This error message is usually thrown when the application is not being authenticated correctly due to several reasons such as missing files, invalid credential paths, incorrect environment variables assignations, among other causes. Keep in mind that when you set an environment variable value in a session, it is reset every time the session is dropped.
Based on this, I recommend you to validate that the credential file and file path are being correctly assigned, as well as follow the Obtaining and providing service account credentials manually guide, in order to explicitly specify your service account file directly into your code; In this way, you will be able to set it permanently and verify if you are passing the service credentials correctly.
Passing the path to the service account key in code example:
def explicit():
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key
# file.
storage_client = storage.Client.from_service_account_json('service_account.json')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
Try also to create project in DialogFlow Console
https://dialogflow.cloud.google.com/
You need to create the following as environment variable
googleProjectID: "",
dialogFlowSessionID: "anything",
dialogFlowSessionLanguageCode: "en-US",
googleClientEmail: "",
googlePrivateKey:
I think you might have missed the Enable the API section in the documentation setup.
Here is that link:
https://cloud.google.com/dialogflow/cx/docs/quick/setup#api
After clicking the link, select the chatbot project you created and fill the necessary instructions given there.
The permissions that I have given for that project are Owner, and editor.
After this, try the code in this link:
https://cloud.google.com/dialogflow/es/docs/quick/api#detect_intent
You should get a response from your chatbot
Hope this helps!

Access Denied on Windows for AWS Credentials file

No matter what I try it seems my web service cannot access my .aws/credentials file.
I always get this error:
System.UnauthorizedAccessException: Access to the path '{PATH}' is denied.
Here is what I have tried:
Move the path from the default directory to the website root
Change the website app pool to run as my user account
Given Everyone full control of the folder and the file
Verify that when I put the same key and secret into the web.config the call works
Tried removing the region from the config
Tried removing the path from the config
Here is my config (note if I don't provide the path, even when in the default location, it says no credentials file was found)
<add key="AWSProfileName" value="default" />
<add key="AWSRegion" value="us-east-1"/>
<add key="AWSProfilesLocation" value="{PATH}" />
In the AWS toolkit I have a `default' profile setup as well that has rights but that does not help this work.
I have even tried the legancy format called out in the AWS docs. What am I missing? It seems I have followed everything AWS calls out in their docs.
I am using Castle Windsor DI so could that be getting in the way?
container.Register(
Component.For<IAmazonDynamoDB>()
.ImplementedBy<AmazonDynamoDBClient>()
.DependsOn(Dependency.OnValue<RegionEndpoint>(RegionEndpoint.USEast1))
.LifestylePerWebRequest());
container.Register(
Component.For<IDynamoDBContext>()
.ImplementedBy<DynamoDBContext>()
.DependsOn(Dependency.OnComponent<IAmazonDynamoDB, AmazonDynamoDBClient>())
.DependsOn(Dependency.OnValue<DynamoDBContextConfig>(
new DynamoDBContextConfig
{
TableNamePrefix = configurationManager.GetRequiredAppSetting<string>(Constants.Web.AppSettings.AwsDynamoDbPrefix),
Conversion = DynamoDBEntryConversion.V2
}))
.LifestylePerWebRequest());
The problem that you have is that the path ~\.aws\credentials is only defined when logged in as a user.
A Windows services such as IIS is not logged in as the user that created the credentials file. Therefore the the path is not accessible to the Windows service. Actually the service does not know what user to look into. For example if your user name is john, the path would be c:\users\john\.aws\credentials. The Windows service does not know about your identity.
Note: I believe - but I am not 100% sure - is that a windows service will look in c:\.aws for credentials. I have used this path in the past but I cannot find Amazon reference documentation to support this. I no longer store credentials on my EC2 instances, so I am out of touch on the location c:\.aws.
You have a number of choices. Create the credentials as usual. Then create a directory outside of your IIS installation and setup such as c:\.aws. Copy ~\.aws to c:\.aws. Then specify the full path in your programs.
A much better and more secure method, if you are running your services on AWS, is to use IAM Role. Create a role with the desired permissions and attach the role to your EC2 instance. All AWS SDKs and Tools know how to find the credentials from AWS Metadata.
There are many more methods such as EC2 Parameter Store. Storing credentials on your instances or inside your program is not a good idea.
[Edit after thinking more about the error message]
You may have an issue where IIS does not have access rights to the location where the credentials are stored.
Open Windows Explorer and locate the folder for your credentials file. Right click this folder, select Properties and click the Security tab. From here, choose Edit then Add. The following users must be added and given at least READ permissions: IUSR & IIS_IUSRS. You may need to add "LIST FOLDER CONTENTS".