I am trying to get a service account to create blobs in Google Cloud Storage
from within a Python script, but I am having issues with the credentials.
1) I create the service account for my project and then download the key file in json:
"home/user/.config/gcloud/service_admin.json"
2) I give the service account the necessary credentials (via gcloud in a subprocess)
roles/viewer, roles/storage.admin, roles/resourcemanager.projectCreator, roles/billing.user
Then I would like to access a bucket in GCS
from google.cloud import storage
import google.auth
credentials, project = google.auth.default()
client = storage.Client('myproject', credentials=credentials)
bucket = client.get_bucket('my_bucket')
Unfortunately, this results in:
google.api_core.exceptions.Forbidden: 403 GET
https://www.googleapis.com/storage/v1/b/my_bucket?projection=noAcl:
s_account#myproject.iam.gserviceaccount.com does not have
storage.buckets.get access to my_bucket
I have somewhat better luck if I set the environment variable
export GOOGLE_APPLICATION_CREDENTIALS="home/user/.config/gcloud/service_admin.json"
and rerun the script. However, I want it all to run in one single instance of the script that creates the accounts and continues to create the necessary files in the buckets. How can I access my_bucket if I know where my json credential file is.
Try this example from the Documentation for Server to Server Authentication:
from google.cloud import storage
# Explicitly use service account credentials by specifying the private key file.
storage_client = storage.Client.from_service_account_json('service_account.json')
# Make an authenticated API request
buckets = list(storage_client.list_buckets())
print(buckets)
This way you point the file containing the key of the Service Account directly in your code.
Related
I am accessing data stored in GCS bucket while running Python within a container in a GKE node within the same project.
I can run gsutil ls without problems, but when I try to access the bucket with Python, I get a permission error:
raise exceptions.from_http_response(response)
google.api_core.exceptions.Forbidden: 403 GET https://storage.googleapis.com/storage/v1/b/xxxxxxxx/o?maxResults=1&projection=noAcl&prefix=test%2F&prettyPrint=false: Caller does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
I am listing the GCS bucket using the answer from #Robino in this post. For brevity, I copied it here:
import google.cloud.storage as gcs
client = gcs.Client()
BUCKET_NAME = "abc"
blobs = client.list_blobs(
BUCKET_NAME,
prefix="xyz/", # <- you need the trailing slash
delimiter="/",
max_results=1,
)
next(blobs, ...) # Force blobs to load.
The problem was that the authentication configs were not picked by python. Calling gcloud auth application-default login solved it.
So I would like to know if there is a way to authenticate gcloud utility command via an access token?
E.g. if I obtained an access token via gcloud auth print-access-token and then on another computer, do gcloud auth ${access_token}
Is that possible?
Gcloud auth tokens expire in 60 minutes by default I think.
To provide long running access to another person or device you would use IAM to provide access to their account to perform the function you need them to do (by them doing their own gcloud auth).
If that's not an option you could create a service account, export a key for that service account, and provide the key to them which they could authenticate from the console/terminal by setting the GOOGLE_APPLICATION_CREDENTIALS variable prior to performing gcloud commands.
e.g.
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"
Resources
Grant an IAM role by using the Google Cloud console
Creating and managing service accounts
Getting started with authentication - Setting the environment variable
To use an access token directly is possible in three ways as outlined here, and summarized by me below:
Declaring the CLOUDSDK_AUTH_ACCESS_TOKEN environment variable, see https://cloud.google.com/sdk/docs/authorizing
--access_token_file flag, see https://cloud.google.com/sdk/gcloud/reference#--access-token-file.
Configuration of access_token_file, see https://cloud.google.com/sdk/gcloud/reference/config/set and search for access_token_file
Practical experience
I tried some things using Google Cloud SDK 413.0.0 and the Python client google-storage-client and learned somethings of relevance.
1. Options work with gcloud, but not all clients
All of the options below worked to setup the gcloud CLI with credentials.
export CLOUDSDK_AUTH_ACCESS_TOKEN=<access token>
gcloud config set auth/access_token_file $(pwd)/my-access-token.txt
gcloud storage ls <bucket> --access_token_file=$(pwd)/my-access-token.txt
2. Python clients don't respect options listed above
In this github issue, support for CLOUDSDK_AUTH_ACCESS_TOKEN by Google's Python libraries is requested.
However, they can consume access tokens directly like this:
# Example on using a temporary GCP access token.
#
# To acquire a token from some location, run
#
# gcloud auth print-access-token
#
# To use it with python libraries like google-cloud-storage, first create a
# credentials object to pass to the client.
#
import getpass
from google.cloud import storage
from google.oauth2.credentials import Credentials
# import an access token
# - option 1: read an access token from a file
with open("my-access-token.txt") as f:
access_token = f.read().strip()
# - option 2: read an access token from user input
access_token = getpass.getpass("Enter access token: ")
# setup a storage client using credentials
credentials = Credentials(access_token)
storage_client = storage.Client(credentials=credentials)
# test the storage client by trying to list content in a google storage bucket
bucket_name = "something" # don't include gs:// here
blobs = list(storage_client.list_blobs(bucket_name))
print(len(blobs))
I am trying to sign URL in GCP storage from AWS EC2 or Lambda, I have generated a JSON file for permissions providing my AWS account ID and role which is given to EC2 or Lambda. When I call sign URL even with storage admin or owner permission I get: Error: The caller does not have permission.
I used the code provided by GCP documentation.
const {Storage} = require('#google-cloud/storage');
const storage = new Storage();
const options = {
version: 'v4',
action: 'read',
expires: Date.now() + 15 * 60 * 1000, // 15 minutes
};
// Get a v4 signed URL for reading the file
const [url] = await storage
.bucket(bucketName)
.file(fileName)
.getSignedUrl(options);
Can anybody tell me what did I miss? What is wrong?
Seems the pro
*** update.
I am creating a service account, granting this service account storage admin to my project, then creating pull in Workload Identity Pools, setting AWS and my AWS account ID, then granting access by my AWS identities matching role, downloading JSON, and putting environment variables - GOOGLE_APPLICATION_CREDENTIALS - path to my JSON file and GOOGLE_CLOUD_PROJECT - my project ID. How to correctly load that clientLibraryConfig.json file to run functions I need?
update ** 2
my clientLibraryConfig JSON has the following content..
{
"type": "external_account",
"audience": "..",
"subject_token_type": "..",
"service_account_impersonation_url": "..",
"token_url": "..",
"credential_source": {
"environment_id": "aws1",
"region_url": "..",
"url": "..",
"regional_cred_verification_url": ".."
}
}
How can I generate an access token in node js SDK from this config file to access GCP storage from AWS ec2?
You have to set up the following permissions for the IAM service account:
Storage Object Creator: This is to create signed URLs.
Service Account Token Creator role: This role enables impersonation
of service accounts to create OAuth2 access tokens, sign blobs, or sign JWTs.
Also, you can try to run locally in GCP to sign the URL with the service account.
You can use an existing private key for a service account. The key can be in JSON or PKCS12 format.
Use the command gsutil signurl and pass the path to the private key from the previous step, along with the name of the bucket and object.
For example, if you use a key stored in the folder Desktop, the following command will generate a signed URL for users to view the object cat.jpegfor for 10 minutes.
gsutil signurl -d 10m Desktop/private-key.json gs://example-bucket/cat.jpeg
If successful, the response should look like this:
URL HTTP Method Expiration Signed URL
gs://example-bucket/cat.jpeg GET 2018-10-26 15:19:52 https://storage.googleapis.
com/example-bucket/cat.jpeg?x-goog-signature=2d2a6f5055eb004b8690b9479883292ae74
50cdc15f17d7f99bc49b916f9e7429106ed7e5858ae6b4ab0bbbdb1a8ccc364dad3a0da2caebd308
87a70c5b2569d089ceb8afbde3eed4dff5116f0db5483998c175980991fe899fbd2cd8cb813b0016
5e8d56e0a8aa7b3d7a12ee1baa8400611040f05b50a1a8eab5ba223fe5375747748de950ec7a4dc5
0f8382a6ffd49941c42498d7daa703d9a414d4475154d0e7edaa92d4f2507d92c1f7e811a7cab64d
f68b5df4857589259d8d0bdb5dc752bdf07bd162d98ff2924f2e4a26fa6b3cede73ad5333c47d146
a21c2ab2d97115986a12c28ff37346d6c2ca83e5618ec8ad95632710b489b75c35697d781c38e&
x-goog-algorithm=GOOG4-RSA-SHA256&x-goog-credential=example%40example-project.
iam.gserviceaccount.com%2F20181026%2Fus%2Fstorage%2Fgoog4_request&x-goog-date=
20201026T211942Z&x-goog-expires=3600&x-goog-signedheaders=host
The signed URL is the string that starts with https://storage.googleapis.com, and it is likely to span multiple lines. Anyone can use the URL to access the associated resource (in this case, cat.jpeg) during the designated time frame (in this case, 10 minutes).
So if this works locally, then you can start configuring Workload Identity Federation to impersonate your service account. In this link, you will find a guide to deploy it.
To access resources from AWS using your Workload Identity Federation you will need to review if the following requirements have been already configured:
The workload identity pool has been created.
AWS has been added as an identity provider in the workload identity
pool (The Google organization policy needs to allow federation from
AWS).
The permissions to impersonate a service account have been granted to the external account.
I will add this guide to configure the Workload Identity Federation.
Once the previous requirements have been completed, you will need to generate the service account credential, this file will only contain non sensitive metadata in order to instruct the library on how to retrieve external subject tokens and exchange them for service accounts tokens, as you mentioned the file could be an config.json and could be generated running the following command:
# Generate an AWS configuration file.
gcloud iam workload-identity-pools create-cred-config \
projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$POOL_ID/providers/$AWS_PROVIDER_ID \
--service-account $SERVICE_ACCOUNT_EMAIL \
--aws \
--output-file /path/to/generated/config.json
Where the following variables need to be substituted:
$PROJECT_NUMBER: The Google Cloud project number.
$POOL_ID:The workload identity pool ID.
$AWS_PROVIDER_ID: The AWS provider ID.
$SERVICE_ACCOUNT_EMAIL: The email of the service account to
impersonate.
Once you generate the JSON credentials configuration file for your external identity, you can store the path at the GOOGLE_APPLICATION_CREDENTIALS environment variable.
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/config.json
So, with this, the library can automatically choose the right type of client and initialize the credential from the configuration file. Please note that the service account will also need the roles/browser when using external identities with Application Default Credentials in Node.js or you can pass the project ID to avoid the need to grant roles/browser to the service account as is shown in the bellow code:
async function main() {
const auth = new GoogleAuth({
scopes: 'https://www.googleapis.com/auth/cloud-platform'
// Pass the project ID explicitly to avoid the need to grant `roles/browser` to the service account
// or enable Cloud Resource Manager API on the project.
projectId: 'CLOUD_RESOURCE_PROJECT_ID',
});
const client = await auth.getClient();
const projectId = await auth.getProjectId();
// List all buckets in a project.
const url = `https://storage.googleapis.com/storage/v1/b?project=${projectId}`;
const res = await client.request({ url });
console.log(res.data);
}
I am running something that uses AWS services on a production server. The most often provided solutions for providing credentials to session are one of:
from boto3 import Session
session = Session(profile_name='my_aws_profile')
OR
from boto import Session
session = Session(
aws_access_key_id="AWS_ACCESS_KEY",
aws_secret_access_key="AWS_SECRET_ACCESS_KEY"
)
What are my options so that I can
share the code without sharing my credentials, and
specify the path of my aws credentials file instead of assuming that it has to be ~/.aws/credentials?
The documentation lists all the ways Boto can find AWS credentials: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#configuring-credentials
In particular, the best practice would be to put credentials in environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Boto will pick those up automatically.
save your credentials as environment variables
if you want another location for your credentials, save it in the config file and change set the AWS_CONFIG_FILE location to your desired path. If you have credentials stored in credentials file and config file, the one in the credentials file takes precedence.
I've come across very weird permission issue. I'm trying to upload a file to s3, here's my function
def UploadFile(FileName, S3FileName):
session = boto3.session.Session()
s3 = session.resource('s3')
s3.meta.client.upload_file(FileName, "MyBucketName", S3FileName)
I did configure aws-cli on the server. This function works fine when I log into server and launch python interpreter but fails when called from my django rest api with:
An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
No idea why the same function works when called from interpreter and fails when called from django. Both are in the same virtual environment. Any suggestions?
According to the boto3 docs, boto3 is looking for credentials in the following places:
Passing credentials as parameters in the boto.client() method
Passing credentials as parameters when creating a Session object
Environment variables
Shared credential file (~/.aws/credentials)
AWS config file (~/.aws/config)
Assume Role provider
Boto2 config file (/etc/boto.cfg and ~/.boto)
Instance metadata service on an Amazon EC2 instance that has an IAM role configured.
Note that many of these places are paths with "~" in them. "~" refers to the current user's home directory. Most likely, your REST API is running under a different system user than you are using to test your code.
The proper solution is to use IAM roles, as this allows your server to have S3 access without you needing to give it IAM credentials. However, if that doesn't work for your setup, you should put the IAM credentials in the /etc/boto.cfg file as that is user agnostic.