While using BigQuery Java Client, need to join between Table A present in project A.dataset A and Table B present in project B.dataset B
I am able to run the query using BigQuery console and get cross-project access to the tables by specifying the complete table id i.e. project.dataset.table
Is it possible to add both projects A and B to the same service account, so that the client can be initialized with a single Google Service Account Configuration and query the tables from both the projects?
Thanks.
Yes, it is possible to add the same Service Account to different projects.
Once you have created your Service Account in one project, copy the e-mail. Navigate to Cloud IAM page, choosing your second project. Add the Service Account as a member with necessary BigQuery role to your second project.
Related
There is a project in bigquery called project1 and it has a dataset called config_prd. I am trying to create a table in the dataset if it does not exist and then update the table each time I trigger the pipeline. Creating and updating tables are airflow tasks.
At the moment the DAG fails because
Access Denied: Table project1:config_prd.table1: User does not have bigquery.tables.get
permission for table project1:config_prd.table1.
My Question:
So the airflow service account needs that permission to check if the table exists. How I can give the airflow account data viewer permission to config_prd dataset?
My suggested solution:
go to the GCP consol > API and Services > credentials > under the service account section I can see an email address:
airflow#project1.iam.gserviceaccount.com
I have to copy this email address and go to the GCP console > IAM and Admin >IAM > add
member > enter the email address and role is viewer
Please let me know if this is correct.
My other question is we have several projects on GCP. Should I do this for every single project?
another question: how previously airflow was able to update other tables in the dataset?
Using Google Cloud, there exists a BigQuery View table that queries two projects.
However, on the project where the view is located, we wish to run a query against it from Airflow/Composer. Currently it fails with a 403.
AFAIK it will use the default composer service account - however it doesn't have access to the 2nd project used in the sql of the view.
How do I give composer's service account access to the second project?
Think about a service account like a user account: you have a user email that you authorize on different project and component. Exactly the same thing with the service account email.
The service account belongs to a project. An user account belongs to a domain name/organisation. No real difference at the end.
So, you can use a service account email like any user accounts:
Grant authorization in any project
Add it in Google Groups
Even grant it viewer or editor role on GSuite document (Sheet, Docs, Slides,...) to allow it to access and to read/update these document!! Like any users!
EDIT
With Airflow, you can defined connexions and a default connexion. You can use this connexion in your DAG and thus use the service account that you want.
I think you have to add the service account into project IAM.
I want to build some service where one customer / company can provide a google cloud storage bucket + firestore db and i want to perform some operations on the bucket files and firestore (read/write) but i'm not sure whats the best way to get access to their resources.
[my gc project] -> [customer 1 gc project: bucket + firestore]
-> [customer 2 gc project: bucket + firestore]
-> [customer n gc project: bucket + firestore]
Solutions i can imagine:
Request access with OAuth but then its more like the user gives me the permissions and not the company
The customer creates a service account and gives me the "json"
I create a service account for each customer and he has to add it to his project, i don't know if thats possible and i think there is a limit of about 100 service accounts per customer
I create one service account and each customer has to add it to his projects
Some other requirements:
I need access to the customer project in a way that i can run scheduled jobs in background
I have to access the customer project with google cloud functions
What would be the best fit for me or am i missing something?
If the projects will be created by you on their behalf, I would suggest to create an organization. In an organization projects are classified in folders, similar to a file system. Then, you can add the access control to specific elements to all the projects inside. https://cloud.google.com/iam/docs/resource-hierarchy-access-control
Otherwise, you will have to manually (or create a script) to ask for a service account (second dot) or create on unique service account and add this unique service account to each customer project (third dot).
I'm implementing a Cloud Dataflow job on GCP that needs to deal with 2 GCP projects.
Both input and output are Bigquery partitionned tables.
The issue I'm going through now is that I must read data from a project A and write it into a project B.
I havent seen anything related to cross project service accounts and I can't give Dataflow two different credential key either which is a bit annoying ?
I don't know if someone else went through that kind of architecture or how you dealt with it.
I think you can accomplish this with the following steps:
Create a dedicated service account in the project running the Dataflow job.
Grant the service account the Dataflow Worker and BigQuery Job User roles. The service account might need additional roles based on the full resource needs of the Dataflow job.
In Project A, grant the service account the BigQuery Data Viewer role to either the entire project or to specific datasets.
In Project B, grant the service account the BigQuery Data Editor role to either the entire project or to specific datasets.
When you start the Dataflow job, override the service account pipeline option supplying the new service account.
It is very simple. you need to give required permission/access to your service account from both the project.
So you need only service account which has required access/permission in both the project
Hope it helps.
I have a BigQuery table that references a Google Sheet document that my service account can't access. Currently getting
"Error: Not found: Files /gdrive/id/xxxx"
I'm using the NodeJs 3.0.0 client library to access the BigQuery API.
The service user account has Data Editor, Job User and User rights on BigQuery and I have explicitly shared the google sheet to the service account.
For clarity, the user account has no issues querying other tables that it has access to, just a bunch of these external tables.
Any thoughts on what else I might need to do?
After going back through this from the start it turns out the issue was due to Google Doc editor being disabled at an organisation level. This affects both normal users and service accounts. Also ensuring that Domain-wide delegation was in place heped resolved this.