Every time I write query using the bq CLI tool I have to specify --use_legacy_sql=flase. I need to configure this. Google's documentation on this states it must be edited in the file $HOME/.bigqueryrc but there is no such file. In fact I couldn't find the .bigqueryrc named file anywhere on the server. Please help me to save few seconds every time I write a query. Thanks.
Following comments conversation above - the file under discussion - .bigqueryrc - can be created (manually) and populated with "default" flag values as described at the Setting default values for command-line flags BigQuery documentation.
Related
I am trying to figure out how to specify the dataset location in a BigQuery API query using v0.27 of the BigQuery API.
I have a dataset located in northamerica-northeast1 and the BigQuery API is returning 404 errors since this is not the default multi-regional location "US."
I am using the run_async_query method to execute my queries but based on documentation am unsure how to add a location to this field to make it location aware.
I have also tried to previously update my client instantiation like this:
def _get_client(self):
bigquery.Client.SCOPE = (
'https://www.googleapis.com/auth/bigquery',
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive')
client = bigquery.Client.from_service_account_json(_KEY_FILE)
if self._params['bq_data_location'].strip():
client.location = self._params['bq_data_location']
return client
However, it does not appear that this is the correct way to inform the BigQuery API of a dataset location.
For additional context, in my SQL that I am passing to the BigQuery API, I am already specifying the PROJECT_ID.DATASET_ID.TABLE_ID, however, this does not seem to be sufficient to find regional data.
Furthermore, I am making this request from Google App Engine using the CRMint open source data flow platform.
Can you please help me with an example of how location can be added to the BigQuery API for v0.27 so that the API does not return 404?
Thank you!
From the code sample it seems you're likely talking about google-cloud-bigquery 0.27, which was released in Aug 2017 and predates location support (as well as many other features).
Your best bet is to update that dependency to something more recent.
Reference: https://cloud.google.com/logging/docs/reference/libraries#log-sinks
Apparently you can create a log sink and export logs to Cloud Storage.
But how does it work? What will be the exported format of those logs? Will I get 1 file per log entry? Will it be like a .txt file? What is the exported format?
From the documentation, this is not clear to me.
The log format is in JSON, with one entry per line, and follow this format. Not really clear in the documentation, but you can find the information here
Here is what I've found out:
When you select your project's bucket. A folder will be created with the same name as your logName (in my case, logName === admin-logs. And there will be folders for years, month and days. Like, today is 2021-02-26 and this was the result:
As per the file, a json file will be created. From the file name, I'm guessin the sink export will happen once every hour.
Currently my VDB DDL file is getting quite big. I want to split into different files using the following.
IMPORT FROM REPOSITORY "DDL-FILE"
INTO test OPTIONS ("ddl-file" '/path/to/schema1.ddl')
However, this does not seem to work.
Can the DDL file path be relative, how?
The schema test, can it be VIRTUAL?
Does "DDL-FILE" refer to "ddl-file"?
What should I put in my main VDB ddl and what should I put in my extra ddl's. Should the
extra ddl's contain server configuration details or should they be defined as a VDB.
I would like to see a working example on how to use this.
This will be used in a teiid springboot project where you can only load one main vdb file. It is not workable to have one very large ddl file.
I tried multiple approaches but it does not seem to work, either giving me a null pointer with no error codes or error codes that tell me nothing.
Also the syntax in Teiid 9.3 seems different:
IMPORT FOREIGN SCHEMA public
FROM REPOSITORY DDL-FILE
INTO test OPTIONS ("ddl-file" '/path/to/schema.ddl')
This feature is currently not implemented in Teiid Spring Boot. This issue is captured in https://issues.redhat.com/browse/TEIIDSB-219
Update: I added the needed code to master, should be available with 1.7 release meanwhile you can build the master branch and test it out.
Is there a way to bulk tag bigquery tables with python google.cloud.datacatalog?
If you want to take a look at sample code which uses the python google.cloud.datacatalog client library, I've put together a utilities open source script, that creates bulk Tags using a CSV as source. If you want to use a different source, you may use this script as reference, hope it helps.
create bulk tags from csv
For this purpose you may consider using DataCatalogClient() method which is included in google.cloud.datacatalog_v1 class as a part of PyPI Python google-cloud-datacatalog package leveraging Google Cloud Data Catalog API service.
By the first, you have to enable Data Catalog and BigQuery APIs
in your project;
Install Python Cloud Client Libraries for the Data Catalog API:
pip install --upgrade google-cloud-datacatalog
Set up authentication, exporting
GOOGLE_APPLICATION_CREDENTIALS environment variable holding JSON
file that contains your service account key:
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/[FILE_NAME].json"
Refer to this example from official documentation that
intelligibly reflects a way creating Data catalog tag template,
attaching appropriate tag fields to the target Bigquery table using
create_tag_template() function.
Having any doubts feel free to extend you initial question or add a comment below this answer, thus we can address particular use case according to your needs.
I'm using siddhi to create some app which also interacts with PostgreSQL DB. Although I'm not sure, I believe, there is a bug about making multiple updates on the same PG table, within a single event (i.e. upon receiving an event, update a record in the table, and create another one again in the same table) it seems the batch updates are causing some problems. SO, I just want to give it a try after disabling batchUpdate (it is enabled by default). I just don't know how to configure it using siddhi-sdk (via Intellij plugin). There are two related tickets:
https://github.com/wso2-extensions/siddhi-store-rdbms/issues/43
https://github.com/wso2/product-sp/issues/472
Until these are documented, I'd like to get some quick response how to set these fields.
Best regards...
I'm using siddhi to create some app which also interacts with PostgreSQL DB. Although I'm not sure, I believe, there is a bug about making multiple updates on the same PG table, within a single event (i.e. upon receiving an event, update a record in the table, and create another one again in the same table) it seems the batch updates are causing some problems.
When batchEnabled has been set to true, it will perform the insert/update operation on batch of events instead of performing those operations on each and every single event. Simply, this has been introduced to improve the performance.
The default value of this parameter is currently set to "true".
However, batchEnable configurations is done through a system parameter called, "{{RDBMS-Name}}.batchEnable" which have to be configured in the WSO2 Stream Processor's deployment.yaml
If you want to overide this property in Product-SP please find the steps below.
Open the deployment.yaml file located in {Product-SP-Home}/conf/editor/
Insert the following lines in the file.
siddhi:
extensions:
extension:
name: store
namespace: rdbms
properties:
PostgreSQL.batchEnable: true
But currently there is no way to overwrite those system configurations from the siddhi app level. Since you are using the SDK, what you can do is changing the default value of above parameter to "false".
Please find the steps below do it.
Find the siddhi-store-rdbms-4.x.xx.jar file in the siddhi
sdk. This is located in the {siddhi-sdk-home}/lib/ .
Open the jar file using an archive manager and open the
rdbms-table-config.xml file located inside it with a text editor.
Set false in <batchEnable>true</batchEnable> attribute under the
<database name="PostgreSQL"> tag and save it.
Thanks Raveen. with a simple dash (-) before "extension" I was able to set the config.
siddhi:
extensions:
- extension:
name: store
namespace: rdbms
properties:
PostgreSQL.batchEnable: false