I have query pertaining to the google big query tables. We are currently looking to query the big query table based on the file uploaded on the day into the cloud storage.
Meaning:
I have to load the data into big query table based on every day's data into cloud storage.
When i query:
select * from BQT where load_date =<TODAY's DATE>
Can we achieve this without adding the date field into the file?
If you just don't want to add a date column, Append current date suffix to your table name like BQT_20200112 when the GCS file is uploaded.
Then you can query specific datetime table by _TABLE_SUFFIX syntax.
Below is example query using _TABLE_SUFFIX
SELECT
field1,
field2,
field3
FROM
`your_dataset.BQT_*`
WHERE
_TABLE_SUFFIX = '20200112'
As you see, You don't need to add additional field like load_date when you query the tables using date suffix and wildcard symbol.
Related
I have two tables one contains User data and other is my dates table.
Is it possible to copy each row from users table for each date from dates table like below?
It is possible using transformations:
Add custom column(column formula is second table name):
Expand column:
Am new to Sql Alchemy. I have a raw sql which i need to execute by passing bind parameters. Resulting rows from the query, i need to update a particular column value. How do i do this in the efficient way?
Below are the columns in my table metrics
TABLE
id,total,pass,fail,category,ref_id
query = "Select * from table where id in(select max(id) from table ...)"
sql = text(query)
result = db.engine.execute(sql, CATEGORY=category)
for row in result:
//update here
So i have this complex query, that i need to execute as an inline query. Let's say i get three rows from my query and i need to update ref_id for all the 3 rows with a values. How can i achieve this preferably bulk update.
Am using python 2.7,SQLAlchemy==0.9.9,SQLAlchemy-Utils==0.29.8
Is there any way to add datepicker to report and based on datepicker value can we get the live data from sql server on demand?
Thanks in advance.
You can pass the date filters in the URL of your report. This means to add a query string parameter like this:
?filter=SalesData/SaleDate ge 2018-01-01 and SalesData/SaleDate se 2018-12-31
To filter the SalesData table on SaleDate field to show only data for 2018.
You can create one helper report with a date table and slicer, which will generate the actual link to your report. To do this, create a new calendar table (from Modeling -> New Table) with the dates you want, e.g.:
Date = CALENDAR (DATE(2000;1;1); DATE(2030;12;31))
Create a new measure to calculate the filtered URL:
ReportUrl = "https://app.powerbi.com/groups/xxxxx/reports/xxxxx?filter=SalesData/SaleDate ge " &
FORMAT(MIN('Date'[Date]); "yyyy-MM-dd") &
" and SalesData/SaleDate se " & FORMAT(MAX('Date'[Date]); "yyyy-MM-dd")
Do not forget to replace xxxxx with actual group and report IDs. Make sure the measure data category is Web URL to make it a hyperlink. Show it somewhere, where the user can click on it:
I'm trying to remove a column from a partitioned table in BigQuery using this command
bq query --destination_table [DATASET].[TABLE_NAME] --replace --use_legacy_sql=false 'SELECT * EXCEPT(column) FROM [DATASET].[TABLE_NAME]'
As a result the unwanted column is removed, the schema is changed but the data is no more partitioned.
Any suggestion on how to keep the data partitioned after the column is removed? Docs are clear only for non partitioned tables.
There are two workarounds you can use:
Use a column-partitioning table, which means it's partitioned on a value of a regular column in a table. You can create a new column-partitioned table and copy the data deleting the column:
bq mk --time_partitioning_field=pt --schema=... [DATASET].[TABLE_NAME2]
bq query --destination_table=[DATASET].[TABLE_NAME2] "SELECT _PARTITIONTIME as pt, * EXCEPT(column) from [DATASET].[TABLE_NAME]"
You can also still use day-partitioned tables, but copy the data using DML. You can set or copy _PARTITIONTIME column inside the DML INSERT statement, which is not possible with regular SELECT. Here is an example:
INSERT INTO
dataset1.table1 (_partitiontime,
a,
b)
SELECT
TIMESTAMP(DATE "2008-12-25") AS _partitiontime,
"a" AS a,
"b" AS b
This requires DML over partitioned tables, which is currently in alpha: https://issuetracker.google.com/issues/36383555
BigQuery now supports DROP COLUMN in partitioned tables:
ALTER TABLE mydataset.mytable
DROP COLUMN column
It's in beta at the time of writing, but it worked for me.
I have 2 tables, lets name them table1 and table2. Both of them have credit_id, loan_id and Date field. For some reason credit_id field needs to be updated with corresponding values from table2, linking data by Date and loan_id fields. To do so, I made a query like:
proc sql;
UPDATE a
SET a.credit_id = b.credit_id
FROM table1 a, table2 b
WHERE (a.Date = b.Date) AND (a.loan_id = b.loan_id);
quit;
According to googling, this query should work in many sql environments, but it seems that SAS is an exception here, because it seems that from part is ignored.
How to update needed field then?
I can't comment on the SQL, but you can do the same thing using a data step:
data table1;
update table1 table2(keep = date loan_id credit_id);
by date loan_id;
run;
This requires that:
No two rows in the same table have the same date and loan_id, and
Both tables are sorted/indexed by date and loan id
You need the keep on the transaction dataset in order to prevent it from updating/creating any other variables on the master dataset. There are also several other ways you could do this, e.g. using the modify or merge statements.