sql alchemy Update resultset of raw query - python-2.7

Am new to Sql Alchemy. I have a raw sql which i need to execute by passing bind parameters. Resulting rows from the query, i need to update a particular column value. How do i do this in the efficient way?
Below are the columns in my table metrics
TABLE
id,total,pass,fail,category,ref_id
query = "Select * from table where id in(select max(id) from table ...)"
sql = text(query)
result = db.engine.execute(sql, CATEGORY=category)
for row in result:
//update here
So i have this complex query, that i need to execute as an inline query. Let's say i get three rows from my query and i need to update ref_id for all the 3 rows with a values. How can i achieve this preferably bulk update.
Am using python 2.7,SQLAlchemy==0.9.9,SQLAlchemy-Utils==0.29.8

Related

Update multiple rows with differing values against a subset of a table

We are trying to update around 3000 rows in a table of roughly 300,000+ entries, with each update value being unique. The problem is it takes around an hour, and we want to see how we can efficiently optimize the query. Is there any way to create a temporary subset of the table that still references the original table?
Currently, we are doing this (x3000):
UPDATE table SET col1 = newVal1 WHERE col1 = oldVal1
UPDATE table SET col1 = newVal2 WHERE col1 = oldVal2
UPDATE table SET col1 = newVal3 WHERE col1 = oldVal3
.... etc (x3000)
There is probably an issue with trying to update a col that we are also searching whereby, causing more tax on the query, but this is a necessity as it's the only way to track the exact row item.
Using Mysql Workbench and InnoDB Engine
We are debating whether indexing col1 would be effective, since we are also writing over col1.
We have also tried to use a View of just the rows we want to update, but there is no difference in how long it takes.
Any suggestions?

How to add column with query folding using snowflake connector

I am trying to add a new column to a power query result that is the result of subtracting one column from another. according to the power bi documentation basic arithmetic is supported with query folding but for some reason it is showing a failure to query fold. I also tried simply adding a column populated with the number 1 and it still was not working. Is there some trick to getting query folding a new column to work on snowflake?
If the computation is made based only on data from source, then it could be computed during table import as SQL Statement:
SELECT col1, col2, col1 + col2 AS computed_total
FROM my_table_name
EDIT:
The problem with this solution is that native SQL statement for snowflake is only supported on PBI desktop and I want to have this stored in a dataflow (so pbi web client) for reusability and other reasons.
Option 1:
Create a view istead of table at source:
CREATE OR REPLACE VIEW my_view
AS
SELECT col1, col2, col1 + col2 AS computed_total
FROM my_table_name;
Option 2:
Add computed column to the table:
ALTER TABLE my_table_name
ADD COLUMN computed_total NUMBER(38,4) AS (col1 + col2);

Query Google Big Query Table by today's date

I have query pertaining to the google big query tables. We are currently looking to query the big query table based on the file uploaded on the day into the cloud storage.
Meaning:
I have to load the data into big query table based on every day's data into cloud storage.
When i query:
select * from BQT where load_date =<TODAY's DATE>
Can we achieve this without adding the date field into the file?
If you just don't want to add a date column, Append current date suffix to your table name like BQT_20200112 when the GCS file is uploaded.
Then you can query specific datetime table by _TABLE_SUFFIX syntax.
Below is example query using _TABLE_SUFFIX
SELECT
field1,
field2,
field3
FROM
`your_dataset.BQT_*`
WHERE
_TABLE_SUFFIX = '20200112'
As you see, You don't need to add additional field like load_date when you query the tables using date suffix and wildcard symbol.

Add column with hardcoded values in PowerBI

It's probably extremely simple but I can't find an answer. I have created a new column and I would like to use the DAX syntax to fill the column with hardcoded values.
I can write this: Column = 10 and I will get a column of 10s but let's say my table has 3 rows and I would like to insert a column with [10, 17, 155]. How can I do that?
Try using DATATABLE function
Table = DATATABLE("Column Name",INTEGER,{{10},{17},{155}})
You can also put more columns with their own data if you want to, check this
https://learn.microsoft.com/en-us/dax/datatable-function
Assuming your table has a primary key column, say, ID, you could create a new table with just the column you want to manually input.
ID Value
---------
1 10
2 17
3 155
You can create this table either through the Enter Data button or create it using the DAX DATATABLE function as #Deltapimol suggests.
Once you have this table you can create a relationship to your existing table in the data model at which point you can either use this new table in your report to get the values you need or if you really need them in the existing table for some reason, you can pull them over using the RELATED function in a calculated column.
Table1 = GENERATESERIES(1, 3)
Table2 = DATATABLE(
"ID", INTEGER,
"Value" INTEGER,
{{1, 10},{2, 17},{3, 155}}
)
Now you can create a relationship from Table1 to Table2[ID] and then define a calculated column on Table1 as follows:
ValueFromTable2 = RELATED(Table2[Value])
If you don't want to create a relationship, then you could use the LOOKUPVALUE function instead in a calculated column on Table11.

Is it possible to remove a column from a partitioned table in Google BigQuery?

I'm trying to remove a column from a partitioned table in BigQuery using this command
bq query --destination_table [DATASET].[TABLE_NAME] --replace --use_legacy_sql=false 'SELECT * EXCEPT(column) FROM [DATASET].[TABLE_NAME]'
As a result the unwanted column is removed, the schema is changed but the data is no more partitioned.
Any suggestion on how to keep the data partitioned after the column is removed? Docs are clear only for non partitioned tables.
There are two workarounds you can use:
Use a column-partitioning table, which means it's partitioned on a value of a regular column in a table. You can create a new column-partitioned table and copy the data deleting the column:
bq mk --time_partitioning_field=pt --schema=... [DATASET].[TABLE_NAME2]
bq query --destination_table=[DATASET].[TABLE_NAME2] "SELECT _PARTITIONTIME as pt, * EXCEPT(column) from [DATASET].[TABLE_NAME]"
You can also still use day-partitioned tables, but copy the data using DML. You can set or copy _PARTITIONTIME column inside the DML INSERT statement, which is not possible with regular SELECT. Here is an example:
INSERT INTO
dataset1.table1 (_partitiontime,
a,
b)
SELECT
TIMESTAMP(DATE "2008-12-25") AS _partitiontime,
"a" AS a,
"b" AS b
This requires DML over partitioned tables, which is currently in alpha: https://issuetracker.google.com/issues/36383555
BigQuery now supports DROP COLUMN in partitioned tables:
ALTER TABLE mydataset.mytable
DROP COLUMN column
It's in beta at the time of writing, but it worked for me.