I am trying to update a bunch of rows in a psql database, and would like to do it in one generated sql statement if possible. I am able to generate batch insert statements which look similar to this:
INSERT INTO my_table (col1, col2, col3)
VALUES (v11, v12, v13), (v21, v22, v23), ...
However I am not sure how to do this with an update statement instead. I could do one SQL statement for each row I want to update but this seems unnecessary and slower than having just one statement
P.S. all rows have an id column so I can reference them through that
Near Bottom of Page
I was able to find the answer at the above link. looks similar to
UPDATE my_table
SET x = case
when y = '1' then '1.1'
when y = '2' then '1.2'
end
WHERE y='1' OR y='2';
Related
I am trying to insert in database(Oracle) in python with cx_oracle. I need to select from table and insert into another table.
insert_select_string = "INSERT INTO wf_measure_details(PARENT_JOB_ID, STAGE_JOB_ID, MEASURE_VALS, STEP_LEVEL, OOZIE_JOB_ID, CREATE_TIME_TS) \
select PARENT_JOB_ID, STAGE_JOB_ID, MEASURE_VALS, STEP_LEVEL, OOZIE_JOB_ID, CREATE_TIME_TS from wf_measure_details_stag where oozie_job_id = '{0}'.format(self.DAG_id)"
conn.executemany(insert_select_string)
conn.commit()
insert_count = conn.rowcount
But I am getting below error. I do not have select parameter of data as data is getting from select query.
Required argument 'parameters' (pos 2) not found
Please suggest how to solve this
As mentioned by Chris in the comments to your question, you want to use cursor.execute() instead of cursor.executemany(). You also want to use bind variables instead of interpolated parameters in order to improve performance and reduce security risks. Take a look at the documentation. In your case you would want something like this (untested):
cursor.execute("""
INSERT INTO wf_measure_details(PARENT_JOB_ID, STAGE_JOB_ID,
MEASURE_VALS, STEP_LEVEL, OOZIE_JOB_ID, CREATE_TIME_TS)
select PARENT_JOB_ID, STAGE_JOB_ID, MEASURE_VALS, STEP_LEVEL,
OOZIE_JOB_ID, CREATE_TIME_TS
from wf_measure_details_stag
where oozie_job_id = :id""",
id=self.DAG_id)
I have the following query in M:
= Table.Combine({
Table.Distinct(Table.SelectColumns(Tab1,{"item"})),
Table.Distinct(Table.SelectColumns(Tab2,{"Column1"}))
})
Is it possible to get it working without prior changing column names?
I want to get something similar to SQL syntax:
select item from Tab1 union all
select Column1 from Tab2
If you need just one column from each table then you may use this code:
= Table.FromList(List.Distinct(Tab1[item])
& List.Distinct(Tab2[Column1]))
If you use M (like in your example or the append query option) the columns names must be the same otherwise it wont work.
But it works in DAX with the command
=UNION(Table1; Table2)
https://learn.microsoft.com/en-us/dax/union-function-dax
It's not possible in Power Query M. Table.Combine make an union with columns that match. If you want to keep all in the same step you can add the change names step instead of tap2 like you did with Table.SelectColumns.
This comparison of matching names is to union in a correct way.
Hope you can manage in the same step if that's what you want.
I have the following tables:- evaluations, evaluation_options and options. I am trying to create an evaluation and evaluation_option on one page.
To create the evaluation_option I will need evaluation_id after an evaluation is created. I am getting the option_id from a List of Value.
At this point, I am not sure how to get this done as I am new to PL-SQL & SQL.
For this, I did a dynamic query to create both tables. I don't think this is the best way of getting the job done, I am open up to resolve this in the right way.
This is my code:-
DECLARE
row_id evaluations.id%TYPE;
BEGIN
INSERT INTO EVALUATIONS (class_student_rotations_id, strengths,
suggestions) VALUES (:P12_CLASS_STUDENT_ROTATIONS_ID, :P12_STRENGTHS,
:P12_SUGGESTIONS);
SELECT id into row_id FROM EVALUATIONS WHERE ROWID=(select max(rowid)
from EVALUATIONS);
INSERT ALL
INTO evaluation_options (option_id, evaluation_id) VALUES
(:P12_APPLICATION_OF_BASICS, row_id)
SELECT * FROM DUAL;
END;
From
https://cloud.google.com/bigquery/docs/partitioned-tables:
you can shard tables using a time-based naming approach such as [PREFIX]_YYYYMMDD
This enables me to do:
SELECT count(*) FROM `xxx.xxx.xxx_*`
and query across all the shards. Is there a special notation that queries only the latest shard? For example say I had:
xxx_20180726
xxx_20180801
could I do something along the lines of
SELECT count(*) FROM `xxx.xxx.xxx_{{ latest }}`
to query xxx_20180801?
SINGLE QUERY INSPIRED BY Mikhail Berlyant:
SELECT count(*) as c FROM `XXX.PREFIX_*` WHERE _TABLE_SUFFIX IN ( SELECT
SUBSTR(MAX(table_id), LENGTH('PREFIX_') + 2)
FROM
`XXX.__TABLES_SUMMARY__`
WHERE
table_id LIKE 'PREFIX_%')
If you do care about cost (meaning how many tables will be scaned by your query) - the only way to do so is to do in two steps like below
First query
#standardSQL
SELECT SUBSTR(MAX(table_id), LENGTH('PREFIX') + 1)
FROM `xxx.xxx.__TABLES_SUMMARY__`
WHERE table_id LIKE 'PREFIX%'
Second Query
#standardSQL
SELECT COUNT(*)
FROM `xxx.xxx.PREFIX_*`
WHERE _TABLE_SUFFIX = '<result of first query>'
so, if result of first query is 20180801 so, second query will obviously look like below
#standardSQL
SELECT COUNT(*)
FROM `xxx.xxx.PREFIX_*`
WHERE _TABLE_SUFFIX = '20180801'
If you don't care about cost but rather need just result - you can easily combine above two queries into one - but - again - remember - even though result will be out of last table - cost will be as you query all table that match xxx.xxx.PREFIX_*
Forgot to mention (even though it should be obvious): of course when you have only COUNT(1) in your SELECT - the cost will be 0(zero) for both options - but in reality - most likely you will have something more valuable than just count(1)
I know this is a kind of an old thread but I was surprised why no one offers an answer using Variables.
"Héctor Neri" already mentioned this in the comments but I thought might be better to have an actual answer with a sample code posted.
#standardSQL
DECLARE SHARD_DATE STRING;
SET SHARD_DATE=(
SELECT MAX(REPLACE(table_name,'{TABLE}_',''))
FROM `{PRJ}.{DATASET}.INFORMATION_SCHEMA.TABLES`
WHERE table_name LIKE '{TABLE}_20%'
);
SELECT * FROM `{PRJ}.{DATASET}.{TABLE}_*`
WHERE _TABLE_SUFFIX = SHARD_DATE
Make sure to replace {PRJ}, {DATASET}, and {TABLE} values with your table location.
If you run this on BigQuery Web UI, you will see this message:
WARNING: Could not compute bytes processed estimate for script.
But you can see that variable properly reduce the table scan to the latest partition and does not cause any extra cost after running the script.
I have a table that houses information uploaded from a template (via another application). Well i noticed that the year was wrong (code in the application issue) and caused about 3000 lines of incorrect dates. My question is, how would i write a query to replace all the 20150101 (incorrect date) with 20160101 (correct date)? I am pretty sure its the UPDATE routine but i am not a SQL programmer so i am a tad lost. I am using latest SSMS.
Table: TRANS_USER_FORECAST_EDITS_FROM_EXCEL
Column Name: mo_day_year
DO A SELECT FIRST TO SEE HOW MANY RECORD YOU NEED TO UPDATE..
SELECT * FROM TRANS_USER_FORECAST_EDITS_FROM_EXCEL
WHERE mo_day_year = '20150101'
THEN COPY YOUR RESULT AND RUN THE QUERY BELOW TO UPDATE ALL THE RECORDS.
BEGIN TRAN
UPDATE TRANS_USER_FORECAST_EDITS_FROM_EXCEL SET mo_day_year = '20160101'
WHERE mo_day_year = '20150101'
COMMIT
As you noted, it's indeed an update statement:
UPDATE TRANS_USER_FORECAST_EDITS_FROM_EXCEL
SET mo_day_year = 20150101
WHERE mo_day_year = 20160101