I'm doing a count from table1 whose records/rows don't exist in table2
Here is the query:
select count(1) from table1
where not exists (select 1 from table2 where
table1.col1 = table2.col1
and table2.id=1)
I need to see the records that are missing in table2 , whose id in table2=1, and these records should be available in table1. The PK here is col1.
The query returns me 0. But if I do an excel sheet comparing by removing both the tables to excel. I can find 1591 records that are missing from table1 and are available in table2.
Your query is working fine.
You query finds records that EXISTS in table1 but not in table2
You have found with excel records that does NOT EXISTS in table1 and EXISTS in table2
If you'd like to find these records with SQL than your query should be:
select count(1) from table2
where table2.id=1 and table2.col1 not in (select col1 from table1)
or with not exists version of this query:
select count(1) from table2
where table2.id=1 and
not exists (select 1 from table1 where table1.col1=table2.col1)
I didn't test the queries.
Related
I have an AWS database with multiple tables that I am trying to get the row counts for in a single query.
The ideal query output would be:
table_name row_count
table2_name row_count
etc...
So far I've been able to either get all the table names from the database or all the rowcounts of the tables (in random order), but not both in the same query.
This query returns a column of all the table names that exist in the database:
SELECT table_name FROM information_schema.tables WHERE table_schema = '<database_name>';
This query returns all the row counts for the tables:
SELECT COUNT(*) FROM table_name
UNION ALL
SELECT COUNT(*) FROM table2_name
UNION ALL
etc..for the rest of the tables
The issue with this query is that is displays the row counts in a random order that doesn't correspond with the order of the tables in the query, and so I don't know which row count goes with which table - hence why I need both the table names and row counts.
Simply add the names of the tables as literals in your queries:
SELECT 'table_name' AS table_name, COUNT(*) AS row_count FROM table_name
UNION ALL
SELECT 'table_name2' AS table_name, COUNT(*) AS row_count FROM table_name2
UNION ALL
…
The following query generates the UNION query to produce counts of all records.
The problem to solve is that (as of December 2022) INFORMATION_SCHEMA.TABLES incorrectly defines every table and view as a BASE TABLE so you will need some logic to eliminate the views.
In Data Warehousing it is common practise to record snapshots of the record counts of landing tables at frequent intervals. Any unexpected deviations from expected counts can be used for reporting/alerting
WITH Table_List AS (
SELECT table_schema,table_name, CONCAT('SELECT CURRENT_DATE AS run_date, ''',table_name, ''' AS table_name, COUNT(*) AS Records FROM "',table_schema,'"."', table_name, '"') AS BaseSQL
FROM INFORMATION_SCHEMA.TABLES
WHERE
table_schema = 'YOUR_DB_NAME' -- Change this
AND table_name LIKE 'YOUR TABLE PATTERN%' -- Change or remove this line
)
, Total_Records AS (
SELECT COUNT(*) AS Table_Count
FROM Table_List
)
SELECT
CASE WHEN ROW_NUMBER() OVER (ORDER BY table_name) = Table_Count
THEN BaseSQL
ELSE CONCAT(BaseSql, ' UNION ALL') END AS All_Table_Record_count_SQL
FROM Table_List CROSS JOIN Total_Records
ORDER BY table_name;
How to do a subquery in the FROM clause in cube.js. There is only one table.
SELECT col_a FROM
(SELECT col_a, col_b FROM table_name WHERE col_c=some_val ORDER BY time DESC)
WHERE col_b=some_other_value
if you are in the same Table you could define a cube.js segment in your schema then specify the segment you are querying in your JSON Query. Refer this page to in the Docs.
I got this statement, which works in Oracle:
update table a set
a.attribute =
(select
round(sum(r.attribute1),4)
from table2 p, table3 r
where 1 = 1
and some joins
)
where 1 = 1
and a.attribute3 > 10
;
Now I would like to do the same statement in Exasol DB. But I got error [Code: 0, SQL State: 0A000] Feature not supported: this kind of correlated subselect (Session: 1665921074538906818)
After some research, I found out you need to write the query in following syntax:
UPDATE table a
set a.attribute = r.attribute2
FROM table a, table2 p, table3 r
where 1 = 1
and some joins
and a.attribute3 > 10;
The problem is I can't take sum of r.attribute2. So I get unstable set of rows. Is there any way to do the first query in Exasol DB?
Thanks for help guys!
Following SQL UPDATE statement will work for cases if JOIN between table1 and table2 are 1-to-1 (or if there is a 1-to-1 relation between target table and resultset of JOINs)
In this case target table val column is updated otherwise an error is returned
UPDATE table1 AS a
SET a.val = table2.val
FROM table1, table2
WHERE table1.id = table2.id;
On the other hand, if the join is causing multiple returns for single table1 rows, then the unstable error raised.
If you want to sum the column values of the multiplying rows, maybe following approach can help
First sum all rows of table2 in bases of table1 and use this sub-select as a new temp table, then use this in UPDATE FROM statement
UPDATE table1 AS a
SET a.val = table2.val
FROM table1
INNER JOIN (
select id, sum(val) val from table2 group by id
) table2
ON table1.id = table2.id;
I tried to solve the issue using two tables
In your case probably you will use table2 and table3 in the subselect statement
I hope this is the answer you were looking for
I have 2 tables that contains var values like (ID, date, var1,var2,var3....)
I need to get the data from table2 and add it to table1 for which (ID or date) does not exist in table1.
I am using the below code in sql to get new ID's from tab2 to tab1:
INSERT INTO table1
SELECT * FROM table2 a
WHERE ID not in(select ID from table1 where ID=a.ID)
Here is the code to add new date for existing ID's in tab2 to tab1 :
INSERT INTO table1
SELECT * FROM table2 a
WHERE date not in(select date from table1 where ID=a.ID)
I don't know how to do this in proc sql.
Please share an effective method to do this task.
To insert the new ID I used:
proc sql;
create table lookup as
select a.ID
from table1 a inner join table2 b
on a.ID = b.ID
;
insert into table1
select * from table2 a
where a.ID not in (select ID from lookup)
;
quit;
This works well. But, it failed to insert a date for existing IDs.
Please suggest some ideas to complete this step.
Thanks in advance!
SAS SQL is Similar to the SQl you have written.
The SAME insert statements can be warped as proc sqls in SAS and they work like charm.
If your SQL did work then the following will work too.
PROC SQL;
INSERT INTO work.table1
SELECT * FROM work.table2 a
WHERE ID not in(select ID from work.table1 where ID=a.ID);
INSERT INTO work.table1
SELECT * FROM work.table2 a
WHERE date not in(select date from work.table1 where ID=a.ID)
QUIT;
I am trying to join two datasets. The first dataset1 has two columns item and price. The second dataset2 has three columns - item, customerid, and qty. I need to only include the unique rows from dataset1 that are not in dataset2. While trying to implement this code, I get the error:
Error: Unresolved reference to table/correlation name i.
I am unsure how to fix this error, thanks.
PROC SQL;
create table a as
select *
from dataset1 as i
except corr
select *
from dataset2 as p
where i.item = p.item;
describe table a;
QUIT;
EXCEPT is used to select records in the first set that do not exist in the second set. So if what you want is, to quote you, select records from dataset1 that do not appear in dataset2, you don't need the where clause:
PROC SQL;
create table a as
select *
from dataset1 as i
except corr
select *
from dataset2 as p
;
QUIT;
If however, like that where clause would suggest, you actually want to select records from dataset1 where the value of item is not found in dataset2, you could do this
proc sql;
select *
from dataset1 i
where not exists (select *
from dataset2 p
where i.item=p.item
)
;
quit;
EDIT: following your latest comment, and if you reaaaally need your query to feature an except, this should get you your result
proc sql;
create table a as
select t1.*
from dataset1 t1
inner join (select *
from dataset1 as i
except corr
select *
from dataset2 as p
) t2
on t1.item=t2.item
;
quit;
Even though this will do the same as the query above (with not exists) or, now that I think of it (stupid me), as this:
proc sql;
create table a as
select *
from dataset1
where item not in (select distinct item from dataset2)
;
quit;