Query breaks whenever I add MAX() functions or grouping statements

Query breaks whenever I add MAX() functions or grouping statements - c++

I have some tables laid out like so:
Airplane
(airplaneID number(2) primary key, airplaneName char(20), cruisingRange number(5));
Flights
(airplaneID number (2), flightNo number(4) primary key,
fromAirport char(20), toAirport char(20), distance number(4), depart timestamp,
arrives timestamp, foreign key (airplaneID) references Airplane);
Employees
(employeeID number(10) primary key, employeeName char(18), salary number(7));
Certified
(employeeID number(10), airplaneID number(2),
foreign key (airplaneID) references Airplane,
foreign key (employeeID) references Employees );
And I need to write a query to get the following information:
For each pilot who is certified for at least 4 airplanes, find the
employeeName and the maximum cruisingRange of the airplanes for which
that pilot is certified.
The query I have written is this:
SELECT Employees.employeeName, MAX(Airplane.cruisingRange)
FROM Employees
JOIN Certified ON Employees.employeeID = Certified.employeeID
JOIN Airplane ON Airplane.airplaneID = Certified.airplaneID
GROUP BY Employees.employeeName
HAVING COUNT(*) > 3
Lastly, this is the function that executes and reads in the query information:
void prepareAndExecuteIt() {
// Prepare the query
//sqlQueryToRun.len = strlen((char *) sqlQueryToRun.arr);
exec sql PREPARE dbVariableToHoldQuery FROM :sqlQueryToRun;
/* The declare statement, below, associates a cursor with a
* PREPAREd statement.
* The cursor name, like the statement
* name, does not appear in the Declare Section.
* A single cursor name can not be declared more than once.
*/
exec sql DECLARE cursorToHoldResultTuples cursor FOR dbVariableToHoldQuery;
exec sql OPEN cursorToHoldResultTuples;
int i = 0;
exec sql WHENEVER NOT FOUND DO break;
while(1){
exec sql FETCH cursorToHoldResultTuples INTO empName, cruiseRange;
printf("%s\t", empName);
printf("%s\n", cruiseRange);
i++;
// This is temporary while I debug so it doesn't just loop on forever when the query breaks.
if (i > 500){
printf("Entered break statement\n");
break;
}
}
exec sql CLOSE cursorToHoldResultTuples;
}
The query works until I add the MAX(), GROUP BY, and HAVING statements. Then it just reads in nothing infinitely. I don't know if this is an issue with the way I've written my query or if it's an issue with the C++ code that executes it. I'm using the ProC interface to access an Oracle database. Any ideas as to what's going wrong?

You can't mix implicit and explicit joins. I suggest
SELECT Employees.employeeName, MAX(Airplane.cruisingRange)
FROM Employees
JOIN Certified ON Employees.employeeID = Certified.employeeID
JOIN Airplane ON Airplane.airplaneID = Certified.airplaneID
GROUP BY Employees.employeeName
HAVING COUNT(*) > 3
which works fine.
db<>fiddle here

Related

How do I create a table for all tables in MySQL

I’m using MySQL for C++ and I want to create a new table for all the tables in my second database. The code I have now is:
CREATE TABLE new_table LIKE original_table;
INSERT INTO new_table SELECT * FROM original_table;
I want to this to work like a loop where all the tables and data in those tables are created for every table and piece of data there is in my second database. Can someone help me?

We can use a stored procedure to do the job in a loop. I just wrote the code and tested it in workbench. Got all my tables(excluding view) from sakila database to my sakila_copy database:
use testdb;
delimiter //
drop procedure if exists copy_tables //
create procedure copy_tables(old_db varchar(20),new_db varchar(20))
begin
declare tb_name varchar(30);
declare fin bool default false;
declare c cursor for select table_name from information_schema.tables where table_schema=old_db and table_type='BASE TABLE';
declare continue handler for not found set fin=true;
open c;
lp:loop
fetch c into tb_name;
if fin=true then
leave lp;
end if;
set #create_stmt=concat('create table ',new_db,'.',tb_name,' like ',old_db,'.',tb_name,';') ;
prepare ddl from #create_stmt;
execute ddl;
deallocate prepare ddl;
set #insert_stmt=concat('insert into ',new_db,'.',tb_name,' select * from ',old_db,'.',tb_name,';');
prepare dml from #insert_stmt;
execute dml;
deallocate prepare dml;
end loop lp;
close c;
end//
delimiter ;
create database sakila_copy;
call testdb.copy_tables('sakila','sakila_copy');
-- after the call, check the tables in sakila_copy to find the new tables
show tables in sakila_copy;
Note: As I stated before, only base tables are copied. I deliberately skipped views, as they provide logical access to tables and hold no data themselves.

Redshift Pivot Function

I've got a similar table which I'm trying to pivot in Redshift:
UUID
Key
Value
a123
Key1
Val1
b123
Key2
Val2
c123
Key3
Val3
Currently I'm using following code to pivot it and it works fine. However, when I replace the IN part with subquery it throws an error.
select *
from (select UUID ,"Key", value from tbl) PIVOT (max(value) for "key" in (
'Key1',
'Key2',
'Key3
))
Question: What's the best way to replace the IN part with sub query which takes distinct values from Key column?
What I am trying to achieve;
select *
from (select UUID ,"Key", value from tbl) PIVOT (max(value) for "key" in (
select distinct "keys" from tbl
))

From the Redshift documentation - "The PIVOT IN list values cannot be column references or sub-queries. Each value must be type compatible with the FOR column reference." See: https://docs.aws.amazon.com/redshift/latest/dg/r_FROM_clause-pivot-unpivot-examples.html
So I think this will need to be done as a sequence of 2 queries. You likely can do this in a stored procedure if you need it as a single command.
Updated with requested stored procedure with results to a cursor example:
In order to make this supportable by you I'll add some background info and description of how this works. First off a stored procedure cannot produce results strait to your bench. It can either store the results in a (temp) table or to a named cursor. A cursor is just storing the results of a query on the leader node where they wait to be fetched. The lifespan of the cursor is the current transaction so a commit or rollback will delete the cursor.
Here's what you want to happen as individual SQL statements but first lets set up the test data:
create table test (UUID varchar(16), Key varchar(16), Value varchar(16));
insert into test values
('a123', 'Key1', 'Val1'),
('b123', 'Key2', 'Val2'),
('c123', 'Key3', 'Val3');
The actions you want to perform are first to create a string for the PIVOT clause IN list like so:
select '\'' || listagg(distinct "key",'\',\'') || '\'' from test;
Then you want to take this string and insert it into your PIVOT query which should look like this:
select *
from (select UUID, "Key", value from test)
PIVOT (max(value) for "key" in ( 'Key1', 'Key2', 'Key3')
);
But doing this in the bench will mean taking the result of one query and copy/paste-ing into a second query and you want this to happen automatically. Unfortunately Redshift does allow sub-queries in PIVOT statement for the reason given above.
We can take the result of one query and use it to construct and run another query in a stored procedure. Here's such a store procedure:
CREATE OR REPLACE procedure pivot_on_all_keys(curs1 INOUT refcursor)
AS
$$
DECLARE
row record;
BEGIN
select into row '\'' || listagg(distinct "key",'\',\'') || '\'' as keys from test;
OPEN curs1 for EXECUTE 'select *
from (select UUID, "Key", value from test)
PIVOT (max(value) for "key" in ( ' || row.keys || ' )
);';
END;
$$ LANGUAGE plpgsql;
What this procedure does is define and populate a "record" (1 row of data) called "row" with the result of the query that produces the IN list. Next it opens a cursor, whose name is provided by the calling command, with the contents of the PIVOT query which uses the IN list from the record "row". Done.
When executed (by running call) this function will produce a cursor on the leader node that contains the result of the PIVOT query. In this stored procedure the name of the cursor to create is passed to the function as a string.
call pivot_on_all_keys('mycursor');
All that needs to be done at this point is to "fetch" the data from the named cursor. This is done with the FETCH command.
fetch all from mycursor;
I prototyped this on a single node Redshift cluster and "FETCH ALL" is not supported at this configuration so I had to use "FETCH 1000". So if you are also on a single node cluster you will need to use:
fetch 1000 from mycursor;
The last point to note is that the cursor "mycursor" now exists and if you tried to rerun the stored procedure it will fail. You could pass a different name to the procedure (making another cursor) or you could end the transaction (END, COMMIT, or ROLLBACK) or you could close the cursor using CLOSE. Once the cursor is destroyed you can use the same name for a new cursor. If you wanted this to be repeatable you could run this batch of commands:
call pivot_on_all_keys('mycursor'); fetch all from mycursor; close mycursor;
Remember that the cursor has a lifespan of the current transaction so any action that ends the transaction will destroy the cursor. If you have AUTOCOMMIT enable in your bench this will insert COMMITs destroying the cursor (you can run the CALL and FETCH in a batch to prevent this in many benches). Also some commands perform an implicit COMMIT and will also destroy the cursor (like TRUNCATE).
For these reasons, and depending on what else you need to do around the PIVOT query, you may want to have the stored procedure write to a temp table instead of a cursor. Then the temp table can be queried for the results. A temp table has a lifespan of the session so is a little stickier but is a little less efficient as a table needs to be created, the result of the PIVOT query needs to be written to the compute nodes, and then the results have to be sent to the leader node to produce the desired output. Just need to pick the right tool for the job.
===================================
To populate a table within a stored procedure you can just execute the commands. The whole thing will look like:
CREATE OR REPLACE procedure pivot_on_all_keys()
AS
$$
DECLARE
row record;
BEGIN
select into row '\'' || listagg(distinct "key",'\',\'') || '\'' as keys from test;
EXECUTE 'drop table if exists test_stage;';
EXECUTE 'create table test_stage AS select *
from (select UUID, "Key", value from test)
PIVOT (max(value) for "key" in ( ' || row.keys || ' )
);';
END;
$$ LANGUAGE plpgsql;
call pivot_on_all_keys();
select * from test_stage;
If you want this new table to have keys for optimizing downstream queries you will want to create the table in one statement then insert into it but this is quickie path.

A little off-topic, but I wonder why Amazon couldn't introduce a simpler syntax for pivot. IMO, if GROUP BY is replaced by PIVOT BY, it can give enough hint to the interpreter to transform rows into columns. For example:
SELECT partname, avg(price) as avg_price FROM Part GROUP BY partname;
can be written as:
SELECT partname, avg(price) as avg_price FROM Part PIVOT BY partname;
Even multi-level pivoting can also be handled in the same syntax.
SELECT year, partname, avg(price) as avg_price FROM Part PIVOT BY year, partname;

Update a table by using multiple joins

I am using Java DB (Java DB is Oracle's supported version of Apache Derby and contains the same binaries as Apache Derby. source: http://www.oracle.com/technetwork/java/javadb/overview/faqs-jsp-156714.html#1q2).
I am trying to update a column in one table, however I need to join that table with 2 other tables within the same database to get accurate results (not my design, nor my choice).
Below are my three tables, ADSID is a key linking Vehicles and Customers and ADDRESS and ZIP in Salesresp are used to link it to Customers. (Other fields left out for the sake of brevity.)
Salesresp(address, zip, prevsale)
Customers(adsid, address, zipcode)
Vehicles(adsid, selldate)
The goal is to find customers in the SalesResp table that have previously purchased a vehicle before the given date. They are identified by address and adsid in Customers and Vechiles respectively.
I have seen updates to a column with a single join and in fact asked a question about one of my own update/joins here (UPDATE with INNER JOIN). But now I need to take it that one step further and use both tables to get all the information.
I can get a multi-JOIN SELECT statement to work:
SELECT * FROM salesresp
INNER JOIN customers ON (SALESRESP.ZIP = customers.ZIPCODE) AND
(SALESRESP.ADDRESS = customers.ADDRESS)
INNER JOIN vehicles ON (Vehicles.ADSId =Customers.ADSId )
WHERE (VEHICLES.SELLDATE<'2013-09-24');
However I cannot get a multi-JOIN UPDATE statement to work.
I have attempted to try the update like this:
UPDATE salesresp SET PREVSALE = (SELECT SALESRESP.address FROM SALESRESP
WHERE SALESRESP.address IN (SELECT customers.address FROM customers
WHERE customers.adsid IN (SELECT vehicles.adsid FROM vehicles
WHERE vehicles.SELLDATE < '2013-09-24')));
And I am given this error: "Error code 30000, SQL state 21000: Scalar subquery is only allowed to return a single row".
But if I change that first "=" to a "IN" it gives me a syntax error for having encountered "IN" (Error code 30000, SQL state 42X01).
I also attempted to do more blatant inner joins, but upon attempting to execute this code I got the the same error as above: "Error code 30000, SQL state 42X01" with it complaining about my use of the "FROM" keyword.
update salesresp set prevsale = vehicles.selldate
from salesresp sr
inner join vehicles v
on sr.prevsale = v.selldate
inner join customers c
on v.adsid = c.adsid
where v.selldate < '2013-09-24';
And in a different configuration:
update salesresp
inner join customer on salesresp.address = customer.address
inner join vehicles on customer.adsid = vehicles.ADSID
set salesresp.SELLDATE = vehicles.selldate where vehicles.selldate < '2013-09-24';
Where it finds the "INNER" distasteful: Error code 30000, SQL state 42X01: Syntax error: Encountered "inner" at line 3, column 1.
What do I need to do to get this multi-join update query to work? Or is it simply not possible with this database?
Any advice is appreciated.

If I were you I would:
1) Turn off autocommit (if you haven't already)
2) Craft a select/join which returns a set of columns that identifies the record you want to update E.g. select c1, c2, ... from A join B join C... WHERE ...
3) Issue the update. E.g. update salesrep SET CX = cx where C1 = c1 AND C2 = c2 AND...
(Having an index on C1, C2, ... will boost performance)
4) Commit.
That way you don't have worry about mixing the update and the join, and doing it within a txn ensures that nothing can change the result of the join before your update goes through.

sql Column with multiple values (query implementation in a cpp file )

I am using this link.
I have connected my cpp file with Eclipse to my Database with 3 tables (two simple tables
Person and Item
and a third one PersonItem that connects them). In the third table I use one simple primary and then two foreign keys like that:
CREATE TABLE PersonsItems(PersonsItemsId int not null auto_increment primary key,
Person_Id int not null,
Item_id int not null,
constraint fk_Person_id foreign key (Person_Id) references Person(PersonId),
constraint fk_Item_id foreign key (Item_id) references Items(ItemId));
So, then with embedded sql in c I want a Person to have multiple items.
My code:
mysql_query(connection, \
"INSERT INTO PersonsItems(PersonsItemsId, Person_Id, Item_id) VALUES (1,1,5), (1,1,8);");
printf("%ld PersonsItems Row(s) Updated!\n", (long) mysql_affected_rows(connection));
//SELECT newly inserted record.
mysql_query(connection, \
"SELECT Order_id FROM PersonsItems");
//Resource struct with rows of returned data.
resource = mysql_use_result(connection);
// Fetch multiple results
while((result = mysql_fetch_row(resource))) {
printf("%s %s\n",result[0], result[1]);
}
My result is
-1 PersonsItems Row(s) Updated!
5
but with VALUES (1,1,5), (1,1,8);
I would like that to be
-1 PersonsItems Row(s) Updated!
5 8
Can somone tell me why is this not happening?
Kind regards.

I suspect this is because your first insert is failing with the following error:
Duplicate entry '1' for key 'PRIMARY'
Because you are trying to insert 1 twice into the PersonsItemsId which is the primary key so has to be unique (it is also auto_increment so there is no need to specify a value at all);
This is why rows affected is -1, and why in this line:
printf("%s %s\n",result[0], result[1]);
you are only seeing 5 because the first statement failed after the values (1,1,5) had already been inserted, so there is still one row of data in the table.
I think to get the behaviour you are expecting you need to use the ON DUPLICATE KEY UPDATE syntax:
INSERT INTO PersonsItems(PersonsItemsId, Person_Id, order_id)
VALUES (1,1,5), (1,1,8)
ON DUPLICATE KEY UPDATE Person_id = VALUES(person_Id), Order_ID = VALUES(Order_ID);
Example on SQL Fiddle
Or do not specify the value for personsItemsID and let auto_increment do its thing:
INSERT INTO PersonsItems( Person_Id, order_id)
VALUES (1,5), (1,8);
Example on SQL Fiddle

I think you have a typo or mistake in your two queries.
You are inserting "PersonsItemsId, Person_Id, Item_id"
INSERT INTO PersonsItems(PersonsItemsId, Person_Id, Item_id) VALUES (1,1,5), (1,1,8)
and then your select statement selects "Order_id".
SELECT Order_id FROM PersonsItems
In order to achieve 5, 8 as you request, your second query needs to be:
SELECT Item_id FROM PersonsItems
Edit to add:
Your primary key is autoincrement so you don't need to pass it to your insert statement (in fact it will error as you pass 1 twice).
You only need to insert your other columns:
INSERT INTO PersonsItems(Person_Id, Item_id) VALUES (1,5), (1,8)

Nested statements in sqlite

I'm using the sqlite3 library in c++ to query the database from *.sqlite file. can you write a query statement in sqlite3 like:
char* sql = "select name from table id = (select full_name from second_table where column = 4);"
The second statement should return an id to complete the query statement with first statement.

Yes you can, just make sure that the nested query doesn't return more than one row. Add a LIMIT 1 to the end of the nested query to fix this. Also make sure that it always returns a row, or else the main query will not work.
If you want to match several rows in the nested query, then you can use either IN, like so:
char* sql = "select name from table WHERE id IN (select full_name from second_table where column = 4);"
or you can use JOIN:
char* sql = "select name from table JOIN second_table ON table.id = second_table.full_name WHERE second_table.column = 4"
Note that the IN method can be very slow, and that JOIN can be very fast, if you index on the right columns

On a sidenote, you can use SQLiteadmin (http://sqliteadmin.orbmu2k.de/) to view the database and make queries directly in it (useful for testing etc).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Query breaks whenever I add MAX() functions or grouping statements - c++

Related

How do I create a table for all tables in MySQL

Redshift Pivot Function

Update a table by using multiple joins

sql Column with multiple values (query implementation in a cpp file )

Nested statements in sqlite

Categories

Resources