Size of data obtained from SQL query via ODBC API

Size of data obtained from SQL query via ODBC API - c++

Does anybody know how I can get the number of the elements (rows*cols) returned after I do an SQL query? If that can't be done, then is there something that's going to be relatively representative of the size of data I get back?
I'm trying to make a status bar that indicates how much of the returned data I have processed, so I want to be somewhere relatively close. Any ideas?
Please note that SQLRowCount only returns returns the number of rows affected by an UPDATE, INSERT, or DELETE statement; not the number of rows returned from a SELECT statement (as far as I can tell). So I can't multiply that directly to the SQLColCount.
My last option is to have a status bar that goes back and forth, indicating that data is being processed.

That is frequently a problem when you wan to reserve dynamic memory to hold the entire result set.
One technique is to return the count as part of the result set.
WITH
data AS
(
SELECT interesting-data
FROM interesting-table
WHERE some-condition
)
SELECT COUNT(*), data.*
from data
If you don't know beforehand what columns you are selecting
or use a *, like the example above,
then number of columns can be selected out of the USER_TAB_COLS table
SELECT COUNT(*)
FROM USER_TAB_COLS
WHERE TABLE_NAME = 'interesting-table'

SQLRowCount can return the number of rows for SELECT queries if the driver supports it. Many drivers dont however, because it can be expensive for the server to compute this. If you want to guarantee you always have a count, you must use COUNT(*), thus forcing the server into doing the potentially time consuming calculation (or causing it to delay returning any results until the entire result is known).
My suggestion would be to attempt SQLRowCount, so that the server or driver can decide if the number of rows is easily computable. If it returns a value, then multiply by the result from SQLNumResultCols. Otherwise, if it returns -1, use the back and forth status bar. Sometimes this is better because you can appear more responsive to the user.

Related

Fastest way to select several inserted rows

I have a table in a database which stores items. Each item has a unique ID, which the DB generates upon insertion (auto-increment).
A user may perform a specific task that will add X items to the database, however my program (C++ server application using MySQL connector) should return the IDs that the database generated right away. For example, if I add 6 items, the server must return 6 new unique IDs to the client.
What is the fastest/cleanest way to do such thing? So far I have been doing INSERT followed by SELECT for each new item OR INSERT followed by last_insert_id, however if there are 50 items to add it will take a few seconds at least which is not good at all for user experience.
sql_task.query("INSERT INTO `ItemDB` (`ItemName`, `Type`, `Time`) VALUES ('%s', '%d', '%d')", strName.c_str(), uiType, uiTime);
Getting the ID:
uint64_t item_id { sql_task.last_id() }; //This calls mysql_insert_id

I believe you need to rethink your design slightly. Let's use the analogy of a sales order. With a sales order (or invoice #) the user gets an invoice number (auto_incr) as well as multiple line item numbers (also auto_inc).
The sales order and all of the line items are selected for insert (from the GUI) and the inserts are performed. First, the sales order row is inserted and its id is saved in a variable for subsequent calls to insert the line items. But the line items are then just inserted without immediate return of their auto_inc id values. The application is merely returned the sales order number in the end. How your app uses that sales order number in subsequent calls is up to you. But it does not need to be immediate to retrieve all the X or 50 rows at once, as it has the sales order number iced and saved somewhere. Let's call that sales order number XYZ.
When you actually need the information, an example call could look like
select lineItemId
from lineItems
where salesOrderNumber=XYZ
order by lineItemId
You need to remember that in a multi-user system that there is no guarantee of receiving a contiguous block of numbers. Nor should it matter to you, as they are all attached appropriately with the correct sales order number.
Again, the above is just an analogy, used for illustration purposes.

That's a common but hard to solve problem. Unsure for mysql, but PostreSQL uses sequences to generate automatic ids. Inserting frameworks (object relationnal mappers) use that when they expect to insert many values: they query directly the sequence for a bunch of IDs and then insert new rows using those already known IDs. That way, no need for an additional query after each insert to get the ID.
The downside is that the relation ID - insertion time can be non monotonic when different writers intermix their inserts. It is not a problem for the database, but some (poorly written?) program could expect it is.

As you ID is autoincremental, you can do only two SELECT queries - before and after INSERT queries:
SELECT AUTO_INCREMENT FROM information_schema.tables WHERE table_name = 'dbTable' AND table_schema = DATABASE();
--
-- INSERT INTO dbTable... (one or many, does not matter);
--
SELECT LAST_INSERT_ID() AS lastID;
This will give you the siquence between first and last inserted IDs. Then you can easily calculate how many they are.

Select Statement Vs Find in Ax

while writing code we can either use select statement or select field list or find method on table for fetching the records.
I wonder which of the statement helps in better performance

It really depends on what you actually need.
find() methods must return the whole table buffer, that means, all of the columns are projected into the buffer returned by it, so you have the complete record selected. But sometimes you only need a single column, or just a few. In such cases it can be a waste to select the whole record, since you won't use the columns selected anyway.
So if you're dealing with a table that has lots of columns and you only need a few of them, consider writing a specific select statement for that, listing the columns you need.
Also, keep in mind that select statements that only project a few columns should not be made public. That means that you should NOT extract such statements into a method, because imagine the surprise of someone consuming that method and trying to figure out why column X was empty...

You can look at the find() method on the table and find out the same 'select'-statement there.
It can be the same 'select; statement as your own an the performance will be the same in this case.
And it can be different select statement then your own and the performance will be depend on indexes on the table, select statement, collected statistics and so on.
But there is no magic here. All of them is just select statement - no matter which method do you use.

mysql++ (mysqlpp): how to get number of rows in result prior to iteration using fetch_row through UseQueryResult

Is there an API call provided by mysql++ to get the number of rows returned by the result?
I have code structured as follows:
// ...
Query q = conn.query(queryString);
if(mysqlpp::UseQueryResult res = query.use()){
// some code
while(mysqlpp::Row row = res.fetch_row()){
}
}
My previous question here will be solved easily if a function that returns the number of rows of the result. I can use it to allocate memory of that size and fill in as I iterate row by row.

In case anyone runs into this:
I quote the user manual:
The most direct way to retrieve a result set is to use Query::store(). This returns a StoreQueryResult object,
which derives from std::vector, making it a random-access container of Rows. In turn,
each Row object is like a std::vector of String objects, one for each field in the result set. Therefore, you can
treat StoreQueryResult as a two-dimensional array: you can get the 5th field on the 2nd row by simply saying
result[1][4]. You can also access row elements by field name, like this: result[2]["price"].
AND
A less direct way of working with query results is to use Query::use(), which returns a UseQueryResult object.
This class acts like an STL input iterator rather than a std::vector: you walk through your result set processing
one row at a time, always going forward. You can’t seek around in the result set, and you can’t know how many
results are in the set until you find the end. In payment for that inconvenience, you get better memory efficiency,
because the entire result set doesn’t need to be stored in RAM. This is very useful when you need large result sets.
A suggestion found here: http://lists.mysql.com/plusplus/9047
is to use the COUNT(*) query and fetch that result and then use Query.use again. To avoid inconsistent count, one can wrap the two queries in one transaction as follows:
START TRANSACTION;
BEGIN;
SELECT COUNT(*) FROM myTable;
SELECT * FROM myTable;
COMMIT;

Getting generatedauto-increment ID without second query (MySQL)

I have been searching for a while on how to get the generated auto-increment ID from an "INSERT . INTO ... (...) VALUES (...)". Even on stackoverflow, I only find the answer of using a "SELECT LAST_INSERT_ID()" in a subsequent query. I find this solution unsatisfactory for a number of reasons:
1) This will effectively double the queries sent to the database, especially since it is mostly handling inserts.
2) What will happen if more than one thread access the database at the same time? What if more than one application accesses the database at the same time? It seems to me the values are bound to become erroneous.
It's hard for me to believe that the MySQL C++ Connector wouldn't offer the feature that the Java Connector as well as the PHP Connector offer.

An example taken from http://forums.mysql.com/read.php?167,294960,295250
sql::Statement* stmt = conn->createStatement();
sql::ResultSet* res = stmt->executeQuery("SELECT ##identity AS id");
res->next();
my_ulong retVal = res->getInt64("id");
In nutshell, if your ID column is not an auto_increment column then you can as well use
SELECT ##identity AS id
EDIT:
Not sure what do you mean by second query/round trip. First I thought you are trying to know a different way to get the ID of the last inserted row but it looks like you are more interested in knowing whether you can save the round trip or not?
If that's the case, then I am completely agree with #WhozCraig; you can punch in both your queries in a single statement like inser into tab value ....;select last_inserted_id() which will be a single call
OR
you can have stored procedure like below to do the same and save the round trip
create procedure myproc
as
begin
insert into mytab values ...;
select last_inserted_id();
end
Let me know if this is not what you are trying to achieve.

ColdFusion 9 - Top n random query results

I've got a series of queries that I do to get 5 results at random, the problem is that it is taking a while to get through them, mostly because it involves a loop to assign a rand value that I can order by (which Railo can do in-query)
I was wondering if anyone has dealt with this and knows of a way of speeding it up.
I'm below 200ms, which isn't bad but I'm sure it can be sped up.

You probably don't need to use QoQ at all.
One option might be to write your original query as:
SELECT TOP 5 whatever,you,need
FROM table
ORDER BY rand()
Update the syntax depending on which database server you're using.
Another option, which could be done for both regular queries and QoQ, would be:
select only the primary keys
shuffle the array (i.e. createObject("java","java.util.Collections").shuffle(Array))
use the first five items in the array to select the fields you need.
No looping or updating, just two simple selects.
Of course if your primary key is just an auto-incrementing integer, you might get away with SELECT MAX(Id) then use RandRange to pick your five items.

For Microsoft SQL Server (v2005+) this query syntax will get 5 random records:
SELECT TOP 5 *
FROM table
ORDER BY NEWID()

I'm on Railo (ColdFusion 9) and neither TOP nor NEWID() works in a Query of Query (QoQ). If you happen to fall into this use case, and you must act upon a QoQ, then here's a solution:
<cfquery name="randomizedQueryObject" dbtype="query" maxrows="10">
SELECT *, RAND() as rand
FROM someQueryObject
ORDER BY rand
</cfquery>
This returns 10 random items from a larger result set and works in a QoQ. Short and simple.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Size of data obtained from SQL query via ODBC API - c++

Related

Fastest way to select several inserted rows

Select Statement Vs Find in Ax

mysql++ (mysqlpp): how to get number of rows in result prior to iteration using fetch_row through UseQueryResult

Getting generatedauto-increment ID without second query (MySQL)

ColdFusion 9 - Top n random query results

Categories

Resources