MySQL check if table has correct schema - c++

I am currently developing server software in C++ with a MySQL data backend. I am using the official MySQL/connector library from Oracle to work with MySQL. The connection itself is working and I'm not having any issues with that.
My problem is that the database and the table schemas tend to change every once in a while because new tables and columns keep getting added. Also exiting column may be changed for the same reason. To make sure I recognize outdated server software quickly I wanted to add a warning when the database has changed.
My first idea was to hardcode how the database (and tables and such) should look and then check whether the current database matches the hardcoded data. But I have no clue how to achive that.
In summary I want to be able to detect whether
A table has been added or removed
A column in a table has been altered
A column in a table has been added or removed
with as little C++ code as possible. Also it should be quite easy to maintain.
Additional information will be added when required.

I would suggest the following approach:
1) fork and execute the mysql command line client. Set up a pair of pipes, to mysql's standard input and output.
2) At this point you should be able to execute simple commands by piping them to mysql via the standard input pipe, and read the output from the standard output pipe.
You will need to make careful notes as to the output format of each mysql command, so that you know when you finished reading its output, and you can send the next command.
3) As the first order of being, execute:
show tables;
The output that comes back will list all tables in the database. Parsing the output into a list of table names is trival. Then execute for each table:
show create table <tablename>;
The resulting output shows all fields in the table, its keys, and constraints. Pretty much all of this table's schema. Lather, rinse, repeat, for every table.
4) In this manner you can capture a basic schema of the entire database, for comparison purposes. If necessary, use the same approach to capture the triggers, and other objects. You'll likely need to do some minor massaging of the data, and exclude a few bits. "show create table", for example, will include the current AUTO_INCREMENT values, which you can ignore.
This general approach, of driving a mysql process via its standard input and output, is bit wobbly, of course. With a little bit of work, you can use mysql's native client library, and execute all of these commands, and capture their results, directly. This should be more reliable.

Related

Comparing two SQLite databases in C++

I have two C++ functions, which each construct an SQLite database.
First function constructs database version 1, and then upgrades it to newest version by adding all tables/columns that have been added to the database since the first version. Another function constructs a database that is already in the newest version. As result, each function gives one database that has all necessary tables and columns, but no values.
I wish to write an unit test that compares the results of those two functions. I want to test that they have exactly the same tables and columns, and that all columns have the same CHECK and NOT NULL constraints. I only need to compare columns and tables, because the databases have no values in them at this point.
I would prefer to get the differences in a human readable form (to place them in an error message), but a boolean value (different/not different) is also fine.
How can I do that, given that both databases are in different variables and I cannot combine them?
There are other questions that suggest external applications for this, but can I do it in a simple way in C++? One possibility is to execute some SQL commands for each database, and compare the results in a for loop, but which commands do I need?
You can read the sqlite_master table to read the SQL used to create each table and compare that:
SELECT name, type, sql FROM sqlite_master;
For more information on sqlite_master, consult the SQLite documentation.

SAS/ACCESS and data step on external DB

I have the following concern regarding SAS/ACCESS facility.
Let's imagine that we have an external DB (i.e. Oracle), which we have assigned to a certain libname.
Next, we do a simple operation on one of the tables within this DB, i.e.
data db.table_new;
set db.table_old(keep=var1 var2 var3);
if var1>0 then new_var1=5;
run;
My question is the following:
Will the whole table table_old be pulled from external DB to SAS Server in order to process the data?
Will SAS/ACCESS transform the data step into DBMS operation or SQL so the whole processing will be performed outside SAS?
The documentation is unclear about it . See page 62.
Usually the rule of thumb is: if the SAS functions that are used in DATA step can be converted to native db sql functions, then SAS will let the DB server do the data processing. In your case, this seems to be the situation.
You can answer this question on any piece of code through a set of non-syntax-highlighted options that need to be simplified:
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
When you run the data step, check the log. You will see information about whether SAS is able to successfully translate the code or not. If it was unsuccessful, you will see:
ACCESS ENGINE: SQL statement was not passed to the DBMS, SAS will do the processing.
If this occurs, SAS will usually send out a select * to the server and pull everything before filtering. When you see that error, try doing explicit passthrough, or redesign your query so that it can do everything on the server. It is possible to bring down the SAS server, or severely degreade performance on the Oracle server, if the table is large enough.
Some common functions you'll want to avoid using directly in the query, especially with Oracle:
datepart()
intnx()
intck()
today()
put()
input()
If I have to use any of those functions, I usually play it safe and create a macro variable of static ones beforehand (e.g. today()), filter the raw data at the lowest level first to get it into the SAS server, or use explicit SQL passthrough.
In summary, I would say it depends on your method. On the second page of Chapter 1 of the SAS/Access 9.2 document in your above link, there are two methods (among the older DBLOAD procedure) of the SAS/ACCESS facility:
LIBNAME reference - assign SAS librefs to DBMS objects such
as schemas and databases; you can then work with the table or view as you would with a SAS data set...You can use such SAS procedures as PROC SQL or DATA step programming on any libref that references DBMS data.
SQL Pass-through facility - to interact with a data source using its
native SQL syntax without leaving your SAS session. SQL statements are passed directly to the data source for processing...The DBMS optimizer can take advantage of indexes on DBMS columns to process a query more quickly and
efficiently
Hence, for the first method SAS handles processing and second method DBMS handles processing. Like most clients (Java, C#, Python script or PHP webpage) that connect to external RDMS sources, unless a direct ODBC/OLEDB or other API connection is explicitly employed and request sent, processing is handled in the frontend (i.e., calculating parameters) and the end result is updated to the backend via transactions. All SAS's libraries would live in memory (or temporary hard disk) during the appointed session and depending on the code handles data itself and passes results to external source or passes data handling entirely to another source.
Comparative Example: Microsoft Access
One good comparative example would be Microsoft Access which like SAS too provides a linked table connection and pass-through query for any ODBC-compliant RDMS including SQL Server, Oracle, MySQL, etc. It is often a misnomer to tag Access as a database when actually it is a GUI program and collection of objects, one of which is the default Windows JET/ACE engine (a .dll file) not at all restricted to Access but available to all Office programs. Notice the world default as this can be switched out to any ODBC database source.
Linked tables are essentially Access GUI objects (specifically special tabledefs) not unlike SAS's libname refs that are loaded into a JET/ACE table container with data pointing externally. One can then use a linked table like any other Access local table and use anything of the ACE SQL dialect. This special linked table (much like SAS's libname refs are established by ODBC or other connection type) points to the external source and the driver translates query command for the migration action. Therefore, an exact same Access linked table query may perform differently than same RDMS query.
Analogy
I imagine SAS behaves the same way and exists as a front-end with libname ref as local objects with pointers to the backend. All data step handling is processed locally and simply the resultset are imported or extracted by the engine. To use an analogy. A database would be the home and SAS is the garbage man, home decorator, or move-in helper. SAS (like Java's JDBC, PHP's PDO, Python's cursors, R's libraries) knocks on the door which the database answers (annoyed by so many requests). "Hey buddy, we need to take out the garbage and here are the exact items...or we need to remodel the basement and here are the exact specs...or we have new furniture to add in the truck ready for drop off...with credentials signed please carry out immediately." And like in both, pass-through methods are requests carried out on the backend engine. So SAS leaves instructions, maybe a note on the door (without exactness) for homeowner to carry out.

Changing Length of Siebel Column

Suppose we have a existing siebel column and this column has corresponding mapped eim column also. If I change the length of this siebel base table's column from 100 to 200varhcar by running alter query from backend. How it will impact on the EIM process? Will import process be successful?
Regards,
Robin
If you are interested in knowing conceptually, here are the implications that i can foresee.
a) Table column added using alter table is virtually useless as the application wont be able to use it because its definition is missing from Siebel Repository.
b) If you change the length of an existing column, application would still be using the length mentioned in Siebel Repository.
c) EIM process will ignore your new column length as it loads data dictionary before running the job.
d) And finally, during code migration you have to do the alter table every time since DDLSync process cannot take care of your scenario.
I would advise you not to alter the length of an existing vanilla table column, and instead extend the database table to add a new column. Just as the other poster mentioned, you should do this using Siebel Tools. You will then need to also add reference for this new field into the EIM components (this you also do using Siebel Tools).
This is a best-practice. If your client ever had an Siebel code review done by Oracle, you would be told to do what I described above (not what you were considering doing).
Changing the column length using the alter table command will only change it in the database layer, which will have no repercussions with a siebel standpoint. The EIM tables will still be valid as they will be using the column length mentioned in the repository sent in by tools. If you dont change it in the tools and apply the table, I dont think the changes will work.
I would not recommend that you do this. In this case, probably nothing will go wrong. EIM columns will load data that are upto 100 characters long but from the gui, you could insert upto 200 characters. Something unexpected can go wrong, we would need to know your application better to answer this question.

borland builder c++ oracle question

I have a Borland builder c++ 6 application calling Oracle 10g database. Operating over a LAN. When the application in question makes a simple db select e.g.
select table_name from element_tablenames where element_id = 10023842
the following is recorded as happening in Oracle (from the performance logs)
select table_name
from element_tablenames
where element_id = 10023842
then immediately (and not from C++ source code but perhaps deeper)
select table_name, element_tablenames.ROWID
from element_tablenames
where element_id = 10023842
The select statement is only called once in the TADODbQuery object, yet two queries are being performed - one to parse and the other adds the ROWID for executon.
Over a WAN and many, many queries this is obviously a problem to the user.
Does anyone know why this might be happening, can someone suggest a solution?
Agree with Robert.
The ROWID uniquely identifies a row in a table so that the returned record can be applied back to the database with any changes (or as a DELETE).
Is there a way to identify a particular column (or set of columns) as a primary key so that it can be used to identify a row without using a ROWID.
I don't know exactly where the RowID is coming from, it could be either the TAdoQuery implementation or the Oracle Driver. But I am sure I found the reason.
From the Oracle docs:
If the database table does not contain a primary key, the ROWID must be selected explicitly when populating DataTable.
So I suspect your Table does not have a primary key, either add one or add the rowid.
Either way this will solve the duplicate query problem.
Since you are concerned about performance. In general
Using TAdoQuery you can set the CursorType to optimize different behaviors for performance. This article covers this from a TAdoQuery perspective. MSDN also has an article that covers it from from a general ADO Perspective. Finally the specifications from the Oracle Driver can be useful.
I would recommend setting the Cursor to either as they are the only supported by Oracle
ctStatic - Bi-directional query produced.
ctOpenForwardOnly - Unidirectional query produced, fastest but can't call Prior
You can also play with CursorLocation to see how it effects your speed.

Tell SAS not to add newly generated tables on the Process Flow

I have a SAS code that creates a lot of intermediary tables for my calculations. Thing is, I don't really care about this tables after the job is done, I only care to the finals results.
But, everytime I run this code, SAS add all the generated tables do my process flow, turning it into a huge mess (I am talking here of 40+ intermediary tables).
Is there a way to tell SAS not to add some tables to the process flow? Or at least to tell it not to add any tables at all? I am using SAS Enterprise Guide 4.1
Thanks in advance
Under SAS 9.1.x and 9.2.x (for Windows), it's possible to suppress the display of datasets in SAS client environments by prefixing the dataset name with "_TO". So in your code and/or tasks, you could call all your intemediate datasets _TO<DataSetName>, and they won't clutter up your process flow. But they will still be there and can be referenced in code and tasks.
If you do this and you're using tasks, note that it might be tricky to work out how to use the output data from a task as the input for another, if you can't see the dataset to select it. If you have trouble with this, comment on this post and we can address that.
Note that this "_TO" prefix thing is an undocumented, "hidden" feature that is to be deprecated in 9.3 - see this blog for details.
If you set the option "Maximum Number of output data sets to add to the project" (under Results General) to zero, it will not add any datasets to the project, but they'll still be available to view from the Server -> Library view (they'll be added to the flow at the point you request them).
I know this question is a year and a half old now, but if you are working with intermediate tables that can be deleted after you get the final results, SAS EG has a built in macro you can use for deleting these tables:
%_eg_conditional_dropds([table1], [table2], ... ,[table-n]);