What is the difference between HSQLDB and JavaDB? Which one is suitable for unit testing? - unit-testing

Could you tell me the differences between HSQLDB and JavaDB? And which one should I use in unit testing, assuming that I only use standard features? Thanks.

HSQLDB is faster, therefore it is better suited for unit testing.
The H2 Database is even better (in my view): it's as fast as HSQLDB, and supports compatibility modes for various databases (MySQL, Oracle,...). So if you need to use database specific features in the future, chances are that you can still test it with H2. But my view is a bit biased (see my profile).

HSQLDB has features that are useful for testing and not available in JavaDB. These include
database script and log in SQL text format allows quick check of test runs
user-defined SQL functions that allow you to write an equivalent to any function supported by another database very simply
very extensive set of SQL features and functions, for example date and interval arithmetic as supported by DB2 and ORACLE (also PostgreSQL and MySQL) and functions such as TO_DATE
BTW, the above features are not available in H2 (mentioned in another answer)


What are some specific reasons why one would use SphinxAPI over SphinxQL?

Are there any capabilities that one inherently lacks that the other doesn't?
SphinxQL (according to benchmarks on the Sphinx blog) returns queries faster than SphinxAPI for interpreted languages and the premise of such a comparison would likely be that the functionalities present in both are the same.
Why the API then?
Any clarity on this issue is much appreciated.
(This is about the C++ based open source search engine)
I just found a satisfactory answer:
SphinxQL is simply a language for querying Sphinx.
SphinxAPI is a framework that allows you to compute results based on the queries.
The queries could still be via SphinxQL or they could be via the API's syntax...it doesn't matter...SphinxQL and the SphinxAPI are different objects that accomplish different things (as highlighted above)
SphinxAPI is a legacy. That is why I'd rather go with a flow with a API than switch to SphinxQL in production. But for new projects SphinxQL is the only choice as it is evolve quicker and gets all features first. The next big thing that using SphinxQL you don't tie to developer of API for not officially supported languages or platform instead you could use any MySQL client \ library.

Is there a database access library for C and/or C++ with a similar interface to Perl's DBI?

I'm willing to write a subset of Perl's DBI interface for libodbc (or unixODBC) in C++.
I believe doing so will allow me concentrate better on my goal.
BTW, I prefer avoiding to reinvent the wheel, if of course something similar is already out there.
NVM, no odbc interface, but it is DBI like (seeing as DBI doesn't use odbc except in DBD::ODBC)
libdbi - http://libdbi.sourceforge.net/
libdbi implements a
database-independent abstraction layer
in C, similar to the DBI/DBD layer in
Perl. Writing one generic set of code,
programmers can leverage the power of
multiple databases and multiple
simultaneous database connections by
using this framework.
In order to utilize the libdbi
framework, you need to install drivers
for a particular type of database. The
drivers officially supported by libdbi
are split off into the libdbi-drivers
project. The current version of libdbi
(0.8.3) is supposed to work with any
0.8.x release of libdbi-drivers. Currently the following database
engines are supported:
* Firebird/Interbase
* FreeTDS (provides access to MS SQL Server and Sybase)
* PostgreSQL
* SQLite/SQLite3
I don't know a DB API that looks like DBI. Go for it - but add it to the libodbc project as a wrapper API rather than start a brand new project.
good luck.

Comparing the schema of two databases for integration testing

We use NHibernate generated schema to run unit tests against a database (integration tests I guess they are). I wondered if it was feasible to compare the generated schema against our development database. This would tell us when we had misspelt column names in our mappings or other issues like that. It would also go a long way toward keeping keys and the like consistent across the two.
Is this kind of automated compare feasible? How is the best way to go about doing it?
If you are unable to find a solution using nhibernate, you could look into something like RedGate's SQL Compare tool. This tool makes it incredibly easy to perform comparisons on different databases and see the schema differences. They also have a software development kit that allows you to leverage the power of SQL Compare in your own applications (something I have not yet gotten into, but would love to if the need ever arose).

Query building in a database agnostic way

In a C++ application that can use just about any relational database, what would be the best way of generating queries that can be easily extended to allow for a database engine's eccentricities?
In other words, the code may need to retrieve data in a way that is not consistent among the various database engines. What's the best way to design the code on the client side to generate queries in a way that will make supporting a new database engine a relatively painless affair.
For example, if I have (MFC)code that looks like this:
CString query = "SELECT id FROM table"
results = dbConnection->Query(query);
and we decide to support some database that uses, um, "AVEC" instead of "FROM". Now whenever the user uses that database engine, this query will fail.
Options so far:
Worst option: have the code making the query check the database type.
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
Betterer option: Create a query builder class that allows the caller to construct queries without using any SQL directly. Once the query is completed, caller can invoke a "Generate" method which returns a query string approrpriate for the active database engine
Best option: ??
Note: The database engine itself is abstracted away through some thin layers of our own creation. It's the queries themselves are the only remaining problem.
I've decided to go with the "better" option (query "selector") for two reasons.
Debugging: As mentioned below, debugging is going to be slightly easier with the selector approach since the queries are pre-built and listed out in a readable form in code.
Flexibility: It occurred to me that there are some databases which might have vastly better and completely different ways of solving a particular query. For example, with Access I perform a complicated query on multiple tables each time because I have to, but on Sql Server I'd like to setup a view. Selecting from the view and from several tables are completely different queries (i think) and this query selector would handle it easily.
You need your own query-writing object, which can be inherited from by database-specific implementations.
So you would do something like:
DbAgnosticQueryObject query = new PostgresSQLQuery();
// and so on
CString queryString = query.toString();
It can get pretty complicated in there once you go past simple selects from a single table. There are already ORM packages out there that deal with a lot of these nuances; it may be worth at looking at them instead of writing your own.
Best option: Pick a database, and code to it.
How often are you going to up and swap out the database on the back end of a production system? And even if you did, you'd have a lot more to worry about than just minor syntax issues. (Major stuff like join syntax, even datatypes can differ widely between databases.)
Now, if you are designing a commercial application where you want the customer to be able to use one of several back-end options when they implement it, then you may have to specify "we support Oracle, MS SQl, or MYSQL" and code to those specific options.
All of your options can be reduced to
Worst option: have the code making the query check the database type.
It's just a matter of where you're putting the logic to check the database type.
The option that I've seen work best in practice is
Better option: Create query request method on the db connection object that takes a unique query "code" and returns the appropriate query based on the database engine in use.
In my experience it is much easier to test queries independently from the rest of your code. It gets a lot harder if you have objects that are piecing together queries from bits of syntax, because then you have to test the query-creation code and the query itself.
If you pull all of your SQL out into separate files that are written and maintained by hand, you can have someone who is an expert in SQL write them (you can still automate the testing of these queries). If you try to write query-generating functions you'll essentially have a C++ expert writing SQL.
Choose an ORM, and start mapping.
If you are to support more than one DB, your problem is only going to get worse.
And just think of DB that are comming - cloud dbs with no (or close to no) SQL, and Object databases.
Take your queries outside the code - put them in the DB or in a resource file and allow overrides for different database engines.
If you use SPs it's potentially even easier, since the SPs abstract away your database differences.
I would think that what you would want to do, if you needed the ability to support multiple databases, would be to create a data provider interface (or abstract class) and associated concrete implementations. The data provider would need to support your standard query operators and other common, supported functionality required support your query operations (have a look at IEnumerable extension methods in .NET 3.5). Each concrete provider would then translate these into specific queries based on the target database engine.
Essentially, what you do is create a database abstraction layer and have your code interact with it. If you can find one of these for C++, it would probably be worth buying instead of writing. You may also want to look for Inversion of Control (IoC) containers for C++ that would basically do this and more. I know of several for Java and C#, but I'm not familiar with any for C++.

Automated integration testing a C++ app with a database

I am introducing automated integration testing to a mature application that until now has only been manually tested.
The app is Windows based and talks to a MySQL database.
What is the best way (including details of any tools recommended) to keep tests independent of each other in terms of the database transactions that will occur?
(Modifications to the app source for this particular purpose are not an option.)
How are you verifying the results?
If you need to query the DB (and it sounds like you probably do) for results then I agree with Kris K, except I would endeavor to rebuild the DB after every test case, not just every suite.
This helps avoid dangerous interacting tests
As for tools, I would recommend CppUnit. You aren't really doing unit tests, but it shouldn't matter as the xUnit framework should give you the set up and teardown framework you'll need to automatically set up your test fixture
Obviously this can result in slow-running tests, depending on your database size, population etc. You may be able to attach/detach databases rather than dropping/rebuilding.
If you're interested in further research, check out XUnit Test Patterns. It's a fine book and a good website for this kind of thing.
And thanks for automating :)
You can dump/restore the database for each test suite, etc. Since you are automating this, it may be something in the setup/teardown functionality.
I used to restore the database in the SetUp function of the database related unit test class. This way it was ensured that each test runs under the same conditions.
You may consider to prepare special database content for the tests, i.e. with less data than the current production version (to keep the restore times reasonable).
The best environment for such testing, I believe, is VMWare or an equivalent. Set up your database, transaction log and so on, then record the whole lot - database as well as configuration. Then to re-test, reload the image and database and kick off the tests. This still requires maintenance of the tests as the system changes, but at least the tests are repeatable, which is one of your greatest challenges in integration testing.
For test automation, many people use Perl, but we've found that Perl programs grow like Topsy and become convoluted. The use of Python as a scripting language (we run C++ tests) is worthwhile if you're trying to build a series of structured tests.
As #Kris K. says dumping and restoring the database between each test will probably be the way to go.
Since you are looking at doing testing external to the App I would look to build the testing framework in a language where you can take advantage of better testing tools.
If you built the testing framework in Java you could take advantage of JUnit and potentially even something like FitNesse.
Don't think that just because the application under test is C++ that means you are stuck using C++ for your automated testing.
Please try AnyDbTest, I think it is the very tool you are finding. (www.anydbtest.com).
1.Writing test case with Xml, not Java/C++/C#/VB code. Not need those expensive programming tools.
2.Supports all popular databases, such as Oracle/SQL Server/My SQL
3.So many kinds of assertion supported, such as StrictEqual, SetEqual, IsSupersetOf, Overlaps, and RecordCountEqual etc. Plus, most of assertions can prefix logic not operator.
4.Allows using an Excel spreadsheet/Xml as the source of the data for the tests. As you know, Excel spreadsheet is to easily create/edit and maintain the test data.
5.Supports sandbox test model, if one test will be done in sandbox, all database operations on each DB will be rolled back meaning any changes will be undone.
6.Allows performing data pump from one database/Excel into target database in testing initialization and finalization phase. This is easy way to prepare the test data for testing.
7.Unique cross-different-type-database testing, which means target and reference result set can come from two databases, even one is SQL Server, another is Oracle.
8.Set style comparison for recordset. AnyDbTest will tell you what is the intersection, or surplus or absence between the two record sets.
9.Sequential style comparison for recordset or scalar values. It means the two result set will be compared in their original sequence.
10.Allow to export result set of SQL statement into Xml/Excel file.