Is possível two heterogenous databases as a only one transactional - teiid

My scenario has two different versions of the same system with different database structures, where 1.0 is in production and 2.0 in dev.
Version 2.0 needs to go into production using the data and structure of version 1.0 for a specified time. Our team would like not to change the data structure of 2.0, the question is whether through a VDB it would be possible for 2.0 to manipulate the base 1.0 by performing queries and transactions.
Our knowledge of teiid is still quite initial, so we would like advice if teiid would be a possibility for our need, if it would be a way forward.

Yes you can use a vdb to provide a stable interface to a single or set of sources. As long as the sources support XA, you can perform transactionally safe reads/writes across them.

Related

PostgreSQL-9.1 watching the schema changes

ALL,
I am writing a software using C++ which will connect to the database and perform some operations and then disconnect. The program will be using different DBMSs and it will be cross-platform. The software should check for the schema changes and if there is a table creation / modification / deletion, act accordingly.
One of the challenges I'm currently facing is this:
I'm trying to test the software on one of the old Mac computers with the PostgreSQL-9.1 installed. The newer versions of PostgreSQL supports writing the function which will look for the schema changes and notify the client. But this feature is available from the version 9.3.
Is there an easy and simple way to get such notification for PG-9.1 and 9.2? Or the only way is to work with the log file and do the polling of it?
TIA!
There's also infromation_schema which presents a standardized representation of the shape of the database, you could query that and compare before and after results.

NoSQL with analytic functions

I'm searching for any NoSQL system (preferably open source) that supports analytic functions (AF for short) like Oracle/SQL Server/Postgres does. I didn't find any with build-in functions. I've read something about Hive but it doesn't have actual feature of AF (windows, first_last values, ntiles, lag, lead and so on) just histograms and ngrams. Also some NoSQL systems (Redis for example) support map/reduce, but I'm not sure if AF can be replaced with it.
I want to make a performance comparison to choose either Postgres or NoSQL system.
So, in short:
Searching for NoSQL systems with AF
Can I rely on map/reduce to replace AF? Is it fast, reliable, easy to go.
ps. I tried to make my question more constructive.
Once you've really understood how MapReduce works, you can do amazing things with a few lines of code.
Here is a nice video course:
http://code.google.com/intl/fr/edu/submissions/mapreduce-minilecture/listing.html
The real difficulty factor will be between functions that you can implement with a single MapReduce and those that will need chained MapReduces. Moreover, some nice MapReduce implementations (like CouchDB) don't allow you to chain MapReduces (easily).
Some function uses knowledge of all existing data when it involves some king of aggregation (avg, median, standard deviation) or some ordering (first, last).
If you want a distributed NOSQL solution that support AF out of the box, the system will need to rely on some centralized indexing and metadata to keep information about the data in all nodes, thus having a master-node and probably a single point of failure.
You have to ask what you expect to accomplish using NoSQL. You want schemaless tables ? Distributed data ? Better raw performance for very simple queries ?
Depending of your needs, I see three main alternatives here:
1 - use a distributed NoSQL with no single point of failure (ie: Cassandra) to store your data and use map/reduce to process the data and produce the results for the desired function (almost any major NoSQL solution support Hadoop). The caveat is that map/reduce queries are not realtime (can take minutes or hours to execute the query) and requires extra-setup and learning.
2 - use a traditional RDBMS that support multiple servers like MySQL Cluster
3 - use a NoSQL with master/slave topology that supports ad-hoc and aggregation queries like Mongo
As for the second question: yes, you can rely on M/R to replace AF. You can do almost anything with M/R.

Database versions deployment. Entity Framework Migrations vs SSDT DacPacs

I have a data-centered application with SQL Server. The environments in which it´ll be deployed are not under our control and there´s no DBA in there (they are all small businesses) so we need the process of distribution of each application/database update to be as automatic as possible.
Besides of the normal changes between versions of an application (kind of unpredictable sometimes), we already know that we´ll need to distribute some new seed data with each version. Sometimes this seed data will be related to other data in our system. For instance: maybe we´ll need to insert 2 new rows of some master data during the v2-v3 update process, and some other 5 rows during the v5-v6 update process.
EF
We have checked Entity Framework Db Migrations (available for existing databases with no Code-First since 4.3.1 release), which represents the traditional sequential scripts in a more automatic and controlled way (like Fluent Migrations).
SSDT
On the other hand, with a different philosophy, we have checked SSDT and its dacpacs, snapshots and pre- and post-deployment scripts.
The questions are:
Which of these technologies / philosophies is more appropriate for the case described?
Any other technology / philosophy that could be used?
Any other advice?
Thanks in advance.
That's an interesting question. Here at Red Gate we're hoping to tackle this issue later this year, as we have many customers asking about how we might provide a simple deployment package. We do have SQL Packager, which essentially wraps a SQL script into an exe.
I would say that dacpacs are designed to cover the use case you describe. However, as far as I understand they work be generating a deployment script dynamically when applied to the target. The drawback is that you won't have the warm fuzzy feeling that you might get when deploying a pre-tested SQL script.
I've not tried updating data with dacpacs before, so I'd be interested to know how well this works. As far as I recall, it truncates the target tables and repopulates them.
I have no experience with EF migrations so I'd be curious to read any answers on this topic.
We´ll probably adopt an hybrid solution. We´d like not to renounce to the idea deployment packagers, but in the other hand, due to our applications´s nature (small businesses as final users, no DBA, no obligation to upgrade so multiple "alive" database versions coexisting), we can´t either renounce to the full control of the migration process, including schema and data. In our case, pre and post-deployment scripts may not be enough (or at least not comfortable enough ) for a full migration like EF Migrations are. Changes like addind/removing seed data, changing a "one to many" to a "many to many" relationship or even radical database schema changes (and, consequently , data migrations to this schema from any previous released schema) may be part of our diary work when our first version is released.
So we´ll probably use EF migations, with its "Up" and "Down" system for each version release. In principle, each "Up" will invoke a dacpac with the last database snapshot (and each Down, its previous), each one with its own deployment parameters for this specific migration. EF migrations will handle the versioning line, an maybe also some complex parts of data migration.
We feel more secure in this hybrid way. We missed automatization and schema changes detection in Entity Framework Migrations as much as we missed versioning line control in Dacpacs way.

inmemory datastructure

i have a distributed application. here are set of processes , spread accross mutiple computers , communicating each other. i have a data structure , which is modified among these proceses . and this is not stored in database .
Now the question is how do i maintain the same view of the this data structure , accross all processes
i.e., at any point of time all process should see the same data structure
You say that you don't have a database. That's a shame, because database authors have solved your problem. You would need to incorporate the equivalent technology in your project. And obviously, the fastest and most simple way to incorporate the technology of databases is to incorporate a database.
Redis is designed to solve your problem. It is a key-value store for sharing between programs running on different machines but sharing the data. It is a server you run somewhere, and your programs all connect to this server using the client library it provides.
You can also use a database such as mysql but with in-memory tables.
If your data-structure does not fit into the key-value or relational models very well, you have the same kind of situation as multi-player games. It is non-trivial to sync multi-player games but it can be done and here is an excellent introduction as to how: gafferongames.com
I would recommend something like the Data Distribution Services platform for something like this (open source version is OpenDDS). Their key selling point is that it is designed to propagate changes to data to all interested in such changes. And performance isn't bad either.
Commercial implementations of this protocol are used in a variety of real-time systems, mostly military grade applications.
More options to consider, distributed caches (such as memcached) - though I've not played with this myself - it looks quite straight forward to get up and running.

What is the difference between HSQLDB and JavaDB? Which one is suitable for unit testing?

Could you tell me the differences between HSQLDB and JavaDB? And which one should I use in unit testing, assuming that I only use standard features? Thanks.
HSQLDB is faster, therefore it is better suited for unit testing.
The H2 Database is even better (in my view): it's as fast as HSQLDB, and supports compatibility modes for various databases (MySQL, Oracle,...). So if you need to use database specific features in the future, chances are that you can still test it with H2. But my view is a bit biased (see my profile).
HSQLDB has features that are useful for testing and not available in JavaDB. These include
database script and log in SQL text format allows quick check of test runs
user-defined SQL functions that allow you to write an equivalent to any function supported by another database very simply
very extensive set of SQL features and functions, for example date and interval arithmetic as supported by DB2 and ORACLE (also PostgreSQL and MySQL) and functions such as TO_DATE
BTW, the above features are not available in H2 (mentioned in another answer)