Registry design - c++

I need to design a application registry S/W component using C++. Basically, this needs to support addition and deletion of key/values. Dynamic updates need to be supported (for example, when a new application get installed).
Is there a design pattern which closely matches the given problem?
Though I have formulated a rough sketch of the APIs this component needs to support, it would be helpful to have a look at alternative (perhaps better) ways of design.
If there are some typical problems associated with registry design (may be some thread issues which I might have overlooked), I want to make sure I have circumvented those.

Is there a design pattern which closely matches the given problem?
You are probably looking at more than one: A proxy for the entire registry, iterator etc. comes to mind.
If there are some typical problems associated with registry design
You will probably need transactional semantics. Rollback too!
Do you need to save snapshots from time to time? Then you will need an archiving module.
Synchronization: Multiple writes to the registry need to be taken care of.

Related

Need explanation for 'convention over configuration' [duplicate]

What are the benefits of the "Convention over Configuration" paradigm in web development? And are there cases where sticking with it don't make sense?
Thanks
Convention states that 90% of the time it will be a certain way. When you deviate from that convention then you can make changes...versus forcing each and every user to understand each and every configuration parameter. The idea is that if you need it to differ you will search it out at that point in time versus trying to wrap your head around all the configuration parameters when it often times has no real value.
IMHO it always makes sense. Making convention the priority over explicit configuration is ideal. Again if someone has a concern, they will force themselves to investigate the need.
I think the benefit is simple: No configuration necessary. You don't need to define locations for this-or-that type of resource, for example, for the app/framework to find them itself.
As for cases where it does not make sense: any situation where it will be fairly frequent that alternative configurations would be required, or where it makes sense that a developer/admin would need to 'opt-in' to some behavior explicitly (for example, to prevent unintended and unexpected side-effects that could have security implications).
The benefit of convention over configuration paradigm in web development the productivity since you won't be required to configured to set all the rules and there are less decision that a programmer has to make. This is evident when using the .NET Framework.
The most obvious benefit is that you will have to write lesser code. Let's take case of Java Persistence API. When you define a POJO having attributes and corresponding setters/getters, it's a simple class. But the moment you annotate it with #javax.persistence.Entity it becomes an entity object (table) which can get persisted in DB. Now this was achieved by just a simple annotation, no other config file.
Another plus point is, all your logic is at one place and in one language (i.e. you get rid of separate xml).
I think this wikipedia article has explained it very well:
Convention over configuration (also known as coding by convention) is
a software design paradigm used by software frameworks that attempts
to decrease the number of decisions that a developer using the
framework is required to make without necessarily losing flexibility.
The concept was introduced by David Heinemeier Hansson to describe the
philosophy of the Ruby on Rails web framework, but is related to
earlier ideas like the concept of "sensible defaults" and the
principle of least astonishment in user interface design.
The phrase essentially means a developer only needs to specify
unconventional aspects of the application. For example, if there is a
class Sales in the model, the corresponding table in the database is
called "sales" by default. It is only if one deviates from this
convention, such as the table "product sales", that one needs to write
code regarding these names.
When the convention implemented by the tool matches the desired
behavior, it behaves as expected without having to write configuration
files. Only when the desired behavior deviates from the implemented
convention is explicit configuration required.

Does three-tier architecture ever work?

We are building three-tier architectures for over a decade now. Dividing presentation-, logic- and data-tier is supposed to allow us to exchange each layer individually, should the need ever arise, be it through changed requirements or new technologies.
I have never seen it working in practice...
Mostly because (at least) one of the following reasons:
The three tiers concept was only visible in the source code (e.g. package naming in Java) which was then deployed as one, tied together package.
The code representing each layer was nicely bundled in its own deployable format but then thrown into the same process (e.g. an "enterprise container").
Each layer was run in its own process, sometimes even on different machines but through the static nature they were connected to each other, replacing one of them meant breaking all of them.
Thus what you usually end up with, in is a monolithic, tightly coupled system that does not deliver what it's architecture promised.
I therefore think "three-tier architecture" is a total misnomer. The true benefit it brings is that the code is logically sound. But that's at "write time", not at "run time". A better name would be something like "layered by responsibility". In any case, the "architecture" word is misleading.
What are your thoughts on this? How could working three-tier architecture be achieved? By that I mean one which holds its promises: Allowing to plug out a layer without affecting the other ones. The system should survive that and be in a well defined state afterwards.
Thanks!
The true purpose of layered architectures (both logical and physical tiers) isn't to make it easy to replace a layer (which is quite rare), but to make it easy to make changes within a layer without affecting the others (and as Ben notes, to facilitate scalability, consistency, and security) - which works all the time all around us.
One example of a 3-tier architecture is a typical database-driven web application:
End-user's web browser
Server-side web application logic
Database engine
In every system, there is the nice, elegant architecture dreamed up at the beginning, then the hairy mess when its finally in production, full of hundreds of bug fixes and special case handlers, and other typical nasty changes made to address specific issues not realized during the design.
I don't think the problems you've described are specific to three-teir architecture at all.
If you haven't seen it working, you may just have bad luck. I've worked on projects that serve several UIs (presentation) from one web service (logic). In addition, we swapped data providers via configuration (data) so we could use a low-cost database while developing and Oracle in higher environments.
Sure, there's always some duplication - maybe you add validation in the UI for responsiveness and then validate again in the logic layer - but overall, a clean separation is possible and nice to work with.
Once you accept that n-tier's major benefits--namely scalability, logical consistency, security--could not easily be achieved through other means, the question of whether or not any of the tiers can be replaced outright without breaking the the others becomes more like asking whether there's any icing on the cake.
Any operating system will have a similar kind of architecture, or else it won't work. The presentation layer is independent of the hardware layer, which is abstracted into drivers that implement a certain interface. The data is handled using logic that changes depending on the type of data being read (think NTFS vs. FAT32 vs. EXT3 vs. CD-ROM). Linux can run on just about any hardware you can throw at it and it will still look and behave the same because the abstractions between the layers insulate each other from changes within a single layer.
One of the biggest practical benefits of the 3-tier approach is that it makes it easy to split up work. You can easily have a DBA and a business anylist or two building a data layer, a traditional programmer building the server side app code, and a graphic designer/ web designer building the UI. The three teams still need to communicate, of course, but this allows for much smoother development in most cases. In this regard, I see the 3-tier approach working reliably everyday, and this enough for me, even if I cannot count on "interchangeable parts", so to speak.

What's a pattern for getting two "deep" parts of a multi-threaded program talking to each other?

I have this general problem in design, refactoring or "triage":
I have an existing multi-threaded C++ application which searches for data using a number of plugin libraries. With the current search interface, a given plugin receives a search string and a pointer to a QList object. Running on a different thread, the plugin goes out and searches various data sources (locally and on the web) and adds the objects of interest to the list. When the plugin returns, the main program, still on the separate thread, adds this data to the local data store (with further processing), guarding this insertion point using a mutex. Thus each plugin can return data asynchronously.
The QT-base plugin library is based on message passing. There are a fair number of plugins which are already written and tested for the application and they work fairly well.
I would like to write some more plugins and leverage the existing application.
The problem is that the new plugins will need more information from the application. They will to need intermittent access to the local data store itself as they search. So to get this, they would need direct or indirect access both the hash array storing the data and the mutex which guards multiple access to the store. I assume the access would be encapsulated by adding an extra method in a "catalog" object.
I can see three ways to write these new plugins.
When loading a plugin, pass them
a pointer to my "catalog" at the
start. This becomes an extra,
"invisible" interface for the new
plugins. This seems quick, easy,
completely wrong according to OO but
I can't see what the future problems would be.
Add a method/message to the
existing interface so I have a
second function which could be
called for the new plugin libraries,
the message would pass a pointer to
the catalog to the plugins. This
would be easy for the plugins but it
would complicate my main code and
seems generally bad.
Redesign the plugin interface.
This seems "best" according to OO,
could have other added benefits but
would require all sorts of
rewriting.
So, my questions are
A. Can anyone tell me the concrete dangers of option 1?
B. Is there a known pattern that fits this kind of problem?
Edit1:
A typical function for calling the plugin routines looks like:
elsewhere(spec){
QList<CatItem> results;
plugins->getResult(spec, &results);
use_list(results);
}
...
void PluginHandler::getResults(QString* spec, QList<CatItem>* results)
{
if (id->count() == 0) return;
foreach(PluginInfo info, plugins) {
if (info.loaded)
info.obj->msg(MSG_GET_RESULTS, (void*) spec, (void*) results);
}
}
It's a repeated through-out the code. I'd rather extend it than break it.
Why is it "completely wrong according to OO"? If your plugin needs access to that object, and it doesn't violate any abstraction you want to preserve, it is the correct solution.
To me it seems like you blew your abstractions the moment you decided that your plugin needs access to the list itself. You just blew up your entire application's architecture. Are you sure you need access to the actual list itself? Why? What do you need from it? Can that information be provided in a more sensible way? One which doesn't 1) increase contention over a shared resource (and increase the risk of subtle multithreading bugs like race conditions and deadlocks), and 2) doesn't undermine the architecture of the rest of the app (which specifically preserves a separation between the list and its clients, to allow asynchronicity)
If you think it's bad OO, then it is because of what you're fundamentally trying to do (violate the basic architecture of your application), not how you're doing it.
Well, option 1 is option 3, in the end. You are redesigning your plugin API to receive extra data from the main app.
It's a simple redesign that, as long as the 'catalog' is well implemented and hide every implementation detail of your hash and mutex backing store, is not bad, and can serve the purpose well enough IMO.
Now if the catalog leaks implementation details then you would better use messages to query the store, receiving responses with the needed data.
Sorry, I just re-read your question 3 times and I think my answer may have been too simple.
Is your "Catalog" an independent object? If not, you could wrap it as it's own object. The Catalog should be completely safe (including threadsafe)--or better yet immutable.
With this done, it would be perfectly valid OO to pass your catalog to the new plugins. If you are worried about passing them through many layers, you can create a factory for the catalog.
Sorry if I'm still misunderstanding something, but I don't see anything wrong with this approach. If your catalog is an object outside your control, however, such as a database object or collection then you really HAVE to encapsulate it in something you can control with a nice, clean interface.
If your Catalog is used by many pieces across your program, you might look at a factory (which, at it's simplest degrades to a Singleton). Using a factory you should be able to summon your Catalog with a Catalog.getType("Clothes"); or whatever. That way you are giving out the same object to everyone who wants one without passing it around.
(this is very similar to a singleton, by the way, but coding it as a factory reminds you that there will almost certainly be more than one--also remember to allow a Catalog.setType("Clothes", ...); for testing.

Open source libraries for abstracting database access in C++?

I'm looking for options for abstracting database server details away from my application (in c++), I'd like to write my code to be independent of the actual database backend. I know MySQL has a nice library, but I don't want to be tied to a single database implementation. Are there good options for this?
SOCI is good. Supports multiple databases, works well, modern C++ style API, works with boost.
My opinion is to forget about a cross-database driver, and focus on finding or creating a cross-database Data Access Layer. A few reaons:
Complex queries (read: anything that's not a toy) invariably end up using one or two database-specific features. LIMIT and OFFSET for example, commonly used for paging, isn't universal.
Sooner or later you'll want bulk insertion, and you'll want it to be as fast as possible, because 3 hours is better than 6 hours. Every database has a different "optimum" way to do this, so your DAL will need to special-case this anyways.
Different databases may expose different constraint mechanisms—even custom column types—that can be be worth taking advantage of where possible (PostgreSQL is wonderful for this).
If you want to do any application level caching, you'll need a DAL anyways.
So, go ahead and use libmysql by itself - just hide it behind a compiler firewall in your DAL, and be prepared to swap it out later. You can protect yourself from shifting infrastructure without having to use a lowest-common-denominator SQL wrapper.
If that doesn't jive with you, check out SQLAPI++.
many apps use odbc (via unixODBC for instance), there's also otl. on windows you could use ado.net from managed c++ or the old ado com interfaces...
Qt provides a database abstraction layer. See: http://doc.trolltech.com/4.6/qsqldatabase.html.
libodbc++ provides a pretty good API.
Also the big guys Qt (see Kyle Lutz' answer) & wxWidgets have db abstraction layers, so it may be a good idea to use them if you plan to use/you're already using any other parts of those frameworks.
OpenDBX and libzdb are two lightweight candidates. Libgda for GNOME.

What is a good design for an extensible query interface?

Our application exposes queries by way of web services, and what we've found is that our clients often want custom queries, either by way of further limiting the results returned by specifying additional criteria, or by asking for things that we don't already expose.
Now, we can take the approach of creating new methods for each of these new methods, but that's somewhat inconvenient; deployment of our application at a client site usually requires weeks of staged integration testing. We've proposed a named query mechanism, where the application administrator would define queries by name that are parameterized, and a corresponding web service that simply invokes these parameters. However, I can't help but think that someone has solved this problem before, so I'd like some input from the SO community on possible designs.
Thanks!
Updates
The specification pattern is a good one, but our application deals with enough data that we want to push as much of the querying work down into an RDBMS, which can do a better job of optimizing the query plan than we would ever want to. Moreover, we support three RDBMS backends, so we're stuck using a greatest-common-denominator approach: we use as much capability as the least functional database can provide.
I would also recommend to consider the "Specification Pattern" in this type of applications as a design decision for your backend. Check the following posts about "Specification Pattern":
http://www.mattberther.com/2005/03/25/the-specification-pattern-a-primer/
http://devlicio.us/blogs/jeff_perrin/archive/2006/12/13/the-specification-pattern.aspx
Take a look at Hibernates Criteria API and use it or build some similar
functionality for Your users.
If it's worth the effort, provide a tree-like interface for grouping criterias. ("all criteria of a group must match" / "one criteria must match" / "negate")
Advantages:
Easy to build.
User parameters are possible.
Powerful queries are possible.
You can apply restrictions like SELECT ... FROM table WHERE someRestriction AND (user-provided criteria)
Since we really don't know which how your users use your interface it seems a little premature to give a technical advice on something that feels a lot closer to "Inmates are running the Asylum" problem.
There are some very good advice and common ways to solve this i technical aspects but do they work for your users? Maybe the really don't give a crap about your problem but rather have a fine working one button solution? (Or more like google?)