Database connection pooling for c++ recommendation engine

Database connection pooling for c++ recommendation engine - c++

I am considering to implement a recommendation engine for a small size website.
The website will employ LAMP stack, and for some reasons the recommendation engine must be written in C++. It consists of an On-line Component and Off-line Component, both need to connect to MySQL. The difference is that On-line Component will need a connection pool, whereas several persistent connections or even connect as required would be sufficient for the Off-line Component, since it does not require real time performance in a concurrent requests scenario as in On-line Component.
On-line Component is to be wrapped as a web service via Apache AXIS2. The PHP frontend app on Apache http server retrieves recommendation data from this web service module.
There are two DB connection options for On-line Component I can think of:
1. Use ODBC connection pool, I think unixODBC might be a candidate.
2. Use connection pool APIs that come as a part of Apache HTTP server. mod_dbd would be a choice. http://httpd.apache.org/docs/2.2/mod/mod_dbd.html
As for Off-line Component, a simple DB connection option is direct connection using ODBC.
Due to lack of web app design experience, I have the following questions:
Option 1 for On-line Component is a tightly coupled design without taking advantage of pooling APIs in Apache HTTP server. But if I choose Option 2 (3-tiered architecture), as a standalone component apart from Apache HTTP server, how to use its connection pool APIs? A Java application can be deployed as a WAR file and contained in a servlet container such as tomcat(See Mahout in Action, section 5.5), is there any similar approach for my C++ recommendation engine?
I am not sure if I made a proper prototype.
Any suggestions will be appreciated:)
Thanks,
Mike

Related

C++: Application architecture with API

My application it's C++ service. And I need to add API for it. I consider that it will be XML/JSON RPC based API. How should I design a program for reusing existing code base and provide API.
I see following options:
My application will work via RPC layer. Seems that it's bad option due to low performance;
Before starting of service I will fork it and run my application in the first process and RPC server in the second; Seems ok, but how to restart RPC server in this case?
I guess there is a well known pattern for such issues.
Thanks.

If you can use a web server, then the FastCGI concept might be what you're looking for. One of the main duties of FastCGI is to allow you to put on a public API (from the web server) that internally calls the "real" application, in your case the resident C++ service. So all work is done at the web server to create the public API using any technology you wish, and little or no code changes done in your C++ service.

Efficient C++ software stack for web development

What C++ software stack do developers use to create custom fast, responsive and not very resource hungry web services?

I'd recommend you to take a look on CppCMS:
http://cppcms.com
It exactly fits the situation you had described:
performance-oriented (preferably web service) software stack
for C++ web development.
It should have a low memory footprint
work on UNIX (FreeBSD) and Linux systems
perform well under high server load and be able to handle many requests with great efficiency
[as I plan to use it in a virtual environment] where resources will be to some extent limited.
So far I have only come across Staff WSF, Boost, Poco libraries. The latter two could be used to implement a custom web server...
The problem that web server is about 2% of web development there are so much stuff to handle:
web templates
sessions
cache
forms
security-security-security - which is far from being trivial
And much more, that is why you need web frameworks.

You could write an apache module, and put all your processing code in there.
Or there's CppCMS, or Treefrog or for writing web services (not web sites) use gSOAP or Apache Axis
But ultimately, there's no "easy to use framework" because C++ developers like to build apps from smaller components. There's no Ruby-style framework, but there is all manner of libraries for handling xml or whatever, and Apache offers the http protocol bits in the module spec so you can build up your app quite happily using whatever pieces make sense to you. Now whether there's a market for bundling this up to make something easier to use is another matter.
Personally, the best web app system I wrote (for a company) used a very think web layer in the web server (IIS and ASP, but this applies to any webserver, use php for example) that did nothing except act as a gateway to pass the data from the requests through to a C++ service. The C++ service could then be written completely as a normal C++ command line server with well-defined entry points, using as thin an RPC system as possible (shared memory, but you may want to check out ZeroMQ), which not only increased security but allowed us to easily scale by shifting the services to app servers and running the web servers on different hardware. It was also really easy to test.

Porting native C++ server with XML commands and huge in-memory project data to webservice. How todo and which Framework?

we have an existing C++ (STL, boost, Qt4) server application which communicates by xml commands over tcp sockets with the clients. The interface is proprietary, but not so far from SOAP, but with one exception. We have large set of master- and (calculated) project data, which is initialize and calculated at server startup (and updated later on). Building and calculating this data on every service request is much to expensive.
Now we want to port some (in future all) commands to web-services. We found/evaluating the following C++ frameworks:
Staff
gSOAP
WSO2
Roguewave Hydra
Now my questions ;-)
How is the standard way to build a web-service which have access to permanent in-memory project data, which be initialized at server startup (not web-service request)? Are the any resources on the net?
Can above be solved with gSOAP own HTTP Get plugins?
Have i forgot a recommendable C++/web-service framework?
Thanks in advance Lars

memcached or a real database of some sort. Depends really on how much concurrent usage there is and will the dataset be modifiable by multiple clients.
Others, no idea..

Opening up TCP Sockets in Java EE Webapplication

We have to communicate with a C++ component from a Java EE web application and my proposal involved using JMS server to communicate with the C++ component which is located on other machine.
However the developer of the C++ component wants me to open up TCP/IP sockets from the webapplication and communicate over XML. My view is that socket programming in web application is error prone and will not scale well since there is a limited amount of sockets that can be opened up.
Please let me have your architecture/design preference on using JMS vs TCP/IP sockets.
Thank you

Of course it's case by case. But give HTTP a serious chance. It is a good way to cross platform boundaries. It gives you ways to swap out the backend easily and there are many ways to scale it. I've used it from various platforms to hit centralized authentication service written in modern language. I've also done the opposite by putting frontend to a legacy code by turning it into a web server.
The best part about HTTP is that it's a standard protocol, so almost any platform is able to serve it and consume it out of the box. HTTP(S) or TCP takes care of many of the issues like reliability and security.

Converting existing C++ web service to a load balanced server?

We have a C++ (SOAP-based) web service deployed Using Systinet C++ Server, that has a single port for all the incoming connections from Java front-end.
However recently in production environment when it was tested with around 150 connections, the service went down and hence I wonder how to achieve load-balancing in a C++ SOAP-based web service?

The service is accessed as SOAP/HTTP?
Then you create several instances of you services and put some kind of router between your clients and the web service to distribute the requests across the instances. Often people use dedicated hardware routers for that purpose.
Note that this is often not truly load "balancing", in that the router can be pretty dumb, for example just using a simple round-robin alrgorithm. Such simple appraoches can be pretty effective.
I hope that your services are stateless, that simplifies things. If indiviual clients must maintain affinity to a particualr instance thing get a little tricker.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js