How do I implement a robust Data Persistence Layer in C++?

How do I implement a robust Data Persistence Layer in C++? - c++

I am at the first time creating a huge program in C++ for my company and I want to create a good pattern to connect into my MySql db. I have these problems:
-I can't decide which pattern should be used, DAO,
Repository, UnitOfWork, Factory..
-I can't find a good examples of data access pattern in C++, I know it should be independently of language but I couldn't find even a robust
DAO pattern example with a good exception handling etc... Commonly are
only two classes (obj1->obj2) on very small environment.
If someone knows good sources or any tips I will be very glad =D
Thanks in advance.

My advice is search for c++ ORM(Object Relational Mapping) there are plenty ORM or DAL solutions on java like Hibernate, Datanucleus, SQLite ..
We are using Datanucleus and we are happy with it but i dont think datanucleus have a support for c++. Imho creating DAL form scratch is unnecessary.

I had the same problem years ago. The list of ORM for C++ in Wikipedia is very short and the most promising product is under GPL or you have to buy it.
We decide to develop our own ORM. There are several enterprise design patterns for it. We choose the way obd uses: Your tables was described from simple classes. The persistence and access of objects are handled from an database-manager. The most costliest todo is to write your own query-interface (if you don't wand to type clear sql in your code).

Related

Simple GUI interface and web services for database

I've looked around and haven't found just quite what I'm looking for. I help teach a health IT course that often involves doctors and nurses who don't have much experience programming. We are planning to teach students about extending clinical information systems by developing a database and exposing it for queries and use using Web services. This is something which is often done by those with technical experience. We would like to teach students this concept with hands on exercises. However we would like the students to focus more on the concept than coding. Is there a free or open source database program that allows users to create a simple database, develop a simple GUI for their database (for hypothetical data collection), and expose their data using Web services. Mind you, this needs to be simple enough for non-programmers to be able to use with minimal coding experience.
Thank you for your comments and recommendations!

well, really broad question, the database word doesn't say much about the final task and it's really generic, but I suggest Base from the libreoffice suite or the equivalent counterpart from openoffice and both are basically the same product if you care about the core features.

Starting with Data Mining

I have started learning Data Mining and wish to create a small project in C++/Java that allows me to utilize a database, say from twitter and then publish a particular set of results (for eg. all the news items on a feed). I want to know how to go about it? Where should I start?

This is a really broad question, so it's hard to answer. Here are some things to consider:
Where are you going to get the data? You mention twitter, but you'll still need to collect the data in some way. There are probably libraries out there for listening to twitter streams, or you could probably buy the data if someone is selling it.
Where are you going to store the data? Depending on how much you'll have and what you plan to do with it, a traditional relational database may or may not be the best fit. You may be better off with something that supports running mapreduce jobs out-of-the box.
Based on the answers to those questions, the choice of programming languages and libraries will be easier to make.
If you're really set on Java, then I think a Hadoop cluster is probably what you want to start out with. It supports writing mapreduce jobs in Java, and works as an effective platform for other systems such as HBase, a column-oriented datastore.
If your data are going to be fairly regular (that is, not much variation in structure from one record to the next), maybe Hive would be a better fit. With Hive, you can write SQL-like queries, given only data files as input. I've never used Mahout, but I understand that its machine learning capabilities are suited for data mining tasks.
These are just some ideas that come to mind. There are lots of options out there and choosing between them has as much to do with the particular problem you're trying to solve and your own personal tastes as anything else.

If you just want to start learning about Data Mining there are two books that I particularly really enjoy:
Pattern Recognition and Machine Learning. Christopher M. Bishop. Springer.
And this one, which is for free:
http://infolab.stanford.edu/~ullman/mmds.html

Good references for you are
AI course taught by people who actually know the subject,Weka website, Machine Learning datasets, Even more datasets, Framework for supporting the mining of larger datasets.
The first link is a good introduction on AI taught by Peter Norvig and Sebastian Thrun, Google's Research Director, and Stanley's creator (the autonomous car), respectively.
The second link you get you to Weka website. Download the software - which is pretty intuitive - and get the book. Make sure you understand all the concepts: what's data mining, what's machine learning, what are the most common tasks, and what are the rationales behind them. Play a lot with the examples - the software package bundles some datasets - until you understand what generated the results.
Next, go to real datasets and play with them. When tackling massive datasets, you may face several performance issues with Weka - which is more of a learning tool as far as my experience can tell. Thus I recommend you to take a look at the fifth link, which will get you to Apache Mahout website.
It's far from being a simple topic, however, it's quite interesting.

I can tell you how I did it.
1) I got the data using twitter4j.
2) I analyzed the data using JUNG.
You have to define a class representing edges and a class representing vertices.
These classes will contain the attributes of the edges and vertices.
3) Then, there is a simple function to add an edge g.addedge(V1,V2,edgeFromV1ToV2) or to add a vertex g.addVertex(V).
The class that defines edges or vertices is easy to create. As an example :
`public class MyEdge {
int Id;
}`
The same is done for vertices.
Today I would do it with R, but if you don't want to learn a new programming language, just import jung which is a java library.

Data mining is broad fields with many different techniques; classification, clustering, association and pattern mining, outlier detection, etc.
You should first decide what you want to do and then decide wich algorithm you need.
If you are new to data mining, I would recommend to read some books like Introduction to Data Mining by Tan, Steinbach and Kumar.

I would like to suggest you to use python or R for data mining process. Doing work with java or c , it bit difficult in the sense you need to do a lot coding

How does one port c++ functions to the internet?

I have a few years experience programming c++ and a little less then that using Qt. I built a data mining software using Qt and I want to make it available online. Unfortunately, I know close to nothing about web programming. Firstly, how easy or hard is this to do and what is the best way to go about it?
Supposing I am looking to hire someone to make me a secure, long-term, extensible, website for an online software service, what skill set should I be looking for?
Edit:
I want to make my question a little more specific:
How can I take a bunch of working c++ functions and port the code so I can run it server side on a website?
Once this is done, would it be easy to make changes to the c++ code and have the algorithm automatically update on the site?
What technologies would be involved? Are there any cloud computing platforms that would be good for something like this?
#Niklaos-what does it mean to build a library and how does one do that?

You might want to have a look at Wt[1]. Its a C++ web framework which is programmed more or less like a desktop GUI application. One of the use cases quoted is to bring legacy apps into the web.
[1] http://www.webtoolkit.eu

Port the functions to Java, easily done from C++, you can even find some tools to help - don't trust them implicitly but they could provide a boost.
See longer answer below.
Wrap them in a web application, and deploy them on Google App-Engine.
Java version of a library would be a jar file.
If you really want to be able to update the algorithm implementation dynamically, then you could implement them in Groovy, and upload changes through a form on your webapp, either as files or as a big text block, need to consider version control.

The effort/skillset involved to perform the task depends on how your wrote your code. If it is in a self-contained library, and has a clean (re-entrant, thread safe) API, you could probably hire a web developer (html/php/asp etc) to write the UI interface to the library for a relatively small cost. The skills required would be dependant on the technologies you wanted to use. For Windows development I would suggest C#/ASP. The applicant would require knowledge of interfacing with native libraries from a managed language. This is assuming that you dont mind the costs of Windows deployment for your application.
On the otherhand, if the library is complex or needs to be re-written to support the extensibility you are looking for, asking here will not get you much.
BTW: here is a great article on Marshalling if you chose to implement using C#/ASP
http://msdn.microsoft.com/en-us/magazine/cc164193.aspx

First, DO NOT USE PHP :D
I used it for some projects (the last one with symphony framework) and i almost shoot my self !
If you are very familiar with C++, ASP .NET could be a good solution because if you like C++ you are going to love C#.
Any ways, I personally use Ruby on Rails for 6 months now and I LOVE IT. I won't write you a book here but the framework is pure gold !
The only problem is that Ruby is a very special language. You will probably be a bit lost a the beginning. But as every one you will learn to love it.
But that was only for the server side. Indeed, there 3 technologies you won't be able to avoid if you want to start to develop web applications.
HTML, CSS and JavaScript are presents every where. This is why i'm thinking you should start by HTML and CSS then JavaScript (with jQuery).
When you've got some basics with these 3 technologies you should be able to choose the server side language.
But you've got to tell you one thing, it's not going to be easy !
PS : Ruby on Rails uses HAML and SASS. These 2 languages replaces HTML and CSS you should have a look at them quickly because they are awesome.

Are there cross-platform tools to write XSS attacks directly to the database?

I've recently found this blog entry on a tool that writes XSS attacks directly to the database. It looks like a terribly good way to scan an application for weaknesses in my applications.
I've tried to run it on Mono, since my development platform is Linux. Unfortunately it crashes with a System.ArgumentNullException deep inside Microsoft.Practices.EnterpriseLibrary and I seem to be unable to find sufficient information about the software (it seems to be a single-shot project, with no homepage and no further development).
Is anyone aware of a similar tool? Preferably it should be:
cross-platform (Java, Python, .NET/Mono, even cross-platform C is ok)
open source (I really like being able to audit my security tools)
able to talk to a wide range of DB products (the big ones are most important: MySQL, Oracle, SQL Server, ...)
Edit: I'd like to clarify my goal: I'd like a tool that directly writes the result of a successful XSS/SQL injection attack into the database. The idea is that I want to check that every place in my app does correct output encoding. Detecting and avoiding the data getting there in the first place is an entirely different thing (and might not be possible when I display data that's written to the DB by a third-party application).
Edit 2: Corneliu Tusnea, the author of the tool I linked to above, has since released the tool as free software on codeplex: http://xssattack.codeplex.com/

I think metasploit has most of the attributes you are looking for. It may even be the only one that has all of what you specify, since all the others I can think of are closed source. There are a few existing modules that deal with XSS and one in particular that you should take a peek at: HTTP Microsoft SQL Injection Table XSS Infection. From the sounds of that module it is capable of doing exactly what you are wanting to do.
The framework is written in Ruby I believe, and is supposed to be easy to extend with your own modules which you may need/want to do.
I hope that helps.
http://www.metasploit.com/

Not sure if this is what you're after, its a parameter fuzzer for HTTP/HTTPS.
I haven't used it in a while, but IIRC it acts a proxy between you and the web application in question - and will insert XSS/SQL Injection attack strings into any input fields before deeming whether the response was "interesting" or not, thus whether the application is vulnerable or not.
http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project
From your question I'm guessing it is a type of fuzzer you're looking for, and one specifically for XSS and web applications; if I'm right - then that might help you!
Its part of the Open Web Application Security Project (OWASP) that "jah" has linked you to above.

There are some Firefox plugins to do some XSS testing here:
http://labs.securitycompass.com/index.php/exploit-me/

A friend of mine keeps saying, that php-ids is pretty good. I haven't tried it myself, but it sounds as if it could approximately match your description:
Open Source (LGPL),
Cross Platform - PHP is not in your list, but maybe it's ok?
Detects "all sorts of XSS, SQL Injection, header injection, directory traversal, RFE/LFI, DoS and LDAP attacks" (this is from the FAQ)
Logs to databases.

I don't think there is such a tool, other than the one you pointed us to. I think there's a good reason for that: It's probably not the best way to test that each and every output is properly encoded for the applicable context.
From reading about that tool it seems the premise is to insert random xss vectors into the database and then you browse your application to see if any of those vectors succeed. This is rather a hit and miss methodology, to say the least.
A much better idea, I think, would be to perform code reviews.
You may find it helpful to have a look at some of the resources available at http://owasp.org - namely the Application Security Verification Standard (ASVS), the Testing Guide and the Code Review Guide.

When i proceed to develop a software, ui design or database design, which should be first?

I tried to design the ui with some ui mocking software, but i found it's hard for me to settle down all the detail of the design, since the database didn't design yet.
But if i first design software, then the same problem occur, I didn't have the UI, how can I create a prominent UI ?

UI first.
Mock up an elegant and easy-to-use user interface (and workflow) thinking from the point of view of the user, and only then think about the underlying database / data structures you'll need to bring that UI to life.
If you can't design your UI because you haven't yet designed your database, you're doing it wrong IMHO. How many annoying pieces of software have you used that suffered from letting the database design drive the UI design?
Edit: As others have pointed out, you need to start with use cases / user stories. The UI design and database design, whichever order you do them, should only happen after you know what your software is trying to do, and for whom.
Edit by Bryan Oakley:
(source: gapingvoid.com)

Put the user at the place he deserves. Design UI first.
Database is only a consequence of user needs.

use cases first, neither ui nor database.

If you're trying to solve a problem in an object-oriented language, it's recommended that you start thinking about the objects involved. Don't worry about the database or the UI until you've got a solid domain model nailed down that addresses all the use cases.
You don't worry about the database or the UI at first. You can serialize objects to the file system if you need persistence and don't have a database. Being able to drive your app with a command line UI is a good exercise for guaranteeing that you have a good MVC separation.
Start with the objects.
UPDATE:
The one advantage that this approach has is that it doesn't prejudice the UI with a particular database design or vice versa. The object are agnostic about the other two layers. You aren't required to have a UI or relational database at all. You're just getting the objects right. Once you have that, you can create any UI or persistence scheme you like, confident that the domain model can handle the problem you've been asked to solve.

All the above answers address your issue in a right direction. That said, I would follow the SDLC thoroughly. It helps you understand the need for the solution for the problem at hand. Then comes the requirement gathering followed by the design either UI / underlying structure that supports the UI. It's a procedure but you would benefit in the end.

Your question is very subjective.
My opinion (and it is just that) is that database and underlying structure should come first. It can often help to put down the keyboard and mouse and compile some notes on paper.
Lay out goals like what you want your application to do, list the features you require and then start thinking about how you'll build it.
This method works for me in application design.

usually you need to manipulate some data in the solutions you develop. So you should consider how this data is organised first, stabilizing this layer is fundamental at the beginning. I agree with duffymo's comment about designing the business objects first if you are in a OO world. Mapping these objects to the DB will be a part of your work. Then you add business functionality and work on the presentation layer. Of course you will have to do some refactoring from time to time, but usually the refactoring impacts the business and presentation layers more than the database.
read this, it is a good technique.
DDD - Domain Driver Design

Would you build a house without a foundation? Database design isn't the fun part but it is the foundation of most business apps and if you get it wrong it becomes the most costly to fix and the most costly to maintain.
That said, I note that there is no reason you can't work on both together as they intertwine. But before you can do either, you need to understand the requirements and the business you are writing the app for.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js