Simple GUI interface and web services for database - web-services

I've looked around and haven't found just quite what I'm looking for. I help teach a health IT course that often involves doctors and nurses who don't have much experience programming. We are planning to teach students about extending clinical information systems by developing a database and exposing it for queries and use using Web services. This is something which is often done by those with technical experience. We would like to teach students this concept with hands on exercises. However we would like the students to focus more on the concept than coding. Is there a free or open source database program that allows users to create a simple database, develop a simple GUI for their database (for hypothetical data collection), and expose their data using Web services. Mind you, this needs to be simple enough for non-programmers to be able to use with minimal coding experience.
Thank you for your comments and recommendations!

well, really broad question, the database word doesn't say much about the final task and it's really generic, but I suggest Base from the libreoffice suite or the equivalent counterpart from openoffice and both are basically the same product if you care about the core features.

Related

How do I implement a robust Data Persistence Layer in C++?

I am at the first time creating a huge program in C++ for my company and I want to create a good pattern to connect into my MySql db. I have these problems:
-I can't decide which pattern should be used, DAO,
Repository, UnitOfWork, Factory..
-I can't find a good examples of data access pattern in C++, I know it should be independently of language but I couldn't find even a robust
DAO pattern example with a good exception handling etc... Commonly are
only two classes (obj1->obj2) on very small environment.
If someone knows good sources or any tips I will be very glad =D
Thanks in advance.
My advice is search for c++ ORM(Object Relational Mapping) there are plenty ORM or DAL solutions on java like Hibernate, Datanucleus, SQLite ..
We are using Datanucleus and we are happy with it but i dont think datanucleus have a support for c++. Imho creating DAL form scratch is unnecessary.
I had the same problem years ago. The list of ORM for C++ in Wikipedia is very short and the most promising product is under GPL or you have to buy it.
We decide to develop our own ORM. There are several enterprise design patterns for it. We choose the way obd uses: Your tables was described from simple classes. The persistence and access of objects are handled from an database-manager. The most costliest todo is to write your own query-interface (if you don't wand to type clear sql in your code).

Starting with Data Mining

I have started learning Data Mining and wish to create a small project in C++/Java that allows me to utilize a database, say from twitter and then publish a particular set of results (for eg. all the news items on a feed). I want to know how to go about it? Where should I start?
This is a really broad question, so it's hard to answer. Here are some things to consider:
Where are you going to get the data? You mention twitter, but you'll still need to collect the data in some way. There are probably libraries out there for listening to twitter streams, or you could probably buy the data if someone is selling it.
Where are you going to store the data? Depending on how much you'll have and what you plan to do with it, a traditional relational database may or may not be the best fit. You may be better off with something that supports running mapreduce jobs out-of-the box.
Based on the answers to those questions, the choice of programming languages and libraries will be easier to make.
If you're really set on Java, then I think a Hadoop cluster is probably what you want to start out with. It supports writing mapreduce jobs in Java, and works as an effective platform for other systems such as HBase, a column-oriented datastore.
If your data are going to be fairly regular (that is, not much variation in structure from one record to the next), maybe Hive would be a better fit. With Hive, you can write SQL-like queries, given only data files as input. I've never used Mahout, but I understand that its machine learning capabilities are suited for data mining tasks.
These are just some ideas that come to mind. There are lots of options out there and choosing between them has as much to do with the particular problem you're trying to solve and your own personal tastes as anything else.
If you just want to start learning about Data Mining there are two books that I particularly really enjoy:
Pattern Recognition and Machine Learning. Christopher M. Bishop. Springer.
And this one, which is for free:
http://infolab.stanford.edu/~ullman/mmds.html
Good references for you are
AI course taught by people who actually know the subject,Weka website, Machine Learning datasets, Even more datasets, Framework for supporting the mining of larger datasets.
The first link is a good introduction on AI taught by Peter Norvig and Sebastian Thrun, Google's Research Director, and Stanley's creator (the autonomous car), respectively.
The second link you get you to Weka website. Download the software - which is pretty intuitive - and get the book. Make sure you understand all the concepts: what's data mining, what's machine learning, what are the most common tasks, and what are the rationales behind them. Play a lot with the examples - the software package bundles some datasets - until you understand what generated the results.
Next, go to real datasets and play with them. When tackling massive datasets, you may face several performance issues with Weka - which is more of a learning tool as far as my experience can tell. Thus I recommend you to take a look at the fifth link, which will get you to Apache Mahout website.
It's far from being a simple topic, however, it's quite interesting.
I can tell you how I did it.
1) I got the data using twitter4j.
2) I analyzed the data using JUNG.
You have to define a class representing edges and a class representing vertices.
These classes will contain the attributes of the edges and vertices.
3) Then, there is a simple function to add an edge g.addedge(V1,V2,edgeFromV1ToV2) or to add a vertex g.addVertex(V).
The class that defines edges or vertices is easy to create. As an example :
`public class MyEdge {
int Id;
}`
The same is done for vertices.
Today I would do it with R, but if you don't want to learn a new programming language, just import jung which is a java library.
Data mining is broad fields with many different techniques; classification, clustering, association and pattern mining, outlier detection, etc.
You should first decide what you want to do and then decide wich algorithm you need.
If you are new to data mining, I would recommend to read some books like Introduction to Data Mining by Tan, Steinbach and Kumar.
I would like to suggest you to use python or R for data mining process. Doing work with java or c , it bit difficult in the sense you need to do a lot coding

Implementing a user registry in a c++ based server

So I'm planning to develop a Community feature for a game I'm developing. Currently, the high score server, which I want to integrate this user registry with is developed in pure c++.
Is there a c++ library for developing user registries? Currently, I am thinking of implementing the user registry by saving to the file system, but if there are libraries for this it'd be even better.
This sounds like a very open ended question.
If you intend to have quite a lot of users and possibly do reporting on data or have to manage pieces of data manually.... you may want to look into sqlite. There would be slightly less administration and overhead than using a fully featured DBMS.
Are you sure you want the high score server in C++? Seems a bit out of place for me.
I would suggest something higher like PHP or Python with a SQLite database. Undoubtly example scripts for user registration can be found on the internet.

How does one port c++ functions to the internet?

I have a few years experience programming c++ and a little less then that using Qt. I built a data mining software using Qt and I want to make it available online. Unfortunately, I know close to nothing about web programming. Firstly, how easy or hard is this to do and what is the best way to go about it?
Supposing I am looking to hire someone to make me a secure, long-term, extensible, website for an online software service, what skill set should I be looking for?
Edit:
I want to make my question a little more specific:
How can I take a bunch of working c++ functions and port the code so I can run it server side on a website?
Once this is done, would it be easy to make changes to the c++ code and have the algorithm automatically update on the site?
What technologies would be involved? Are there any cloud computing platforms that would be good for something like this?
#Niklaos-what does it mean to build a library and how does one do that?
You might want to have a look at Wt[1]. Its a C++ web framework which is programmed more or less like a desktop GUI application. One of the use cases quoted is to bring legacy apps into the web.
[1] http://www.webtoolkit.eu
Port the functions to Java, easily done from C++, you can even find some tools to help - don't trust them implicitly but they could provide a boost.
See longer answer below.
Wrap them in a web application, and deploy them on Google App-Engine.
Java version of a library would be a jar file.
If you really want to be able to update the algorithm implementation dynamically, then you could implement them in Groovy, and upload changes through a form on your webapp, either as files or as a big text block, need to consider version control.
The effort/skillset involved to perform the task depends on how your wrote your code. If it is in a self-contained library, and has a clean (re-entrant, thread safe) API, you could probably hire a web developer (html/php/asp etc) to write the UI interface to the library for a relatively small cost. The skills required would be dependant on the technologies you wanted to use. For Windows development I would suggest C#/ASP. The applicant would require knowledge of interfacing with native libraries from a managed language. This is assuming that you dont mind the costs of Windows deployment for your application.
On the otherhand, if the library is complex or needs to be re-written to support the extensibility you are looking for, asking here will not get you much.
BTW: here is a great article on Marshalling if you chose to implement using C#/ASP
http://msdn.microsoft.com/en-us/magazine/cc164193.aspx
First, DO NOT USE PHP :D
I used it for some projects (the last one with symphony framework) and i almost shoot my self !
If you are very familiar with C++, ASP .NET could be a good solution because if you like C++ you are going to love C#.
Any ways, I personally use Ruby on Rails for 6 months now and I LOVE IT. I won't write you a book here but the framework is pure gold !
The only problem is that Ruby is a very special language. You will probably be a bit lost a the beginning. But as every one you will learn to love it.
But that was only for the server side. Indeed, there 3 technologies you won't be able to avoid if you want to start to develop web applications.
HTML, CSS and JavaScript are presents every where. This is why i'm thinking you should start by HTML and CSS then JavaScript (with jQuery).
When you've got some basics with these 3 technologies you should be able to choose the server side language.
But you've got to tell you one thing, it's not going to be easy !
PS : Ruby on Rails uses HAML and SASS. These 2 languages replaces HTML and CSS you should have a look at them quickly because they are awesome.

Data mining/BI/Analytics/ML : Can a mathematically challenged person move into this field?

I have recently become interested in the field(s) of data mining and machine learning. The idea of going through huge datasets and trying to correlate hidden patterns and trends is fascinating. So far I have done the following
Used Weka to load simple data sets and generate decision trees
Continously read books, wiki's, blogs and SO on the same
Started playing around SQL Server DM and Python API's
Have an idea on options of freely available data sets on the web(freedb, UN etc)
What is hindering me is the minute I try to go beyond classification/associsciation and into priori/apriori algorithms I am stuck because understanding mathematical equations and logic is not(to put it modestly) one of my strong points.
So my question would be are there anybody in the Data mining field(in the role of product owner or builder) who are not naturally mathematicians? If so, how would you approach in undestanding the field since free tools like Weka and Rapid-miner both expects some mathematical/statistical background?
P.S: Excuse me if I made some mistake in the query like mixing Data mining and analytics when they are separate as I am still getting my feet wet. I hope my core question is clear.
Well, being able to do some analysis of what the data mining models are showing is absolutely vital. However, these days all of the math and statistics are taken care of by the data mining models. You don't need to understand the math behind them (although it helps).
For example, you can look through the SQL Server Analysis Services Data Mining Algorithms and see that even the technical reference is how to use these implementations, not how to recreate them.
If you can understand the business cases and you can understand what the data mining is telling you, there's really no need to delve into the math behind it.
As for some of the free tools, I've never used them, so I can't speak to them. However, I'm a big fan of SSAS and those data mining models, which don't require an extensive mathematical background.
As Eric says, and as far as you only intend to use the existing algorithms and APIs and make sense from them, I don't see problems with the required math/statistics skill set (anyway, you'll need some previous basic knowledge/level).
Now, if you intend to do research or if you want to improve or modify existing algorithms, or why not, create your own algorithms, then math and statistics is a MUST. I just started doing some research in this area, and I'm still trying to fill my skills gap =)