What is a proper way to approach the following scheduling task - scheduling

I am thinking about the following scheduling problem:
I have X people.
I have Y meeting slots with Z meeting roles available in every meeting.
For some roles, same person may combine two of them in a single meeting, but most are one person = one role.
For each person x in X, I know a set of facts about them:
a) The last date they attended the meeting and had a specific role (historical);
b) Their availability for any meeting y in Y;
c) Their specific preference for the roles z in Z or a set of roles (no specific dates) for the group of meetings.
I'd like to build a scheduler with the following objectives in mind:
a) All meeting roles are filled.
b) Preferences are accommodated if possible;
c) Distribution of people / roles should be uniform (i.e. if one person is scheduled every meeting and other just for one meeting once in a while -- it's unacceptable; if one person is scheduled for the same role over, and over, and over again -- it's unacceptable).
Now, I have a gut feeling that the task is not easy at all :), so my specific questions are:
What language would be better suited for the task (somehow I feel Prolog can deal with it, but I am not entirely sure).
What is the proper approach to solve this task and how close can I get to my objectives in #4 above?
Any good read on the kind of problem I am looking to solve?
Thank you!
P.S. If you are curious, the use case is scheduling a roster for a set of Toastmasters meeting (example) (I am lazy do it by hand and I'd like computer to help me in this task at least partially).

A rule engine, like Drools Expert or Prolog is good for defining the constraints (= score function). However it's terrible at finding the best solution.
Since your problem is probably NP complete (especially if the meetings need to be put into a timeslot and/or 1 person can't attend 2 meetings at the same time), you need to use a planning optimization algorithm on top of that, such as construction heuristics and metaheuristics. Take a look at the curriculum course example in Drools Planner (java, open source, ASL).

From my point of view, the language you are going to program in doesn't really matter that much: for simple problems the language to use is more of a personal preference instead of an exact science. If you like/want to learn Python, use that. If you "feel like" Prolog today, use that.
What will be a factor in your choice though is how you want to preserve and present your data. From your question it can be told that you need the following:
A database (or at least, a persistent resource) to store your available participants and roles, past and future meetings storing the roles for every participant, and some way to schedule availability.
Some way to present your data (command line, GUI, or website).
Some business logic that describes the way of assigning roles, criteria for the attendance and such.
You will want to use some third-party components for most of these, since your time is to be spent on the added value of your product; creating a shiny ORM or GUI toolkit is not your goal in this. So the programming language you will choose should have a proper support for these items (especially the first two). I can't say it for Prolog, but Python will have you fully covered in these areas. I think it goes beyond the scope of this question to suggest specific toolkits, so I'll leave it at that for now.
After this step, you analyze your problem, which you seem to have done quite nicely already. So, start implementing it. To be able to verify your specific use cases, it sounds like you could benefit from some Test or Behavior Driven Design, so you may want to read up on that.
For learning the language, just search StackOverflow for "[language] tutorial": there are already plenty of answers linking to very nice resources for getting started with any language you will choose.
Final advice: perseverance is the hardest part, so try to set yourself some goals or milestones, or try to involve other people in one way or another. That way you'll enlarge the possibility of following through with creating a nice piece of software.

Even though I'm a Python fan, I'd hardly suggest Prolog for this task. I'm familiar with Prolog, and it's definitely nicer solved with Prolog. But it depends on how you will use that program. Your choice - decide whether the installation of Python or Prolog is easier for you (if you just run it on your local PC, it doesn't matter that much I guess), or on other requirements you have.
It's farly simple with Prolog, if you know about Prolog. After you learnt Prolog, you can solve it with some thinking without much problems I guess (if you really understood Prolog!).
Basicly you should start with Prolog of course. I'd suggest to use SWI-Prolog, it's one of the most common Prolog Implementations used. Also, there is a nice tutorial for it: http://www.learnprolognow.org/
It seems to me, but I'm not 100% sure, that you are not familiar with Prolog yet. You need the time to learn Prolog first, so it also depends on how fast you need to have your program. It's possible to get through the Tutorial in less than a month, as far as I remember. Of course this hardly depends on how much time you invest per day - you can do it in less or even more time.
Prolog is based on rules. Every of your requirement can be expressed as a rule. After you have your set of rules, you can ask, which combination (of persons and meeting room) conform to all those rules. For the historical data of the different persons, you could use a small database.

This sounds like an optimization problem and I agree with Geoffrey that it would be a NP Complete problem. I recently developed a scheduling algorithm for a university that does final exam scheduling. I used a genetic algorithm with domain specific heuristics to solve that problem. My implementation performed nicely with a student count of 3000 + and course count of 500, it took about 2 hours to find a near optimal solution.

I agree with people who suggest Prolog for this task; I would suggest to take a look
at ECLiPSe (it is, besides being a Prolog implementation, a constraint programming
language which have more powerful problem solving capabilities than just Prolog).
ECLiPSe has now a very nice introduction, with many examples and very to the point,
with a free pdf, written by Antoni Niederlinski:
http://www.anclp.pl/
Among the examples on ECLiPSe site, I found the following which seems to be relevant: http://eclipseclp.org/examples/roster.ecl.txt.
ECLiPSe is thoroughly documented and, according to this documentation,
can be also integerated with C++/Java.

Related

How do you understand a large chunk of code?

I am a fresh college grad student that just started my job. In my ramp up period, I need to learn a lot of product code. There are some design docs but they do not help much.
Can you provide some general techniques to browse and understand huge product code (specifically C++)?
Run it through doxygen. This will generate html documentation which will be helpful even if the code does not have proper doxygen-style comments.
Another good advice is to look through the unit tests, if there are any. If there are no unit tests, a good way to understand the code is to write your own unit tests. The effort to do this will pay for itself many times over.
Use every method available to you (in no particular priority):
Use the product itself and understand what it does
Talk to the devs that maintained it or have worked with it previously
Debug through it and see how data flows and how classes interact ("when I click this button, what exactly happens, who is responsible?")
Look at architecture, UML, or class diagrams
One of my favorites: create your own diagrams of class hierarchies, class interactions, general control flow, high-level components, process/DLL interactions, object lifetimes and management
If they're not totally out-of-date, read the dev/test/user specs (goes well with #1)
Read the documentation on it
Most of all: be tenacious and persistent. If you don't put in the work, don't expect to understand it. If you don't understand something, dig and dig until you do. Software is not magic, it's just hard work :)
Some people will tell you to start with the data structures, but in a large system even that's not terribly helpful much of the time. I can think of four major points:
Take your time. Often, it's more like a whole series of gestalt shifts than it is a single, linear, gradual understanding. So be patient.
No matter how big it is, you should be able to put a breakpoint in and walk it in a debugger. Even in a large, complicated, multi-threaded system, you should be able walk through and see what's happening.
Ask for bugs, and start fixing them, no matter how crazy they seem. It's akin to dropping yourself into a foreign country; you'll pickup the language eventually.
Find a mentor. A jungle guide is invaluable.
I think there have been a few good responses already. My 2c worth...
Not sure what you class as huge (10 KLOC, 1000 KLOC, 10000 KLOC, etc), but one would hope that this is broken down in some way and is not a monolithic single program. Perhaps your management has some guidance on which 'module(s)' you are most likely to be spending time in at the moment. Hopefully this can help break down the problem scope.
Firstly, before you try to understand the code try to understand the product. What does it do? Then how does it do it? What does it interact with? Then how does it interact? etc...
When getting to the code try to understand the high level design and philosophy first, and work on the breadth before the depth. I agree with some of the above re fixing some bugs, but I also strongly suggest you continue to get a handle on the high level even if you need to get into the details to fix some bugs.
I also agree with the above in terms of generating some diagrams for yourself if you can't find any already in existence. And then share them, perhaps a team/product wiki? I'm curious as to why the existing doco does not help very much. Typically this is because this type of doco was generated from the early concepts and the product no longer bears any similarity, but if this is not the case then what can you contribute to this issue. One assumes that where you are today someone else will be in short enough order, and you are in an ideal position to know what essential doco is missing!
If the product is actually 'huge' then you have to accept that you will never be able to hold all of it in your head, so the best thing you can do is be familiar enough to know where to start looking (comes back to understanding the product, and approaching code breadth first).
This is obviously a pretty common question, and it's similar to this one (and the questions related to it): How to understand the design and code flow of any product quickly?
Dig through some of those answers / comments, for starters. Else, we'll just end up repeating them. :)

(student) interview questions - programming for a robotics lab

my robotics lab is looking for programmers to work on some projects we have at the moment.
We nailed down the requirements (mainly, c++ and experience with openGL and 3D), but due to obvious money constraints we can't afford to hire Great Developers. Instead we're going to settle for Talented Students, offering them projects for their dissertation/thesis and hoping for some fresh ideas and creativity from their side. We can also afford to pay students that just graduated (first job experience).
So my question is:
In your experience, how did you spot a Talented Student (computer scientist or engineer)? What questions did you ask? What else did help you in finding a candidate that turned out to be a Good Programmer? (note: they might not know much about a specific language, but might have the ability to learn pretty fast)
or, if you were the interviewee,
Which questions were asked that made you jump on the bandwagon? Or, if you had an awful experience, what - in retrospect - was an obvious warning signal that you ignored?
Please note that I am not looking for an argumentative answer. We can talk all day long about what's best for us and never agree.
Instead, I'm looking for tales from your experience. Anecdotes, stories, hints, everything will help.
Background:
A bit more background: working for academia here is slightly different than working for the private sector (here = Italy). There are no 'deadlines' to 'sell' a product; instead, it's all proof-of-concept based. Nothing you start working on has the guarantee to be functional.
A comic best describes it: reinventing the wheel
I am considering doing Coding Questions for their interview, but all my colleagues are scoffing at me (too scary, nobody will ever come to work for us again, nobody really know how to code, etc).
Coding-wise, programming done by researchers is ... ugly. I am fighting to get a version control system in constant use, people have to be chased down to report bugs and document their code, everything is coded-so-that-it-works and rarely we go back to old code to 'fix bugs'. Basically once it's somewhat working, the project is closed and people go work on another project.
Lots has been reinvented and rewritten over and over again (just because nobody knew it was already there). People come and go, future is uncertain, but we play with robots so it's very cool :)
Furthermore, being really understaffed, nobody can follow you and guide you in your project. At best you're the one that has to come up with a plan, background literature and a working prototype.
Hence, we are looking for people that:
have some background to get started
can be highly independent
do want to learn and build their own expertise in new fields
Actually, here's my best advice:
Recruit among your students.
Since you work for an academic institution I assume that either you or your colleagues teach. This provides you with a wealth of information about what potential recruits are capable of -- how fast they learn, how motivated they are, what they are good and bad at, how the code they turn in for the lab assignments and projects look like, etc.
Firstly, in industry, coding questions are very much the norm - I'd be worried if coding questions weren't asked!
I've been responsible for doing technical interviews for about ten years now. And yes, I ask coding questions. But the questions themselves aren't really the point. What I'm more interested in is getting an idea of whether the candidate can think, and articulate their thoughts.
One question I ask (asuming the simple earlier stuff has gone well) is about inheritance heirachies. There is no one right answer, although there are a lot of wrong ones. But what's important is how they approach the question, the points they come up with in favour of one design over another.
Background knowledge is useful, in that it shows that they have an interest in the area in which you're working - but really knowledge can be gained. Intelligence is much more important.
However it's quite possible to have intelligent people who are impossible to work with! I haven't figured out how to determine who they are.
I have done such a project when I was a student: ie a 4-months project, working half-time. It was not about robots, per se.
I think that the most obvious requirement is motivation/passion. Since they'll be mostly on their own you will need them to be somewhat independent and able to think for themselves, this requires motivation first and foremost.
In order to determine whether the candidate is motivated or not, begin by asking them about the project itself. If they only gave it a cursory glance, they're likely not motivated. Also look at their experience / courses: optional courses in CS, projects they've done, etc... anything indicating that they really care about CS / development in general and are not there just because they've heard it paid well.
Then comes the question of ability. Like you said it might not be easy to spot those who will be smart enough to figure stuff out by themselves and DO things. Once again, you can ask them about past projects, having them detail the issues they faced and how they solved it.
Finally, I agree with you that some demonstration of their abilities is in order. They might be a bit tense initially, so I would do this at the end, once the interview is already going, you might have had a chance to get them to relax with the previous questions this way.
You don't necessarily need them to do coding questions, I think it's most about reasoning. Try to pick problems related to your area work, for example one you really had in the past, and get them to analyze the problem. If possible they should take the lead and ask you questions about the problem itself here.
We've had an issue with the robot not being able to analyze the images the camera took, it could not correctly determine the moving objects itself, do you have an idea how you would do that ?
Then you'll need to get them to think about a solution. You need a whiteboard here, and ask them to think aloud so that you can follow their reflexion. You'll probably need to nudge them a bit from time to time to keep them on track, their reaction to your input is also a key-point here, since you want them to be able to accept criticism and build on it, otherwise you might have issues directing them afterward.
Frankly, try to avoid asking them for the quicksort algorithm, or the introsort, or the radix sort... If they need sorting, they'll just fire up their computer and browse the internet. On the other hand, getting them to analyze an existing algorithm unknown to them (for example, the median of 5 sort) and checking that they understand why it works, could be worth it. If they need to work on their own, they'll need to be able to learn on their own too :)
As others say, try to hire someone that's motivated!
For master thesis students I put more emphasis on knowing the basic skills (programming, knowing how to use version control) as they don't stay on long enough to learn everything along the way.
If they're going to work mostly on their own and you have no special requirements on language I wouldn't focus much on language questions. But every decent programmer knows at least one language fairly well, get a sample or their prior work or make them code some simple application to test that they don't suck.
I'd focus more on algorithm and data structures. Ask rudimentary stuff that every programmer should know - when to use a list and when to use a vector, why summing a row-major matrix by iterating over the columns first is bad, basic complexity analysis questions, etc. That will sort out many of the bad weeds.
Ask some design questions too perhaps, e.g. what is "coupling" and why is it bad, ask them if they know what a design pattern is, etc.
Check that the applicants have a solid grasp of linear algebra and coordinate system changes in particular if they're going to work with any 3D stuff like OpenGL. In my experience, learning the API is simple, wrapping your head around how the transformations work less so.
Obviously, if you except them to perform any real robotics-specific you should check for that knowledge as well. E.g. estimation (understanding simple EKF and particle filters is a requirement in my book), control theory, pattern recognition, machine learning, vision, or whatever is useful for the particular task.
If I were hiring someone for theoretical work I would perhaps loosen up on the CS/programming skills and focus more on math knowledge. Someone with solid math skills will pick up the CS easily, and programming is just programming.
Ask for references or to see some of the prior work. Many great students already have some interesting project to show after graduation.
I'm not sure how common this is at universities in general, but I would look for a games programming (or robotics) course on the transcript where the candidate, as a student, succeeded in completing a project with a team. It would ask the candidate to describe how that project worked (important technical details) and the role he had in the team. The only way to really tell if someone is good at something is to see what happens when they try it, and since you're in academia, recruiting students, that should not be a problem.

Questions to answer before proposing to use a new language?

What are the technical questions I simply must have answers for before I approach someone about introducing a new language?
I'm looking for the list of technical questions that without a really good answer, I should not even waste anyone's time by proposing that we use language X.
PS: (def X clojure)
A crash course in politics for engineers...
Despite all the mission-statement baloney meant to sound noble and emphasize community support, the real purpose of every business is return on investment or, equivalently, maximizing shareholder value. If it's a government agency, it's kind of still the same question but the legal owners will have no direct influence and instead you will have proxy owners, such as higher agencies or powerful individual officials.
Decisions, however, are almost always made by agents, and so the principal-agent problem (also called the agency dilemma) appears; the agents (the management) will make a decision in their interest, and not necessarily according to the shareholder's interest as is theoretically required. In a government agency this is almost 100% of the consideration.
Sadly, this stirs in all the Dilbert and Parkinson's Law complexities.
The best you can conclude is that decisions will be justified on the basis of risk, cost, and benefit, but will tend to be made on the basis of what credit and blame is in store for the agent and understood by the agent, which is a narrow risk consideration of questionable value to the principal but at least an identifiable one.
So, we should now apply this to the language question. Your manager is likely to avoid threats, risks, scandals, and controversies. His application of the principals's concerns will be mainly through the constraints of budgets and expectations. Here are some examples that should be mostly self-explanatory.
If you want to use Java or PHP:
Everyone is doing it this way
This is the industry-standard approach for this type of problem
This is the low-risk approach
Similar systems have been done many times in Java/PHP
(That's the "no one ever got fired for buying IBM" argument.)
If you want to use Ruby:
Ruby is in the Tiobe top-ten (not quite an industry standard, so this is the best you can do)
PHP and Java are higher-cost technologies (he probably has a budget as an attempt to mitigate the principal-agent problem)
PHP and Java are going to be out of fashion "soon" (maybe not, but phrased as a "risk of appearing to stupidly use old tech', and implying the lack of later credit and recognition)
Ruby is an advanced language with powerful abstractions for cost-effective development (a weak argument for the agent, but offers the possibility of credit. The least effective of all the arguments.)
If you want to use Clojure:
You better prototype the system on weekends and evenings and present it as a solved problem.
Emphasize parallel Java / Clojure development ("if necessary the entire system can be written in Clojure Java")
Make all the Java arguments and then say something about "the best of both worlds"
Productivity with a language is neither the only factor, nor a simple scalar in itself. Important questions include:
How easy is it to learn the language, if it's not already familiar to people on the team?
How easy is it to become expert at the language?
Does the team have access to one or more language experts who have the bandwidth to do the necessary mentoring?
Are good learning materials (books, blogs, tutorials) and support channels (fora, IRC, mailing lists) available?
Does the language (or some framework in that language) allow a competent programmer to write the software faster than what you're using now?
How maintainable is the language? How readable is the syntax to a competent programmer encountering someone else's code for the first time? (Think of APL and Perl.)
Is the language somehow better applicable to your problem domain than what you're using now (e.g., functional languages for distributed computing)?
How well does the language/platform meet business needs not related to development speed (e.g., performance, scalability)?
What are the available tools like, and what do they cost? Is there a debugger available? An IDE? Refactoring and unit test support built into the IDE? Build management and deployment tools?
So much depends on what you're currently using, what you're switching to and why that it's difficult to answer. But these are always important:
What can I do if I choose a new language that I could not do before?
What could I do faster than I can currently with the new language?
How will the rest of the team cope with the introduction of the new language?
If I left, could someone else new to the language pick up where I left off without too many problems?
What is the business case?
It comes down to ROI (Return On Investment).
It is not only about an individual's productivity but:
the whole team
impact on product lifecycle
maintainability
etc.
How easy is it to pick up? I find this is not that important.
Does it have IDE support? Pretty important, but you can work without it.
Is there a debugger available? I think this is the most important question I would ask. Once you have a working debugger, you can usually get anything done.
We hired a team this year and decided to use Clojure as our weapon of choice. The team's background was primarily Java-based but also a wide variety of other languages for hobby work.
The criteria we considered were:
Can we leverage the Java/JVM background of the team and integrate with an existing Java-based product?
Can we achieve performance on par with Java?
Can we build thread-safe concurrent maintainable programs?
Can we leverage a higher level of abstraction
Can we hire/train people to work in the language?
Can we maintain a large codebase in the language?
Are sufficient tools available to work effectively in the language?
Is there an active community of people growing the language and libs?
We seriously considered Groovy, Scala, and Clojure. I really enjoy Groovy for lightweight apps but I had serious questions about performance. Scala and Clojure both have lots to offer on all of the points above. In the end, our problem domain involves a lot of symbolic manipulation and we felt that Clojure would be a better match but I suspect Scala also would work well.
What will your new language offer that an existing language doesn't already?
We have languages that do just about everything in every way today. So before introducing a new language, make sure there isn't one already existing that does everything your new language does. And make sure you know exactly what features your new language will offer that aren't offered in the same combination or at all by other languages.
Unless of course you're just doing this for your own education - in which case forget this question and have at it!
How will this improve my productivity?
If this cannot be answered pack up and go home.
What's the point? / Why?
How will it make my job easier?
Q1: Can I hire people with these skills?
Q2: When I call our outsourcing partner account managers, and ask how much would a typical fixed-cost project cost, if done in the usual way, or done using language X, is the multiplier more than 1?
Q3: Does everyone else in my department also have a favorite language that does about the same job as my favorite language, and should their favorite languages be used as well? What are the practical consequences of this?
A good question to ask is what is the size of the community around the language/framework. For instance, ruby/rails has a significant community around it, which would make me more comfortable that I would not be "the first kid on the block" to have to deal with a particular problem.
Why limit yourself to one language? Figure out which problems are solved best by which language and offer up services. If the bandwidth between the services is too high, then migrate the problematic services together based on which language solves both best.

Relational databases application [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
When developing an application which mostly interacts with a database, what is a good way to start? The application requires a lot of filtering based on user input, sorting and structuring.
The best way to start is by figuring out "user stories" (or "use cases" -- but the "story" approach tends to really work great and start dragging shareholder into the shared storytelling...!-); on top of that, designing the database schema as the best-normalized idea you can find to satisfy all data layer needs of the user stories.
Thirdly, you may sketch layers such as views on top of the schema; fourthly, and optionally, triggers and stored procedures that might live in the DB to ensure consistency and ease of use for higher layers (but, no matter how strongly DBAs will push you towards those, don't accept their assurances that they're a MUST: they aren't -- if your storage layer is well designed in terms of normalization and maybe useful views on top, non-storage-layer functionality CAN always reside elsewhere, it's an issue of convenience and performance, NOT logical consistency, completeness, correctness).
I think the business layer and user-experience layers should come after. I realize that's a controversial position, but my point is that the user stories (and implied business-rules that come with them) have ALREADY told you a LOT about the business and user layers -- so, "nailing down" (relatively speaking -- agility and "embrace change!" should always rule;-) the data storage layer is the next order of business, and refining ("drilling down") the higher layers can and should come after.
When you get to the database layer you'll want to handle the database access via stored procedures. This will help give you additional protection against SQL Injection attacks, and make it much easier to push logic changes to the database layer.
If it's mostly users interacting with data, you can design using a form perspective.
What forms are needed for user input?
What forms are needed for output reports?
Once you've determined that, the use of the forms will dictate the business logic needed to be coded behind the scenes. You'll take the inputs, create the set of procedures or methods to deal with them, and output what is necessary. Once you know the inputs and outputs, you will be able to easily design the necessary functions.
The scope of the question is very broad. You are expecting me to tell what to do. I can only do a good job of telling how to do things. Do investigate upon using Hibernate/Spring. Since most of your operations looks like querying db, hibernate should help. Make sure the tables are sufficiently indexed so your queries can run faster if filtered based on index fields. The challenging task is design your DB layer which will be the glue between your application and db. Design your db layer generic enough so that it can build queries based on the params that you pass to it. Then move on to develop the above presentation layer. Developing your application layer by layer helps since it will force you to decouple the db logic from the presentation logic. When you develop the db layer, assume that not just your presentation layer but any client can call it. This will help you to design applications that can be scalable and adaptable to new requirements.
So bottom line : Start with DB, DB integeration layer, Controller and last Presentation Layer.
For the purpose of discussion, I'm going to assume that you are working with a starting application that doesn't have a pre-existing database. If this is false, I'd probably move the order of steps around quite a bit.
1 - Understand the Universe
First, you've got to get a sense of what's around you so you can really understand the problem that you are trying to solve.
User stories or use cases are often a good starting point. Starting with what tasks the user will try to do, and evaluating how frequently they are likely to be is a great starting point. I like to start with screen mockups as well, with or without lots of hands on time with users, I find that having a screen gives our team something really finite to argue about.
What other tools exist in this sphere? These days, it seems to me that users never use just one tool, they swap around alot. You need to know two main things about the other tools you users use:
(1) - what will they be using as part of the process, along side your tool? Consider direct input/output needs - what might they want to cut/copy/paste from or to? What tools might you want to offer file upload/download for with specific formats, what tools are they using alongside your tool that you might want to share terminology, layout, color coding, icons or other GUI elements with. Focus especially on the edges of the tools - a real gotcha I hit in a recent project was emulating the databases of previous tools. It turned out that we had massive database shift, and we would likely have been better starting fresh.
(2) What (if anything) are you replacing or competing with? Steal the good stuff, dump and improve the bad stuff. Asking users is always best. If you can't at least understanding the management initiative is important - is this tool replacing a horrible legacy tool? It may be legacy, but there may be the One True Feature that has kept the tool in business all these years...
At this stage, I find that things are really mushy - there's some screen shots, some writing, some schemas or ICDs - but not a really gelled clue.
2 - Logical Entities
Or at least that's what the OO books call it.
I don't care much for all the writing I see on this task - but I find that any any given system, I have one true diagram that I draw over and over. It's usually about 3-10 boxes, and hopefully less than an exponentially large number of lines connecting them. W
The earlier you can get that diagram the better.
It doesn't matter to me if it's in UML, a database logical model, something older, or on the back of a napkin (as long as the napkin is shrouded in plastic and hung where everyone can see it).
The earlier you can make this diagram correctly, the better.
After the diagram is made, you can start working on the follow on work that may be more official.
I think it's a chicken and egg question on whether you start with your data or you start with your screens and business logic. I know that you certianly want to optimize for database sizing and searchability... but how do you know exactly what your database needs are without screens and interfaces giving you a sense for the data?
In practice, I think this is an ever-churning cycle. You do a little bit everywhere, and then you change it all.
Even if you don't get to do a formal agile lifecycle, I think you're best bet is to view design as agile -- it will take many repetitions and arguments before you really feel it's "right".
The most important thing to keep in mind is that your first, and most likely 2nd 3rd attempt at designing the database will be wrong in some way. That might sound negative, maybe even a little rash, (it's certainly more towards the 'agile' software design philosophy) but it's important thing to keep in mind.
You still need to do your analysis thoroughly of course, try to implement one feature at a time, but try to get all layers working first. That way you won't have to do to much rework when the specs change and you understand the issues better. One you have a lot of data loaded into a system, changing things becomes increasingly difficult.
The main benefit of this approach is you find out quickly where you design is broken, where you haven't separated you design layers correctly. One trick I find extremely useful is to do both a sqllite and a mysql version, so seamless switching between the two is possible. Because the two use a different accent of SQL it highlights where you have too tight a coupling between the layers.
A good start would be to get familiar with Multitier architecture
Then you design your presentation layer.
In your business logic layer implement all logic
And finally you implement your data access layer.
Try to setup a prototype with something that is more productive then C++ for example Ruby, Python and well maybe even PHP.
When the prototype works and you see your data model is okay and your queries are too slow then you can start using C++.
But as your questions suggests you have more options then data and in this case the speed of a scripting langauge should be enough.

Improving and publishing an application. Need some advice

Last term (August - December 2008) me and some class mates wrote an application in C++. Nothing spectacular, it is an ORM for Sqlite3. We implemented some stuff like reflection to make it work and release the end user from the ugly stuff. Personally, i think we made a nice job, and that our ORM could actually be useful for someone (even though its writen specifically for Sqlite3, its easily adaptable for oter databases).
Consequently, i`ve come to the conclusion that it should be published somewhere (sourceforge most likely) as an open source project. But, as it was a term project, there are some things that need to be addresesed before doing that. Namely, it has some memory leaks that should be fixed, and some parts of the code could be refactored to make everyone´s life easier in the future.
I would like to know more experienced C++ programmers opinion on some issues:
Is it worth rewriting some parts to
apply new techonologies (for example,
boost).
Should our ORM be adapted to latest
C++ standard? Is there any benefit in
doing this?
How will we know when our code is
ready for release?
What are the chances that this ORM
will be forgotten into the mists of
the internet? (i.e is it worth
publishing it beyond personal pride
as a programmer?)
Right now i can`t think of many more questions, but i would like to read on similar experiences.
EDIT: I should probably translate my code + comments to english right? (self question)
Thanks in advance.
I guess I am "more experienced" with regard to your particular question. I co-developed an open source web application language & template system a lot like ColdFusion back in the early days of web design before Java or ASP were around. You can still see it at http://www.steelblue.com/ if you are interested. It's still used at the company I was at when it was developed, but I don't think anywhere else.
What I found is that unless you are already well connected and people are watching what you are doing, getting people to use your open source code is just about as hard as selling somone your closed source program. You really need to advocate for your project and it should have some kind of unique selling proposition that distinguishes it from the compitition.
So, that's the unsolicited advice. Here are some specific answers to the questions you had...all purely my opinion, of course.
I wouldn't rewrite any code unless you have a featuer you want to put in. That feature might be compatibility with a specific platforms or compilers. It might be to support a new db datatype or smarter indicies or whatever. If you are going to put some more serious work into the applicaiton, think about a roadmap of what you can realistically accomplish in the next iteration and what choices will make the app the "most better" at the end of your cycle.
Release the code as soon as it is usable for a specific purpose, any purpose. Two reasons. First off, there might be someone who wants it for that purpose right now. If it's not available, they will use something else. Also, if it's open source, they might contribute back to the project. Second, the sooner you find out how much people want to use the code, the better. Either it will be more popular than you expect and you can get excited about continuing the development....or....you will find that no one is even visiting your web page to see what you've got. In either case, better to know sooner than later what people really want from your project so you can take that into account when planning new releases.
About the "forgotten into the mists." I think most projects are. I don't want to be a downer, but looking at Wikipedia, there were 5 C++ ORM tools popular enough to get mention and they were all open source. As I said above, unless you can sell your idea to people, they are going to go with another proven open source solution. For someone to choose you over them, three things have to happen: 1. They need a feature you have that the others don't. 2. They find your project web site and it demonstrates the superiority of your code. 3. They trust your code enough to give it a shot.
On the other hand, if you are in this for the long haul and want to continue development thigns get easier over time. Eventually the project will get all the basics covered and you can start developing those new featuers that aren't in the other solutions. Also, the longer you are in active development the more trustworthy the project will seem. Finally, you will get more experience in the nitch. 2 years from now you will be better positioned to say where your effort will have the most impact on bettering the project.
A final thought: If you are enjoying it, learning from it, and it's not getting in the way of you keeping food on the table, it's a good use of your time.
Good luck!
-Al
Regarding the open source part:
If you really want to make it an open source project, you really should publish it regardless of it's current state - fully working and debugged - or half working and full of memory leaks.
Just, if it's state is bad, make sure to document it, and give it a suitable version number (less than one?). then others may view your code, suggest improving, join your team, etc...
My--rather random--thoughts on the matter (in the order I think is most important):
How will we know when our code is ready for release?
Like Liran Orevi said: if you're going open source release early. Document it reasonable well, and take the time to provide a road map of planned or hoped for future improvements (these are a invitation for people to help you, so note which ones have no one working on them).
Is it worth rewriting some parts to apply new technologies (for example, boost).
Should our ORM be adapted to latest C++ standard? Is there any benefit in doing this?
SQLite relies on a fairly limited base. Maybe you don't want your tool to demand a much heavier environment. If the code in not currently a tangled and unmaintainable mess, you might want to avoid boost and newest frills. Once you have a stable release (1.0 at least) you can starting thinking about the improvements that can be made for version 2.
What are the chances that this ORM will be forgotten into the mists of the internet? (i.e is it worth publishing it beyond personal pride as a programmer?)
Most things end up in the big /dev/null in the sky, and there is only one way to find out... If it goes anywhere at all, you win. If it doesn't it was a modest investment, and maybe you learned something while you were at it.