How do you understand a large chunk of code? - c++

I am a fresh college grad student that just started my job. In my ramp up period, I need to learn a lot of product code. There are some design docs but they do not help much.
Can you provide some general techniques to browse and understand huge product code (specifically C++)?

Run it through doxygen. This will generate html documentation which will be helpful even if the code does not have proper doxygen-style comments.
Another good advice is to look through the unit tests, if there are any. If there are no unit tests, a good way to understand the code is to write your own unit tests. The effort to do this will pay for itself many times over.

Use every method available to you (in no particular priority):
Use the product itself and understand what it does
Talk to the devs that maintained it or have worked with it previously
Debug through it and see how data flows and how classes interact ("when I click this button, what exactly happens, who is responsible?")
Look at architecture, UML, or class diagrams
One of my favorites: create your own diagrams of class hierarchies, class interactions, general control flow, high-level components, process/DLL interactions, object lifetimes and management
If they're not totally out-of-date, read the dev/test/user specs (goes well with #1)
Read the documentation on it
Most of all: be tenacious and persistent. If you don't put in the work, don't expect to understand it. If you don't understand something, dig and dig until you do. Software is not magic, it's just hard work :)

Some people will tell you to start with the data structures, but in a large system even that's not terribly helpful much of the time. I can think of four major points:
Take your time. Often, it's more like a whole series of gestalt shifts than it is a single, linear, gradual understanding. So be patient.
No matter how big it is, you should be able to put a breakpoint in and walk it in a debugger. Even in a large, complicated, multi-threaded system, you should be able walk through and see what's happening.
Ask for bugs, and start fixing them, no matter how crazy they seem. It's akin to dropping yourself into a foreign country; you'll pickup the language eventually.
Find a mentor. A jungle guide is invaluable.

I think there have been a few good responses already. My 2c worth...
Not sure what you class as huge (10 KLOC, 1000 KLOC, 10000 KLOC, etc), but one would hope that this is broken down in some way and is not a monolithic single program. Perhaps your management has some guidance on which 'module(s)' you are most likely to be spending time in at the moment. Hopefully this can help break down the problem scope.
Firstly, before you try to understand the code try to understand the product. What does it do? Then how does it do it? What does it interact with? Then how does it interact? etc...
When getting to the code try to understand the high level design and philosophy first, and work on the breadth before the depth. I agree with some of the above re fixing some bugs, but I also strongly suggest you continue to get a handle on the high level even if you need to get into the details to fix some bugs.
I also agree with the above in terms of generating some diagrams for yourself if you can't find any already in existence. And then share them, perhaps a team/product wiki? I'm curious as to why the existing doco does not help very much. Typically this is because this type of doco was generated from the early concepts and the product no longer bears any similarity, but if this is not the case then what can you contribute to this issue. One assumes that where you are today someone else will be in short enough order, and you are in an ideal position to know what essential doco is missing!
If the product is actually 'huge' then you have to accept that you will never be able to hold all of it in your head, so the best thing you can do is be familiar enough to know where to start looking (comes back to understanding the product, and approaching code breadth first).

This is obviously a pretty common question, and it's similar to this one (and the questions related to it): How to understand the design and code flow of any product quickly?
Dig through some of those answers / comments, for starters. Else, we'll just end up repeating them. :)

Related

Our code sucks and I'm powerless to fix it. Help! [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Our code sucks. Actually, let me clarify that. Our old code sucks. It's difficult to debug and is full of abstractions that few people understand or even remember. Just yesterday I spent an hour debugging in an area that I've worked for over a year and found myself thinking, "Wow, this is really painful." It's not anyone's fault - I'm sure it all made perfect sense initially. The worst part is usually It Just Works...provided you don't ask it to do anything outside of its comfort zone.
Our new code is pretty good. I think we're doing a lot of good things there. It's clear, consistent, and (hopefully) maintainable. We've got a Hudson server running for continuous integration and we have the beginnings of a unit test suite in place. The problem is our management is laser-focused on writing New Code. There's no time to give Old Code (or even old New Code) the TLC it so desperately needs. At any given moment our scrum backlog (for six developers) has about 140 items and around a dozen defects. And those numbers aren't changing much. We're adding things as fast as we can burn them down.
So what can I do to avoid the headaches of marathon debugging sessions mired in the depths of Old Code? Every sprint is filled to the brim with new development and showstopper defects. Specifically...
What can I do to help maintenance and refactoring tasks get high enough priority to be worked?
Are there any C++-specific strategies you employ to help prevent New Code from rotting so quickly?
Your management may be focused on getting working features into the product, and keeping them working. In this case, you will need to make a business case for refactoring the old stuff, in that by X investment of time and effort you can reduce necessary maintenance time by Y over period Z. Or your management may be fundamentally clueless (this happens, but less often than most developers seem to think), in which case you'll never get permission.
You need to see the business point of view. It doesn't matter to the end user whether the code is ugly or elegant, only what the software does. The cost of bad code is potential unreliability and additional difficulty in changing it; the emotional distress it causes to the programmer is rarely considered.
If you can't get permission to go in and refactor, you can always try it on your own, a little bit at a time. Whenever you fix a bug, do a little rewriting to make things clearer. This may turn out to be faster than the minimum possible fix, particularly in verifying that the code now works. Even if it isn't, it's usually possible to take a little more time on a bug fix without getting into trouble. Just don't get carried away.
If you can leave the code just a little better each time you go in, you'll feel a lot better about it.
Stand Up Meetings
I might go to my mechanic, and we have a little stand-up meeting in the morning:
I tell him I want my wheels aligned,
my tires rotated, and my oil changed.
I mention that "Oh by the way my
brakes felt a little soft on the way
in. Could [he] take a look at them?
How soon can I get my car back because
I need to get back to work?"
He pops his head under my car, pops
back up and says my brakes are leaking
oil and starting to fail. He will need
a part that will arrive at 10:30am.
His man won't finish before lunch, but
I should get my car back by 1:30pm or
so. He's booked solid so he won't be
able to do any of the other stuff
today, and I will have to book another
appointment.
I ask if he can do the other stuff and
I come back for the brake. He tells me
he really can't let me drive out of
there without fixing the brakes
because they might cause an accident,
but if I want to go to another
mechanic, he can call for a tow.
Since the car will be done so shortly
after lunch, I ask if his man can take
a late lunch so I can get my car back
an hour earlier.
He tells me his men come in at 8am and
often work into the evening.
They earn every break they
get, and his man deserves to take his
lunch with everyone else.
None of that is what I wanted to hear. I wanted to hear that I would drive out of there in a half hour with my wheels, tires and oil done.
My mechanic was just straight up and honest with me. Are you straight up and honest with your management? Or do you avoid telling them things they don't want to hear?
Unit Testing
I wouldn't touch a line of code I didn't understand, and I wouldn't check in a new line of code I didn't test thoroughly. (At least, not intentionally.)
Your question seems to imply that somehow a large corpus of poorly documented code made it past review without any unit tests. Maybe you participated in that, and maybe you didn't. Everyone involved needs to accept responsibility for that--including management. Regardless, what's done is done. You cannot go back and change it.
However, right now, in the present time, it is everybody's responsibility to stop the behavior that led to the problem in the first place. You say you spent a year working in code that you find difficult to understand and that has no unit tests. During that year, as you worked hard to improve your understanding, how many unit tests did you write to document and to verify that understanding?
As you struggled through the code slowly gaining understanding, how many comments did you add so you wouldn't have to struggle next time?
Scrum Backlog
Personally, I think the term "Scrum backlog" is a misnomer. A list of things to do is just a list--a shopping list if you will. I had a list when I went to the mechanic. My stand up meeting with the mechanic was really more of a sprint planning meeting.
A sprint planning meeting is a negotiation. If your management is time boxing without that negotiation, they aren't managing anything. They are simply trying to cram 10 lbs of shit into a 5 lb sack, and it's your responsibility to tell them so.
When you show up to a sprint planning meeting, you are expected to commit to a body of work, and it's your responsibility to prepare for that. Preparation means having some idea of what you will have to do to complete each item on the list--including the time it takes to understand obscure code and the time it takes to write unit tests.
If someone invites you to a planning meeting where you won't have time to prepare, decline the meeting and suggest when to reschedule so you will have time.
If you have an existing body of code with no unit tests and a feature might conceivably affect the operation of that code, you need to write unit tests for as much of the old code as might be affected. When you commit to writing the feature, you are committing to doing that work. If that leaves you too little time to commit to some other feature, just say so. Don't commit to the other feature.
When you commit to fix a defect, you commit to testing your work. Obviously, that means writing a unit test for the defect. But if it involves old code with no unit tests, it also means writing unit tests for things that aren't broken yet, but might break due to your change. How else will you test the fix?
If your defect list remains a constant size, your team regresses as much as it fixes. Politely explain to whomever needs to understand that unit tests prevent the regressions that currently keep your defect list from shrinking.
If you fail to write those unit tests because you commit to too many features, whose responsibility is that?
Refactoring
When you refactor code, you have to test all of it, and that means writing unit tests for all of it. If you have a large body of code with no unit tests, you will have to write all of those unit tests before you refactor.
I suggest you hold off on refactoring until those unit tests are in place. In the meantime, if you insist on including unit tests in your estimates for the work you commit to, eventually all those unit tests will be there. And then you can refactor.
The one exception to that is refactoring for testability. You may find that some of the code was not designed for test and that you have to refactor for things like dependency injection before you can create your unit tests. When you commit to writing the feature that requires the unit test, you commit to making the code testable. Include that in your estimate when you commit to the feature.
Commitment + Responsibility = Power
You say you are powerless. When you accept responsibility and commit to doing what needs doing, I think you will find you have all the power you need.
P.S. If anyone complains about anybody "wasting time" writing multiple unit tests when fixing a single defect, show them this video on the 80:20 rule and pound "defect clusters" into their brains.
It is hard to tell much from the information you give. Some questions I would have is a logical reason to be writing new code is to replace the old code. If that is what you are doing, abandon the old code.
Is it also old code that has showstopper defects? If so where are they coming from? Old code does not have "showstopper" defects, it just grinds closer and closer to a halt usually. It is old code after all - it should have the same old defects and the same old limitations, not stuff that has to be looked at right away. Showstopper defects are new code defects. It sounds like there is active development on in the old code.
If you are writing all this new code on top of old code that sucks, with no plans to fix it once and for all, sorry, there is only so much you can do when you are too busy burying yourself to dig yourself out.
If the latter is the case. you should recognize where you are headed, and try to detach a little. It is going to all collapse eventually, if you plan on being around save your strength for a worthwhile battle.
In the meantime try to pick up some design patterns. There are several that can at least help shield you new code from the old stuff, but still, ultimately it is just hard to write good code against bad code.
And your sprints sound maybe confused. Is there not an overall direction? That should determine how much backlog you have, although things can change month to month, is there not a clear sense of moving towards some final goal?
And new code rotting? The way you prevent that is you have a meaningful design, a meaningful direction, and a quality team that is committed to both the quality of their work and the vision of the design. If you have that, discipline is what maintains quality. If you don't have that sorry, you basically were writing code with no purpose already. It was basically rotten on the vine.
Not being critical, just trying to be honest. Take a deep breath. Slow down. You seem like you need it. Look at what you have written here. It tells nothing. You talk of refactor, scrums, showstoppers, defects, old code, new code. What does any of that mean? It is all jumbled up.
What about "new initiatives versus legacy systems"? "Need to refactor early sprint cycle code in terms of latest understanding etc." Are showstoppers in fact "Early components of the current enterprise initiatives have been released but are experiencing problems and no time is budgeted because of new development".
These would be meaningful concepts. You've given us nothing. I understand it is intense. My sprints are crazy too, we add a lot of back;pg items because we could not get many requirements up front (a lot of my new requirements result from having to also contend with external regulatory bodies, the normal business process is not always available).
But at the same time I am ground down by the sheer magnitude of what has to be done and the time to do it. Everything that is added to my backlog needs to be there. It is crazy, but at the same time I have a very clear idea of where I have been, where I need to go, and why the road is getter harder.
Step back, clear your thoughts, figure out the same - where you have been and where you are going. Because if you know that, it sure is not obvious. If you cannot communicate anything your peers can understand, how far are you going to get with a business manager?
Old code always sucks. There's probably some rare exceptions written by people with names like Kernighan or Thompson but, for the typical "code written in an office" stuff, over time it's gonna stink. Developers get more experienced. Newer practices, such as continuous integration, change the game. Stuff get's forgotten. New maintainers fail to grasp designs and wish for re-writes. So best accept this as normal.
Some random things that might help...
Talk about it with your team. Share your experiences and your concerns, while avoiding "man your old code sucks" (for obvious reasons) and see what the consensus is. You're probably not alone.
Forget about your managers. Don't expose them to this level of detail - they don't need to think about new vs. old code and probably won't understand if they do. This is a problem for your team to tackle and, if necessary, to make your PO aware of
Be open to the possibility that you may be able to throw stuff out. Some of that old code probably relates to features that are no longer being used or failed to be adopted by users in the first place. To make this work for you, you really need to go a level higher and think in terms of where the code really delivers user or business value vs. where it's just a ball of mud that no one is brave enough to take a decision on. Who dares, wins.
Relax your view of architectural consistency. There's always a way to tap into a working system with new code somewhere, and that may allow you to slowly migrate to a newer, smarter approach, while preserving the old long enough not to break existing things.
Overall, winning in this kind of situation is less about coding skills and much more about smart choices and handling the human aspects.
Hope that helps.
I recommend keeping track of how many bugs and code changes involve your "old code" and present this to either your manager or to your fellow developers at your next team meeting. With this in hand it should be simple enough to convince them that more needs to be done to refactor your "old code" and bring it up to par with your "new code".
It would also be prudent to document the parts of your "old code" that are most difficult to understand. These would also be the parts of your "old code" that you should be refactoring first once you get the approval.
Something to try: group your class into - say - worst 10%, best 10%, and the rest. Deliver the lists to your management, saying, "I predict the majority of bugs over the next quarter will be found in the first set." Based on length, cyclomatic complexity, test coverage - whatever tools are handy and comfortable to you. Then sit back and watch - and be right. Now you've got some credibility, some leverage when you say, "I'd like to invest some resources in making our bad code better, to reduce bugs and maintenance costs - and I know where to invest that energy, see?"
You could create diagrams and sketches of how the new code works and how the classes and functions are related to one another. You could use FreeMind or maybe Dia. And I definitely agree with Documenting and commenting your code.
I once had a problem with this too. I wrote a font class for J2ME for my own language. It was awful for these reasons that maybe you might also see in your code.
No Comments or documentation
Less object oriented
bad variable / function names
...
But after a few months I was forced to write the whole thing again. Now I've learned to use meaningful variable names that are sometimes VERY long. write comments more than writing codes. And using diagrams for the project's classes and their relationships.
I don't know If it was a real answer but it definitely worked for me. and for old codes you might actually have to reread the whole thing and add comments when you remember the functionalities.
Hope it helped.
Talk to your Product Owner! Explain that time invested in refactoring the old code will bring him benefit of higher team velocity on new features once this obstacle is removed.
Other than the approaches mentioned above which are good, you can also try these:
For keeping future code clean
Try pair programming, at least for parts that make sense. It's an effective way of getting reviewed, refactored code a practice.
Try to get refactoring onto the definition of "done". Then it will be part of the estimation process and allotted accordingly. So the definition of done might include: coded, unit tested, functionally tested, performance tested, code reviewed, refactored, and integrated (or something like this).
For Cleaning up the old code:
Unit tests are great for helping you refactor and figure out how things work.
I agree with the comments that a business case needs to be made for large-scale refactoring. But, small-scale refactoring could be easily included in the estimate and will provide immediate return. i.e.: I spend 2 hours rewriting a piece but I would have spent that time looking for bugs anyway.
You may also want to consider getting the product owner and scrummaster to capture a separate velocity for the old code vs the new code, and use that accordingly.
If there's a desired new feature and you can delineate a non-overwhelming hunk of code that is in the way, then you might be able to get management's blessing to replace the old code with new code that has the desired new feature. When I did this, I had to write a somewhat ugly shim layer to meet the old interfaces of the part of the software I wasn't going to touch. And a test harness that could exercise the existing code and exercise the new code to make sure the new code, as seen through the shim layer, could fool the rest of the application into thinking nothing had changed. By reworking the portion we reworked, we were able to show huge performance benefits, compatibility with desired new hardware, reduction in each of our field site's needs for expertise in administering space for the application - and the new code was much more maintainable. That last point mattered not a whit to the users, but the other advantages from the rework were attractive enough to "sell" the users on the merits of a somewhat painful database conversion.
Another more modest success story: we had a decent trouble tracking system that had literally years of history. There was a subsystem of our application that was famed for the speed with which it would burn out maintenance programmers. Clearly (well, clearly in my mind) it was in need of a major re-write, but management wasn't enthused about that. We were able to dig through the history in the trouble tracking data to show the staffing level that had gone into maintaining this module, and for all that effort, the trouble tickets per month against that module continued to arrive at a constant rate. When faced with actual data like that, even the reluctant managers who had long been tight-fisted about staffing re-work of that subsystem could see the merit of assigning staff to rework that module.
The approach as before was to leave the input and output of that module alone. The good news was that throwing virtual memory at the new code with its fancy new data structures did give a noticeable performance improvement to the module. The bad news is that we were nearly done with the re-implementation before we really understood what was wrong in the original implementation such that it did work most of the time, but managed to fail on some of the transactions on some days. The first cut faithfully reproduced those bugs, but the bugs were easier to understand in the reworked code so we now had a shot at really fixing the real problem. In retrospect, maybe we'd have been smarter to have captured data that produced the problems and have taken better care to make sure the reworked version didn't reproduce that problem. But, the truth is, nobody understood the problem until we were quite far along on the re-write. So, the re-write gave improved performance to the users and improved understanding to the current programmers, such that the real problem could really be resolved at last.
A fail example: There was yet another incredibly ugly module that persistently was a sore spot. Alas, I wasn't clever enough to be able to understand the defacto interfaces to this particular wretched hive of scum and villainy, at least not in the time frame of the nominal release schedule. I'd like to believe that given more time we could have figured out a suitable plan for re-working that piece of the system too, and maybe once we understood it, we could even identify user-desired improvements that we could fit into the re-write. But I can't promise that you'll find a prize in every box. If the box is entirely obscure to you, slicing away a chunk of it and replacing that piece with clean code is hard to do. The guy who had charge of that module is probably the one who was best positioned to figure out a plan of attack, but he saw the frequent crashes and calls from the field for assistance as "job security". I don't think management ever really recognized that he needed to be eased aside for someone with a hunger for change, but that's what probably was needed.
Drew

(student) interview questions - programming for a robotics lab

my robotics lab is looking for programmers to work on some projects we have at the moment.
We nailed down the requirements (mainly, c++ and experience with openGL and 3D), but due to obvious money constraints we can't afford to hire Great Developers. Instead we're going to settle for Talented Students, offering them projects for their dissertation/thesis and hoping for some fresh ideas and creativity from their side. We can also afford to pay students that just graduated (first job experience).
So my question is:
In your experience, how did you spot a Talented Student (computer scientist or engineer)? What questions did you ask? What else did help you in finding a candidate that turned out to be a Good Programmer? (note: they might not know much about a specific language, but might have the ability to learn pretty fast)
or, if you were the interviewee,
Which questions were asked that made you jump on the bandwagon? Or, if you had an awful experience, what - in retrospect - was an obvious warning signal that you ignored?
Please note that I am not looking for an argumentative answer. We can talk all day long about what's best for us and never agree.
Instead, I'm looking for tales from your experience. Anecdotes, stories, hints, everything will help.
Background:
A bit more background: working for academia here is slightly different than working for the private sector (here = Italy). There are no 'deadlines' to 'sell' a product; instead, it's all proof-of-concept based. Nothing you start working on has the guarantee to be functional.
A comic best describes it: reinventing the wheel
I am considering doing Coding Questions for their interview, but all my colleagues are scoffing at me (too scary, nobody will ever come to work for us again, nobody really know how to code, etc).
Coding-wise, programming done by researchers is ... ugly. I am fighting to get a version control system in constant use, people have to be chased down to report bugs and document their code, everything is coded-so-that-it-works and rarely we go back to old code to 'fix bugs'. Basically once it's somewhat working, the project is closed and people go work on another project.
Lots has been reinvented and rewritten over and over again (just because nobody knew it was already there). People come and go, future is uncertain, but we play with robots so it's very cool :)
Furthermore, being really understaffed, nobody can follow you and guide you in your project. At best you're the one that has to come up with a plan, background literature and a working prototype.
Hence, we are looking for people that:
have some background to get started
can be highly independent
do want to learn and build their own expertise in new fields
Actually, here's my best advice:
Recruit among your students.
Since you work for an academic institution I assume that either you or your colleagues teach. This provides you with a wealth of information about what potential recruits are capable of -- how fast they learn, how motivated they are, what they are good and bad at, how the code they turn in for the lab assignments and projects look like, etc.
Firstly, in industry, coding questions are very much the norm - I'd be worried if coding questions weren't asked!
I've been responsible for doing technical interviews for about ten years now. And yes, I ask coding questions. But the questions themselves aren't really the point. What I'm more interested in is getting an idea of whether the candidate can think, and articulate their thoughts.
One question I ask (asuming the simple earlier stuff has gone well) is about inheritance heirachies. There is no one right answer, although there are a lot of wrong ones. But what's important is how they approach the question, the points they come up with in favour of one design over another.
Background knowledge is useful, in that it shows that they have an interest in the area in which you're working - but really knowledge can be gained. Intelligence is much more important.
However it's quite possible to have intelligent people who are impossible to work with! I haven't figured out how to determine who they are.
I have done such a project when I was a student: ie a 4-months project, working half-time. It was not about robots, per se.
I think that the most obvious requirement is motivation/passion. Since they'll be mostly on their own you will need them to be somewhat independent and able to think for themselves, this requires motivation first and foremost.
In order to determine whether the candidate is motivated or not, begin by asking them about the project itself. If they only gave it a cursory glance, they're likely not motivated. Also look at their experience / courses: optional courses in CS, projects they've done, etc... anything indicating that they really care about CS / development in general and are not there just because they've heard it paid well.
Then comes the question of ability. Like you said it might not be easy to spot those who will be smart enough to figure stuff out by themselves and DO things. Once again, you can ask them about past projects, having them detail the issues they faced and how they solved it.
Finally, I agree with you that some demonstration of their abilities is in order. They might be a bit tense initially, so I would do this at the end, once the interview is already going, you might have had a chance to get them to relax with the previous questions this way.
You don't necessarily need them to do coding questions, I think it's most about reasoning. Try to pick problems related to your area work, for example one you really had in the past, and get them to analyze the problem. If possible they should take the lead and ask you questions about the problem itself here.
We've had an issue with the robot not being able to analyze the images the camera took, it could not correctly determine the moving objects itself, do you have an idea how you would do that ?
Then you'll need to get them to think about a solution. You need a whiteboard here, and ask them to think aloud so that you can follow their reflexion. You'll probably need to nudge them a bit from time to time to keep them on track, their reaction to your input is also a key-point here, since you want them to be able to accept criticism and build on it, otherwise you might have issues directing them afterward.
Frankly, try to avoid asking them for the quicksort algorithm, or the introsort, or the radix sort... If they need sorting, they'll just fire up their computer and browse the internet. On the other hand, getting them to analyze an existing algorithm unknown to them (for example, the median of 5 sort) and checking that they understand why it works, could be worth it. If they need to work on their own, they'll need to be able to learn on their own too :)
As others say, try to hire someone that's motivated!
For master thesis students I put more emphasis on knowing the basic skills (programming, knowing how to use version control) as they don't stay on long enough to learn everything along the way.
If they're going to work mostly on their own and you have no special requirements on language I wouldn't focus much on language questions. But every decent programmer knows at least one language fairly well, get a sample or their prior work or make them code some simple application to test that they don't suck.
I'd focus more on algorithm and data structures. Ask rudimentary stuff that every programmer should know - when to use a list and when to use a vector, why summing a row-major matrix by iterating over the columns first is bad, basic complexity analysis questions, etc. That will sort out many of the bad weeds.
Ask some design questions too perhaps, e.g. what is "coupling" and why is it bad, ask them if they know what a design pattern is, etc.
Check that the applicants have a solid grasp of linear algebra and coordinate system changes in particular if they're going to work with any 3D stuff like OpenGL. In my experience, learning the API is simple, wrapping your head around how the transformations work less so.
Obviously, if you except them to perform any real robotics-specific you should check for that knowledge as well. E.g. estimation (understanding simple EKF and particle filters is a requirement in my book), control theory, pattern recognition, machine learning, vision, or whatever is useful for the particular task.
If I were hiring someone for theoretical work I would perhaps loosen up on the CS/programming skills and focus more on math knowledge. Someone with solid math skills will pick up the CS easily, and programming is just programming.
Ask for references or to see some of the prior work. Many great students already have some interesting project to show after graduation.
I'm not sure how common this is at universities in general, but I would look for a games programming (or robotics) course on the transcript where the candidate, as a student, succeeded in completing a project with a team. It would ask the candidate to describe how that project worked (important technical details) and the role he had in the team. The only way to really tell if someone is good at something is to see what happens when they try it, and since you're in academia, recruiting students, that should not be a problem.

Any advice for a developer given the task of enhancing & refactoring a business critical application?

Recently I inherited a business critical project at work to "enhance". The code has been worked on and passed through many hands over the past five years. Consultants and full-time employees who are no longer with the company have butchered this very delicate and overly sensitive application. Most of us have to deal with legacy code or this type of project... its part of being a developer... but...
There are zero units and zero system tests. Logic is inter-mingled (and sometimes duplicated for no reason) between stored procedures, views (yes, I said views) and code. Documentation? Yeah, right.
I am scared. Yes, very sacred to make even the most minimal of "tweak" or refactor. One little mishap, and there would be major income loss and potential legal issues for my employer.
So, any advice? My first thought would be to begin writing assertions/unit tests against the existing code. However, that can only go so far because there is a lot of logic embedded in stored procedures. (I know its possible to test stored procedures, but historically its much more difficult compared to unit testing source code logic).
Another or additional approach would be to compare the database state before and after the application has performed a function, make some code changes, then do database state compare.
I just rewrote thousands of lines of the most complex subsystem of an enterprise filesystem to make it multi-threaded, so all of this comes from experience. If the rewrite is justified (it is if the rewrite is being done to significantly enhance capabilities, or if existing code is coming in the way of putting in more enhancements), then here are the pointers:
You need to be confident in your own abilities first of all to do this. That comes only if you have enough prior experience with the technologies involved.
Communicate, communicate, communicate. Let all involved stake-holders know, this is a mess, this is risky, this cannot be done in a hurry, this will need to be done piece-meal - attack one area at a time.
Understand the system inside out. Document every nuance, trick and hack. Document the overall design. Ask any old-timers about historical reasons for the existence of any code you cannot justify. These are the mines you don't want to step on - you might think those are useless pieces of code and then regret later after getting rid of them.
Unit test. Work the system through any test-suite which already exists, otherwise first write the tests for existing code, if they don't exist.
Spew debugging code all over the place during the rewrite - asserts, logging, console prints (you should have the ability to turn them on and off, as well specify different levels of output i.e. control verbosity). This is a must in my experience, and helps tremendously during a rewrite.
When going through the code, make a list of all things that need to be done - things you need to find out, things you need to write tests for, things you need to ask questions about, notes to remind you how to refactor some piece of code, anything that can affect your rewrite... you cannot afford to forget anything! I do this using Outlook Tasks (just make sure whatever you use is always in front of you - this is the first app I open as soon as I sit down on the desk). If I get interrupted, I write down anything that I have been thinking about and hints about where to continue after coming back to the task.
Try avoiding hacks in your rewrite (that's one of the reasons you are rewriting it). Think about tough problems you encounter. Discuss them with other people and bounce off your ideas against them (nothing beats this), and put in clean solutions. Look at all the tasks you put into the todo list - make a 10,000 feet picture of existing design, then decide how the new rewrite would look like (in terms of modules, sub-modules, how they fit together etc.).
Tackle the toughest problems before any other. That'll save you from running into problems you cannot solve near the end of tunnel, and save you from taking any steps backward. Of course, you need to know what the toughest problems will be - so again, better document everything first during your forays into existing code.
Get a very firm list of requirements.
Make sure you have implicit requirements as well as explicit ones - i.e. what programs it has to work with, and how.
Write all scenarios and use cases for how it is currently being used.
Write a lot of unit tests.
Write a lot of integration tests to test the integration of the program with existing programs it has to work with.
Talk to everyone who uses the program to find out more implicit requirements.
Test, test, test changes before moving into production.
CYA :)
Two things, beyond #Sudhanshu's great list (and, to some extent, disagreeing with his #8):
First, be aware that untested code is buggy code - what you are starting with almost certainly does not work correctly, for any definition of "correct" other than "works just like the unmodified code". That is, be prepared to find unexpected behavior in the system, to ask experts in the system about that behavior, and for them to conclude that it's not working the way it should. Prepare them for it to - warn them that without tests or other documentation, there's no reason to think it works they way they think it's working.
Next: Refactor The Low-Hanging Fruit Take it easy, take it slow, take it very careful. Notice something easy in the code - duplication, say - and test the hell out of whatever methods contain the duplication, then eliminate it. Lather, rinse, repeat. Don't write tests for everything before making changes, but write tests for whatever you're changing. This way, it stays releasable at every stage and you are continuously adding value, continuously improving the code base.
I said "two things", but I guess I'll add a third: Manage expectations. Let your customer know how scared you are of this task; let them know how bad what they've got is. Let them know how slow progress will be, and let them know you'll keep them informed of that progress (and, of course, do it). Your customer may think s/he's asking for "just a little fix" - and the functionality may indeed change only a little - but that doesn't mean it's not going to be a lot of work and a lot of time. You understand that; your customer needs to, too.
I've had this problem before and I've asked around (before the days of stack overflow) and this book has always been recommended to me. http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052
Ask yourself this: what are you trying to achieve? What is your mission? How much time do you have? What is the measurement for success? What risks are there? How do you mitigate and deal with them?
Don't touch anything unless you know what it is you're trying to achieve.
The code might be "bad" but what does that mean? The code works right? So if you rewrite the code so it does the same thing you'll have spent a lot of time rewriting something introducing bugs along the way so the code does the same thing? To what end?
The simplest thing you can do is document what the system does. And I don't mean write mind-numbing Word documents no one will ever read. I mean writing tests on key functionality, refactoring the code if necessary to allow such tests to be written.
You said you are scared to touch the code because of legal, income loss and that there is zero documentation. So do you understand the code? The first thing you should do is document it and make sure you understand it before you even think about refactoring. Once you have done that and identified the problem areas make a list of your refactoring proposals in the order of maximum benefit with minimum changes and attack it incrementally. Refactoring makes additional sense if: the expected lifespan of the code will be long, new features will be added, bug fixes are numerous. As for testing the database state - I worked on a project recently where that is exactly what we did with success.
Is it possible to get a separation of the DB and non-DB parts, so that a DBA can take on the challenge of the stored procedures and databases themselves freeing you up to work on the other parts of the system? This also presumes that there is a DBA who can step up and take that part of the application.
If that isn't possible, then I'd make the suggestion of seeing how big is the codebase and if it is possible to get some assistance so it isn't all on you. While this could be seen as side-stepping responsibility, the point would be that things shouldn't be in just one person's hands usually as they can disappear at times.
Good luck!

Where to get peer review of code and how to get my code attention?

I'm just now learning to programming at age 17. It's hard for me to talk to other programmers as I'm just out of high school (which means I can't take programming courses). I know that I write terrible code, and not like Jeff Atwood terrible code, my code actually sucks. So where can I post some of my code and get real programmers to review it. I know if I had a question I could ask it on StackOverflow, but I want to post a whole class and get a review on it.
The real problem here is that I'm not going to be writing the next great piece of Software. I'm going to be writing a really useless class, which will serve no other purpose than to teach me how to program. This code will never be used, ever! EVER! How can I get an advanced (or even intermediate) programmer to look at my code?
Thanks in advance! ;-)
Look to the open source community. There are plenty of existing and new projects that would love an eager (if inexperienced) developer to offer support.
Going this route offers two advantages:
You get to see great code in action and learn from it
Any changes you submit will be reviewed by an experienced developer and they will often give you excellent suggestions as to how to improve your code before it will be accepted
Start by choosing a project in your language (there are a bunch in c++) and check out the code. You don't need to understand it all, but you must be able to understand at least a portion of it.
If the project looks way to complicated, keep looking. Younger projects tend to have less code that you need to learn.
If you can't get great programmers to look at your code, do the next best thing: look at theirs!
Look for a bunch of code snippets that do the same (simple) thing. Before you look at them too closely, write your own code to perform the same task. Compare all of the snippets with your own (and each other!) and try to figure out the reasons for the differences.
I recommend looking for code from well established projects. Code from tutorials often ignores important details for the sake of simplicity.
Why don't you try RefactorMyCode?
I would try not to write useless code, but attempt to solve some particular problem. Your learning will be more advanced if you are learning in the context of a real-world scenario. It doesn't have to be a big business domain; could even be a game or a shareware utility.
As for getting your code reviewed, the open source community is a good way to go as The Lame Duck says - in fact you're guaranteed it gets some form of review if you actually contribute to a project. Other avenues to explore: your local C++ users' group, checking out a co-op program available through a junior college, or engaging someone in a company that sponsors interns.
I haven't tried sites such as RefactorMyCode as suggested by Gilad Naor, but that seems promising. And, yes, StackOverflow is a good place for bite-sized chunks of code. If you do that, explain what you are trying to do, and why you are trying to do it that way, and ask if there's a better approach. Good luck!
I think the best way to learn is the way I learned (I may be biased): trial and error. I just wrote programs all the time, teaching myself as I went. I'd write terrible code, and I would wrestle with making it do what I wanted. Often it would make me give up on that particular project. But on the next project, I'd take a different approach, and it would work better. Repeat ad nauseam. Once you know where the rough spots are in your designs, you'll be able to ask specific questions on places like SO, or, better yet IMHO, come up with better designs yourself. I independently invented all the major design patterns just through frustration at the solutions I'd created in the past. I think this gives me a valuable perspective, since for most people design patterns are just a "best practice", but I know the pain that comes with using other designs, and I can see signs of bad designs in code very easily (it takes one to know one). This last skill is one that I often see lacking in other programmers... they can't see why their design is deficient and they should use something else.
You could always try a site like Project Euler, where there are a whole load of problems that will test your skills and a whole bunch of solutions to those problems, submitted by others. Project Euler tends to focus on algorithms rather than higher level programming constructs, but I imagine that there are others in a similar vein.
Do something fun and don't worry too much about code style yet. I started out with BASIC on Commodore 64 without even realizing that there was such a thing as clean code vs dirty code. If I had worried a lot about that then, it might have hindered me from progressing. You always learn best when doing it playfully.
Maybe a bit late, but since StackExchange has Code Review, it worth the answer:
Code Review Stack Exchange is a question and answer site for peer
programmer code reviews. It's 100% free, no registration required.
Here is the link: Code Review Stack Exchange

Improving and publishing an application. Need some advice

Last term (August - December 2008) me and some class mates wrote an application in C++. Nothing spectacular, it is an ORM for Sqlite3. We implemented some stuff like reflection to make it work and release the end user from the ugly stuff. Personally, i think we made a nice job, and that our ORM could actually be useful for someone (even though its writen specifically for Sqlite3, its easily adaptable for oter databases).
Consequently, i`ve come to the conclusion that it should be published somewhere (sourceforge most likely) as an open source project. But, as it was a term project, there are some things that need to be addresesed before doing that. Namely, it has some memory leaks that should be fixed, and some parts of the code could be refactored to make everyone´s life easier in the future.
I would like to know more experienced C++ programmers opinion on some issues:
Is it worth rewriting some parts to
apply new techonologies (for example,
boost).
Should our ORM be adapted to latest
C++ standard? Is there any benefit in
doing this?
How will we know when our code is
ready for release?
What are the chances that this ORM
will be forgotten into the mists of
the internet? (i.e is it worth
publishing it beyond personal pride
as a programmer?)
Right now i can`t think of many more questions, but i would like to read on similar experiences.
EDIT: I should probably translate my code + comments to english right? (self question)
Thanks in advance.
I guess I am "more experienced" with regard to your particular question. I co-developed an open source web application language & template system a lot like ColdFusion back in the early days of web design before Java or ASP were around. You can still see it at http://www.steelblue.com/ if you are interested. It's still used at the company I was at when it was developed, but I don't think anywhere else.
What I found is that unless you are already well connected and people are watching what you are doing, getting people to use your open source code is just about as hard as selling somone your closed source program. You really need to advocate for your project and it should have some kind of unique selling proposition that distinguishes it from the compitition.
So, that's the unsolicited advice. Here are some specific answers to the questions you had...all purely my opinion, of course.
I wouldn't rewrite any code unless you have a featuer you want to put in. That feature might be compatibility with a specific platforms or compilers. It might be to support a new db datatype or smarter indicies or whatever. If you are going to put some more serious work into the applicaiton, think about a roadmap of what you can realistically accomplish in the next iteration and what choices will make the app the "most better" at the end of your cycle.
Release the code as soon as it is usable for a specific purpose, any purpose. Two reasons. First off, there might be someone who wants it for that purpose right now. If it's not available, they will use something else. Also, if it's open source, they might contribute back to the project. Second, the sooner you find out how much people want to use the code, the better. Either it will be more popular than you expect and you can get excited about continuing the development....or....you will find that no one is even visiting your web page to see what you've got. In either case, better to know sooner than later what people really want from your project so you can take that into account when planning new releases.
About the "forgotten into the mists." I think most projects are. I don't want to be a downer, but looking at Wikipedia, there were 5 C++ ORM tools popular enough to get mention and they were all open source. As I said above, unless you can sell your idea to people, they are going to go with another proven open source solution. For someone to choose you over them, three things have to happen: 1. They need a feature you have that the others don't. 2. They find your project web site and it demonstrates the superiority of your code. 3. They trust your code enough to give it a shot.
On the other hand, if you are in this for the long haul and want to continue development thigns get easier over time. Eventually the project will get all the basics covered and you can start developing those new featuers that aren't in the other solutions. Also, the longer you are in active development the more trustworthy the project will seem. Finally, you will get more experience in the nitch. 2 years from now you will be better positioned to say where your effort will have the most impact on bettering the project.
A final thought: If you are enjoying it, learning from it, and it's not getting in the way of you keeping food on the table, it's a good use of your time.
Good luck!
-Al
Regarding the open source part:
If you really want to make it an open source project, you really should publish it regardless of it's current state - fully working and debugged - or half working and full of memory leaks.
Just, if it's state is bad, make sure to document it, and give it a suitable version number (less than one?). then others may view your code, suggest improving, join your team, etc...
My--rather random--thoughts on the matter (in the order I think is most important):
How will we know when our code is ready for release?
Like Liran Orevi said: if you're going open source release early. Document it reasonable well, and take the time to provide a road map of planned or hoped for future improvements (these are a invitation for people to help you, so note which ones have no one working on them).
Is it worth rewriting some parts to apply new technologies (for example, boost).
Should our ORM be adapted to latest C++ standard? Is there any benefit in doing this?
SQLite relies on a fairly limited base. Maybe you don't want your tool to demand a much heavier environment. If the code in not currently a tangled and unmaintainable mess, you might want to avoid boost and newest frills. Once you have a stable release (1.0 at least) you can starting thinking about the improvements that can be made for version 2.
What are the chances that this ORM will be forgotten into the mists of the internet? (i.e is it worth publishing it beyond personal pride as a programmer?)
Most things end up in the big /dev/null in the sky, and there is only one way to find out... If it goes anywhere at all, you win. If it doesn't it was a modest investment, and maybe you learned something while you were at it.