Related
I face a situation where we have many very long methods, 1000 lines or more.
To give you some more detail, we have a list of incoming high level commands, and each generates results in a longer (sometime huge) list of lower level commands. There's a factory creating an instance of a class for each incoming command. Each class has a process method, where all the lower level commands are generated added in sequence. As I said, these sequences of commands and their parameters cause quite often the process methods to reach thousands of lines.
There are a lot of repetitions. Many command patterns are shared between different commands, but the code is repeated over and over. That leads me to think refactoring would be a very good idea.
On the contrary, the specs we have come exactly in the same form as the current code. Very long list of commands for each incoming one. When I've tried some refactoring, I've started to feel uncomfortable with the specs. I miss the obvious analogy between the specs and code, and lose time digging into newly created common classes.
Then here the question: in general, do you think such very long methods would always need refactoring, or in a similar case it would be acceptable?
(unfortunately refactoring the specs is not an option)
edit:
I have removed every reference to "generate" cause it was actually confusing. It's not auto generated code.
class InCmd001 {
OutMsg process ( InMsg& inMsg ) {
OutMsg outMsg = OutMsg::Create();
OutCmd001 outCmd001 = OutCmd001::Create();
outCmd001.SetA( param.getA() );
outCmd001.SetB( inMsg.getB() );
outMsg.addCmd( outCmd001 );
OutCmd016 outCmd016 = OutCmd016::Create();
outCmd016.SetF( param.getF() );
outMsg.addCmd( outCmd016 );
OutCmd007 outCmd007 = OutCmd007::Create();
outCmd007.SetR( inMsg.getR() );
outMsg.addCmd( outCmd007 );
// ......
return outMsg;
}
}
here the example of one incoming command class (manually written in pseudo c++)
Code never needs refactoring. The code either works, or it doesn't. And if it works, the code doesn't need anything.
The need for refactoring comes from you, the programmer. The person reading, writing, maintaining and extending the code.
If you have trouble understanding the code, it needs to be refactored. If you would be more productive by cleaning up and refactoring the code, it needs to be refactored.
In general, I'd say it's a good idea for your own sake to refactor 1000+ line functions. But you're not doing it because the code needs it. You're doing it because that makes it easier for you to understand the code, test its correctness, and add new functionality.
On the other hand, if the code is automatically generated by another tool, you'll never need to read it or edit it. So what'd be the point in refactoring it?
I understand exactly where you're coming from, and can see exactly why you've structured your code the way it is, but it needs to change.
The uncertainty you feel when you attempt to refactor can be ameliorated by writing unit tests. If you've tests specific to each spec, then the code for each spec can be refactored until you're blue in the face, and you can have confidence in it.
A second option, is it possible to automatically generate your code from a data structure?
If you've a core suite of classes that do the donkey work and edge cases, you can auto-generate the repetitive 1000 line methods as often as you wish.
However, there are exceptions to every rule.
If the methods are a literal interpretation of the spec (very little additional logic), and the specs change infrequently, and the "common" portions (i.e. bits that happen to be the same right now) of the specs change at different times, and you're not going to be asked to get a 10x performance gain out of the code anytime soon, then (and only then) . . . you may be better off with what you have.
. . . but on the whole, refactor.
Yes, always. 1000 lines is at least 10x longer than any function should ever be, and I'm tempted to say 100x, except that when dealing with input parsing and validation it can become natural to write functions with 20 or so lines.
Edit: Just re-read your question and I'm not clear on one point - are you talking about machine generated code that no-one has to touch? In which case I would leave things as they are.
Refectoring is not the same as writing from scratch. While you should never write code like this, before you refactor it, you need to consider the costs of refactoring in terms of time spent, the associated risks in terms of breaking code that already works, and the net benefits in terms of future time saved. Refactor only if the net benefits outweigh the associated costs and risks.
Sometimes wrapping and rewriting can be a safer and more cost effective solution, even if it appears expensive at first glance.
Long methods need refactoring if they are maintained (and thus need to be understood) by humans.
As a rule of thumb, code for humans first. I don't agree with the common idea that functions need to be short. I think what you need to aim at is when a human reads your code they grok it quickly.
To this effect it's a good idea to simplify things as much as possible--but not more than that. It's a good idea to delegate roughly one task for each function. There is no rule as for what "roughly one task" means: you'll have to use your own judgement for that. But do recognize that a function split into too many other functions itself reduces readability. Think about the human being who reads your function for the first time: they would have to follow one function call after another, constantly context-switching and maintaining a stack in their mind. This is a task for machines, not for humans.
Find the balance.
Here, you see how important naming things is. You will see it is not that easy to choose names for variables and functions, it takes time, but on the other hand it can save a lot of confusion on the human reader's side. Again, find the balance between saving your time and the time of the friendly humans who will follow you.
As for repetition, it's a bad idea. It's something that needs to be fixed, just like a memory leak. It's a ticking bomb.
As others have said before me, changing code can be expensive. You need to do the thinking as for whether it will pay off to spend all this time and effort, facing the risks of change, for a better code. You will possibly lose lots of time and make yourself one headache after another now, in order to possibly save lots of time and headache later.
Take a look at the related question How many lines of code is too many?. There are quite a few tidbits of wisdom throughout the answers there.
To repost a quote (although I'll attempt to comment on it a little more here)... A while back, I read this passage from Ovid's journal:
I recently wrote some code for
Class::Sniff which would detect "long
methods" and report them as a code
smell. I even wrote a blog post about
how I did this (quelle surprise, eh?).
That's when Ben Tilly asked an
embarrassingly obvious question: how
do I know that long methods are a code
smell?
I threw out the usual justifications,
but he wouldn't let up. He wanted
information and he cited the excellent
book Code Complete as a
counter-argument. I got down my copy
of this book and started reading "How
Long Should A Routine Be" (page 175,
second edition). The author, Steve
McConnell, argues that routines should
not be longer than 200 lines. Holy
crud! That's waaaaaay to long. If a
routine is longer than about 20 or 30
lines, I reckon it's time to break it
up.
Regrettably, McConnell has the cheek
to cite six separate studies, all of
which found that longer routines were
not only not correlated with a greater
defect rate, but were also often
cheaper to develop and easier to
comprehend. As a result, the latest
version of Class::Sniff on github now
documents that longer routines may not
be a code smell after all. Ben was
right. I was wrong.
(The rest of the post, on TDD, is worth reading as well.)
Coming from the "shorter methods are better" camp, this gave me a lot to think about.
Previously my large methods were generally limited to "I need inlining here, and the compiler is being uncooperative", or "for one reason or another the giant switch block really does run faster than the dispatch table", or "this stuff is only called exactly in sequence and I really really don't want function call overhead here". All relatively rare cases.
In your situation, though, I'd have a large bias toward not touching things: refactoring carries some inherent risk, and it may currently outweigh the reward. (Disclaimer: I'm slightly paranoid; I'm usually the guy who ends up fixing the crashes.)
Consider spending your efforts on tests, asserts, or documentation that can strengthen the existing code and tilt the risk/reward scale before any attempt to refactor: invariant checks, bound function analysis, and pre/postcondition tests; any other useful concepts from DBC; maybe even a parallel implementation in another language (maybe something message oriented like Erlang would give you a better perspective, given your code sample) or even some sort of formal logical representation of the spec you're trying to follow if you have some time to burn.
Any of these kinds of efforts generally have a few results, even if you don't get to refactor the code: you learn something, you increase your (and your organization's) understanding of and ability to use the code and specifications, you might find a few holes that really do need to be filled now, and you become more confident in your ability to make a change with less chance of disastrous consequences.
As you gain a better understanding of the problem domain, you may find that there are different ways to refactor you hadn't thought of previously.
This isn't to say "thou shalt have a full-coverage test suite, and DBC asserts, and a formal logical spec". It's just that you are in a typically imperfect situation, and diversifying a bit -- looking for novel ways to approach the problems you find (maintainability? fuzzy spec? ease of learning the system?) -- may give you a small bit of forward progress and some increased confidence, after which you can take larger steps.
So think less from the "too many lines is a problem" perspective and more from the "this might be a code smell, what problems is it going to cause for us, and is there anything easy and/or rewarding we can do about it?"
Leaving it cooking on the backburner for a bit -- coming back and revisiting it as time and coincidence allows (e.g. "I'm working near the code today, maybe I'll wander over and see if I can't document the assumptions a bit better...") may produce good results. Then again, getting royally ticked off and deciding something must be done about the situation is also effective.
Have I managed to be wishy-washy enough here? My point, I think, is that the code smells, the patterns/antipatterns, the best practices, etc -- they're there to serve you. Experiment to get used to them, and then take what makes sense for your current situation, and leave the rest.
I think you first need to "refactor" the specs. If there are repetitions in the spec it also will become easier to read, if it makes use of some "basic building blocks".
Edit: As long as you cannot refactor the specs, I wouldn't change the code.
Coding style guides are all made for easier code maintenance, but in your special case the ease of maintenance is achieved by following the spec.
Some people here asked if the code is generated. In my opinion it does not matter: If the code follows the spec "line by line" it makes no difference if the code is generated or hand-written.
1000 thousand lines of code is nothing. We have functions that are 6 to 12 thousand lines long. Of course those functions are so big, that literally things get lost in there, and no tool can help us even look at high level abstractions of them. the code is now unfortunately incomprehensible.
My opinion of functions that are that big, is that they were not written by brilliant programmers but by incompetent hacks who shouldn't be left anywhere near a computer - but should be fired and left flipping burgers at McDonald's. Such code wreaks havok by leaving behind features that cannot be added to or improved upon. (too bad for the customer). The code is so brittle that it cannot be modified by anyone - even the original authors.
And yes, those methods should be refactored, or thrown away.
Do you ever have to read or maintain the generated code?
If yes, then I'd think some refactoring might be in order.
If no, then the higher-level language is really the language you're working with -- the C++ is just an intermediate representation on the way to the compiler -- and refactoring might not be necessary.
Looks to me that you've implemented a separate language within your application - have you considered going that way?
It has been my understanding that it's recommended that any method over 100 lines of code be refactored.
I think some rules may be a little different in his era when code is most commonly viewed in an IDE. If the code does not contain exploitable repetition, such that there are 1,000 lines which are going to be referenced once each, and which share a significant number of variables in a clear fashion, dividing the code into 100-line routines each of which is called once may not be that much of an improvement over having a well-formatted 1,000-line module which includes #region tags or the equivalent to allow outline-style viewing.
My philosophy is that certain layouts of code generally imply certain things. To my mind, when a piece of code is placed into its own routine, that suggests that the code will be usable in more than one context (exception: callback handlers and the like in languages which don't support anonymous methods). If code segment #1 leaves an object in an obscure state which is only usable by code segment #2, and code segment #2 is only usable on a data object which is left in the state created by #1, then absent some compelling reason to put the segments in different routines, they should appear in the same routine. If a program puts objects through a chain of obscure states extending for many hundreds of lines of code, it might be good to rework the design of the code to subdivide the operation into smaller pieces which have more "natural" pre- and post- conditions, but absent some compelling reason to do so, I would not favor splitting up the code without changing the design.
For further reading, I highly recommend the long, insightful, entertaining, and sometimes bitter discussion of this topic over on the Portland Pattern Repository.
I've seen cases where it is not the case (for example, creating an Excel spreadsheet in .Net often requires a lot of line of code for the formating of the sheet), but most of the time, the best thing would be to indeed refactor it.
I personally try to make a function small enough so it all appears on my screen (without affecting the readability of course).
1000 lines? Definitely they need to be refactored. Also not that, for example, default maximum number of executable statements is 30 in Checkstyle, well-known coding standard checker.
If you refactor, when you refactor, add some comments to explain what the heck it's doing.
If it had comments, it would be much less likely a candidate for refactoring, because it would already be easier to read and follow for someone starting from scratch.
Then here the question: in general, do
you think such very long methods would
always need refactoring,
if you ask in general, we will say Yes .
or in a
similar case it would be acceptable?
(unfortunately refactoring the specs
is not an option)
Sometimes are acceptable, but is very unusual, I will give you a pair of examples:
There are some 8 bit microcontrollers called Microchip PIC, that have only a fixed 8 level stack, so you can't nest more than 8 calls, then care must be taken to avoid "stack overflow", so in this special case having many small function (nested) is not the best way to go.
Other example is when doing optimization of code (at very low level) so you have to take account the jump and context saving cost. Use it with care.
EDIT:
Even in generated code, you could need to refactorize the way its generated, for example for memory saving, energy saving, generate human readable, beauty, who knows, etc..
There has been very good general advise, here a practical recommendation for your sample:
common patterns can be isolated in plain feeder methods:
void AddSimpleTransform(OutMsg & msg, InMsg const & inMsg,
int rotateBy, int foldBy, int gonkBy = 0)
{
// create & add up to three messages
}
You might even improve that by making this a member of OutMsg, and using a fluent interface, such that you can write
OutMsg msg;
msg.AddSimpleTransform(inMsg, 12, 17)
.Staple("print")
.AddArtificialRust(0.02);
which can be an additional improvement under circumstances.
I'm a student who's learning C++ at school now. We are using Dev-C++ to make little, short exercises. Sometimes I find it hard to know where I made a mistake or what's really happing in the program. Our teacher taught us to make drawings. They can be useful when working with Linked Lists and Pointers but sometimes my drawing itself is wrong.
(example of a drawing that visualizes a linked list: nl.wikibooks.org/wiki/Bestand:GelinkteLijst.png )
Is there any software that could interpret my C++ code/program and visualize it (making the drawings for me)?
I found this: link text
other links:
cs.ru.ac.za/research/g05v0090/images/screen1.png and
cs.ru.ac.za/research/g05v0090/index.html
That looks like what I need but is not available for any download. I tried to contact that person but got no answer.
Does anybody know such software? Could be useful for other students also I guess...
Kind regards,
juFo
This is unrelated to the actual title but I'd like to make a simple suggestion concerning how to understand what's happening in the program.
I don't know if you've looked at a debugger but it's a great tool that can definitely vastly improve your understanding of what's going on. Depending on your IDE, it'll have more or less features, some of them should include:
seeing the current call stack (allows you to understand what function is calling what)
seeing the current accessible variables along with their values
allowing you to walk step by step and see how each value changes
and many, many more.
So I'd advise you to spend some time learning all about the particular debugger for your IDE, and start to use all of these features. There's sometimes a lot more stuff then simply clicking on Next. Some things may include dynamic code evaluation, going back in time, etc.
Have a look at DDD. It is a graphical front-end for debuggers.
Try debuggers in general to understand what your program is doing, they can walk you through your code step-by-step.
Doxygen has, if I recall, a basic form of this but it's really only a minor feature of a much bigger library, so that may be overkill for what you want. (Though it's a great program for documentation!)
Reverse engineering the code to some sort of diagram, will have limited benefit IMO. A better approach to understanding program flow is to step the code in the debugger. If you don't yet use a debugger, you should; it is the more appropriate tool for this particular problem.
Reverse engineering code to diagrams is useful when reusing or maintaining undocumented or poorly documented legacy code, but it seldom exposes the design intent of the code, since it lacks the abstraction that you would use if you were designing the code. You should not have to resort to such things on new code you have just written yourself! Moreover, tools that do this even moderately well are expensive.
Should you be thinking you can avoid design, and just hand in an automatically generated diagram, don't. It will be more than obvious that it is an automatically generated diagram!
I am currently working on a large project, and I spend most of the time debugging. While debugging is a normal process, there are bugs, that are unstable, and these bugs are the greatest pain for the developer. The program does not work, well, sometimes... Sometimes it does, and there is nothing you can do about it.
What can be done about these bugs? Most common debugging tools (interactive debuggers, watches, log messages) may lead you nowhere, because the bug will disappear ... just to appear once again, later. That is why I am asking for some heuristics: what are the most common reasons for such bugs? What suspicious code should we investigate to locate such a bugs?
Let me start the list:
using uninitialized variables.
Common misprints like mMember =
mMember;
thread synchronization.
Sometimes it can be a matter of
luck;
working with non-smart
pointers, dereferencing invalid
ones;
what else?
IME the underlying problem in many projects is that developers use low-level features of C++ like manual memory management, C-style string handling, etc. even though they are very rarely ever necessary (and then only well encapsulated in classes). This leads to memory corruption, invalid pointers, buffer overflows, resource leaks and whatnot. All the while nice and clean high-level constructs are available.
I was part of the team for a large (several MLoC) application for several years and the number of crashing bugs for different parts of the application nicely correlated to the programming style used within these parts. When asked why they wouldn't change their programming style some of the culprits answered that their style in general yields more performance. (Not only is this wrong, it's also a fact that customers rather have a more stable but slower program than a fast one that keeps crashing on them. Also, most of their code wasn't even required to be fast...)
As for multi-threading: I don't feel expert enough to offer solutions here, but I think Herb Sutter's Effective Concurrency columns are a very worthwhile read on the subject.
Edit to address the discussions in the comments:
I did not write that "C-style string handling is not more performant". (Certainly a lot of negation in this sentence, but since I feel misread, I try to be precise.) What I said is that high level constructs are not in general less performant: std::vector isn't in general slower than manually doing dynamically allocated C arrays, since it is a dynamically allocated C array. Of course, there are cases where something coded according to special requirements will perform better than any general solution -- but that doesn't necessarily mean you'll have to resort to manual memory management. This is why I wrote that, if such things are necessary, then only well-encapsulated in classes.
But what's even more important: in most code the difference doesn't matter. Whether a button depresses 0.01secs after someone clicked it or 0.05secs simply doesn't matter, so even a factor 5 speed gain is irrelevant in the button's code. Whether the code crashes, however, always matters.
To sum up my argument: First make it work correctly. This is best done using well-proven off-the-shelf building blocks. Then measure. Then improve performance where it matters, using well-proven off-the-shelf idioms.
I was actually going to post a question that asked exactly the opposite - do others find, as I do, that you spend almost no time using the debugger when working with C++? I honestly cannot remember the last time I used one - it must have been about six months ago.
Frankly, if you spend most of the time in the debugger, I think there is something very wrong with your basic coding practices.
Race conditions.
These are one of the few things that still sends a shiver down my spine when it comes up in debugging (or in the issue tracker). Inherently horrible to debug, and extremely easy to create. The three most common causes of bugs in my C++ software have been race conditions, reliance on uninitialised memory, and reliance on static constructor order.
And if you don't know what race conditions are, chances are they're the cause of your instability ;)
If you are really in a position where you already have bad code that breaks, the best plan is probably to throw as many tools at it as you can (OS/lib-level memory checking, automated testing, logging, core dumps, etc) to find the problem areas. Then rewrite the code to do something more deterministic. Most of the bugs come from people doing things that mostly work most of the time, but C++ offers stronger guarantees if you use the right tools and approaches.
Haven't seen this one mentioned yet:
Inheriting from a class that does not have a virtual destructor.
Reading from uncached memory while a cache line is being written back over the memory (This is a right bastard to find).
Buffer overwrites
Stack overflows!
The only 3 i can think of at the mo ... may edit later :)
buffer overflows
using pointers to deleted objects
returning invalid references or references to out of scope objects
unhandled exceptions
resource leaks (not only memory)
infinite recursion
dynamic libraries version mismatch
Not really a C++ issue but seen in a C/C++ project.
The trickiest issue I had to deal with was an initialization issue when starting up the OS on our platform that lead to unusual crashes. It took years before we found out what happened. Before that we ran the system overnight and if it didn't crash, then it was normally okay.
Luckily, the OS isn't sold anymore.
addresses and memory used before allocation or after deallocation, segmentation faults, arrayoutofbounds, offset, threadlocks, unintelligible operator overloading, inline assembly, void exit and void in general where return values are desired complicates where math.h functions are worth a look since all math.h functions both have working arguments and return values compared to other library overly void, emptiness tests, nils, nulls and voids. 4 general conventions I recommend are return values, arguments, ternary choices and invertible changes. Faultprone to avoid are vectors (use arrays instead) void with empty arguments and in my subjective opinion I avoid the switch statement in favor of more intelligible or readable if...elseif or more abstract "is".
C++ also has rather lousy forward compatibility compared to scripts and interpreted, to try a decade old Java it still runs unchanged and safe in later vm.
I don't mean external tools. I think of architectural patterns, language constructs, habits. I am mostly interested in C++
Automated Unit Testing .
There's an oft-unappreciated technique that I like to call The QA Team that can do wonders for weeding out bugs before they reach production.
It's been my experience (and is often quoted in textbooks) that programmers don't make the best testers, despite what they may think, because they tend to test to behaviour they already know to be true from their coding. On top of that, they're often not very good at putting themelves in the shoes of the end user (if it's that kind of app), and so are likely to neglect UI formatting/alignment/usability issues.
Yes, unit testing is immensely important and I'm sure others can give you better tips than I on that, but don't neglect your system/integration testing. :)
..and hey, it's a language independent technique!
Code Review, Unit Testing, and Continuous Integration may all help.
I find the following rather handy.
1) ASSERTs.
2) A debug logger that can output to the debug spew, console or file.
3) Memory tracking tools.
4) Unit testing.
5) Smart pointers.
Im sure there are tonnes of others but I can't think of them off the top of my head :)
RAII to avoid resource leakage errors.
Strive for simplicity and conciseness.
Never leave cases where your code behavior is undefined.
Look for opportunities to leverage the type system and have the compiler check as much as possible at compile time. Templates and code generation are your friends as long as you keep your common sense.
Minimize the number of singletons and global variables.
Use RAII !
Use assertions !
Automatic testing of some nominal and all corner cases.
Avoid last minute changes like the plague.
I use thinking.
Reducing variables scope to as narrow as possible. Less variables in outer scope - less chances to plant and hide an error.
I found that, the more is done and checked at compile time, the less can possibly go wrong at run-time. So I try to leverage techniques that allow stricter checking at compile-time. That's one of the reason I went into template-meta programming. If you do something wrong, it doesn't compile and thus never leaves your desk (and thus never arrives at the customer's).
I find many problems before i start testing at all using
asserts
Testing it with actual, realistic data from the start. And testing is necessary not only while writing the code, but it should start early in the design phase. Find out what your worst use cases will be like, and make sure your design can handle it. If your design feels good and elegant even against these use cases, it might actually be good.
Automated tests are great for making sure the code you write is correct. However, before you get to writing code, you have to make sure you're building the right things.
Learning functional programming helps somehow.
HERE
Learn you a haskell for great good.
Model-View-Controller, and in general anything with contracts and interfaces that can be unit-tested automatically.
I agree with many of the other answers here.
Specific to C++, the use of 'const' and avoiding raw pointers (in favor of references and smart pointers) when possible has helped me find errors at compile time.
Also, having a "no warnings" policy helps find errors.
Requirements.
From my experience, having full and complete requirements is the number one step in creating bug-free software. You can't write complete and correct software if you don't know what it's supposed to do. You can't write proper tests for software if you don't know what it's supposed to do; you'll miss a fair amount of stuff you should test. Also, the simple process of writing the requirements helps you to flesh them out. You find so many issues and problems before you ever write the first line of code.
I find peer progamming tends to help avoid a lot of the silly mistakes, and al ot of the time generates discussions which uncover flaws. Plus with someone free to think about the why you are doing something, it tends to make everything cleaner.
Code reviews; I've personally found lots of bugs in my colleagues' code and they have found bugs in mine.
Code reviews, early and often, will help you to both understand each others' code (which helps for maintenance), and spot bugs.
The sooner you spot a bug the easier it is to fix. So do them as soon as you can.
Of course pair programming takes this to an extreme.
Using an IDE like IntelliJ that inspects my code as I write it and flags dodgy code as I write it.
Unit Testing followed by Continious Integration.
Book suggestions: "Code Complete" and "Release it" are two must-read books on this topic.
In addition to the already mentioned things I believe that some features introduced with C++0x will help avoiding certain bugs. Features like strongly-typed enums, for-in loops and deleteing standard functions of objects come to mind.
In general strong typing is the way to go imho
Coding style consistency across a project.
Not just spaces vs. tab issues, but the way that code is used. There is always more than one way to do things. When the same thing gets done differently in different places, it makes catching common errors more difficult.
It's already been mentioned here, but I'll say it again because I believe this cannot be said enough:
Unnecessary complexity is the arch nemesis of good engineering.
Keep it simple. If things start looking complicated, stop and ask yourself why and what you can do to break the problem down into smaller, simpler chunks.
Hire someone that test/validate your software.
We have a guy that use our software before any of our customer. He finds bugs that our automated tests processes do not find, because he thinks as a customer not as a software developper. This guy also gives support to our customers, because he knows very well the software from the customer point of view. INVALUABLE.
all kinds of 'trace'.
Something not mentioned yet - when there's even semi-complex logic going on, name your variables and functions as accurately as you can (but not too long). This will make incongruencies in their interactions with each other, and with what they're supposed to be doing stand out better. The 'meaning', or language-parsing part of your brain will have more to grab on to. I find that with vaguely named things, your brain sort of glosses over what's really there and sees what is /supposed to/ be happening rather than what actually is.
Also, make code clean, it helps to keep your brain from getting fuzzy.
Test-driven development combined with pair programming seems to work quite well on keeping some bugs down. Getting the tests created early helps work out some of the design as well as giving some confidence should someone else have to work with the code.
Creating a string representation of class state, and printing those out to console.
Note that in some cases single line-string won't be enough, you will have to code small printing loop, that would create multi-line representation of class state.
Once you have "visualized" your program in such a way you can start to search errors in it. When you know which variable contained wrong value in the end, it's easy to place asserts everywhere where this variable is assigned or modified. This way you can pin point the exact place of error, and fix it without using the step-by-step debugging (which is rather slow way to find bugs imo).
Just yesterday found a really nasty bug without debugging a single line:
vector<string> vec;
vec.push_back("test1");
vec.push_back(vec[0]); // second element is not "test1" after this, it's empty string
I just kept placing assert-statements and restarting the program, until multi-line representation of program's state was correct.
I have to interview some C++ candidates over the next few weeks and as the most senior programmer in the company I'm expected to try and figure out whether these people know what they are doing.
So has anybody got any suggestions?
Personally I hate being left in a room to fill out some C++ questions so I'd rather do a more complex test that I can chat with the interviewee about their approaches and so forth as we go along. ie it doesn't matter whether they get the right answers or not its how they approach the problem that interests me. I don't care whether they understand obscure features of the language but I do care that they have a good solid understanding of pointers as well as understanding the under lying differences between pointers and references. I would also love to see how they approach optimisation of a given problem because solid fast code is a must, in my opinion.
So any suggestions along these lines would be greatly appreciated!
I wouldn't make them write code. Instead, I'd give them a couple of code snippets to review.
For example, the first would be about design by contract: See if they know what preconditions, postconditions and invariants are. Do a couple of small mistakes, such as never initializing an integer field but asserting that it is >= 0 in the invariant, and see if they spot them.
The second would be to give them bool contains(char * inString, char c). Implement it with a trivial loop. Then ask whether there are any mistakes. Of course, your code here does not check for null in the input parameter inString (even if the very previous question talked about preconditions!). Also, the loop finishes at character 0. Of course, the candidate should spot the possible problems and insist on using std::string instead of this char * crap. It's important because if they do complain, you'll know that they won't add their own char *'s to new code.
An alternative which addresses containers: give them a std::vector<int> and code which searches for prime numbers or counts the odd numbers or something. Make some small mistake. See if they find any issues and they understand the code. Ask in which situation a std::set would be better (when you are going to search elements quite systematically and original order of entrance doesn't matter.).
Discuss everything live, letting them think a couple minutes. Capture the essence of what they say. Don't focus on "coverage" (how many things they spot) because some people may be stressed. Listen to what they actually say, and see if it makes any sense.
I disagree with writing code in interviews. I'd never ask anyone to write code. I know my handwritten code would probably suck in a situation like that. Actually, I have seldom been asked to do so, but when I have, I haven't been hired.
This one is a great complex task, even though it is looking quite harmless.
I believe that a C++ programmer needs more than just generic programming skills, because...
In C++ it's harder to shoot yourself in the foot, but when you do, you blow off your whole leg.
Writing bug-free, maintainable C++ code places a much higher demand on a few areas than most languages.
One thing I'll call "pedanticness". You know how some people can spot spelling errors in something at a glance? A C++ programmer needs to be able to spot simple bugs while they read or write code (whether the code is their own or not). A programmer who relies on the "compile and test" technique just to get rid of simple bugs is incompatible with the C++ language, because those bugs don't always lead to immediate failure in C++.
C++ programmers also need a good knowledge of low-level stuff. Pointers, memory allocators, blocking, deadlocks. And "nitty gritty" C++ issues, like multiple inheritance and method hiding and such, particularly if they need to maintain other people's code.
Finally, C++ programmers need to be able to write code that's easy for other people to use. Can they design stuff well?
A good test for the first two areas is "Here's some C++ code I got off the internet. Find the bugs, and identify the unneccessary bits." (There's lots of really bad C++ code available on the internet, and often the programmer does unnecessary things due to a faulty understanding of how to be "safe" in C++.)
The last area you can test with more generic interview questions.
A few questions can allow you to know a lot about a candidate:
Differences between a pointer and a reference, when would you use each?
Why would you make a destructor virtual?
Where is the construction order of a class attributes defined?
Copy constructor and operator=. When would you implement them? When would you make them private?
When would you use smart pointers? what things would you take into account to decide which?
Where else have you seen RAII idiom?
When would you make a parameter const? when a method?
When would you make an attribute mutable?
What is virtual inheritance?
What is template specialization?
What are traits?
What are policies?
What is SFINAE?
What do you know about C++Ox standard?
What boost libraries have you used?
What C++ books have you read? (Sutter? Alexandrescu?)
Some short exercises (no more than 10 minutes) about STL containers, memory management, slicing, etc. would also be useful. I would allow him to do it in a computer with a ready environment. It's important to observe the agility.
Checkout Joel's Guerrilla guide to interviewing. Seems a lot like what you are looking for.
"Write a program that receives 3 integers in the range of 0..2^32-1, and validates if they represent valid edges of a triangle".
It seems to be a simple question. The input is considered valid if the sum of any two edges is greater than the third edge. However, there are some pitfalls, that a good programmer will handle:
The correct type to use should be unsigned long. Many "programmers" will fail here.
Zero values should be considered as non-valid.
Overflow should be avoided: "if (a+b <= c) return false" is problematic since a+b may cause an overflow.
if (a <= c-b) is also bad solution since c-b may be negative. Not a good thing for unsigned types.
if (c > b) { if (a <= c-b) return false; } else { if (a <= b-c) return false; } This looks much better, but it will not work correctly if (a >= b+c).
A good programmer must be detail oriented. This simple exercise will help checking if he is.
Depending on what your organisation's pre-screening is like, assume that the person knows nothing at all about C++ and has just put in on their CV because it makes them look supertechnical. Seriously. Start with something simple, like reversing a string. I have had candidates who couldn't even write a function prototype for this !!
Do not forget to also test for code bigotry. I know I don't want anyone working for or with me that isn't a flexible and consequently practical programmer both in their attitude to the programming language, but also in their approach to problem solving.
Denying any type of preconceptions
Understanding the value of the
exceptions in any Best Practices
Being capable of refusing long term
habits in favor of something else if
the need arises
These are characteristics dear to me. The manner of testing for these is not ideal if the interviews aren't lengthy or don't involve presenting code. But showing code snippets with purposely debatable techniques while offering a use case scenario and asking the candidate how they feel about the solution is one way.
This article offers some general ideas that are relevant regardless of what language you're working with.
Don't test only the C++ and overall technical skills! Those are of course important, but they are nothing if people don't listen, don't answer properly or don't follow the commitments they made.
Check at most for the ability to clearly communicate. If people cant tell you what roughly they did in their former jobs within a few minutes, they will also be unable to report about their work at your place etc.
In a recent company we invited people for interviews in groups of about 3 people together. They were surprised, but nobody was angry about that. It was very interesting, because people had to communicate not only with us, but also with others in the same position. In case we were interested further, we arranged a second interview.
You can choose potentially problematic task and see how they approach it. Ask them to write a smart pointer for example, you'll see if they understand pointers, references and templates in one step :) Usually they are stressed so they will do mistakes, those mistakes might help you find out how good they problem solving skills are, what paths would they use to fix a mistake and so on. The only problem with this approach is that sometimes interviewee just don't know anything about the task and you would have to quickly figure out something easier. If they do perfect code you can discuss their choices but when there's nothing to look at it is depressing for both of you.
Here is my answer to a similar question geared towards C#, but notice that my answer is language agnostic. My interview question is, in fact, in C. I rarely interview a person with the goal of finding out if they can program. I want to find out if they can think, problem solve, collaborate, communicate, understand something new, and so on. In the meantime, I circle around trying to see if they "get it" in terms of the big picture of software engineering. I use programming questions because that's a common basis and an easy ruse.
Get Codility.com to screen out non-programming programmers, this will get you a limited number of mostly reasoable candidates. Sit for an hour with each of them and try to build something together (a micro web server, a script for processing some of your data, a simple GUI). Pay attention to communication skills, i.e. how much effort does it take to understand the candidate. Ask the candidate for recommendation of books related to the subject (C++ software development in your case). Follow Guerilla Guide to Interviewing, i.e. answer yourself honestly, if the person is smart and gets things done. Good luck.
Check 10 C++ Interview Questions by Tests4Geeks.
It's an addition to their pre-interview C++ test and it has really usefull questions. Many people have been working on these interview questions so it's quite balanced and has no tricky or syntax questions.
Idea is quite simple - first you weed out incompetent candidates using the test, then you use article questions in real-life interview.
Whatever you do, pairing would be a good idea. Come up with a good program and pair with the guy and work towards solving the problem. IMHO, that a very good idea
So has anybody got any suggestions?
I'd recommend getting a copy of this:
http://www.amazon.co.uk/Programming-Interviews-Exposed-Secrets-Programmer/dp/047012167X/ref=sr_1_1?ie=UTF8&s=books&qid=1252499175&sr=8-1
ie it doesn't matter whether they get the right answers or not its how they approach the problem that interests me
You could ask the candidate to come up with a UML design to a common problem. If they show you a design pattern, then you can talk through the pros/cons of the pattern. You could then ask them to produce some code for one of the classes.
This would help you determine their technical knowledge level and their communication abilities.
I do care that they have a good solid understanding of pointers as well as understanding the under lying differences between pointers and references
Linked list problems are good for determining whether a candidate has a solid grasp of pointers.
As for references, you could show them some code that does not use references correctly, and ask them to describe the problem.
e.g show them a class definition that contains a reference member variable, and the implementation of the constructor with the reference initialization missing.
I would also love to see how they approach optimisation of a given problem because solid fast code is a must, in my opinion.
I'd start off simple...
Show them a code example that passes strings to a function by value. (the strings should not be modified in the function). You should check they correct the code to pass the strings by const reference.
After this, you could show a constructor that uses assignment instead of initialization (for objects). Ask them to improve it.
Lastly, ask them simple questions about data structure selection.
e.g. When they should use a list rather than a vector.
If you feel they have a grasp of the fundamentals you could either ask how they approach optimization problems (discuss profilers etc), or ask them to optimize something less obvious.
Take a look into this C++ test. They have a questions about differences between pointers and references as you require.
Here is full list of topics:
Fundamentals: References & Pointers, Const Correctness, Explicit
Standard Library
Class Design, Overloading
Virtual Functions, Polymorphism, Inheritance
Memory Management, Exception Safety
Miscellaneous: Perfect Forwarding, Auto, Flow Control, Macros
These guys are really serious about their questions, they also made the great list of C++ interview question which you might ask your candidates:
https://tests4geeks.com/cpp-interview-questions/