Secret to achieve good OO Design [closed]

Secret to achieve good OO Design [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am a c++ programmer, and I am looking forward to learning and mastering OO design.I have done a lot of search and as we all know there is loads of material, books, tutorials etc on how to achieve a good OO design. Of course, I do understand a good design is something can only come up with loads of experience, individual talent, brilliance or in fact even by mere luck(exaggeration!).
But sure it all starts off with a solid beginning & building some strong basics.Can someone help me out by pointing out the right material on how to start off this quest of learning designing right from the stage of identifying objects, classes etc to the stage of using design patterns.
Having said that I am a programmer but I have not had a experience in designing.Can you please help me take someone help me out in this transition from a programmer to a designer?
Any tips,suggestions,advice will be helpful.
[Edit]Thanks for the links and answers, I need to get myself in to that :) As i mentioned before I am a C++ programmer and I do understand the OO basic concepts as such, like inheritance, abstraction, polymorphism, and having written code in C++ do understand a few of the design patterns as well.what i dont understand is the basic thought process with which one should approach a requirement. The nitty grittys of how to appraoch and decide on what classes should be made, and how to define or conclude on relationships they should have amongst themselves.Knowing the concepts(t some extent) but not knowing how to apply them is the problem i seem to have :( Any suggestions about that?

(very) Simple, but not simplist design (simple enough design if you prefer) : K.I.S.S.
Prefer flat hierarchies, avoid deep hierarchies.
Separation of concerns is essential.
Consider other "paradigms" than OO when it don't seem simple or elegant enough.
More generally : D.R.Y. and Y.A.G.N.I help you achieve 1.

There is no secret. It's sweat, not magic.
It's not about doing one thing right. It's balancing many things that must not go wrong. Sometimes, they work in sync, sometimes, they work against each other. Design is only one group of these aspects. The best design doesn't help if the project fails (e.g. because it never ships).
The first rule I'd put forward is:
1. There are no absolutes
Follows directly from the "many things to balance. D.R.Y., Y.A.G.N.I. etc. are guidelines, strictly following them cannot guarantee good design, if followed by the letter they may make your project fail.
Example: D.R.Y. One of the most fundamental principles, yet studies show that complexity of small code snippets increases by a factor of 3 or more when they get isolated, due to pre/post condition checking, error handling, and generalization to multiple related cases. So the principle needs to be weakened to "D.R.Y. (at least, not to much)" - when to and when not is the hard part.
The second rule is not a very common one:
2. An interface must be simpler than the implementation
Sounds to trivial to be catchy. Yet, there's much to say about it:
The premise of OO was to manage program sizes that could not be managed with structured programming anymore. The primary mechanism is to encapsulate complexity: we can hide complexity behind a simpler interface, and then forget about that complexity.
Interface complexity involves the documentation, error handling specifications, performance guarantees (or their absence), etc. This means that e.g. reducing the interface declaration by introducing special cases isn't a reduction in complexity - just a shuffle.
3-N Here's where I put most of the other mentions, that have been explained already very well.
Separation of Concerns, K.I.S.S, SOLID principle, D.R.Y., roughly in that order.
How to build software according to these guidelines?
Above guidelines help evaluating a piece of code. Unfortunately, there's no recipe how to get there. "Experienced" means that you have a good feel for how to structure your software, and some decisions just feel bad. Maybe all the principles are just rationnalizaitons after the fact.
The general path is to break down a system into responsibilities, until the individual pieces are managable.
There are formal processes for that, but these just work around the fact that what makes a good, isolated component is a subjective decision. But in the end, that's what we get paid for.
If you have a rough idea of the whole system, it isn't wrong to start with one of these pieces as a seed, and grow them into a "core". Top-down and bottom-up aren't antipodes.
Practice, practice, practice. Build a small program, make it run, change requirements, get it to run again. The "changing requirements" part you don't need to train a lot, we have customers for that.
Post-Project reviews - try to get used to them even for your personal projects. After it's sealed, done, evaluate what was good, what was bad. Consider the source code was thrown away - i.e. don't see that sessison as "what should be fixed?"
Conway's Law says that "A system reflects the structure of the organizaiton that built it." That applies to most complex software I've seen, and formal studies seem to confirm that. We can derive a few bits of information from that:
If structure is important, so are the people you work with.
Or Maybe structure isn't that important. There is not one right structure (just many wrong ones to avoid)

I'm going to quote Marcus Baker talking about how to achieve good OO design in a forum post here: http://www.sitepoint.com/forums/showpost.php?p=4671598&postcount=24
1) Take one thread of a use case.
2) Implement it any old how.
3) Take another thread.
4) Implement it any old how.
5) Look for commonality.
6) Factor the code so that commonality is collected into functions. Aim for clarity of code. No globals, pass everything.
7) Any block of code that is unclear, group into a function as well.
8) Implement another thread any old how, but use your existing functions if they are instant drop-ins.
9) Once working, factor again to remove duplication. By now you may find you are passing similar lumps of stuff around from function to function. To remove duplication, move these into objects.
10) Implement another thread once your code is perfect.
11) Refactor to avoid duplication until bored.
Now the OO bit...
12) By now some candidate higher roles should be emerging. Group those functions into roles by class.
13) Refactor again with the aim of clarity. No class bigger than a couple of pages of code, no method longer than 5 lines. No inheritance unless the variation is just a few lines of code.
From this point on you can cycle for a bit...
14) Implement a thread of use case any old how.
15) Refactor as above. Refactoring includes renaming objects and classes as their meanings evolve.
16) Repeat until bored.
Now the patterns stuff!
Once you have a couple of dozen classes and quite a bit of functionality up and running, you may notice some classes have very similar code, but no obvious split (no, don't use inheritance). At this point consult the patterns books for options on removing the duplication. Hint: you probably want "Strategy".
The following repeats...
17) Implement another thread of a use case any old how.
18) Refactor methods down to five lines or less, classes down to 2 pages or less (preferably a lot less), look for patterns to remove higher level duplication if it really makes the code cleaner.
19) Repeat until your top level constructors either have lot's of parameters, or you find yourself using "new" a lot to create objects inside other objects (this is bad).
Now we need to clean up the dependencies. Any class that does work should not use "new" to create things inside itself. Pass those sub objects from out side. Classes which do no mechanical work are allowed to use the "new" operator. They just assemble stuff - we'll call them factories. A factory is a role in itself. A class should have just one role, thus factories should be separate classes.
20) Factor out the factories.
Now we repeat again...
21) Implement another thread of a use case any old how.
22) Refactor methods down to five lines or less, classes down to 2 pages or less (preferably a lot less), look for patterns to remove higher level duplication if it really makes the code cleaner, make sure you use separate classes for factories.
23) Repeat until your top level classes have an excessive number of parameters (say 8+).
You've probably finished by now. If not, look up the dependency injection pattern...
24) Create (only) your top level classes with a dependency injector.
Then you can repeat again...
25) Implement another thread of a use case any old how.
26) Refactor methods down to five lines or less, classes down to 2 pages or less (preferably a lot less), look for patterns to remove higher level duplication if it really makes the code cleaner, make sure you use separate classes for factories, pass the top level dependencies (including the factories) via DI.
27) Repeat.
At any stage in this heuristic you will probably want to take a look at test driven development. At the very least it will stop regressions while you refactor.
Obviously, this is a pretty simple process, and the information contained therein shouldn't be applied to every situation, but I feel like Marcus gets it right, especially with regards to the process one should use to design OO code. After a while, you'll start doing it naturally, it'll just become second nature. But while learning to do so, this is a great set of steps to follow.

As you said, there is nothing like experience. You can read every existing book on the planet about this, you'll still not as good as if you practice.
Understanding the theory is good, but in my humble opinion, there is nothing like experience. I think the best way to learn and understand things completely is to apply them in some project(s).
There you'll face difficulties, you'll learn to solve them, sometimes perhaps with a bad solution : but still you'll learn. And if at any time something bothers you and you can't find how to solve it nicely, we'll be here on SO to help you ! :)

I can advice you the book "Head First Design Patterns" (search Amazon). It is a good starting point, before seriously diving into the gang of fours' bible, and it shows design principles and the most used patterns.

In a nutshell : Code, criticize, look for a well-known solution, implement it, and back to first step till you're (more or less) satisfied.
As often, the answer to this kind of question is : it depends. And in that case, it depends how you learn things. I'll tell you what work for me, for I face the very problem you describe, but it won't work with everybody and I would not say it's "the perfect answer".
I begin coding something, not too simple, not too complex. Then, I look at the code and I think : all right, what is wrong ? For that, you can use the first three "SOLID principles" :
Single responsibility (are all your classes serving a unique purpose ?)
Open/Close principle (if you want to add a service, your classes can be extended with inheritance, but there is no need to alter the basic functions or your current classes).
Liskov Substitution (all right, this one I can't explain simply, and I'd advise reading about it).
Don't try to master those and understand everything about them. Just use them as guideline to criticize your code. Think chiefly about "what if I want to do this now ?". "What if I work for a client, and he wants to add this or that ?". If your code is perfectly adaptable to any situation (which is almost impossible), you might have reached a very good design.
If it's not, consider a solution. What would you do ? Try to come with an idea. Then, read about design patterns and find one that could answer your problem. See if it matches your idea - often, it's the idea you had, but better expressed and developped. Now, try to implement it. It's going to take time, you'll often fail, it's frustrating, and that's normal.
Design is about experience, but experience is acquired by criticizing your own code. That's how you'll understand design, not as a cool thing to know, but as the basis for a solid code. It's not enough to know "all right, a good code has that and that". It's much better to have experienced why, to have failed and see what whas wrong. The trouble with design pattern is that they are very abstract. My method is a way (probably not the only one nor the best) to make them less abstract to you.

No-solo-work. Good designs are seldom created by a single person only. Talk to your colleagues. Discuss your design with others. And learn.
Don't be too smart. A complex hierarchy with 10 levels of inheritance is seldom a good design. Make sure that you can clearly explain how your design works. If you can't explain the basic principles in 5 minutes, your design is probably too complex.
Learn tricks from the masters: Alexandrescu, Meyers, Sutter, GoF.
Prefer extensibility over perfection. Source code written today will be insufficient in 3 years time. If you write your code to be perfect now, but inextensible, you will have a problem later. If you write your code to be extensible (but not perfect), you will still be able to adapt it later.

The core concepts in my mind are:
Encapsulation - Keep as much of you object hidden from both the prying eyes and sticky fingers of the outside world.
Abstraction - Hide as much of the inner workings of you object from the simple minds of the code that needs to use your objects
All the other concepts such as inheritance, polymorphism and design patters are about incorporating the two concepts above and still have objects that can solve real world problems.

Related

How to select the right architectural/design patterns

I am doing my own research project, and I am quite struggling regarding the right choice of architectural/design patterns.
In this project, after the "system" start, I need to do something in background (tasks, processing, display data and so on) and at the same be able to interact with the system using, for example, keyboard and send some commands, like "give me status of this particular object" or "what is the data in this object".
So my question is - what software architectural/design patterns can be applied to this particular project? How the interraction between classes/objects should be organized? How should the objects be created?
Can, for example, "event-driven architecture" or "Microkernel" be applied here? Some references to useful resources will be very much appreciated!
Thank you very much in advance!

Careful with design patterns. If you sprinkle them throughout your code hoping that everything will work great, you'll soon have an unreadable, boilerplate full mess. They are recipes, not solutions.
My advice to you is pick a piece of paper and a pencil and start drawing all the entities of your domain, with all their requisites, and see how they relate. If you want to get somewhat serious about it, you can do something like this.
When defining your entities, strive for high cohesion and loose coupling.
High cohesion means that you should keep similar functionalities together. In a very simple example, if you have a class that reads stuff from a file and processes it, the class has low cohesion, since reading and processing are two very distinct functionalities. In this case, you would want a class for each functionality.
As for loose coupling, it means that your entities should be independent of each other. Using the example above, supposed that you are now the proud owner of two highly cohesive classes - one that reads stuff from a file (Reader), and one that processes that stuff (Processor). Now, suppose that the Processor class has an instance of the Reader class, and calls it in order to get its input. In this case, we can say that both classes are tightly coupled, since Processor won't work without Reader. In the OOP world, the solution for this is typically the use of interfaces. You can find a neat example here.
After defining an initial model of your domain and gathering as much knowledge about it as you can, you can now start to think about the implementation's architecture. This is were you can start thinking about the architectural patterns. Event driven architecture, clean architecture, MVP, MVVM... It will all depend on your domain. It is your job to know which pattern will fit best. Spoiler alert: this can be extremely hard to do correctly even for experienced engineers, so don't be afraid to fail.
Finally, leave the design patterns for the implementation stage. Their use completely depends on your implementation problems and decisions. Also, DON'T FORCE THEM. Ideally, you will solve a problem and, IF APPLICABLE, you'll see a pattern emerging. Trust me, the last thing you want is to have a case of design patternitis. Anyway, if you need literature on patterns, I totally recommend this book. It's great no matter your level as an engineer.
Further reading:
SOLID principles
Onion Architecture
Clean architecture
Good luck!

You have a background task, and it can be used for a message pump/event queue indeed. Then your foreground task would send requests to this background thread and asynchronously wait for the result.
Have a look at the book "Patterns for Parallel Programming".

It is much better if you check a book for Design Patterns. I really like this one.
For example, if you need to get some data from a particular object, you may need the Observer Pattern to work for you and as soon as the object has the data, you (or another object) get to know this data and can work with it, with another pattern (strategy might work, it really depends on what you have to do).
If you have to do some things at the same time, check also the Singleton pattern (well, check the most important ones!).

Symptoms and alternatives to overused OOP

Lately I am losing my trust in OOP. I have already seen many
complaints about common OOP misuses or just simple overuse. I do not
mean the common confusion between is-a and has-a relationship. I mean
stuff like the problems of ORM when dealing with relational databases,
the excessive use of inheritance from C# and also several years of looking
at code with the same false encapsulation belief that Scott Meyers
mentions in the item 23 of Effective C++
I am interested in learning more about this and non OOP software
patterns that can solve certain problems better than their OOP
counterparts. I am convinced that out there there are many people
giving good advice on how to use this as an advantage with non pure OOP
languages such as C++.
Does anyone knows any good reference (author, book, article) to get
started?
Please, notice that I am looking for two related but different things:
Common misuses of OOP concepts (like item 23)
Patterns where OOP is not the best solution (with alternatives)

Well I can recommend you a book Agile Principles, Patterns, and Practices in C#.
Examples are in C# of course, but the idea of the book is universal. Not only it covers Agile but also focuses on bad practices and shows in examples how to convert bad code to a good code. It also contains descriptions of many design pattern and shows how to implement them in semi-real example of Payroll application.

This has to be done but if you truly want to get away from OOP or at least take a look at concepts which are not OOP but are used with great effectiveness: Learn you a Haskell. Try a new programming paradigm and then start seeing where you can apply much of the concepts back to OOP languages. This addresses your second bullet, not in a direct way but trust me, it'll help more than you can think.

It's a bit odd that you mention C#. It has very powerful keywords to keep the usual inheritance misery in check. The first one ought to be the internal keyword. The notion of restricting the visibility to a module. That concept is completely absent in C++, the build model just doesn't support it. Otherwise a great concept, "I only trust the members of my team to get it right". Of course you do.
Then there's the slammer one, the sealed keyword. Extraordinary powerful, "the buck stops here, don't mess with me". Used with surgical precision in the .NET framework, I've never yet found a case where sealed was used inappropriately. Also missing in C++, but with obscure ways to get that working.
But yes, the WPF object model sucks fairly heavy. Inheriting 6 levels deep and using backdoors like a dependency property is offensive. Inheritance is hard, let's go shopping.

I would say to look at game engines. For the most part, OOP has a tendency to cause slight performance decreases, and the gaming industry is seemingly obsessed with eliminating minor slowdowns (and sometimes ignoring large ones). As such, their code, though usually written in a language that supports OOP, will end up using only those elements of OOP that are necessary for clean code / ease of maintenance that also balances performance.
EDIT:
Having said that, I don't know if I would really go look at Unreal. They do some strange things for the sake of making their content pipeline easier for developers... it makes their code... well, look if you really want to know.

One common overuse is forcing OOP in programs/scripts that take some input, turn it to output, then exit (and not receiving input from anywhere else during the process). Procedural way is much cleaner in these cases.
Typical example of this is forcing OOP in PHP scripts.

Do very long methods always need refactoring?

I face a situation where we have many very long methods, 1000 lines or more.
To give you some more detail, we have a list of incoming high level commands, and each generates results in a longer (sometime huge) list of lower level commands. There's a factory creating an instance of a class for each incoming command. Each class has a process method, where all the lower level commands are generated added in sequence. As I said, these sequences of commands and their parameters cause quite often the process methods to reach thousands of lines.
There are a lot of repetitions. Many command patterns are shared between different commands, but the code is repeated over and over. That leads me to think refactoring would be a very good idea.
On the contrary, the specs we have come exactly in the same form as the current code. Very long list of commands for each incoming one. When I've tried some refactoring, I've started to feel uncomfortable with the specs. I miss the obvious analogy between the specs and code, and lose time digging into newly created common classes.
Then here the question: in general, do you think such very long methods would always need refactoring, or in a similar case it would be acceptable?
(unfortunately refactoring the specs is not an option)
edit:
I have removed every reference to "generate" cause it was actually confusing. It's not auto generated code.
class InCmd001 {
OutMsg process ( InMsg& inMsg ) {
OutMsg outMsg = OutMsg::Create();
OutCmd001 outCmd001 = OutCmd001::Create();
outCmd001.SetA( param.getA() );
outCmd001.SetB( inMsg.getB() );
outMsg.addCmd( outCmd001 );
OutCmd016 outCmd016 = OutCmd016::Create();
outCmd016.SetF( param.getF() );
outMsg.addCmd( outCmd016 );
OutCmd007 outCmd007 = OutCmd007::Create();
outCmd007.SetR( inMsg.getR() );
outMsg.addCmd( outCmd007 );
// ......
return outMsg;
}
}
here the example of one incoming command class (manually written in pseudo c++)

Code never needs refactoring. The code either works, or it doesn't. And if it works, the code doesn't need anything.
The need for refactoring comes from you, the programmer. The person reading, writing, maintaining and extending the code.
If you have trouble understanding the code, it needs to be refactored. If you would be more productive by cleaning up and refactoring the code, it needs to be refactored.
In general, I'd say it's a good idea for your own sake to refactor 1000+ line functions. But you're not doing it because the code needs it. You're doing it because that makes it easier for you to understand the code, test its correctness, and add new functionality.
On the other hand, if the code is automatically generated by another tool, you'll never need to read it or edit it. So what'd be the point in refactoring it?

I understand exactly where you're coming from, and can see exactly why you've structured your code the way it is, but it needs to change.
The uncertainty you feel when you attempt to refactor can be ameliorated by writing unit tests. If you've tests specific to each spec, then the code for each spec can be refactored until you're blue in the face, and you can have confidence in it.
A second option, is it possible to automatically generate your code from a data structure?
If you've a core suite of classes that do the donkey work and edge cases, you can auto-generate the repetitive 1000 line methods as often as you wish.
However, there are exceptions to every rule.
If the methods are a literal interpretation of the spec (very little additional logic), and the specs change infrequently, and the "common" portions (i.e. bits that happen to be the same right now) of the specs change at different times, and you're not going to be asked to get a 10x performance gain out of the code anytime soon, then (and only then) . . . you may be better off with what you have.
. . . but on the whole, refactor.

Yes, always. 1000 lines is at least 10x longer than any function should ever be, and I'm tempted to say 100x, except that when dealing with input parsing and validation it can become natural to write functions with 20 or so lines.
Edit: Just re-read your question and I'm not clear on one point - are you talking about machine generated code that no-one has to touch? In which case I would leave things as they are.

Refectoring is not the same as writing from scratch. While you should never write code like this, before you refactor it, you need to consider the costs of refactoring in terms of time spent, the associated risks in terms of breaking code that already works, and the net benefits in terms of future time saved. Refactor only if the net benefits outweigh the associated costs and risks.
Sometimes wrapping and rewriting can be a safer and more cost effective solution, even if it appears expensive at first glance.

Long methods need refactoring if they are maintained (and thus need to be understood) by humans.

As a rule of thumb, code for humans first. I don't agree with the common idea that functions need to be short. I think what you need to aim at is when a human reads your code they grok it quickly.
To this effect it's a good idea to simplify things as much as possible--but not more than that. It's a good idea to delegate roughly one task for each function. There is no rule as for what "roughly one task" means: you'll have to use your own judgement for that. But do recognize that a function split into too many other functions itself reduces readability. Think about the human being who reads your function for the first time: they would have to follow one function call after another, constantly context-switching and maintaining a stack in their mind. This is a task for machines, not for humans.
Find the balance.
Here, you see how important naming things is. You will see it is not that easy to choose names for variables and functions, it takes time, but on the other hand it can save a lot of confusion on the human reader's side. Again, find the balance between saving your time and the time of the friendly humans who will follow you.
As for repetition, it's a bad idea. It's something that needs to be fixed, just like a memory leak. It's a ticking bomb.
As others have said before me, changing code can be expensive. You need to do the thinking as for whether it will pay off to spend all this time and effort, facing the risks of change, for a better code. You will possibly lose lots of time and make yourself one headache after another now, in order to possibly save lots of time and headache later.

Take a look at the related question How many lines of code is too many?. There are quite a few tidbits of wisdom throughout the answers there.
To repost a quote (although I'll attempt to comment on it a little more here)... A while back, I read this passage from Ovid's journal:
I recently wrote some code for
Class::Sniff which would detect "long
methods" and report them as a code
smell. I even wrote a blog post about
how I did this (quelle surprise, eh?).
That's when Ben Tilly asked an
embarrassingly obvious question: how
do I know that long methods are a code
smell?
I threw out the usual justifications,
but he wouldn't let up. He wanted
information and he cited the excellent
book Code Complete as a
counter-argument. I got down my copy
of this book and started reading "How
Long Should A Routine Be" (page 175,
second edition). The author, Steve
McConnell, argues that routines should
not be longer than 200 lines. Holy
crud! That's waaaaaay to long. If a
routine is longer than about 20 or 30
lines, I reckon it's time to break it
up.
Regrettably, McConnell has the cheek
to cite six separate studies, all of
which found that longer routines were
not only not correlated with a greater
defect rate, but were also often
cheaper to develop and easier to
comprehend. As a result, the latest
version of Class::Sniff on github now
documents that longer routines may not
be a code smell after all. Ben was
right. I was wrong.
(The rest of the post, on TDD, is worth reading as well.)
Coming from the "shorter methods are better" camp, this gave me a lot to think about.
Previously my large methods were generally limited to "I need inlining here, and the compiler is being uncooperative", or "for one reason or another the giant switch block really does run faster than the dispatch table", or "this stuff is only called exactly in sequence and I really really don't want function call overhead here". All relatively rare cases.
In your situation, though, I'd have a large bias toward not touching things: refactoring carries some inherent risk, and it may currently outweigh the reward. (Disclaimer: I'm slightly paranoid; I'm usually the guy who ends up fixing the crashes.)
Consider spending your efforts on tests, asserts, or documentation that can strengthen the existing code and tilt the risk/reward scale before any attempt to refactor: invariant checks, bound function analysis, and pre/postcondition tests; any other useful concepts from DBC; maybe even a parallel implementation in another language (maybe something message oriented like Erlang would give you a better perspective, given your code sample) or even some sort of formal logical representation of the spec you're trying to follow if you have some time to burn.
Any of these kinds of efforts generally have a few results, even if you don't get to refactor the code: you learn something, you increase your (and your organization's) understanding of and ability to use the code and specifications, you might find a few holes that really do need to be filled now, and you become more confident in your ability to make a change with less chance of disastrous consequences.
As you gain a better understanding of the problem domain, you may find that there are different ways to refactor you hadn't thought of previously.
This isn't to say "thou shalt have a full-coverage test suite, and DBC asserts, and a formal logical spec". It's just that you are in a typically imperfect situation, and diversifying a bit -- looking for novel ways to approach the problems you find (maintainability? fuzzy spec? ease of learning the system?) -- may give you a small bit of forward progress and some increased confidence, after which you can take larger steps.
So think less from the "too many lines is a problem" perspective and more from the "this might be a code smell, what problems is it going to cause for us, and is there anything easy and/or rewarding we can do about it?"
Leaving it cooking on the backburner for a bit -- coming back and revisiting it as time and coincidence allows (e.g. "I'm working near the code today, maybe I'll wander over and see if I can't document the assumptions a bit better...") may produce good results. Then again, getting royally ticked off and deciding something must be done about the situation is also effective.
Have I managed to be wishy-washy enough here? My point, I think, is that the code smells, the patterns/antipatterns, the best practices, etc -- they're there to serve you. Experiment to get used to them, and then take what makes sense for your current situation, and leave the rest.

I think you first need to "refactor" the specs. If there are repetitions in the spec it also will become easier to read, if it makes use of some "basic building blocks".
Edit: As long as you cannot refactor the specs, I wouldn't change the code.
Coding style guides are all made for easier code maintenance, but in your special case the ease of maintenance is achieved by following the spec.
Some people here asked if the code is generated. In my opinion it does not matter: If the code follows the spec "line by line" it makes no difference if the code is generated or hand-written.

1000 thousand lines of code is nothing. We have functions that are 6 to 12 thousand lines long. Of course those functions are so big, that literally things get lost in there, and no tool can help us even look at high level abstractions of them. the code is now unfortunately incomprehensible.
My opinion of functions that are that big, is that they were not written by brilliant programmers but by incompetent hacks who shouldn't be left anywhere near a computer - but should be fired and left flipping burgers at McDonald's. Such code wreaks havok by leaving behind features that cannot be added to or improved upon. (too bad for the customer). The code is so brittle that it cannot be modified by anyone - even the original authors.
And yes, those methods should be refactored, or thrown away.

Do you ever have to read or maintain the generated code?
If yes, then I'd think some refactoring might be in order.
If no, then the higher-level language is really the language you're working with -- the C++ is just an intermediate representation on the way to the compiler -- and refactoring might not be necessary.

Looks to me that you've implemented a separate language within your application - have you considered going that way?

It has been my understanding that it's recommended that any method over 100 lines of code be refactored.

I think some rules may be a little different in his era when code is most commonly viewed in an IDE. If the code does not contain exploitable repetition, such that there are 1,000 lines which are going to be referenced once each, and which share a significant number of variables in a clear fashion, dividing the code into 100-line routines each of which is called once may not be that much of an improvement over having a well-formatted 1,000-line module which includes #region tags or the equivalent to allow outline-style viewing.
My philosophy is that certain layouts of code generally imply certain things. To my mind, when a piece of code is placed into its own routine, that suggests that the code will be usable in more than one context (exception: callback handlers and the like in languages which don't support anonymous methods). If code segment #1 leaves an object in an obscure state which is only usable by code segment #2, and code segment #2 is only usable on a data object which is left in the state created by #1, then absent some compelling reason to put the segments in different routines, they should appear in the same routine. If a program puts objects through a chain of obscure states extending for many hundreds of lines of code, it might be good to rework the design of the code to subdivide the operation into smaller pieces which have more "natural" pre- and post- conditions, but absent some compelling reason to do so, I would not favor splitting up the code without changing the design.

For further reading, I highly recommend the long, insightful, entertaining, and sometimes bitter discussion of this topic over on the Portland Pattern Repository.

I've seen cases where it is not the case (for example, creating an Excel spreadsheet in .Net often requires a lot of line of code for the formating of the sheet), but most of the time, the best thing would be to indeed refactor it.
I personally try to make a function small enough so it all appears on my screen (without affecting the readability of course).

1000 lines? Definitely they need to be refactored. Also not that, for example, default maximum number of executable statements is 30 in Checkstyle, well-known coding standard checker.

If you refactor, when you refactor, add some comments to explain what the heck it's doing.
If it had comments, it would be much less likely a candidate for refactoring, because it would already be easier to read and follow for someone starting from scratch.

Then here the question: in general, do
you think such very long methods would
always need refactoring,
if you ask in general, we will say Yes .
or in a
similar case it would be acceptable?
(unfortunately refactoring the specs
is not an option)
Sometimes are acceptable, but is very unusual, I will give you a pair of examples:
There are some 8 bit microcontrollers called Microchip PIC, that have only a fixed 8 level stack, so you can't nest more than 8 calls, then care must be taken to avoid "stack overflow", so in this special case having many small function (nested) is not the best way to go.
Other example is when doing optimization of code (at very low level) so you have to take account the jump and context saving cost. Use it with care.
EDIT:
Even in generated code, you could need to refactorize the way its generated, for example for memory saving, energy saving, generate human readable, beauty, who knows, etc..

There has been very good general advise, here a practical recommendation for your sample:
common patterns can be isolated in plain feeder methods:
void AddSimpleTransform(OutMsg & msg, InMsg const & inMsg,
int rotateBy, int foldBy, int gonkBy = 0)
{
// create & add up to three messages
}
You might even improve that by making this a member of OutMsg, and using a fluent interface, such that you can write
OutMsg msg;
msg.AddSimpleTransform(inMsg, 12, 17)
.Staple("print")
.AddArtificialRust(0.02);
which can be an additional improvement under circumstances.

More on the mediator pattern and OO design

So, I've come back to ask, once more, a patterns-related question. This may be too generic to answer, but my problem is this (I am programming and applying concepts that I learn as I go along):
I have several structures within structures (note, I'm using the word structure in the general sense, not in the strict C struct sense (whoa, what a tongue twister)), and quite a bit of complicated inter-communications going on. Using the example of one of my earlier questions, I have Unit objects, UnitStatistics objects, General objects, Army objects, Soldier objects, Battle objects, and the list goes on, some organized in a tree structure.
After researching a little bit and asking around, I decided to use the mediator pattern because the interdependencies were becoming a trifle too much, and the classes were starting to appear too tightly coupled (yes, another term which I just learned and am too happy about not to use it somewhere). The pattern makes perfect sense and it should straighten some of the chaotic spaghetti that I currently have boiling in my project pot.
But well, I guess I haven't learned yet enough about OO design. My question is this (finally. PS, I hope it makes sense): should I have one central mediator that deals with all communications within the program, and is it even possible? Or should I have, say, an abstract mediator and one subclassed mediator per structure type that deals with communication of a particular set of classes, e.g. a concrete mediator per army which helps out the army, its general, its units, etc.
I'm leaning more towards the second option, but I really am no expert when it comes to OO design. So third question is, what should I read to learn more about this kind of subject (I've looked at Head First's Design Patterns and the GoF book, but they're more of a "learn the vocabulary" kind of book than a "learn how to use your vocabulary" kind of book, which is what I need in this case.
As always, thanks for any and all help (including the witty comments).

I don't think you've provided enough info above to be able to make an informed decision as to which is best.
From looking at your other questions it seems that most of the communication occurs between components within an Army. You don't mention much occurring between one Army and another. In which case it would seem to make sense to have each Mediator instance coordinate communication between the components comprising a single Army - i.e. the Generals, Soldiers etc. So if you have 10 Army's then you will have 10 ArmyMediator's.
If you really want to learn O-O Design you're going to have to try things out and run the risk of getting it wrong from time to time. I think you'll learn just as much, if not more, from having to refactor a design that doesn't quite model the problem correctly into one that does, as you will from getting the design right the first time around.
Often you just won't have enough information up front to be able to choose the right design from the go anyway. Just choose the simplest one that works for now, and improve it later when you have a better idea of the requirements and/or the shortcomings of the current design.
Regarding books, personally I think the GoF book is more useful if you focus less on the specific set of patterns they describe, and focus more on the overall approach of breaking classes down into smaller reusable components, each of which typically encapsulates a single unit of functionality.

I can't answer your question directly, because I have never used that design pattern. However, whenever I have this problem, of message passing between various objects, I use the signal-slot pattern. Usually I use Qt's, but my second option is Boost's. They both solve the problem by having a single, global message passing handler. They are also both type-safe are quite efficient, both in terms of cpu-cycles and in productivity. Because they are so flexible, i.e. any object and emit any kind of signal, and any other object can receive any signal, you'll end up solving, I think, what you describe.
Sorry if I just made things worse by not choosing any of the 2 option, but instead adding a 3rd!

In order to use Mediator you need to determine:
(1) What does the group of objects, which need mediation, consist of?
(2) Among these, which are the ones that have a common interface?
The Mediator design pattern relies on the group of objects that are to be mediated to have a "common interface"; i.e., same base class: the widgets in the GoF book example inherit from same Widget base, etc.
So, for your application:
(1) Which are the structures (Soldier, General, Army, Unit, etc.) that need mediation between each other?
(2) Which ones of those (Soldier, General, Army, Unit, etc.) have a common base?
This should help you determine, as a first step, an outline of the participants in the Mediator design pattern. You may find out that some structures in (1) fall outside of (2). Then, yo may need to force them adhering to a common interface, too, if you can change that or if you can afford to make that change... (may turn out to be too much redesigning work and it violates the Open-Closed principle: your design should be, as much as possible, open to adding new features but closed to modifying existent ones).
If you discover that (1) and (2) above result in a partition of separate groups, each with its own mediator, then the number of these partitions dictate the number of different types of mediators. Now, should these different mediators have a common interface of their own? Maybe, maybe not. Polymorphism is a way of handling complexity by grouping different entities under a common interface such that they can be handled as a group rather then individually. So, would there be any benefit to group all these supposedly different types of mediators under a common interface (like the DialogDirector in the GoF book example)? Possibly, if:
(a) You may have to use a heterogeneous collection of mediators;
or
(b) You envision in the future that these mediators will evolve (and they probably will). Hence providing an abstract interface allows you to derive more evolved versions of mediators without affecting existent ones or their colleagues (the clients of the mediators).
So, without knowing more, I'd have to guess that, yes, it's probably better to use abstract mediators and to subclass them, for each group partition, just to prepare yourself for future changes without having to redesign your mediators (remember the Open-Closed principle).
Hope this helps.

C++ interview - testing potential candidates

I have to interview some C++ candidates over the next few weeks and as the most senior programmer in the company I'm expected to try and figure out whether these people know what they are doing.
So has anybody got any suggestions?
Personally I hate being left in a room to fill out some C++ questions so I'd rather do a more complex test that I can chat with the interviewee about their approaches and so forth as we go along. ie it doesn't matter whether they get the right answers or not its how they approach the problem that interests me. I don't care whether they understand obscure features of the language but I do care that they have a good solid understanding of pointers as well as understanding the under lying differences between pointers and references. I would also love to see how they approach optimisation of a given problem because solid fast code is a must, in my opinion.
So any suggestions along these lines would be greatly appreciated!

I wouldn't make them write code. Instead, I'd give them a couple of code snippets to review.
For example, the first would be about design by contract: See if they know what preconditions, postconditions and invariants are. Do a couple of small mistakes, such as never initializing an integer field but asserting that it is >= 0 in the invariant, and see if they spot them.
The second would be to give them bool contains(char * inString, char c). Implement it with a trivial loop. Then ask whether there are any mistakes. Of course, your code here does not check for null in the input parameter inString (even if the very previous question talked about preconditions!). Also, the loop finishes at character 0. Of course, the candidate should spot the possible problems and insist on using std::string instead of this char * crap. It's important because if they do complain, you'll know that they won't add their own char *'s to new code.
An alternative which addresses containers: give them a std::vector<int> and code which searches for prime numbers or counts the odd numbers or something. Make some small mistake. See if they find any issues and they understand the code. Ask in which situation a std::set would be better (when you are going to search elements quite systematically and original order of entrance doesn't matter.).
Discuss everything live, letting them think a couple minutes. Capture the essence of what they say. Don't focus on "coverage" (how many things they spot) because some people may be stressed. Listen to what they actually say, and see if it makes any sense.
I disagree with writing code in interviews. I'd never ask anyone to write code. I know my handwritten code would probably suck in a situation like that. Actually, I have seldom been asked to do so, but when I have, I haven't been hired.

This one is a great complex task, even though it is looking quite harmless.

I believe that a C++ programmer needs more than just generic programming skills, because...
In C++ it's harder to shoot yourself in the foot, but when you do, you blow off your whole leg.
Writing bug-free, maintainable C++ code places a much higher demand on a few areas than most languages.
One thing I'll call "pedanticness". You know how some people can spot spelling errors in something at a glance? A C++ programmer needs to be able to spot simple bugs while they read or write code (whether the code is their own or not). A programmer who relies on the "compile and test" technique just to get rid of simple bugs is incompatible with the C++ language, because those bugs don't always lead to immediate failure in C++.
C++ programmers also need a good knowledge of low-level stuff. Pointers, memory allocators, blocking, deadlocks. And "nitty gritty" C++ issues, like multiple inheritance and method hiding and such, particularly if they need to maintain other people's code.
Finally, C++ programmers need to be able to write code that's easy for other people to use. Can they design stuff well?
A good test for the first two areas is "Here's some C++ code I got off the internet. Find the bugs, and identify the unneccessary bits." (There's lots of really bad C++ code available on the internet, and often the programmer does unnecessary things due to a faulty understanding of how to be "safe" in C++.)
The last area you can test with more generic interview questions.

A few questions can allow you to know a lot about a candidate:
Differences between a pointer and a reference, when would you use each?
Why would you make a destructor virtual?
Where is the construction order of a class attributes defined?
Copy constructor and operator=. When would you implement them? When would you make them private?
When would you use smart pointers? what things would you take into account to decide which?
Where else have you seen RAII idiom?
When would you make a parameter const? when a method?
When would you make an attribute mutable?
What is virtual inheritance?
What is template specialization?
What are traits?
What are policies?
What is SFINAE?
What do you know about C++Ox standard?
What boost libraries have you used?
What C++ books have you read? (Sutter? Alexandrescu?)
Some short exercises (no more than 10 minutes) about STL containers, memory management, slicing, etc. would also be useful. I would allow him to do it in a computer with a ready environment. It's important to observe the agility.

Checkout Joel's Guerrilla guide to interviewing. Seems a lot like what you are looking for.

"Write a program that receives 3 integers in the range of 0..2^32-1, and validates if they represent valid edges of a triangle".
It seems to be a simple question. The input is considered valid if the sum of any two edges is greater than the third edge. However, there are some pitfalls, that a good programmer will handle:
The correct type to use should be unsigned long. Many "programmers" will fail here.
Zero values should be considered as non-valid.
Overflow should be avoided: "if (a+b <= c) return false" is problematic since a+b may cause an overflow.
if (a <= c-b) is also bad solution since c-b may be negative. Not a good thing for unsigned types.
if (c > b) { if (a <= c-b) return false; } else { if (a <= b-c) return false; } This looks much better, but it will not work correctly if (a >= b+c).
A good programmer must be detail oriented. This simple exercise will help checking if he is.

Depending on what your organisation's pre-screening is like, assume that the person knows nothing at all about C++ and has just put in on their CV because it makes them look supertechnical. Seriously. Start with something simple, like reversing a string. I have had candidates who couldn't even write a function prototype for this !!

Do not forget to also test for code bigotry. I know I don't want anyone working for or with me that isn't a flexible and consequently practical programmer both in their attitude to the programming language, but also in their approach to problem solving.
Denying any type of preconceptions
Understanding the value of the
exceptions in any Best Practices
Being capable of refusing long term
habits in favor of something else if
the need arises
These are characteristics dear to me. The manner of testing for these is not ideal if the interviews aren't lengthy or don't involve presenting code. But showing code snippets with purposely debatable techniques while offering a use case scenario and asking the candidate how they feel about the solution is one way.

This article offers some general ideas that are relevant regardless of what language you're working with.

Don't test only the C++ and overall technical skills! Those are of course important, but they are nothing if people don't listen, don't answer properly or don't follow the commitments they made.
Check at most for the ability to clearly communicate. If people cant tell you what roughly they did in their former jobs within a few minutes, they will also be unable to report about their work at your place etc.
In a recent company we invited people for interviews in groups of about 3 people together. They were surprised, but nobody was angry about that. It was very interesting, because people had to communicate not only with us, but also with others in the same position. In case we were interested further, we arranged a second interview.

You can choose potentially problematic task and see how they approach it. Ask them to write a smart pointer for example, you'll see if they understand pointers, references and templates in one step :) Usually they are stressed so they will do mistakes, those mistakes might help you find out how good they problem solving skills are, what paths would they use to fix a mistake and so on. The only problem with this approach is that sometimes interviewee just don't know anything about the task and you would have to quickly figure out something easier. If they do perfect code you can discuss their choices but when there's nothing to look at it is depressing for both of you.

Here is my answer to a similar question geared towards C#, but notice that my answer is language agnostic. My interview question is, in fact, in C. I rarely interview a person with the goal of finding out if they can program. I want to find out if they can think, problem solve, collaborate, communicate, understand something new, and so on. In the meantime, I circle around trying to see if they "get it" in terms of the big picture of software engineering. I use programming questions because that's a common basis and an easy ruse.

Get Codility.com to screen out non-programming programmers, this will get you a limited number of mostly reasoable candidates. Sit for an hour with each of them and try to build something together (a micro web server, a script for processing some of your data, a simple GUI). Pay attention to communication skills, i.e. how much effort does it take to understand the candidate. Ask the candidate for recommendation of books related to the subject (C++ software development in your case). Follow Guerilla Guide to Interviewing, i.e. answer yourself honestly, if the person is smart and gets things done. Good luck.

Check 10 C++ Interview Questions by Tests4Geeks.
It's an addition to their pre-interview C++ test and it has really usefull questions. Many people have been working on these interview questions so it's quite balanced and has no tricky or syntax questions.
Idea is quite simple - first you weed out incompetent candidates using the test, then you use article questions in real-life interview.

Whatever you do, pairing would be a good idea. Come up with a good program and pair with the guy and work towards solving the problem. IMHO, that a very good idea

So has anybody got any suggestions?
I'd recommend getting a copy of this:
http://www.amazon.co.uk/Programming-Interviews-Exposed-Secrets-Programmer/dp/047012167X/ref=sr_1_1?ie=UTF8&s=books&qid=1252499175&sr=8-1
ie it doesn't matter whether they get the right answers or not its how they approach the problem that interests me
You could ask the candidate to come up with a UML design to a common problem. If they show you a design pattern, then you can talk through the pros/cons of the pattern. You could then ask them to produce some code for one of the classes.
This would help you determine their technical knowledge level and their communication abilities.
I do care that they have a good solid understanding of pointers as well as understanding the under lying differences between pointers and references
Linked list problems are good for determining whether a candidate has a solid grasp of pointers.
As for references, you could show them some code that does not use references correctly, and ask them to describe the problem.
e.g show them a class definition that contains a reference member variable, and the implementation of the constructor with the reference initialization missing.
I would also love to see how they approach optimisation of a given problem because solid fast code is a must, in my opinion.
I'd start off simple...
Show them a code example that passes strings to a function by value. (the strings should not be modified in the function). You should check they correct the code to pass the strings by const reference.
After this, you could show a constructor that uses assignment instead of initialization (for objects). Ask them to improve it.
Lastly, ask them simple questions about data structure selection.
e.g. When they should use a list rather than a vector.
If you feel they have a grasp of the fundamentals you could either ask how they approach optimization problems (discuss profilers etc), or ask them to optimize something less obvious.

Take a look into this C++ test. They have a questions about differences between pointers and references as you require.
Here is full list of topics:
Fundamentals: References & Pointers, Const Correctness, Explicit
Standard Library
Class Design, Overloading
Virtual Functions, Polymorphism, Inheritance
Memory Management, Exception Safety
Miscellaneous: Perfect Forwarding, Auto, Flow Control, Macros
These guys are really serious about their questions, they also made the great list of C++ interview question which you might ask your candidates:
https://tests4geeks.com/cpp-interview-questions/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js