Write C++ in a graphical scratch-like way? - c++

I am considering the possibility of designing an application that would allow people to develop C++ code graphically. I was amazed when I discovered Scratch (see site and tutorial videos).
I believe most of C++ can be represented graphically, with the exceptions of preprocessor instructions and possibly function pointers.
What C++ features do you think could be (or not be) represented by graphical items?
What would be the pros and cons of such an application ? How much simpler would it be than "plain" C++?
RECAP and MORE:
Pros:
intuitive
simple for small applications
helps avoid typos
Cons:
may become unreadable for large (medium?) - sized applications
manual coding is faster for experienced programmers
C++ is too complicated a language for such an approach
Considering that we -at my work- already have quite a bit of existing C++ code, I am not looking for a completely new way of programming. I am considering an alternate way of programming that is fully compatible with legacy code. Some kind of "viral language" that people would use for new code and, hopefully, would eventually use to replace existing code as well (where it could be useful).
How do you feel towards this viral approach?
When it comes to manual vs graphical programming, I tend to agree with your answers. This is why, ideally, I'll find a way to let the user always choose between typing and graphical programming. A line-by-line parser (+partial interpreter) might be able to convert typed code into graphical design. It is possible. Let's all cross our fingers.
Are there caveats to providing both typing and graphical programming capabilities that I should think about and analyze carefully?
I have already worked on template classes (and more generally type-level C++) and their graphical representation.
See there for an example of graphical representation of template classes. Boxes represent classes or class templates. First top node is the class itself, the next ones (if any) are typedef instructions inside the class. Bottom nodes are template arguments. Edges, of course, connect classes to template arguments for instantiations.
I already have a prototype for working on such type-level diagrams.
If you feel this way of representing template classes is plain wrong, don't hesitate to say so and why!

Much as I like Scratch, it is still much quicker for an experienced programmer to write code using a text editor than it is to drag blocks around, This has been proved time and again with any number of graphical programming environments.

Writing code is the easiest part of a developers day. I don't think we need more help with that. Reading, understanding, maintaining, comparing, annotating, documenting, and validating is where - despite a gargantuan amount of tools and frameworks - we still are lacking.
To dissect your pros:
Intuitive and simple for small applications - replace that with "misleading". It makes it look simple, but it isn't: As long as it is simple, VB.NET is simpler. When it gets complicated, visual design would get in the way.
Help avoid typos - that's what a good style, consistency and last not least intellisense are for. The things you need anyway when things aren't simple anymore.
Wrong level
You are thinking on the wrong level: C++ statements are not reusable, robust components, they are more like a big bag of gears that need to be put together correctly. C++ with it's complexity and exceptions (to rules) isn't even particulary suited.
If you want to make things easy, you need reusable components at a much higher level. Even if you have these, plugging them together is not simple. Despite years of struggle, and many attempts in many environments, this sometimes works and often fails.
Viral - You are correct IMO about that requriement: allow incremental adoption. This is closely related to switching smoothly between source code and visual representation, which in turn probably means you must be able to generate the visual representation from modified source code.
IDE Support - here's where most language-centered approaches go astray. A modern IDE is more than just a text editor and a compiler. What about debugging your graph - with breakpoints, data inspection etc? Will profilers, leak detectors etc. highlight nodes in your graph? Will source control give me a Visual Diff of yesterday's graph vs. today's?
Maybe you are on to something, despite all my "no"s: a better way to visualize code, a way to put different filters on it so that I see just what I need to see.

The early versions of C++ were originally written so that they compiled to C, then the C was compiled as normal.
What it sounds like you are describing is a graphical language that is compiled to C++, which will then be compiled as normal.
So really you are not creating a graphical C++, you are creating a new language that happens to be graphical. Nothing wrong with that, but don't let C++ restrict what you do, because eventually you may want to compile the graphical language straight to machine code, or even to something like CIL, Java ByteCode, or whatever else tickles your fancy.
Other graphical languages you may want to check out are LabVIEW, and more generally the category of visual programming languages.
Good luck in your efforts.

The complexity of a nontrivial program is usually too high to be represented with graphical symbols, which are low in their information content. Unless your approach is markedly different in some way, I am skeptical that this would be of value based on past efforts.
So, practically speaking, his will be useful only for instructional purposes and very simple programs. But that would still be a great target market for a product like this. sometimes people have trouble grasping the fundamentals, and a visual model might be just the thing to help things click.

Interesting idea. I doubt I'd use it though. I tend to prefer coding in a flat text editor, not even an IDE, and for tough problems I prefer a pad of paper. Most of the really experienced programmers I know work this way, Maybe it's because we grew up in a different environment, but I think it's also because of the way we think about programming. As you get more experience, you start seeing the code in your head more clearly than any GUI tool will show it to you.
As for your question, I'd nominate templates as one of the harder / more interesting sort of thing to try to represent well. They are ubiquitous and carry information that you won't have access to as you are designing your tool. Getting that to the user in a useful way should pose an interesting challenge.

What C++ features do you think could be [...] represented by graphical items?
Object Oriented Design. Hence classes, inheritance, polymorphism, mutability, const-ness etc. And, templates.
What would be the pros and cons of such an application?
It may be easier for beginners to start writing programs. For the experienced, it may be get rid of the boring parts of programming.
Think of any other code generator. They create a framework for you to write the more involved portion(s). They also lead to bloated-code (think of any WYSIWYG HTML editor).
The biggest challenge, as I see it, is that any such UI necessarily hinders the user's imagination.
How much simpler would it be than "plain" c++ ?
It can be a real pain, when you wade through truckloads of errors which is typical of code generators.
Further, since a lot of code is generated, you have no idea of what is going on -- debugging becomes difficult.
Also, for the experienced there may be some irritation to find that the generated code is not per their preferred coding style.

I prefer hot-keys instead graphical menus and buttons.
And I think same thing will happen with graphical development tool. Many peoples will prefer manual codding.
But, source code visualizer - should be nice thing.

I like the idea, but I suspect there comes a point where things get far too complicated to be represented graphically.
However, given recent experience at work; it would be useful to give such a graphical interface to a non-techie person to use to create basic drag-and-drop programs, leaving myself free to get on with some "proper" programming ;-) If it can do the job of allowing somebody non-skilled to build something functional it can be a very good thing (even if programming logic escapes them)
There comes a point in such a system where it becomes easier to define what you want to do using literal C++ code, rather than have a user interface getting in the way; it can get frustrating to the sessioned programmer knowing the precise code that needs to be written but then only being limited to the design GUI. I'm specifically thinking about a more common application, such as html editors/designers in which they allow newbies to build their websites without knowing any html at all.
It would be interesting to see how such a system would handle the dynamic allocation of memory, and the different states of a program as time progressed; I suspect that there are some very basic programming concepts that may be difficult to represent graphically.. Polymorphism..? Virtual Classes, LinkList, Stacks/Circular Queues. I wonder for a moment how you would explain an image compression algorithm (such as jpg) successfully too without the help of a gigantic display screen.
I also wonder if such a system would even go to such a low level, and whether you would be dealing with abstracted concepts and the compiler would be working out the best way to do something.

I've been working on a new model-driven software development paradigm named ABSE (http://www.abse.info) that supports end-user programming: It's a template-based system that can be complemented with transformation code. I also have an IDE (named AtomWeaver) implementing ABSE that is in pre-alpha stage right now.
With AtomWeaver, as an expert/architect, you build your knowledge Templates, and then the developers (or end-users if you make your meta-models simpler) can just "assemble" systems by building blocks, and then filling template parameters in form-style editors.
At the end, pressing the "Generate" button will create the final system as specified by the architect/expert.

I'm surprised you think function pointers would be a particular problem. How about anything at all to do with pointers?
A programming language can be represented by a hierarchy of nodes - that's exactly what the compiler turns it into. It is very strange that the UI for editing programs is still a sequence of characters that get parsed, because the degrees of freedom in the editor is way larger than the available set of allowed choices. But intellisense helps to reduce this problem a lot.
C++ would be a strange choice to base such a system on.

I think the major problem of this kind of IDEs are that the code generated becomes unmantainable easily.
This happened to Delphi. It's a really nice tool to develop some kind of applications, however, when we start adding complex relationships between the components, start adding Design Patterns, etc. the code grows to an unmantainable size.
I believe it's also because graphical tools don't apply the concept of MVC (or if they do, it's only in the way that the IDE understands).
It can be really helpful for prototypes and very small applications that don't tend to grow, otherwise it can become a mess for the developer(s)

Related

Porting C++ w/ MFC to a Touch Screen

Looking for a GUI framework to use with C++ to "modernize" an existing interface for touchscreen.
I'm a novice programmer with a background in C++/Java that was just assigned a project that involves taking an existing C++ program using MFC (3 data views, multiple text/radio controlled dialog boxes, etc.) and redesigning the interface to be "touch-screen friendly" such as larger button controls, sliders and whatnot.
I've been given pretty free-reign with instructions to make a "more modern looking interface" as opposed to the typical bare-bones MFC I've been given. I know I have quite a bit to learn either way, so any suggestions are helpful.
So far the options I've come up with are:
MFC just tweak existing controls to accommodate touch input, keep crappy looking interface.
Managed C++ or C++/CLI figure out how to keep the underlying C++ structure while being able to design a new interface with WPF or Windows Forms.
Qt completely new to me, but seems a promising alternative.
Really I just need to find a way to make this program look like it wasn't written over 10 years ago, and so far in teaching myself MFC, it doesn't seem that flexible in terms of incorporating any sort of graphic-design/themes.
Are there other alternatives I should be looking into? Is there more to MFC than meets the eye and I just need to learn more about it? As I said, any suggestions of things to look into are helpful.
What you've listed as #1 could really be either of two approaches. One (call it 1a) is to continue using the same version of VC++ and MFC as the original, and do only the bare minimum of editing to increase the sizes of the controls where needed. Short of encountering some fairly bad luck in how the existing code is written, this might not involve any real programming at all, and would be relatively quick and easy.
The second (call it 1b) would be to do the update using the current version of VC++, MFC, etc. This would probably involve some code updates, but probably not anything terrible (though if it's much more than 10 years old, significant code updates could be needed as well). With some care, you may be able to update the UI quite a bit (e.g., change from menus to ribbons, include color theme support), still with fairly minimal investment.
Your #2 could easily end up as nearly a complete rewrite. Despite superficial similarities, C++/CLI is a completely different language from C++. The only way this really makes sense is if you have quite a bit of non-UI code you can leave alone completely, and use C++/CLI exclusively as a "bridge" between existing C++ and a .NET UI (and the UI is fairly minimal, so the rather mediocre tool support for C++/CLI doesn't cause a big problem). If your UI is any more than fairly trivial, C# has enough better tool support that it may easily be a better choice than C++/CLI.
Your #3 will probably require only marginally less rewriting than 2, at least of the code that's related to the UI. While the code would remain C++, Qt is quite a lot different from MFC. The big advantage would be if you (might soon) want to support something other than Windows (e.g., iOS or Android). For portability, Qt has a huge advantage over any of the alternatives you've named.
A lot comes back to a question of how much C++ code you can retain intact. If you have a lot of code in an "engine" that's cleanly separated from the UI, and a fairly simple UI, then retaining that existing code make a lot of sense, and rewriting the UI from the ground up isn't all that terrible of a problem. On the other hand, if the UI and business logic code are heavily intertwined (fairly common) then any more than really minor tweaks to the UI is likely to me significant rewriting of the rest of the code as well.
Conclusion: if the UI is heavily intertwined with other code, your only real choice is between 1a and 1b. If the UI is easily detached from other code, the choice between 2 and 3 is (at least to me) primarily one of whether portability is at all likely to matter (now or any time soon).

Best C++ programming practices in professional game development environments? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm well educated in the C++ programming language regarding syntax, STL, etc. I haven't done any true projects with it yet other than some college courses. My goal is to start writing progams but attempt to keep it to industry best practices. One of my main goals is to work in the game industry. A door is opening which lead me to ask these questions.
Are C++ game developers usually limited on using all the C++ features? Is it true more or less for each platform?
Is STL used these days in the environment, or should I avoid it?
Do they treat C++ as "everything must be an object" or is it a mix of paradigms?
Any features they tend to not use?
Is C still used, and is it good to show a project completed in it?
I did talk to a famous game designer and he said they don't use STL and said to use basic encapsulation with classes, but he said not to use all other advanced C++ features. The idea is to keep it simple. Is this common?
Thanks in advance!
Bear in mind that "games development" covers everything from phones to websites to PSPs and DS to Wii, Xbox, PS3 and PC/Mac, so the answers you get may vary from one product/developer/platform to another.
1) Yes. Games development can mean that you have to use a particular compiler for your target platform, which may not support some of the "newer" features of C++. You may be restricted on which libraries the development studio will allow you to use. In addition, hardware restrictions on some platforms may influence how you code (e.g. on some systems the cost of virtual functions can be excessively high compared to a PC, so your designs can be restricted away from pure OO for performance reasons. Some systems have limited resources (memory, disk, etc) which will influence your design a lot. On the Playstation3 you may need to write vector-processing SPU tasks rather than traditional single/multithreaded code. etc).
2) Yes. (it's commonly used, as there's usually no point wasting time rolling your own, probably less efficient, equivalents)
3) A mix. Generally in games the key goals are (in rough order): Deliver the impossible on time (why are you going home now, it's only 10pm?), deliver high performance, and try not to crash.
4,6) Generally keep the code simple and flexible so you can develop fast and change to suit the producer's latest vision. You usually have tight deadlines so a lot of design and careful coding tends to be difficult to achieve. The industry unfortunately tends to work on the "churn out the code quickly so we can throw it away and start on the next product", although it is slowly coming around to the idea of re-usable libraries (and more methodical professional practices like unit tests, etc).
A typical scenario is to be asked to code up a quick and dirty demo for the boss, and then when you think you'll be able to throw it away and write it properly, the boss says "we don't have time to rewrite something that already works, start on the next level". So if you're asked to be quick'n'dirty, try not to burn too many bridges or you'll regret it.
5) You may use C, but it depends on the platform and the development house. These days most devlopment will be C++ even if parts of the code are effectively "little more than C" in terms of syntax and features used.
These questions are not at all unique to game development. You will find they vary as much by field as they do by company. I can think of two game design firms that use STL off the top of my head, and also of a few that don't.
1) Define "all C++ features". They typically don't use goto, if that's what you mean.
2) Yes STL is used. You shouldn't avoid it, or not avoid it - use the tools you need to get the job done every time you have a job that needs to get done. Keeping what you use in line with other ways similar problems were tackled -within that project or context- is importent.
3) Depends where you are, both in games and out of games. Some are graduated C guys who want things more C like. Others are C#/Java guys who are more OO (Object Obsessed).
4) goto
5) Yes, and yes. Knowing more than one language is always a good thing. Even if it's just C to C++.
6) Again, use the tools available to solve your problems. Keeping it simple is good when you can, sure.
1) No limitation really, except that you should follow programming guides so as to use them properly (i.e. avoid dinosaurs when using goto's).
2) Yes, its used, and Boost also (which is kinda the preview of the next STL).
3) Mix, always a mix. To say that C++ is only object-oriented is to misunderstand the language. "Everything is an object" is really a Java (or things like it) thing; check out C++'s concept of POD, as an example of how not everything is an object.
4) same as 1)
5) It is used, but as far as I've seen, no new project is being created in C.
6) No... he kinda said the opposite of what I've just said.
Generally, some parts of C++ are limited for patform and performance reasons.
Exceptions and RTTI are normally disabled. Exceptions are expensive when they unwind the stack, and RTTI uses too much memory. Lack of exceptions is particularly a pain because rather than come up with another error handling system, programmers tend to just return NULL. :)
Floats are used almost exclusively over doubles, for performance reasons.
Virtual functions and inheritence can be used, but the virtual table lookup can be costly so sometimes it's optimised out.
The STL can be used, but you normally have to write your own allocators to avoid fragging memory (see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html). It's more common to see custom containers - of varying quality. I've never come across boost being used in a console project.
Most code is OO, almost to a fault.
Best C++ programming practices in professional game development environments?
The question is too broad.
Are C++ game developers usually limited on using all the C++ features? Is it true more or less for each platform?
No, or at least extremely unlikely. Depends on platform/compiler.
Is STL used these days in the environment, or should I avoid it?
You shouldn't avoid it, but:
you'll definitely need a profiler if you want stl containers
STL containers are extremely slow in debug builds. This might be a problem, because if you're frequently accessing them, there will be a huge performance drop (say, from 300 fps to 50) in debug build (when compared to release build). Debug code is already slower, and stl containers have large amount of "security check" in debug build.
Do they treat C++ as "everything must be an object"
Every idea is bad if you take it to extremes. "Everything must be an object" is one of those.
Any features they tend to not use?
Why? Unless there is a compiler bug or performance loss there is no point in avoiding feature. Besides, there are commercial 3D games written in java, so I see no point in avoiding feature of compiled language.
Is C still used, and is it good to show a project completed in it?
Some opensource projects may use C, but I can't remember a game completely written in it. There is Q3 engine, but afaik it is mix of C/C++.
I did talk to a famous game designer and he said they don't use STL and said to use basic encapsulation with classes, but he said not to use all other advanced C++ features.
They could decide to avoid stl simply because containers are slow in debug build. It is unclear what exactly you call "advanced C++ features". Also, keep in mind that he is designer, not programmer.
The idea is to keep it simple. Is this common?
The idea may or may not be common, but it is a correct idea for a small teams. With limited human resources everything should be made as simple as possible - to reduce development costs. If you stick to concept like "everything should be an object", decide to build "nice-looking" class hierarchy, etc, AND if you have limited resources, then you are doomed - It is very possible that you'll never finish your project and will be rewriting code forever and trying to decide what should be derived from what. Needless project flexibility and extensibility may kill small project. Game engine doesn't have to be complicated.
Large teams may afford complex solutions, but complexity leads to additional bugs. For example, you could look at sims 3 - it has a VERY advanced system (object "plugins", game resource management, GPU resource management, content "streaming", etc, and I think they even put game logic into a few .NET modules that are stored within game archive...), but extensibility of the system resulted in mountain of bugs after certain updates.
Electronic Arts as one example uses STL, but felt compelled to create their own version of it called EASTL.
Are C++ game developers usually limited on using all the C++ features? Is it true more or less for each platform? These days, I think it's pretty good on the main console platforms but if you were developing for mobile devices I doubt it's quite so even
Is STL used these days in the environment, or should I avoid it? It is used and should be used... but some people don't trust it because in the past STL support was much less even, and sometimes had performance problems
Do they treat C++ as "everything must be an object" or is it a mix of paradigms? Far too general. Varies from person to person and company to company. On average, game developers have a reputation for being less focused on software engineering and more on hacking code, but that's far too generalised to really be of use. However spouting for 30min about design patterns and frameworks and abstractions in an interview is probably not a wonderful plan
Any features they tend to not use? STL still has a bad rep amongst those who worked on older platforms. Exceptions also and RTTI, probably sicne they weren't well supported (and maybe are still not)
Is C still used, and is it good to show a project completed in it? I'm sure it is, and if you have a project then show it, but it's not anywhere near as widely used as C++
I did talk to a famous game designer and he said they don't use STL and said to use basic encapsulation with classes, but he said not to use all other advanced C++ features. The idea is to keep it simple. Is this common?
That view is pretty common but it doesn't mean it's right. You might want to tread carefully to avoid upsetting a 20-year veteran who has not moved with the times
I'm going to assume you're referring to people who develop full-price games for consoles and PC.
"Are C++ game developers usually limited on using all the C++ features?"
I think most are avoiding exceptions, RTTI, and many avoid the STL part of the standard library. They're probably using templates, though not much metaprogramming. I don't think this varies much across platform. But it does vary from company to company.
"Is STL used these days in the environment, or should I avoid it?"
It is used by many, but not all. Don't avoid it - it's a useful tool that is very good at the tasks it is designed for. Just make sure you can get by without it.
"Do they treat C++ as "everything must be an object" or is it a mix of paradigms?"
Different places will work in different ways. There's nothing intrinsic to object orientation in C++ that makes it good or bad for game development. Bear in mind that there is often legacy code or 3rd party libraries that are written in C or designed to be operated from C, so it's very rare that you'll have C++ throughout.
"Any features they tend to not use?"
I think we covered this above.
"Is C still used, and is it good to show a project completed in it?"
Yes, and no. Life is too short to spend on mastering every related language and dialect before you get your first job. Showing the ability to use a C-style library (eg. SDL) should suffice. Obviously if you set your goals on joining a company that just uses C, you will have to consider that.
"I did talk to a famous game designer and he said they don't use STL and said to use basic encapsulation with classes, but he said not to use all other advanced C++ features. The idea is to keep it simple. Is this common?"
See above. Incidentally, game "designers" typically aren't programmers, so watch your terminology. If you apply for a game designer role, you won't usually be showing programming skills.

Visualizing C++ to help understanding it

I'm a student who's learning C++ at school now. We are using Dev-C++ to make little, short exercises. Sometimes I find it hard to know where I made a mistake or what's really happing in the program. Our teacher taught us to make drawings. They can be useful when working with Linked Lists and Pointers but sometimes my drawing itself is wrong.
(example of a drawing that visualizes a linked list: nl.wikibooks.org/wiki/Bestand:GelinkteLijst.png )
Is there any software that could interpret my C++ code/program and visualize it (making the drawings for me)?
I found this: link text
other links:
cs.ru.ac.za/research/g05v0090/images/screen1.png and
cs.ru.ac.za/research/g05v0090/index.html
That looks like what I need but is not available for any download. I tried to contact that person but got no answer.
Does anybody know such software? Could be useful for other students also I guess...
Kind regards,
juFo
This is unrelated to the actual title but I'd like to make a simple suggestion concerning how to understand what's happening in the program.
I don't know if you've looked at a debugger but it's a great tool that can definitely vastly improve your understanding of what's going on. Depending on your IDE, it'll have more or less features, some of them should include:
seeing the current call stack (allows you to understand what function is calling what)
seeing the current accessible variables along with their values
allowing you to walk step by step and see how each value changes
and many, many more.
So I'd advise you to spend some time learning all about the particular debugger for your IDE, and start to use all of these features. There's sometimes a lot more stuff then simply clicking on Next. Some things may include dynamic code evaluation, going back in time, etc.
Have a look at DDD. It is a graphical front-end for debuggers.
Try debuggers in general to understand what your program is doing, they can walk you through your code step-by-step.
Doxygen has, if I recall, a basic form of this but it's really only a minor feature of a much bigger library, so that may be overkill for what you want. (Though it's a great program for documentation!)
Reverse engineering the code to some sort of diagram, will have limited benefit IMO. A better approach to understanding program flow is to step the code in the debugger. If you don't yet use a debugger, you should; it is the more appropriate tool for this particular problem.
Reverse engineering code to diagrams is useful when reusing or maintaining undocumented or poorly documented legacy code, but it seldom exposes the design intent of the code, since it lacks the abstraction that you would use if you were designing the code. You should not have to resort to such things on new code you have just written yourself! Moreover, tools that do this even moderately well are expensive.
Should you be thinking you can avoid design, and just hand in an automatically generated diagram, don't. It will be more than obvious that it is an automatically generated diagram!

Design and Readbility

I am working on a project written in C++ which involves modification of existing code. The code uses object oriented principles(design patterns) heavily and also complicated stuff like smart pointers.
While trying to understand the code using gdb,I had to be very careful about the various polymorphic functions being called by the various subclasses.
Everyone knows that the intent of using design patterns and other complicated stuff in your code is to make it more reusable i.e maintainable but I personally feel that, it is much easier to understand and debug a procedure oriented code as you definitely know which function will actually be called.
Any insights or tips to handle such situations is greatly appreciated.
P.S: I am relatively less experienced with OOP and large projects.
gdb is not a tool for understanding code, it is a low-level debugging tool. Especially when using C++ as a higher level language on a larger project, it's not going to be easy to get the big picture from stepping through code in a debugger.
If you consider smart pointers and design patterns to be 'complicated stuff' then I respectfully suggest that you study their use until they don't seem complicated. They should be used to make things simpler, not more complex.
While procedural code may be simple to understand in the small, using object oriented design principals can provide the abstractions required to build a very large project without it turning into unmaintainable spaghetti.
For large projects, reading code is a much more important skill than operating a debugger. If a function is operating on a polymorphic base class then you need to read the code and understand what abstract operations it is performing. If there is an issue with a derived class' behaviour, then you need to look at the overrides to see if these are consistent with the base class contract.
If and only if you have a specific question about a specific circumstance that the debugger can answer should you step through code in a debugger. Questions might be something like 'Is this overriden function being called?'. This can be answered by putting a breakpoint in the overriden function and stepping over the call which you believe should be calling the overriden function to see if the breakpoint is hit.
Port it into Doxygen as a first step.
Modifying comments should have no effect on the code.
Doxygen will allow you to get an overview of the structure of the program.
Over time, as you figure out more about the program, you add comments that get picked up by Doxygen. The quality of the document grows over time, and will be helpful to the next poor SOB that gets stuck with the program
There is an excellent book called Object-Oriented Reengineering Patterns that, in a first part, provides patterns on how to understand legacy code (e.g. "refactor to understand").
A pdf version of the book is available for free at http://scg.unibe.ch/download/oorp/
Diagrams. Does your IDE have a tool that can reverse-engineer class diagrams from the code? That may help you understand the relationships between classes. Also, have the other developers actually written documentation on what they are doing and why? Is there a decisions document explaining why they designed and built in the way they did (Ok, sometimes this is not necessary - but if it exists, it would also help).
Also, do you know WHAT design patterns were used? Do they have names? Can you look them up and find other simpler examples of them? Maybe try writing a small app that also implements the design pattern, just to try it for yourself. That can also improve understanding.
I generally do the following:
Draw a simplified class diagram
Write some pseudocode
Ask a developer who is likely to be familiar with the code layout

Converting C source to C++

How would you go about converting a reasonably large (>300K), fairly mature C codebase to C++?
The kind of C I have in mind is split into files roughly corresponding to modules (i.e. less granular than a typical OO class-based decomposition), using internal linkage in lieu private functions and data, and external linkage for public functions and data. Global variables are used extensively for communication between the modules. There is a very extensive integration test suite available, but no unit (i.e. module) level tests.
I have in mind a general strategy:
Compile everything in C++'s C subset and get that working.
Convert modules into huge classes, so that all the cross-references are scoped by a class name, but leaving all functions and data as static members, and get that working.
Convert huge classes into instances with appropriate constructors and initialized cross-references; replace static member accesses with indirect accesses as appropriate; and get that working.
Now, approach the project as an ill-factored OO application, and write unit tests where dependencies are tractable, and decompose into separate classes where they are not; the goal here would be to move from one working program to another at each transformation.
Obviously, this would be quite a bit of work. Are there any case studies / war stories out there on this kind of translation? Alternative strategies? Other useful advice?
Note 1: the program is a compiler, and probably millions of other programs rely on its behaviour not changing, so wholesale rewriting is pretty much not an option.
Note 2: the source is nearly 20 years old, and has perhaps 30% code churn (lines modified + added / previous total lines) per year. It is heavily maintained and extended, in other words. Thus, one of the goals would be to increase mantainability.
[For the sake of the question, assume that translation into C++ is mandatory, and that leaving it in C is not an option. The point of adding this condition is to weed out the "leave it in C" answers.]
Having just started on pretty much the same thing a few months ago (on a ten-year-old commercial project, originally written with the "C++ is nothing but C with smart structs" philosophy), I would suggest using the same strategy you'd use to eat an elephant: take it one bite at a time. :-)
As much as possible, split it up into stages that can be done with minimal effects on other parts. Building a facade system, as Federico Ramponi suggested, is a good start -- once everything has a C++ facade and is communicating through it, you can change the internals of the modules with fair certainty that they can't affect anything outside them.
We already had a partial C++ interface system in place (due to previous smaller refactoring efforts), so this approach wasn't difficult in our case. Once we had everything communicating as C++ objects (which took a few weeks, working on a completely separate source-code branch and integrating all changes to the main branch as they were approved), it was very seldom that we couldn't compile a totally working version before we left for the day.
The change-over isn't complete yet -- we've paused twice for interim releases (we aim for a point-release every few weeks), but it's well on the way, and no customer has complained about any problems. Our QA people have only found one problem that I recall, too. :-)
What about:
Compiling everything in C++'s C subset and get that working, and
Implementing a set of facades leaving the C code unaltered?
Why is "translation into C++ mandatory"? You can wrap the C code without the pain of converting it into huge classes and so on.
Your application has lots of folks working on it, and a need to not-be-broken.
If you are serious about large scale conversion to an OO style, what
you need is massive transformation tools to automate the work.
The basic idea is to designate groups of data as classes, and then
get the tool to refactor the code to move that data into classes,
move functions on just that data into those classes,
and revise all accesses to that data to calls on the classes.
You can do an automated preanalysis to form statistic clusters to get some ideas,
but you'll still need an applicaiton aware engineer to decide what
data elements should be grouped.
A tool that is capable of doing this task is our DMS Software Reengineering
Toolkit.
DMS has strong C parsers for reading your code, captures the C code
as compiler abstract syntax trees, (and unlike a conventional compiler)
can compute flow analyses across your entire 300K SLOC.
DMS has a C++ front end that can be used as the "back" end;
one writes transformations that map C syntax to C++ syntax.
A major C++ reengineering task on a large avionics system gives
some idea of what using DMS for this kind of activity is like.
See technical papers at
www.semdesigns.com/Products/DMS/DMSToolkit.html,
specifically
Re-engineering C++ Component Models Via Automatic Program Transformation
This process is not for the faint of heart. But than anybody
that would consider manual refactoring of a large application
is already not afraid of hard work.
Yes, I'm associated with the company, being its chief architect.
I would write C++ classes over the C interface. Not touching the C code will decrease the chance of messing up and quicken the process significantly.
Once you have your C++ interface up; then it is a trivial task of copy+pasting the code into your classes. As you mentioned - during this step it is vital to do unit testing.
GCC is currently in midtransition to C++ from C. They started by moving everything into the common subset of C and C++, obviously. As they did so, they added warnings to GCC for everything they found, found under -Wc++-compat. That should get you on the first part of your journey.
For the latter parts, once you actually have everything compiling with a C++ compiler, I would focus on replacing things that have idiomatic C++ counterparts. For example, if you're using lists, maps, sets, bitvectors, hashtables, etc, which are defined using C macros, you will likely gain a lot by moving these to C++. Likewise with OO, you'll likely find benefits where you are already using a C OO idiom (like struct inheritence), and where C++ will afford greater clarity and better type checking on your code.
Your list looks okay except I would suggest reviewing the test suite first and trying to get that as tight as possible before doing any coding.
Let's throw another stupid idea:
Compile everything in C++'s C subset and get that working.
Start with a module, convert it in a huge class, then in an instance, and build a C interface (identical to the one you started from) out of that instance. Let the remaining C code work with that C interface.
Refactor as needed, growing the OO subsystem out of C code one module at a time, and drop parts of the C interface when they become useless.
Probably two things to consider besides how you want to start are on what you want to focus, and where you want to stop.
You state that there is a large code churn, this may be a key to focus your efforts. I suggest you pick the parts of your code where a lot of maintenance is needed, the mature/stable parts are apparently working well enough, so it is better to leave them as they are, except probably for some window dressing with facades etc.
Where you want to stop depends on what the reason is for wanting to convert to C++. This can hardly be a goal in itself. If it is due to some 3rd party dependency, focus your efforts on the interface to that component.
The software I work on is a huge, old code base which has been 'converted' from C to C++ years ago now. I think it was because the GUI was converted to Qt. Even now it still mostly looks like a C program with classes. Breaking the dependencies caused by public data members, and refactoring the huge classes with procedural monster methods into smaller methods and classes never has really taken off, I think for the following reasons:
There is no need to change code that is working and that does not need to be enhanced. Doing so introduces new bugs without adding functionality, and end users don't appreciate that;
It is very, very hard to do refactor reliably. Many pieces of code are so large and also so vital that people hardly dare touching it. We have a fairly extensive suite of functional tests, but sufficient code coverage information is hard to get. As a result, it is difficult to establish whether there are already sufficient tests in place to detect problems during refactoring;
The ROI is difficult to establish. The end user will not benefit from refactoring, so it must be in reduced maintenance cost, which will increase initially because by refactoring you introduce new bugs in mature, i.e. fairly bug-free code. And the refactoring itself will be costly as well ...
NB. I suppose you know the "Working effectively with Legacy code" book?
You mention that your tool is a compiler, and that: "Actually, pattern matching, not just type matching, in the multiple dispatch would be even better".
You might want to take a look at maketea. It provides pattern matching for ASTs, as well as the AST definition from an abstract grammar, and visitors, tranformers, etc.
If you have a small or academic project (say, less than 10,000 lines), a rewrite is probably your best option. You can factor it however you want, and it won't take too much time.
If you have a real-world application, I'd suggest getting it to compile as C++ (which usually means primarily fixing up function prototypes and the like), then work on refactoring and OO wrapping. Of course, I don't subscribe to the philosophy that code needs to be OO structured in order to be acceptable C++ code. I'd do a piece-by-piece conversion, rewriting and refactoring as you need to (for functionality or for incorporating unit testing).
Here's what I would do:
Since the code is 20 years old, scrap down the parser/syntax analyzer and replace it with one of the newer lex/yacc/bison(or anything similar) etc based C++ code, much more maintainable and easier to understand. Faster to develop too if you have a BNF handy.
Once this is retrofitted to the old code, start wrapping modules into classes. Replace global/shared variables with interfaces.
Now what you have will be a compiler in C++ (not quite though).
Draw a class diagram of all the classes in your system, and see how they are communicating.
Draw another one using the same classes and see how they ought to communicate.
Refactor the code to transform the first diagram to the second. (this might be messy and tricky)
Remember to use C++ code for all new code added.
If you have some time left, try replacing data structures one by one to use the more standardized STL or Boost.