Doxygen, too heavy to maintain? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am currently starting using doxygen to document my source code. I have notice that the syntax is very heavy, every time I modify the source code, I also need to change the comment and I really have the impression to pass too much time modifying the comment for every change I make in the source code.
Do you have some tips to document my source code efficiently ?
Does some editor (or plugin for existing editor) for doxygen to do the following exist?
automatically track unsynchronized code/comment and warn the programmer about it.
automatically add doxygen comment format (template with parameter name in it for example) in the source code (template) for every new item
PS: I am working on a C/C++ project.

Is it the Doxygen syntax you find difficult? Or is it the fact that you have to comment all of the functions now.
If it's the former, there may be a different tool that fits your coding style better. Keep in mind that Doxygen supports multiple commenting styles, so experiment until you find one you like.
If it's the latter, then tough it out. As a good programming practice, every public-facing function should have a comment header that explains:
What the function does
The parameters
The return codes
Any major warnings/limitations about the function.
This is true regardless of the documentation tool you use.
My big tip: Avoid the temptation to comment too much. Describe what you need, and no more. Doxygen gives you lots of tags, but you don't have to use them all.
You don't always need a #brief and a detailed description.
You don't need to put the function name in the comments.
You don't need to comment the function prototype AND implementation.
You don't need the file name at the top of every file.
You don't need a version history in the comments. (You're using a version control tool, right?)
You don't need a "last modified date" or similar.
As for your question: Doxygen has some configuration options to trigger warnings when the comments don't match the code. You can integrate this into your build process, and scan the Doxygen output for any warnings. This is the best way I have found to catch deviations in the code vs comments.

I feel you get back what you put into comments, 5 minutes commenting a class well will in 3 months time when the class needs to be changed by someone else other than the original author (indeed sometimes by the original author) will take much less time to get to grips with.
I second the documentation support mentioned by David, in eclipse when you refactor parameter names it will rename the parameter in your docs section for example. I am not sure I would want it doing anything else automatically to be honest.
"every time I modify the source code, I also need to change the comment" Could be you're documenting too much. You should only have to change the documentation of a function if the change to it requires you to change every caller in some way (or if not actually change, at least check to make sure they weren't relying on obsolete behaviour), of if you're introducing new functionality that a new caller will rely on. So in theory it shouldn't be a massive overhead. Small changes, like optimizations and bugfixes within the function, usually don't need documenting.

There is seriously something wrong about the way you use comments, or how you develop.
Doxygen comments are used for external/internal documentation on interfaces. If your interfaces change extremely fast, then you should probably sit down and think about the architecture layout first.
If you are using doxygen to document the internal flow of functions, then you should maybe rethink this approach (and even then, these comments shouldn't change that much). When you have a function that calculates some value then a comment /// Calculating xy using the abc algorithm is definitely something that shouldn't change each day.

It really depends on how much information you put in your documentation.
My functions generally smaller now due to unit testing and thus the documentation is correspondingly small. Also when documenting the class itself I always have a small snippet of code to show how the class is supposed to use. I find those are the hardest to maintain but worth it so you don't get juniors bugging you asking how they use the class.
Tips:
Only document your public interfaces.
Only do minimal documentation about what the function does.
Try to use an editor that has support (most do) or has a plugin.
You'll be glad when you come to edit your code in 6 months time...

Judge yourself if the below style fits your needs. It is C-affine tho, but maybe you can draw enough from it for your ends.
///\file
/// Brief description goes here
bool /// #retval 0 This is somewhat inconsistent. \n Doxygen allows several retval-descriptions but
/// #retval 1 will not do so for parameters. (see below)
PLL_start(
unsigned char busywait, ///< 0: Comment parameters w/o repeating them anywhere. If you delete this param, the
/// comment will go also. \n
/// 1: Unluckily, to structure the input ranges, one has to use formatting commands like \\n \n
/// 2-32767: whatever
int param) ///< 0: crash \n
/// 1: boom \n
/// 2: bang!
{
/**
* Here goes the long explanation. If this changes frequently, you have other more grave problems than
* documentation. NO DOCUMENTATION OF PARAMETERS OR RETURN VALUES HERE! REDUNDANCY IS VICIOUS!
* - Explain in list form.
* - It really helps the maintainer to grok the code faster.
*
*#attention Explain misuses which aren't caught at runtime.
*
*#pre Context:
* - This function expects only a small stack ...
* - The call is either blocking or non-blocking. No access to global data.
*
*#post The Foobar Interrupt is enabled. Used system resources:
* - FOO registers
* - BAR interrupts
*/
/**#post This is another postcondition. */
}

Use non-documenting comment for new features and early stages of your code. When you find your code is ready to publish, you may update docs. Also avoid repetition of argument or function names.

Use Doxygen's strengths - it will generate class and method descriptions without you adding comments (just not by default - set EXTRACT_ALL=YES).
I don't document every parameter because I think their names should do that for them (*).
I am against the auto-documenting plugins most people recommend because they create generic comments you will then have to maintain.
I want comments to be meaningful - if I see a comment it stands out and I will pay attention.
(*) the exception is when interfaces in public code are very stable and some people will benefit from additional explanations, even then I try to avoid documenting.

Not exactly what you're searching but this Vim plugin can generate Doxygen stub on top of your definitions. It work pretty well.

In my professional software experience, every time a source file is modified, a comment must be entered describing the change. These change comments are usually not in the Doxygen comment areas (unless changes are made to an interface).
I highly suggest you make commenting your code a habit. Not only is this good for when other people have to maintain or assist you with your code, but it helps when you have abandon a source file for a while (such as when management tells you to switch projects). I have found that writing comments as I code helps me understand the algorithms better.

In addition to Doxygen I think you should take a look at Code Rocket.
It actually documents what's happening "inside" your methods by trawling through the actual code and comments they contain - therefore not just limiting itself to the function comment headers, which can get out of date with what the function contents actually are.
It will automatically provide you with flowchart and pseudocode visualizations of your method contents as a form of documentation.

Related

Understanding large code [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
The question of understanding a large code has previously been well answered. But I feel I should ask this question again to ask the problems I have been facing.
I have just started a student job. I am a beginner programmer and just learned about classes two months back. At the job though, I have been handed a code that is part of a big software. I understand what that code is supposed to do (to read a file). But after spending a few weeks trying to understand the code and modify it to achieve our desired results, I have come to the conclusion that I need to understand each line of that code. The code is about 1300 lines.
Now when i start reading the code, I find that, for example, a variable is defined as:
VarType VarName
Now VarType is not a type like int or float. It is a user defined type so i have to go the class to see what this type is.
In the next line, I see a function being called, like points.interpolate(x);
Now i have to go into another class and see what the interpolate function does.
This happens a lot which means even if I try to understand a small part of the code, I have to go to 3 or 4 different classes and keep them in mind all at one time without losing the main objective and that is tough.
I may not be a skilled programmer but I want to be able to do this. Can I have some suggestions how i should approach this?
Also (I will sound really stupid when I ask this) what is a debugger? I hope this gives you an idea of where I stand (and the need to ask this question again). :(
With any luck, those functions and classes should have at least some documentation to describe what they do. You do not need to do know how they work to understand what they do. When you see the use of interpolate, don't start looking at how it works, otherwise you end up in a deep depth-first-search through the code base. Instead, read its documentation, and that should tell you everything you need to know to understand the code that uses it.
If there is no documentation, I feel for you. I can suggest two tips:
Make general assumptions about what a function or class will do from its name, return type and arguments and the surrounding code that uses it until something happens that contradicts those assumptions. I can make a pretty good guess about what interpolate does without reading how it works. This only works when the names of the functions or classes are sufficiently self-documenting.
If you need a deep understanding of how some code works, start from the bottom and work upwards. Doing this means that you won't end up having to remember where you were in some high level code as you search through the code base. Get a good understanding of the low level fundamental classes before you attempt to understand the high level application of those types.
This also means that you will understand the functions and classes in a generic sense, rather than in the context of the code that led you to them. When you find points.interpolate(x), instead of wondering what interpolate does to these specific points with this specific x argument, find out what it does in general. Later, you will be able to apply your new-found knowledge to any code that uses the same function.
Nonetheless, I wouldn't worry about 1300 lines of code. That's basically a small project. It's only larger than examples and college assignments. If you take these tips into account, that amount of code should be easily manageable.
A debugger is a program that helps you debug your code. Common features of debuggers allow you to step through your code line-by-line and watch as the values of variables change. You can also set up breakpoints in your code that are of interest and the debugger will let you know when it's hit them. Some debuggers even let you change code while executing. There are many different debuggers that all have different sets of features.
Try making assumptions about what the code does based on its title. For example, assume that the interpolate function correctly interpolates your point; only go digging in that bit of code if the output looks suspicious.
First, consider getting an editor/IDE that has the following features:
parens/brackets/braces matching
collapsing/uncollapsing of blocks of code between curly braces
type highlighting (in tooltips)
macro expansion (in tooltips or in a separate window/panel)
function prototype expansion (in tooltips or in a separate window/panel)
quick navigation to types, functions and classes and back
opening the same file in multiple windows/panels at different positions
search for all mentions/uses of a specific type, variable, function or class and presentation of that as a list
call tree/graph construction/navigation
regex search in addition to simple search
bookmarks?
Source Insight is one of such tools. There must be others.
Second, consider annotating the code as you go through it. While doing this, note (write down) the following:
invariants (what's always true or must always be true)
assumptions (what may not be true, e.g. missing checks/validations or unwarranted expectations), think "what if"
objectives (the what) of a piece of code
peculiarities/details of implementation (the how; e.g. whether exceptions are thrown and which, which error codes are returned and when)
a simplified call tree/graph to see the code flow
do the same for data flow
Draw diagrams (in ASCII or on paper/board); I sometimes photograph my papers or the board. Specifically, draw block diagrams and state machines.
Work with code at different levels of abstraction/detail. Zoom in to see the details, zoom out to see the structure. Collapse/uncollapse blocks of code and branches of the call tree/graph.
Also, have a checklist of what you are going to do. Check the items you've done. Add more as necessary. Assign priorities to work items, if it's appropriate.
A debugger is a program that lets you execute your program step by step and examine its state (variables). It also lets you modify the state and that may be useful at times too.
You may use a debugger to understand your code if you're not very well familiar with it or with the programming language.
Another thing that may come in handy is writing tests or input data test sets for your program. They may reveal problems and limitations in terms of logic and performance.
Also, don't neglect documentation and people! If there's something or someone that can give you more information about the project/code, use that something or someone. Ask for advice.
I know this sounds like a lot, but you'll end up doing some of this at some point anyway. Just wait for a big enough project. :)
You may basically needs to understand what is the functionality of a function being called at first, then understand what is input and output to that function, for example, if you really needs to understand how interpolate is done, you can then go to the details. Usually, the name of the functions are self-explainable, you can get a feeling about what the function does from its name if the code is well written.
Another thing you may want to try is to run some toy examples to go through the code, you can use some of the debuggers or IDE that can help you navigate through the code. Understanding large-scale code takes time and experience, just be patience.
"Try the Debugger Approach"
[Update : A debugger is a special program that lets you pause a running program to examine the state of program (Variable Values/Which function is running/Who is the parent function etc.,)]
The way I do it is by Step Debugging the code, for the usecase I want to understand.
If you are using an Advanced/Mordern IDE then setting breakpoints at the entry point (like main() or a point of interest) is fairly easy. And from there on just enter into the function you want to examine or overstep the function.
To give you a step by step approach
Setup a break point in the main() methods (entry points) starting expression.
Run the program with debugging active
The program will break at the break point.
Now, if step over until you come across a function/expression that seems interesting. (say, your points.interpolate(x); ) function
Step into the function, and examine the program state like the variables and function stack, in live.
Avoid complex system Libraries. Just Step over/Step out. (Example: Avoid something like MathLib.boringComputaion() )
Repeat until the program exits.
I found out that this way of learning is very rapid and gives you a quick understanding of any complex/large piece of software.
Use Eclipse, or if you cant then try GDB if its C/C++. Every popular programming language has a decent Debugger.
Understand the basic debugging operations like will be a benifit:
Setting-up a breakpoint.
Stopping at a breakpoint.
Examine/Watch Variables.
Examine Function Stack (the hierarchy of function calls)
Single-Step - Stepping to next Line in Code.
Step-Into a function.
Step-Out of a function.
Step-over a function.
Jumping to the next breakpoint (point of interest).
Hope, it helps!
Many great answer have already been given. I thought to add my understanding as a former student (not too long ago) and what I learned to help me understand code. This particularly helped me because I began a project to convert a database I wrote in Java many years ago to c++.
1. **Code Reading** - Do not underestimate this important task. The ability to write code
does not always translate into the ability to read it -- and reading it can be more
frustrating than writing it.
Take your time and carefully discover what each line of the codes does. This will certainly help you avoid making assumptions unless you come across code that you are familiar with and can gloss over it.
2. Don't hesitate to follow references, locate declarations, and uncover definitions of
code elements you are reading. What you learn about how a particular variable,
method call, or class are defined all contribute to learning and ultimately to you
being able to perform your task.
This is particularly important because detective, and effective detective work, are essential parts of being bale to understand the small parts of the code so that you can, in the future, grasp the larger parts with less difficulty.
Others have already posted information about what a debugger is and you will find it is an invaluable asset at tracking down code errors and, I think, helps with code reading, knowledge gain, and understanding so you can be a successful programmer.
Here is a link to a debugger tutorial utilizing Visual Studio and may give you a strong understanding of at least the process at hand.

Dual purpose code commenting(users & maintainers)...HOW? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am writing a C++ static library and I have been commenting with doxygen comments in the implementation files. I have never really had to care very much about documentation but I am working on something now that needs to be documented well for the users and also I am trying to replace my previous bad habit of just wanting to code and not document with better software engineering practices.
Anyway, I realized the other day that I need a couple different types of documentation, one type for users of the library(doxygen manual) and then comments for myself or a future maintainer that deal more with implementation details.
One of my solutions is to put the doxygen comments for file, class, and methods at the bottom of the implementation file. There they would be out of the way and I could include normal comments in/around the method definitions to benefit a programmer. I know it's more work but it seems like the best way for me to achieve the two separate types of commenting/documentation. Do you agree or have any solutions/principles that might be helpful. I looked around the site but couldn't really find any threads that dealt with this.
Also, I don't really want to litter the interface file with comments because I feel like it's better to let the interface speak for itself. I would rather the manual be the place a user can look if they need a deeper understanding of the library interface. Am I on the right track here?
Any thoughts or comments are much appreciated.
edit:
Thanks everyone for your comments. I have learned alot from hearing them. I think I have a better uderstanding of how to go about separating my user manual from the code comments that will be useful to a maintainer. I like the idea that #jalf has about having a "prose" style manual that helps explain how to use the library. I really think this is better than just a reference manual. That being said...I also feel like the reference manual might really come in handy. I think I will combine his advice with the thoughts of others and try to create a hybrid.(A prose manual(using the doxygen tags like page, section, subsection) that links to the reference manual.) Another suggestion I liked from #jalf was the idea of the code not having a whole manual interleaved into it. I can avoid this by placing all of my doxygen comments at the bottom of the implementation file. That leaves the headers clean and the implementation clean to place comments useful for someone maintaining the implementation. We will see if this works out in reality. These are just my thoughts on what I have learned so far. I am not positive my approach is going to work well or even be practical. Only time will tell.
I generally believe that comments for users should not be inline in the code, as doxygen comments or anything like that. It should be a separate document, in prose form. As a user of the library, I don't need to, or want to, know what each parameter for a function means. Hopefully, that's obvious. I need to know what the function does. And I need to know why it does it and when to call it. And I need to know what pre- and postconditions apply. What assumptions does the function make when I call it, and what guarantees does it provide when it returns?
Library users don't need comments, they need documentation. Describe how the library is structured and how it works and how to use it, and do so outside the code, in an actual text document.
Of course, the code may still contain comments directed at maintainers, explaining why the implementation looks the way it does, or how it works if it's not obvious. But the documentation that the library user needs should not be in the code.
I think the best approach is to use Doxygen for header files to describe (to the users) how to use each class/method and to use comments within the .cpp files to describe the implementation details.
Well done, Doxygen commenting can be very useful both when reading code and when reading generated HTML. All the difficulty lies in Well done.
My approach is as following:
For users of library, I put Doxygen comments in header files for explaining what is the purpose of that function and how to use it by detailing all arguments, return values and possible side effects. I try to format it such that generated documentation is a reference manual.
For maintainers, I put basic (not Doxygen) comments in implementation files whenever self-commenting code is not enough.
Moreover, I write a special introductory file (apart from the code) in Doxygen format for explaining to new users of libray how to use the various features of the library, in the form of a user's guide which points to details of reference manual. This intro appears as the front page of the Doxygen generated documentation.
Doxygen allows the creation of two versions of the documentation (one for users and one for "internal use") through the \internal command and the INTERNAL_DOCS option. It is also possible to have a finer grained control with conditional sections (see the \if command and the ENABLED_SECTIONS option.)
As others have already noted, it is also useful to provide users (and also maintainers sometimes) something at a higher level than strictly code comments. Doxygen can also be used for that, with the \mainpage, \page, [sub[sub]]section and \par commands
I recommend you to take a look at this paper: http://www.literateprogramming.com/knuthweb.pdf
I normally applied those ideas to my projects (using Doxygen). It also helps in keeping the doc up to date because it is not necessary to leave the IDE, so one can make annotations while coding and, later on, revise the final pdf document to see what needs to be updated or more detailed.
In my experience, Doxygen requires some work so that the pdf look nice, the graphs and pics in place, etc. but once you find your ways and learn the limitations of the tool, it gets the job done quite well.
My suggestion, besides what Kyle Lutz and Eric Malefant have already said, is to put long explanations about related classes in its own file (I use a header file for that) and add references to other parts using Doxygen tags. You only need to include those headers in the Doxygen configuration file (using pattern matching). This avoids cluttering your headers too much.
There is no quick easy answer, good documentation is hard.
I personally feel a layered model is best.
high level docs in prose. Pictures and videos are very appropriate.
reference level docs should Doxygen (well done doxygen, not just off hand comments).
maintainer docs should not show up in the reference docs, but they could still be doxygen as pointed out by by Éric.
I really like the documentation style used in RakNet. The author uses extensive Doxygen comments and provides a generated reference manual. He also provides some plain html tutorials. Best of all he supplies video walk-throughs of some of the more complicated features.
Another good example is SFML. The quality isn't as good as RakNet but it's still very good. He provides a good overview page in the doxygen generated documentation. There are a few plain html tutorials and a plain html Features/Overview page.
I prefer these styles as Doxygen generated documentation is generally too low level when I'm just starting out, but perfectly concise once I'm in the groove.

How to manage special cases and heuristics

I often have code based on a specific well defined algorithm. This gets well commented and seems proper. For most data sets, the algorithm works great.
But then the edge cases, the special cases, the heuristics get added to solve particular problems with particular sets of data. As number of special cases grow, the comments get more and more hazy. I fear going back and looking at this code in a year or so and trying to remember why each particular special case or heuristic was added.
I sometimes wish there was a way to embed or link graphics in the source code, so I could say effectively, "in the graph of this data set, this particular feature here was causing the routine to trigger incorrectly, so that's why this piece of code was added".
What are some best-practices to handle situations like this?
Special cases seem to be always required to handle these unusual/edge cases. How can they be managed to keep the code relatively readable and understandable?
Consider an example dealing with feature recognition from photos (not exactly what I'm working on, but the analogy seems apt). When I find a particular picture for which the general algorithm fails and a special case is needed, I record as best I can that information in a comment, (or as someone suggested below, a descriptive function name). But what is often missing is a permanent link to the particular data file that exhibits the behavior in question. While my comment should describe the issue, and would probably say "see file foo.jp for an example of this behavior", this file is never in the source tree, and can easily get lost.
In cases like this, do people add data files to the source tree for reference?
Martin Fowler said in his refactoring book that when you feel the need to add a comment to your code, first see if you can encapsulate that code into a method and give the method a name that would replace the comment.
so as an abstract you could create a method named.
private bool ConditionXAndYHaveOccurred(object param)
{
// code to check for conditions x and y
return result;
}
private object ApplySolutionForEdgeCaseWhenXAndYHappen(object param)
{
//modify param to solve for edge case
return param;
}
Then you can write code like
if(ConditionXAndYHaveOccurred(myObject))
{
myObject = ApplySolutionForEdgeCaseWhenXAndYHappen(myObject);
}
Not a hard and fast concrete example, but it would help with readability in a year or two.
Unit testing can help here. Having tests that actually simulate the special cases can often serve as documentation on why the code does what it does. This can often be better then just describing the issue in a comment.
Not that this replaces moving the special case handling to their own functions and decent comments...
If you have a knowledge base or a wiki for the project, you could add the graph in it, linking to it in the method as per Matthew's Fowler quote and also in the source control commit message for the edge case change.
//See description at KB#2312
private object SolveXAndYEdgeCase(object param)
{
//modify param to solve for edge case
return param;
}
Commit Message: Solution for X and Y edge case, see description at KB#2312
It is more work, but a way to document cases more thoroughly than mere test cases or comments could. Even though one might argue that test cases should be documentation enough, you might not want store the whole failing data set in it, for instance.
Remember, vague problems lead to vague solutions.
I'm not usually an advocate of test driven development and similar styles that stress tests too much, but this seems to be a perfect case where a bunch of unit test can help a lot. And not even in the first place to catch bugs from later changes, but simply to document all the special cases that need to be addressed.
A few good unit test with comments in them are in itself the best description of the special cases. And the commenting of the code itself gets easier too. One can simply point to some unit tests that illustrate the problem that is being solved at that point in the code.
About the
I sometimes wish there was a way to embed or link graphics in the source code,
so I could say effectively, "in the graph of this data set, this particular
feature here was causing the routine to trigger incorrectly, so that's why this piece of
code was added".
part:
If the "graphic" that you want to embed is a graph, and if you use Doxygen, you can embed dot commands in your comment to generate a graph in the documentation:
/**
If we have a subgraph looking like this:
\dot
digraph g{
A->B;
A->C;
B->C;
}
\enddot
the usual method does not work well and we use this heuristic instead.
*/
Don Knuth invented literate programming to make it easy for your program documentation to include plots, graphs, charts, mathematical equations, and whatever else you need to make it understood. A literate program is a great way to explain why something is the way it is and how it got that way over time. There are many, many literate-programming tools; the "noweb" tool is one of the simplest and is shipped with some Linux distributions.
Without knowing the specific nature of your problem is not easy to give an answer, but in my own experience, handling of special cases on hard code must be avoided. Haven't you thought about implementing a rules engine or something like that for handling special cases outside your main processing algorithm?
It sounds like you need more thorough documentation than just code comments. That way someone could look up the function in question in the documentation and be presented with an example picture that requires a special case.

C/C++ Header file documentation [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
What do you think is best practice when creating public header files in C++?
Should header files contain no, brief or massive documentation? I've seen everything from almost no documentation (relying on some external documentation) to large specifications of invariants, valid parameters, return values etc. I'm not sure exactly what I prefer, large documentation is nice since you've always access to it from your editor, on the other hand a header file with very brief documentation can often show a complete interface on one or two pages of text giving a much better overview of what's possible to do with a class.
Let's say I go with something like brief or massive documentation. I want something similar to javadoc where I document return values, parameters etc. What's the best convention for that in c++? As far as I can remember doxygen does good stuff with java doc-style documentation, but are there any other conventions and tools for this I should be aware of before going for javadoc style documentation?
Usually I put documentation for the interface (parameters, return value, what the function does) in the interface file (.h), and the documentation for the implementation (how the function does) in the implementation file (.c, .cpp, .m).
I write an overview of the class just before its declaration, so the reader has immediate basic information.
The tool I use is Doxygen.
I would definetely have some documentation in the header files themselves. It greatly improves debugging to have the information next to the code, and not in separate documents. As a rule of thumb, I would document the API (return values, argument, state changes, etc) next to the code, and high-level architectural overviews in separate documents (to give a broader view of how everything is put together; it's hard to place this together with the code, since it usually references several classes at once).
Doxygen is fine from my experience.
I believe Doxygen is the most common platform for generating documentation, and as far as I know, it's more or less able to cover JavaDoc-notation (not limited to of course). I've used doxygen for C, with OK results, I do think it's more suitable for C++ though. You might want to look into robodoc as well, although I think Occam's Razor will tell you to rather go for Doxygen.
Regarding how much documentation, I've never been a documentation-fan myself, but whether I like it or not, having more documentation always beats having no documentation. I'd put the API-documentation in the header file, and the implementation documentation in the implementation (stands to reason, doesn't it?). :) That way, IDEs have the chance to pick it up and show it during autocompletion (Eclipse does this for JavaDocs, for example, perhaps also for doxygen?), which shouldn't be underestimated. JavaDoc has this additional quirk that it uses the first sentence (i.e. up to the first full stop) as a brief description, don't know if Doxygen does this though, but if so, it should be taken into consideration when writing the documentation.
Having a lot of documentation runs the risk of being out of date, however, keeping the documentation close to the code will give people a chance to keep it up to date, so you should definately keep it in the source/header files. What shouldn't be forgotten though is the production of documentation. Yes, some people will use the documentation directly (through the IDE or whatever, or just reading the header file), but some people prefer other ways, so you should probably consider putting your (regularly updated) API documentation online, all nice and browsable, as well as perhaps producing man-files if you're targeting *nix-based developers.
That's my two cents.
Put enough into the code that it can stand alone. Nearly every project I've been in where the documentation was separate, it got out of date or wasn't done, partly that if it's a separate document it becomes a separate task, partly as management didn't allow for it as a task in budgetting. Documenting inline as part of the flow works much better in my experience.
Write the documentation in a form which most editors recognise is a block; for C++ this seems to be /* rather than //. That way you can fold it and just see the declarations.
Maybe you would be interested in gtk-doc. It can be "a bit awkward to setup and use" but you can get a nice API documentation from source code, looking like this:
String Utility Functions

Comments in source code [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
How to keep the source code well documented/commented? Is there a tool to generate a skeleton for comments on the Unix platform for C++?
In general, how many lines of comments is recommended for a file with around 100 lines of code?
Generally, it's best to let the code itself explain what it does, whereas the comments are there to describe why it's like that. There is no number to stick to. If your 100 lines speak for themselves, don't comment at all or just provide a summary at the beginning. If there is some knowledge involved that's beyond what the code does, explain it in a comment.
If you're code is too complicated to explain itself, then that may be a reason to refactor.
This way, when you change the implementation you don't need to change the comments as well, as your comments do not duplicate the code. Since the reasons for the design seldom change it's safe to document them in comments for clarity.
Personally I think skeleton comments are a horrible, horrible idea. I understand that sometimes it's nice to save couple of keystrokes and perhaps get argument signatures in comment... but resulting n+1 empty useless comments (when editor has added boilerplates and coder has left them as is) are just more irritating.
I do think comments are needed, at any rate -- if only code one writes is too trivial to ned explanation, chances are code in question is useless (i.e. could have been automated and needn't be hand-written). I tend to comment my code reasonably well because I have learnt that usually I need it myself first. That others can use them is just an added bonus.
In general, how many lines of comments is recommended for a file with around 100 lines of code?
Enough to make your intent clear and to explain any unfamiliar idioms used. There's no rule of thumb, because no two 100 lines of code are the same.
For example, in C#, a property can be given setters and getters like this:
public int ID { get; set; }
Now I hadn't even seen any C# until I joined StackOverflow two weeks ago, but that needs no comment even for me. Commenting that with
// accessor and setter for ID property
would just be noise. Similarly,
for( int i = m ; i < n; ++i) { // "loop from m to n" is a pointless comment
char* p = getP() ; // set p to getP, pure noise.
if( p ) // if p does not eqaul null, pure noise
int a &= 0x3; // a is bitwise or'd with 0x303, pure noise
// mask off all but two least significant bits,
//less noisy but still bad
// get remainder of a / 4, finally a useful comment
Again, any competent coder can read the code to see what it's doing. Any coder with basic experience knows that if( p ) is a common idiom for if( p != 0), which doesn't need explaining. But no one can read your intent unless you comment it.
Comment what you're trying to do, your reason for doing it, not what the code is plainly doing.
On edit: you'll note that after 11 days, no one has commented on intentional error in one of my example comments. That just underscores that that comment is pure noise.
I think this question has a lot of good relevant answers for a similar question: Self-documenting code
As for tools for creating comments, it depends on the editor you're using and the platform. Visual studio automatically creates space for comments, at least it does for C# sometimes. There are also tools that use comments to generate documentation. As for lines counts, I think that's irrelevant. Be as concise and clear as possible.
I think a good guideline is to comment every class and method with a general description of what each is for, especially if you are using an HTML documentation generation tool. Other than that, I try to keep comments to a minimum - only comment code that could potentially be confusing, or require interpretation of intent. Try to write your code in a way that doesn't require comments.
I don't think there is really a metric that you can apply to comments/lines of code, it just depends on the code.
My personal ideal is to write enough commentary so that reading just the comments explains how and why a function is intended to be used. How it works, should usually come out from well chosen variable names and clear implementation.
One way to achieve that, at least on the comment side, is to use a tool such as Doxygen from the beginning. Start coding each new function by writing the comment describing what it is for and how it should be used.
Get Doxygen configured well, have document generation included as a build step, and read the resulting documentation.
The only comment template that might be helpful would be one that sketches in the barest beginning of the Doxygen comment block, but even that might be too much. You want the generated documentation to explain what is important without cluttering it with worthless placeholder text that will never get rewritten.
This is a subject which can be taken to extremes (like many things these days). Enforcing a strong policy sometimes can risk devaluing the exercise (i.e. comments for comment's sake) most of the time, IMHO.
Sometimes an overreaching policy makes sense (e.g. "all public functions must have comment blocks") with exceptions - why bother for generated code?
Commenting should come naturally - should compliment readble code alongside meaningful variable, property and function names (etc).
I don't think there is a useful or accurate measurement of X comments per Y lines of code. You will likely get a good sense of balance through peer reviews (e.g. "this code here should have a comment explaining it's purpose").
I'm not sure about auto-comment tools for C/C++ but the .Net equivalent would have to be GhostDoc. Again, these tools only help define a comment structure - meaning still needs to be added by a developer, or someone who has to interpret the point of the code or design.
Commenting code is essential if your auto generating your documentation (we use doxygen). Otherwise it's best to keep it to a minimum.
We use a skeleton for every method in the .cpp file.
//**************************************************************************************************
//
/// #brief
/// #details
/// #param
/// #return
/// #sa
//
//**************************************************************************************************
but this is purely due to our documentation needs
The rules I try to follow:
write code that is auto-documented: nice and clear variable names,
resist the temptation of clever hacks, etc. This advice depends a
lot on the programming language you use: it is is much easier to
follow with Python than with C.
comment at the beginning to guide the reader so that they know
immediately what they are to expect.
comment what is not obvious from the code. If you had trouble
writing a piece of code, it may mean it desserves a comment.
the API of a library is a special case: it requires
documentation (and to put it in the code is often a good idea,
especially with tools like Doxygen). Just do
not confuse this documentation intended for users with the one
which will be useful for the maintainers of the library.
comment what cannot be in the code, such as policy requirments that
explain why things are the way they are.
comment background information such a the reference to a scientific
article which describes the clever algorithm you use, or the RFC
standardizing the network protocol you implement.
comment the hacks! Everyone is sometimes forced to use hacks or
workarounds but be nice for the future maintainer, comment it. Read
"Technical debt".
And don't comment the rest. Quantitative rules like "20 % of the lines
must be comments" are plainly stupid and clearly intended only for
PHBs.
I'm not aware of any tool but I feel it's always good to have some comments in the code if it is to be maintained by someone else in the future. At least, it's good to have header blocks for classes and methods detailing what the class is meant for and what the method does. But yes, it is good to keep the comments as minimal as possible.
I prefer to use comments to explain
what a class\function is intended to do,
what it is not supposed to do,
any assumptions I make that the users of the class\function should adhere to.
For the users of vi editor the following plug-in is very helpful. We can define templates for class comments, function comments etc.
csupport plug in
There are no good rules in terms of comment/code ratios. It totally depends on the complexity of your code.
I do follow one (and only one) rule with respect to comments (I like to stay flexible).
The code show how things are done, the comments show what is done.
Some code doesn't need comments at all, due to it's obviousness: this can often be achieved by use of good variable names. Mostly, I'll comment a function then comment major blocks withing the function.
I consider this bad:
// Process list by running through the whole list,
// processing each node within the list.
//
void processList (tNode *s) {
while (s != NULL) { // Run until reached end of list.
processNode (s); // Process the node.
s = s->nxt; // Move to next node.
}
}
since all you're doing there is writing the code thrice. I would prefer something like:
// Process list (or rest of list if you pass a non-start node).
//
void processList (tNode *currentNode) {
// Run through the list, processing each node.
while (currentNode != NULL) {
processNode (currentNode);
currentNode = currentNode->nextNode;
}
}
You guys may argue about but i realy believe in it:
Usually , You don't have to write comments. Simply as that. The code has to be written in such way that is explain itself , if it doesn't explain itself and you have to write comments , then something is wrong.
There are however some exceptional cases:
You have to write something that is VERY cryptic to gain performence. So here you may need to write some explanation.
You provide a library to some other group/company , It is better you document its API.
There are too many novice programemers in your organization.
I wouldn't be so rude to say that comments are excuse for badly programmed code like some people above, nor to say you don't need them.
It also depends on your editor and how do you like to see your code in it, and how would you like others to do that.
For instance, I like to create regions in C#. Regions are named collapsable areas of code, in some way commented code containers. That way, when I look at the editor, I actually look at pseudo code.
#region Connect to the database
// ....
#endregion
#region Prepare tables
#region Cache tables ...
#endregion
#region Fix column names ...
#endregion
#endregion
This kind of code is more readable then anything else I know but ofcourse, it needs editor supporing custom folding with names. (like Visual Studio editor, VIM... ). Somebody will say that you can achieve the similar if you put regions into procedures but first, you can't always do that, second, you have to jump to the procedure to see its code. If you simply set hotkies to open/collapse the region you can quickly see the code in it while scrolling and reading text and generally quickly move over the hierarchy of regions.
About line comments, it would be good to write code that autodocuments itself, but unfortunatelly, this can't be said in generall. This ofcourse depends on projects, its domain and its complexity.
As a last note, I fully suggest in-code documentation via portable and language indepent tool, like for instance NaturalDocs that can be made to work with any language around with natural syntax that doesn't include XML or any kind of special formating (hence the name) plus it doesn't need to be installed more then once.
And if there is a guy that don't like comments, he can always remove them using some simple tool. I even integrated such tool in my editor and comments are gone via simple menu click. So, comments can't harm the code in any way that can't be fixed very fast.
I say that generally comments are a bad smell. But inline code documentation is great. I've elaborated more on the subject over at robowiki.net:
Why Comments are Bad
I concur with everyone about self-documenting code. And I also agree about the need for special comments when it comes to documentation generation. A short comment at the top of each method/class is useful, especially if your IDE can use it for tooltips in code completion (like Visual Studio).
Another reason for comments that I don't see mentioned here is in type-unsafe languages like JavaScript or PHP. You can specify data types that way, although hungarian notation can help there as well (one of the rare cases for using it properly, I think).
Also, tools like PHPLint can use some special type-related comments to check your code for type-safety.
There are no metrics you can sensibly use for comments. You should never say x lines of code must have y comments, because then you will end up with silly useless comments that simply restate code, and these will degrade the quality of your code.
100 lines of code should have as few comments as possible.
Personally, having used them in the past, I'd not use things like doxygen to document internal code to the extent of every function and every parameter needing tagged descriptions because with well factored code you have many functions and with good names, most often these tagged descriptions don't say any more than the parameter name itself.
My opinion - comments in source code is evil. Code should be self documented. Developers usually forget about reading and updating them.
As sad Martin Fowler: "if you need comment for lines block - just make new function" (this not quote - this phrase as I remember it).
It will be better to keep separated documentation for utility modules, basic principles of your project, organization of libraries, some algorithms and design ideas.
Almost forget: I was used code comments once. It was MFC/COM - project and I leave links from MSDN howto articles near non-trivial solutions/workarounds.
100 lines of source code - should be understandable if not - it should be separated or reorganized on few functions - which will be more understandable.
Is there a tool to generate a skeleton
for comments on the Unix platform for
C++?
Vim have plugins for inserting doxygen comments template, if you really need this.
Source code should always be documented where needed. People have argued for what and what not to document. However I wanted to attribute with one more note.
Let's say I have implemented a method that return a/b
So as the programmer I am a great citizen, and I will hint the user what to expect.
/**
* Will return 0 if b is 0, to prevent the world from exploding.
*/
float divide(float a, float b) {
if (b == 0) return 0;
return a/b;
}
I know, this is pretty obvious that nobody would ever create such a method. But this can be reflected to other issues, where users of an API can't figure out what a function expects.