Doxygen parsing ampersands for ascii chars - c++

I've been using Doxygen to document my project but I've ran into some problems.
My documentation is written in a language which apostrophes are often used. Although my language config parameter is properly set, when Doxygen generates the HTML output, it can't parse apostrophes so the code is shown instead of the correct character.
So, in the HTML documentation:
This should be the text: Vector d'Individus
But instead, it shows this: Vector d'Individus
That's strange, but searching the code in the HTML file, I found that what happens is that instead of using an ampersand to write the ' code, it uses the ampersand code. Well, seeing the code is easier to see:
<div class="ttdoc">Vector d&#39;Individus ... </div>
One other thing is to note that this only happens with the text inside tooltips...
But not on other places (same code, same class)...
What can I do to solve this?
Thanks!

Apostrophes in code comments must be encoded with the correct glyph for doxygen to parse it correctly. This seems particularly true for the SOURCE_TOOLTIPS popups. The correct glyph is \u2019, standing for RIGHT SINGLE QUOTATION MARK. If the keyboard you are using is not providing this glyph, you may write a temporary symbol (e.g. ') and batch replace it afterwards with an unicode capable auxiliary tool, for example: perl -pC -e "s/'/\x{2019}/g" < infile > outfile. Hope it helps.

Regarding the answer from ramkinobit, this is not necessary, doxygen can use for e.g. the Right Single quote: ā€™ (see doxygen documentation chapter "HTML commands").
Regarding the apostrophe the OP asks for one can use (the doxygen extension) &apos; (see also doxygen documentation chapter "HTML commands")).
There was a double 'HTML escape' in doxygen resulting in the behavior as observed for the single quote i.e. displaying '.
I've just pushed a proposed patch to github (pull request 784, https://github.com/doxygen/doxygen/pull/784).
EDIT 07/07/2018 (alternative) patch has been integrated in main branch on github.

Related

What does ā€œ\pā€ in comments means?

During reading LLVM source code, I find something different in comments, e.g.
/// If \p DebugLogging is true, we'll log our progress to llvm::dbgs().
What does \p means here?
LLVM uses Doxygen for generating documentation, the /// marker is one of the many ways of creating a special comment block that Doxygen will parse to form documentation.
Within a special comment block, \p is simply one of the mark-up commands, this particular one renders the following word in typewriter font (fixed rather than proportional). The \c option is an alias for the same thing.
3 slashes is one of the ways that doxygen comments are identified.
The \p tag has some meaning, see it's documentation: https://www.doxygen.nl/manual/commands.html#cmdp
Displays the parameter using a typewriter font. You can use this command to refer to member function parameters in the running text.
I agree . These seems to be Doxygen commands to format typewritten fonts but since its in comments its not showing the 'font format' but the character itself.
Comments are not touched or processed by Doxygen. They have their own formatting. The /c /p precedes some important keywords (methods, members, parameters etc) only and not arbitrary. The author in all good intentions wanted people to identify the keywords but in comments, all are equal.

How to set up clang-format comment pragmas so multiline doxygen comments don't get touched?

I am trying to introduce clang-format to a couple of our projects at work (C and C++), but I am having trouble getting it to format multi-line Doxygen comments the way I want.
All comments have the same format:
/*! #brief Some text
*
* Some more text
*
* #verbatim
*
* A very long line of text that exceeds the clang-format column width but should not be touched
*
* #endverbatim
*/
I want clang-format to leave the verbatim blocks alone and not reflow them. I am using clang-format-6.0
Turning ReflowComments off is not an option as non-doxygen comments must be taken care of by clang-format
I have tried various regular expressions in the CommentPragmas config item but to no avail:
#verbatim(.*\n)*.*#endverbatim to treat the entire verbatim block as a comment pragma. This is the ideal situation, as any other part of the Doxygen comment I do not mind being broken into multiple lines.
#brief(.*\n)+ to match the entire comment block as the pragma. I've also tried this with an arbitrary token at the end of the comment to act as an explicit end-of-block marker. This isn't ideal as it doesn't force the non-verbatim part of the comment to conform, but is a compromise I'm willing to live with if I have to.
Various other regexes I've seen in other discussions, adapted to fit our Doxygen markup.
All I've managed to get it to do so far is to leave the first line of the multi-line comment alone, if it happens to exceed the column limit. However, any following long line is still broken up.
The only other tool I have left in my box is to use // clang-format off and // clang-format on around these comments but again I'd like to avoid it if I can because:
a) it'll be quite tedious to add them throughout the code base
b) I'll have to surround the entire comments with these, rather than just the verbatim blocks (I haven't figured out if you can disable it just for a portion of a multi-line comment - I've only managed to get it working for an entire one, and even if that was possible the clang-format directives would end up in the generated Doxygen docs which is unacceptable)
c) I don't really like the way it looks in code.
Any help is appreciated. Thanks.
Ran into this issue also, and the only work around found was to use clang-format on/off.
clang-format re flowing comments tends to:
break #page, #section, etc titles, and links generated from them (in rare cases).
break #startuml blocks, which have a specific syntax.
break #verbatim blocks.
See an example of usage in MySQL:
https://github.com/mysql/mysql-server/blob/8.0/storage/perfschema/pfs.cc
Update:
Filed a feature request on clang-format itself:
https://bugs.llvm.org/show_bug.cgi?id=44486

Ignore multiline comments git diff

I'm trying to find the significant differences in C/C++ source code in which only source code changes. I know you can use the git diff -G<regex> but it seems very limiting in the kind of regexes that can be run. For example, it doesn't seem to offer a way to ignore multiline comments in C/C++.
Is there any way in git or preferably libgit2 to ignore comments (including multiline), whitespaces, etc. before a diff is run? Or a way of determining if a line from the diff output is a comment or not?
git diff -w to ignore whitespace differences.
You cannot ignore multiline comments because git is a versioning tool, not a language dependent interpreter. It doesn't know your code is C++. It does not parse files for semantics, so it cannot interpret what is comment and what isn't. In particular, it relies on diff (or a configured difftool) to compare text files and it expects a line-by-line comparison.
I agree with #andrew-c that what you are really asking is to compare the two pieces of code without comments. More specifically helpful, you are asking to compare the lines of code where all multiline comments have been turned into empty lines. You keep the blank lines there so you have the correct line numbers to reference on a normal copy.
So you could manually convert the two code states to blank out multiline comments... or you might look at building your own diff wrapper that did the stripping for you. But the latter is not likely to be worth the effort.
You can achieve this using git attributes and diff filters as described in Viewing git filters output when using meld as a diff tool to call a sed script, which however is pretty complex on its own if you want it to handle all cases like comment delimiters inside string literals etc.

Textile: using code and footnotes

Hi I'm using Redmine to write a wiki of my software. I need to put some notes next to a code section like this:
class.method()[1]
Where the "one" is a link to my note at the end of the page.
I've tried to use any method defined in the Textile syntax but it seems that it doesn't work. In fact when you use the code tag '# #' any other tag stops working.
It's good even if I can use the link tag [[ ]] but only if it is like this google.com
Thanks for any help,
Alessandro
Redmine uses Coderay to parse the code sections in the Wiki. Take a look at the documentation for the different languages. Otherwise I would suggest using comments instead of footnotes or in worst case line references to the code.
The footnote will only work if there is an alphanumeric character directly before the opening square bracktet:
this[1], whereas this()[2] or this [3]
fn1. will work
fn2. won't work.
fn3. won't work.
At least this is true with Redmine 3.1. See this issue for more information.
Note that you need blank lines between fn1., fn2. and fn3. to get a correct rendering.
This extension for Redmine supports footnotes and custom styles in the wiki.
Redmine supports Textile markup syntax. Textile has support for footnotes. As noted on ticket ticket #974, this is the syntax for using footnotes in Redmine:
Text with a footnote[1]
fn1. and here the actual footnote.

Converting C++ code to HTML safe

I decided to try http://www.screwturn.eu/ wiki as a code snippet storage utility. So far I am very impressed, but what irkes me is that when I copy paste my code that I want to save, '<'s and '[' (http://en.wikipedia.org/wiki/Character_encodings_in_HTML#Character_references) invariably screw up the output as the wiki interprets them as either wiki or HTML tags.
Does anyone know a way around this? Or failing that, know of a simple utility that would take C++ code and convert it to HTML safe code?
You can use the ##...## tag to escape the code and automatically wrap it in PRE tags.
Surround your code in <nowiki> .. </nowiki> tags.
I don't know of utilities, but I'm sure you could write a very simple app that does a find/replace. To display angle brackets, you just need to replace them with > and < respectively. As for the square brackets, that is a wiki specific problem with the markdown methinks.
Dario Solera wrote "You can use the ##...## tag to escape the code and automatically wrap it in PRE tags."
If you don't want it wrapped just use: <esc></esc>
List of characters that need escaping:
< (less-than sign)
& (ampersand)
[ (opening square bracket)
Have you tried wrapping your code in html pre or code tags before pasting? Both allow any special characters (such as '<') to be used without being interpreted as html. pre also honors the formatting of the contents.
example
<pre>
if (foo <= bar) {
do_something();
}
</pre>
To post C++ code on a web page, you should convert it to valid HTML first, which will usually require the use of HTML character entities, as others have noted. This is not limited to replacing < and > with < and >. Consider the following code:
unsigned int maskedValue = value&mask;
Uh-oh, does the HTML DTD contain an entity called &mask;? Better replace & with & as well.
Going in an alternate direction, you can get rid of [ and ] by replacing them with the trigraphs ??( and ??). In C++, trigraphs and digraphs are sequences of characters that can be used to represent specific characters that are not available in all character sets. They are unlikely to be recognized by most C++ programmers though.