LLDB summary strings without quotes - c++

Lets say that I have a c++ class that contains two c strings like below.
class PathExample {
char* partA; // Eg: "/some/folder/"
char* partB; // Eg: "SomeFile.txt"
}
I can make an lldb summary string for it:
type summary add PathExample --summary-string "${var.partA}${var.partB}"
However this adds unnecessary and confusing quotes "/some/folder/""SomeFile.txt".
How can I format the type summary string to not use quotes, or at least append the strings before adding quotes? Eg: "/some/folder/SomeFile.txt"

"Remove leading or trailing quote in the summary value" before adding to the output is a not supported by the summary string formatting options. We're trying to keep those options fairly streamlined, and that's a bit too much of a special purpose feature.
The thing that allows us to keep the summary string version fairly restrained is that you can always write a Python summary, which allows you to format up the output in whatever way you like. There's an example that's somewhat like what you want in the section on Python scripting:
https://lldb.llvm.org/use/variable.html#python-scripting
You wouldn't use GetValueAsUnsigned as that example does. The C-string rendering of char * types is actually done by a built-in summary, so you would use "SBValue.GetSummary" to get the string value. That's actually the same thing that's substituted into the summary string so it also has the quotes on it. But in Python it's trivial to strip the leading and trailing quotes before concatenating the two strings.
Note, though it's convenient for playing around with, you don't have to define the Python summary callback inline as shown in the example. You can put a function with the correct signature in a .py file somewhere, use command script import <path to .py file> and then import it using the -F option to type summary add. Remember to use the full name of the function (module_name.func_name) when you specify it. I have a bunch of these in a ~/.lldb directory and command script import them in my ~/.lldbinit.
help type summary add has some more details on how to do this.

Related

Scanning a language with non-delimited strings with nested tokens

I want to create a lexer/parser for a language that has non-delimited strings.
Which part of the language is a string is defined by the command preceding it.
For example it has statements that look like this:
pause 5
alert Hello world[CRLF] this contains 'pause' once (1)
Alert in this instance can end with any string, including keywords and numbers.
Further complicating things, the text can contain tags like [CRLF] that I want to separate too.
Ideally I'd want this to be broken up into:
[PAUSE][INT 5]
[ALERT][STR "Hello world"][CRLF][STR " this contains 'pause' once (1)"]
I'm currently using flex but from what I've gathered this kind of thing isn't possible with flex.
How can I achieve what I want here?
(Since one of your tags is "regex", I'll suggest a non-flex approach.)
From the example, it seems like you could just:
match each line against ^(\w+) (.+) to obtain command and arguments-text, and then
get individual arguments by splitting the arguments-text on (\[\w+\]) (assuming your regex library's split function can return both the splitter-strings and the split-strings).
It's possible your actual situation is more complex and something like flex makes more sense, but I'm not really seeing it so far.

Doxygen parsing ampersands for ascii chars

I've been using Doxygen to document my project but I've ran into some problems.
My documentation is written in a language which apostrophes are often used. Although my language config parameter is properly set, when Doxygen generates the HTML output, it can't parse apostrophes so the code is shown instead of the correct character.
So, in the HTML documentation:
This should be the text: Vector d'Individus
But instead, it shows this: Vector d'Individus
That's strange, but searching the code in the HTML file, I found that what happens is that instead of using an ampersand to write the ' code, it uses the ampersand code. Well, seeing the code is easier to see:
<div class="ttdoc">Vector d&#39;Individus ... </div>
One other thing is to note that this only happens with the text inside tooltips...
But not on other places (same code, same class)...
What can I do to solve this?
Thanks!
Apostrophes in code comments must be encoded with the correct glyph for doxygen to parse it correctly. This seems particularly true for the SOURCE_TOOLTIPS popups. The correct glyph is \u2019, standing for RIGHT SINGLE QUOTATION MARK. If the keyboard you are using is not providing this glyph, you may write a temporary symbol (e.g. ') and batch replace it afterwards with an unicode capable auxiliary tool, for example: perl -pC -e "s/'/\x{2019}/g" < infile > outfile. Hope it helps.
Regarding the answer from ramkinobit, this is not necessary, doxygen can use for e.g. the Right Single quote: ’ (see doxygen documentation chapter "HTML commands").
Regarding the apostrophe the OP asks for one can use (the doxygen extension) &apos; (see also doxygen documentation chapter "HTML commands")).
There was a double 'HTML escape' in doxygen resulting in the behavior as observed for the single quote i.e. displaying '.
I've just pushed a proposed patch to github (pull request 784, https://github.com/doxygen/doxygen/pull/784).
EDIT 07/07/2018 (alternative) patch has been integrated in main branch on github.

How mark the end of a #ref reference?

I'm using Doxygen to document C++ code, and am writing a substantial amount of Doxygen doc for the code. In one place I'm making a list of groups in the code, and would like it to appear as follows:
Control Module: the module that controls everything
Slave Module: the module that is the slave of the Control Module
My documentation source looks like this:
- #ref CM: the module that controls everything
- #ref SM: the module that is the slave of the #CM
But, problem: Doxygen seems to be reading the reference name as CM:, not CM, and thus can't find the reference. So, somehow I need to tell Doxygen where the reference name ends. (For example, if I were using Bash, and wanted to echo a variable string with an "s" as a suffix, I'd use echo "${NOUN}s".)
As a workaround, I could add a space between the name and the subsequent colon, but that makes the resulting doc harder to read and I'd like to avoid it.
Under Special Commands, the Doxygen manual includes the following hopeful-sounding information:
Some commands have one or more arguments. Each argument has a certain
range:
If <sharp> braces are used the argument is a single word.
If (round) braces are used the argument extends until the end of the line on
which the command was found.
If {curly} braces are used the argument
extends until the next paragraph. Paragraphs are delimited by a blank
line or by a section indicator.
OK, that's all fine and good, but the documentation doesn't say, and I can't figure out, where those braces are supposed to go. Around the argument alone? Around the entire command and argument? Neither works, and I can't come up with an alternative that does work.
So, how do I indicate the end of a reference name to Doxygen? And if braces are the answer, where do they go?
This works for Doxygen version 1.8.11:
\ref name "":
Apparently, the empty string triggers a fall-back to use the name argument before it.
The Doxygen documentation you quote is describing the syntax of the Doxygen documentation, not of sources to be parsed by your use of Doxygen.
In other words, if <sharp> braces are used when describing a command, it takes a single word; and so on.
Looking at the documentation of #ref:
\ref <name> ["(text)"]
The name argument is in "sharp braces," and so it's just a single word. Unfortunately, Doxygen seems to interpret : as part of that word. Your best bet would be to introduce a space:
#ref CM : the ...
You could also try whether a zero-width character would break the word recognition:
#ref CM‌: the ...

camelCase to underscore in vi(m)

If for some reason I want to selectively convert camelCase named things to being underscore separated in vim, how could I go about doing so?
Currently I've found that I can do a search /s[a-z][A-Z] and record a macro to add an underscore and convert to lower case, but I'm curious as to if I can do it with something like :
%s/([a-z])([A-Z])/\1\u\2/gc
Thanks in advance!
EDIT: I figured out the answer for camelCase (which is what I really needed), but can someone else answer how to change CamelCase to camel_case?
You might want to try out the Abolish plugin by Tim Pope. It provides a few shortcuts to coerce from one style to another. For example, starting with:
MixedCase
Typing crc [mnemonic: CoeRce to Camelcase] would give you:
mixedCase
Typing crs [mnemonic: CoeRce to Snake_case] would give you:
mixed_case
And typing crm [mnemonic: CoeRce to MixedCase] would take you back to:
MixedCase
If you also install repeat.vim, then you can repeat the coercion commands by pressing the dot key.
This is a bit long, but seems to do the job:
:%s/\<\u\|\l\u/\= join(split(tolower(submatch(0)), '\zs'), '_')/gc
I suppose I should have just kept trying for about 5 more minutes. Well... if anyone is curious:
%s/\(\l\)\(\u\)/\1\_\l\2/gc does the trick.
Actually, I realized this works for camelCase, but not CamelCase, which could also be useful for someone.
I whipped up a plugin that does this.
https://github.com/chiedojohn/vim-case-convert
To convert the case, select a block of text in visual mode and the enter one of the following (Self explanatory) :
:CamelToHyphen
:CamelToSnake
:HyphenToCamel
:HyphenToSnake
:SnakeToCamel
:SnakeToHyphen
To convert all occerences in your document then run one of the following commands:
:CamelToHyphenAll
:CamelToSnakeAll
:HyphenToCamelAll
:HyphenToSnakeAll
:SnakeToCamelAll
:SnakeToHyphen
Add a bang (eg. :CamelToHyphen!) to any of the above command to bypass the prompts before each conversion.
You may not want to do that though as the plugin wouldn't know the different between variables or other text in your file.
For the CamelCase case:%s#(\<\u\|\l)(\l+)(\u)#\l\1\2_\l\3#gc
Tip: the regex delimiters can be altered as in my example to make it (somewhat) more legible.
I have an API for various development oriented processing. Among other things, it provides a few functions for transforming names between (configurable) conventions (variable <-> attribute <-> getter <-> setter <-> constant <-> parameter <-> ...) and styles (camelcase (low/high) <-> underscores). These conversion functions have been wrapped into a plugin.
The plugin + API can be fetch from here: https://github.com/LucHermitte/lh-dev, for this names conversion task, it requires lh-vim-lib
It can be used the following way:
put the cursor on the symbol you want to rename
type :NameConvert + the type of conversion you wish (here : underscore). NB: this command supports auto-completion.
et voilà!

C++ Runtime string formatting

Usually I use streams for formatting stuff however in this case ?I don't know the format until runtime.
I want to be able to take something like the following format string:
Hello {0}! Your last login was on {1,date:dd/mm/yy}.
...and feed in the variables "Fire Lancer" and 1247859223, and end up with the following formatted string:
Hello Fire Lancer! Your last login was on 17/07/09.
In other languages I use there is built in support for this kind of thing, eg pythons format string method, however in c++ there doesn't seem to be any such functionality, accept the C print methods which are not very safe.
Also this is for a high performance program, so whatever solution I use needs to parse the format string once and store it (eg mayby a Parse method that returns a FormatString object with a Format(string) method), not reparse the string every time the format method is called...
Your format string looks very much like those used in ICU MessageFormat. Did you consider using it?
Boost Formatting does that for you:
http://www.boost.org/doc/libs/1_39_0/libs/format/doc/format.html
Check out this question and answer for examples of usage:
boost::format will do the positional arguments portion, but not the date formatting...