I'm working on a project that used BOOST_CHECK_EXCEPTION in unit tests. The first argument is a code block. It works well when the code under test has no commas. Once the code gets a comma that is not inside parentheses (e.g. constructor call with braces and multiple arguments), BOOST_CHECK_EXCEPTION stops working. The preprocessor treats the comma as an argument separator. The preprocessor is aware of parentheses, but not of braces.
So the code blocks that contain unparenthesized commas are defined as a lambdas outside BOOST_CHECK_EXCEPTION. That works, but I'm looking for a solution that keeps BOOST_CHECK_EXCEPTION invocations more uniform. After all, commas can appear and disappear from the expressions as the code is being developed.
First of all, just delaying comma expansion after BOOST_CHECK_EXCEPTION expansion doesn't work. The implementation of BOOST_CHECK_EXCEPTION (BOOST_CHECK_THROW_IMPL) would still reject the extra arguments. It means BOOST_PP_COMMA is not going to help.
One approach I considered is having a CODE_WRAPPER macro that would take the code block and wrap it into code that includes parentheses. Those parentheses need to survive all preprocessor expansion. for and while use parentheses about code, but I was unable to put code blocks inside them. Likewise, I could not get a code block inside a function call. They all expect an expression.
One approach that works is a statement expression. It is a GNU extension, so it limits the code to gcc and clang, which is undesirable.
Boost documentation recommends do {...} while(0), but it doesn't fix the issue with commas. https://www.boost.org/doc/libs/1_68_0/libs/test/doc/html/boost_test/utf_reference/testing_tool_ref/assertion_boost_level_exception.html
Now I'm thinking of wrapping BOOST_CHECK_EXCEPTION inside a macro that would define a lambda transparently for the caller. And I'm surprised that I don't see much help online. I feel I'm missing something obvious.
Is there any easy way to use BOOST_CHECK_EXCEPTION with code blocks that include unparenthesized commas?
Related
Is there way to get rid or find by linting (or maybe seding/regexping) these nasty situations when your have just one line of code after if/for statement, without curly braces? Like this one:
if(condition)
return;
For reference why would I want to get rid of that - there are lots of reasons given in this thread:
What's the purpose of using braces (i.e. {}) for a single-line if or loop?
I maintain some legacy code, and deal with some not-really-finished code from other people, and from time to time stumble on situation when this code-style works like a trip wire when debugging:
if(condition_for_early_return)
LOG("Im here") // surprise surprise, I just broke control logic
return;
Also, I've seen code like that:
if(condition)
<tabs> do_smth();
<spaces> do_smth_else();
Of course if contains only first do_smth(), compiler is not confused. But because the do_ functions are visually aligned, I wonder - is this intended behaviour or is it a bug that was never found in this legacy code.
I know cppcheck does not catch these situation - already tried that.
Do you have any way of finding these traps automatically?
GCC has:
-Wmisleading-indentation (C and C++ only)
Warn when the indentation of the code does not reflect the block
structure. Specifically, a warning is issued for if, else, while, and
for clauses with a guarded statement that does not use braces,
followed by an unguarded statement with the same indentation.
In the following example, the call to “bar” is misleadingly indented
as if it were guarded by the “if” conditional.
if (some_condition ())
foo ();
bar (); /* Gotcha: this is not guarded by the "if". */
In the case of mixed tabs and spaces, the warning uses the -ftabstop=
option to determine if the statements line up (defaulting to 8).
The warning is not issued for code involving multiline preprocessor
logic such as the following example.
if (flagA)
foo (0);
#if SOME_CONDITION_THAT_DOES_NOT_HOLD
if (flagB)
#endif
foo (1);
The warning is not issued after a #line directive, since this
typically indicates autogenerated code, and no assumptions can be made
about the layout of the file that the directive references.
Note that this warning is enabled by -Wall in C and C++.
Alternatively, clang-format provides:
-InsertBraces (Boolean) clang-format 15
Insert braces after control statements (if, else, for, do, and while)
in C++ unless the control statements are inside macro definitions or
the braces would enclose preprocessor directives.
But also issues a warning:
Setting this option to true could lead to incorrect code formatting
due to clang-format’s lack of complete semantic information. As such,
extra care should be taken to review code changes made by this option.
A quick google showed that SonarSource's linter has a rule for this: https://rules.sonarsource.com/cpp/RSPEC-121
I believe there's a way to use this tool for free to some extent, so it should work.
Alternatively, clang-format supports the option to just add them since v15: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#insertbraces
Clang-tidy actually has a check for that, and if I remember correctly, it can also fix things automatically for you (you should of course then manually go through all the changes and make sure they're correct)
Coming from the Ruby world, I instantly understood why Crystal chose not to implement a for method. But then I was surprised to see that Crystal does implement a for method for macros. I was even more surprised to find that macros don't allow an enumerable (.each, etc) syntax (i.e. {% ["one", "two", "three"].each do |value| %} isn't valid macro syntax).
Is there a logical reason for this syntax difference? It's possible that the answer is simply ~"because the devs decided that macro syntax looks like x, and non-macro syntax looks like y", but I'm guessing that there is more to it then that (an arbitrary syntax inconsistency seems like a flaw).
Thanks!
The main reason is that when the parser parses foo.bar do |arg| ... end, it expects an expression after |arg|, not %}, which is a parse error. So to allow that we'd need to enhance the parser (which is already quite complex) to take that into account. for was decided because of this, but also to make it clear that it's just not regular crystal but a different thing (it's an interpreted subset of crystal and the standard library).
Another reason is that if each and other iteration methods are allowed, why not while and until? That could allow endless loops in macros, which with just for aren't possible, so you can guarantee a macro finishes executing. Which... is actually not true given that we have run inside macros.
So I think I'm not opposed to change the language to allow each, each_with_index, etc., inside macros, and allow that syntax, and eventually remove for from the macro language. Opening an issue requesting this is a good way in this direction.
I'm studying the C++ standard on the exact behaviour the preprocessor (I need to implement some sort of C++ preprocessor). From what I understand, the example I made up (to aid my understanding) below should be valid:
#define dds(x) f(x,
#define f(a,b) a+b
dds(eoe)
su)
I expect the first function like macro invocation dds(eoe) be replaced by f(eoe, (note the comma within the replacement string) which then considered as f(eoe,su) when the input is rescanned.
But a test with VC++2010 gave me this (I told the VC++ to output the preprocessed file):
eoe+et_leoe+et_l
su)
This is counter-intuitive and is obviously incorrect. Is it a bug with VC++2010 or my misunderstanding of the C++ standard? In particular, is it incorrect to put a comma at the end of the replacement string like I did? My understanding of the C++ standard grammar is that any preprocessing-token's are allowed there.
EDIT:
I don't have GCC or other versions of VC++. Could someone help me to verify with these compilers.
My answer is valid for the C preprocessor, but according to Is a C++ preprocessor identical to a C preprocessor?, the differences are not relevant for this case.
From C, A Reference Manual, 5th edition:
When a functionlike macro call is encoutered, the entire macro call is
replaced, after parameter processing, by a copy of the body. Parameter
processing proceeds as follows. Actual argument token strings are
associated with the corresponding formal parameter names. A copy of
the body is then made in which every occurrence of a formal parameter
name is replace by a copy of the actual parameter token sequence
associated with it. This copy of the body then replaces the macro
call.
[...] Once a macro call has been expanded, the scan for macro calls
resumes at the beginning of the expansion so that names of macros may
be recognized within the expansion for the purpose of further macro
replacement.
Note the words within the expansion. That's what makes your example invalid. Now, combine it with this: UPDATE: read comments below.
[...] The macro is invoked by writing its name, a left parenthesis,
then once actual argument token sequence for each formal parameter,
then a right parenthesis. The actual argument token sequences are
separated by commas.
Basically, it all boils down to whether the preprocessor will rescan for further macro invocations only within the previous expansion, or if it will keep reading tokens that show up even after the expansion.
This may be hard to think about, but I believe that what should happen with your example is that the macro name f is recognized during rescanning, and since subsequent token processing reveals a macro invocation for f(), your example is correct and should output what you expect. GCC and clang give the correct output, and according to this reasoning, this would also be valid (and yield equivalent outputs):
#define dds f
#define f(a,b) a+b
dds(eoe,su)
And indeed, the preprocessing output is the same in both examples. As for the output you get with VC++, I'd say you found a bug.
This is consistent with C99 section 6.10.3.4, as well as C++ standard section 16.3.4, Rescanning and further replacement:
After all parameters in the replacement list have been substituted and # and ##
processing has taken place, all placemarker preprocessing tokens are removed. Then, the
resulting preprocessing token sequence is rescanned, along with all subsequent
preprocessing tokens of the source file, for more macro names to replace.
To the best of my understanding there is nothing in the [cpp.subst/rescan] portions of the standard that makes what you do illegal, and clang and gcc are right in expanding it as eoe+su, and the MSC (Visual C++) behaviour has to be reported as a bug.
I failed to make it work but I managed to find an ugly MSC workaround for you, using variadics - you may find it helpful, or you may not, but in any event it is:
#define f(a,b) (a+b
#define dds(...) f(__VA_ARGS__)
It is expanded as:
(eoe+
su)
Of course, this won't work with gcc and clang.
Well, the problem i see is that the preprocessor does the following
ddx(x) becomes f(x,
However, f(x, is defined as well (even thou it's defined as f(a,b) ), so f(x, expands to x+ garbage.
So ddx(x) finally transforms into x + garbage (because you defined f(smthing, ).
Your dds(eoe) actually expands into a+b where a is eoe and b is et_l .
And it does that twice for whatever reason :).
This scenario you made is compiler specific, depends how the preprocessor chooses to handle the defines expansion.
Lisp/Clojure code have consistency in their syntax and it is a plus point as one doesn't need to understand various different constructs.
But at times It is easier to understand by looking at a piece of code just by the different syntax being used like this is a switch case or this is the pattern matching construct etc without actually reading the text.
I have started out with Clojure couple of months ago and I have realized I can't understand the code without reading the name of the form and then googling whether it is a macro or a function and how it works.
So it turns out that, a piece of Clojure code, irrespective fo the uniformity of the syntax isn't uniform.
It may seem like a function but if at all it is a macro then it might not be evaluating all its arguments.
Is there a naming convention or indentation style that all macros use so it is easier for someone to grasp by the name what is going on ?
The most useful intuition in my opinion comes from understanding the purpose of a given operator / Var. Well-designed macros simply could not be written as functions and still offer the same functionality with the same syntax, for if they could, they would in fact be written as functions (see the "well-designed" part above!).1 So, if you're dealing with a construct which couldn't possibly be a regular function, then you know it isn't; otherwise it likely is.
Additionally, the usual ways of learning about the Vars exported by a library tell you whether you're dealing with a macro or a function up front. That is true of doc ((doc foo) says that foo is a macro near the top of its output if that is indeed the case), source (since it gives you the entire code) and M-. (jump to definition in Emacs with nrepl.el or swank-clojure; M-, jumps back). Documentation may be expected to mention what is a macro and what isn't (except that's not necessarily true of docstrings, since all usual ways of accessing a docstring already tell you whether you're dealing with a macro, as explained above).
If you're skimming a body of code with the intention of forming a rough understanding of what it probably does on the assumption that the various operators perform the functions suggested by their names, then either (1) the names are suggestive enough and you get an idea of what's intended by the code, so you don't even need to care which operators happen to be macros, or (2) the names are not suggestive enough, so you'll need to dive into the docs or the source for some of the operators anyway, and then the first thing you'll learn is which of them are registered as macros.
Finally, there is no single naming style for macros, although there are certain conventions specific to particular use cases. For example with-foo-style constructs tend to be convenience macros whose purpose is to simplify dealing with resources of type foo; dofoo-style constructs tend to be macros which take a body of expressions to be executed (how many times and with which additional context set up depends on the macro; the most basic member of this family, do, is actually a special form rather than a macro); deffoo-style constructs introduce new Vars or type-like entities.
It's worth pointing out that similar patterns are sometimes broken. For instance, most threading constructs (-> & Co.) are macros, but xml-> from clojure.data.zip.xml is a function. That makes perfect sense when one considers the functionality provided, which brings us back to the point about the purpose of an operator being the most useful source of intuition.
1 There might be some exceptions to this rule. One would expect these to be documented. Some projects are of course not documented at all (or very nearly so); here the issue goes away completely, since one must go to the source to make sense of things anyway.
There are two attributes that typically distinguish a macro (or sometimes special form) from a function:
When the form does some sort of binding (i.e. declaring new identifiers for later use)
When some of the arguments are evaluated lazily
Examples of the first case are let, letfn, binding and with-local-vars. Strangely though, defn is defined as a function, but I'm pretty sure it has something to do with Clojure's bootstrapping process (defn is defined before defmacro is defined).
Examples of the second would be and, or and lazy-seq. In all these constructs, the arguments are evaluated lazily by either putting them in conditional branches (like if) or moving them inside a function body.
Both of those attributes are really just manifestations of the macro manipulating the Clojure syntax. I don't think the threading macros (-> and ->>) fit very well into either of those categories, but the nil-safe versions (-?> and -?>>) kind of fall under having lazy arguments.
As far as I know there is no enforced naming convention.
As a rule of thumb, functions are preferred wherever possible, but macros can sometimes be spotted when they follow the pattern def<something> for setting up a something or with-<resource> for doing something with an open resource.
Because of this, you may find clojure's doc macro helpful. It will tell you whether a form is a macro/function/special form, as well as give it's arg list and doc string (if present). For example
(use 'clojure.repl)
(doc and)
Will print the following to the repl.
clojure.core/and
([] [x] [x & next])
Macro
Evaluates exprs one at a time, from left to right. If a form
returns logical false (nil or false), and returns that value and
doesn't evaluate any of the other expressions, otherwise it returns
the value of the last expr. (and) returns true.
Some editors (e.g. emacs) will provide this documentation as a pop-up or on a key combination, which makes accessing it (and reading) much faster.
You can't have code outside of functions except for declarations, definitions and preprocessor directives.
Is that statement accurate, or is there something I'm missing? I'm teaching my nephew to program, and he was trying to put a while loop before main. He's pretty young, I want to give him a hard simple rule that he can understand.
Not quite -- you can also put expressions in global variable declarations:
int myGlobalVar = 3 + SomeFunction(4) - anotherGlobalVar;
But you can only put expressions here, which have to evaluate to the value you're initializing the global with. You cannot put full statements (no blocks of code, no if statements, no loops, etc.). This code will get executed before main() gets a chance to run, so be careful with what you do here. I'd recommend against calling functions in global initializers unless you can't avoid it.
For your nephew:
no, you can't do it.
For yourself:
The compiler's input is technically what you get after the preprocessor is run. So, let's leave preprocessor out. After it has worked, you get a C++ program which is a sequence of declarations. Some delcarations may also be definitions, and some definitions (like function definitions) may have statements inside them.
HTH
Yes- you can't stick random executable code outside functions.
Yes, every kind of statement that does something must reside inside a context that can use it (this doesn't apply to variable initialization).
This because C++ is a structured programming language that encloses its behaviour inside procedures, as opposed to unstructured ones in which you have just one level of code and no scopes.
Well, there's namespaces...and the stuff Adam Rosenfield mentioned...and there's also exception try/catch that can be sort of external to functions. Unfortunately, I can't remember the syntax and can't find it with google.