C++ Scope of Static Variable In a Static Member Function - c++

I have a simple object which does some parsing. Inside, there is a parse function, containing a static variable that is used to limit number of error messages to print to the user:
struct CMYParsePrimitive {
static bool Parse(const std::string &s_line)
{
// do the parsing
static bool b_warned = false;
if(!b_warned) {
b_warned = true;
std::cerr << "error: token XYZ is deprecated" << std::endl;
}
// print a warning about using this token (only once)
return true;
}
};
Now these parse primitives are passed in a typelist to a parser specialization. There is some other interface which tells parser which token types should be parsed using which parse primitives.
My issue is that the warning should be displayed up to once per application run. But in my case, it is sometimes displayed multiple times, seems to be per parser instance rather than application instance.
I'm using Visual Studio 2008, I imagine this might be some bug or a deviation from the standard? Does anyone have any idea as to why this happens?

I failed to notice that the function is also a template. My bad. It is instantiated twice in the code with different parameters - hence the warning is sometimes printed twice. The real code looks more like this:
struct CMYParsePrimitive {
template <class CSink>
static bool Parse(const std::string &s_line, CSink &sink)
{
// do the parsing, results passed down to "sink"
static bool b_warned = false;
if(!b_warned) {
b_warned = true;
std::cerr << "error: token XYZ is deprecated" << std::endl;
}
// print a warning about using this token (only once)
return true;
}
};
So then there are e.g. CMYParsePrimitive::Parse<PreviewParser>::b_warned, which can print the warning once when used by PreviewParser, and then also CMYParsePrimitive::Parse<Parser>::b_warned which can print the warning when used by Parser.

Related

How does Clang's "did you mean ...?" variable name correction algorithm work?

I am compiling C++ code with Clang. (Apple clang version 12.0.5 (clang-1205.0.22.11)).
Clang can give tips in case you misspell a variable:
#include <iostream>
int main() {
int my_int;
std::cout << my_it << std::endl;
}
spellcheck-test.cpp:5:18: error: use of undeclared identifier 'my_it'; did you mean 'my_int'?
std::cout << my_it << std::endl;
^~~~~
my_int
spellcheck-test.cpp:4:9: note: 'my_int' declared here
int my_int;
^
1 error generated.
My question is:
What is the criterion Clang uses to determine when to suggest another variable?
My experimentation suggests it is quite sophisticated:
If there is another similarly named variable that you might have meant (e.g. int my_in;) it does not give a suggestion
If the suggested variable has the wrong type for the operation (e.g. by trying to print my_it.size() instead) it does not give a suggestion
Whether or not it gives the suggestion depends on a non-trivial comparison of variable names: it allows for both deletions and insertions of characters, and longer variable names allow for more insertion/deletions to be considered "similar".
You will not likely find it documented, but as Clang is open-source you can turn to the source to try to figure it out.
Clangd?
The particular diagnostic (from DiagnosticSemaKinds.td):
def err_undeclared_var_use_suggest : Error<
"use of undeclared identifier %0; did you mean %1?">;
is ever only referred to from clang-tools-extra/clangd/IncludeFixer.cpp:
// Try to fix unresolved name caused by missing declaration.
// E.g.
// clang::SourceManager SM;
// ~~~~~~~~~~~~~
// UnresolvedName
// or
// namespace clang { SourceManager SM; }
// ~~~~~~~~~~~~~
// UnresolvedName
// We only attempt to recover a diagnostic if it has the same location as
// the last seen unresolved name.
if (DiagLevel >= DiagnosticsEngine::Error &&
LastUnresolvedName->Loc == Info.getLocation())
return fixUnresolvedName();
Now, clangd is a language server and t.b.h. I don't know how whether this is actually used by the Clang compiler frontend to yield certain diagnostics, but you're free to continue down the rabbit hole to tie together these details. The fixUnresolvedName above eventually performs a fuzzy search:
if (llvm::Optional<const SymbolSlab *> Syms = fuzzyFindCached(Req))
return fixesForSymbols(**Syms);
If you want to dig into the details, I would recommend starting with the fuzzyFindCached function:
llvm::Optional<const SymbolSlab *>
IncludeFixer::fuzzyFindCached(const FuzzyFindRequest &Req) const {
auto ReqStr = llvm::formatv("{0}", toJSON(Req)).str();
auto I = FuzzyFindCache.find(ReqStr);
if (I != FuzzyFindCache.end())
return &I->second;
if (IndexRequestCount >= IndexRequestLimit)
return llvm::None;
IndexRequestCount++;
SymbolSlab::Builder Matches;
Index.fuzzyFind(Req, [&](const Symbol &Sym) {
if (Sym.Name != Req.Query)
return;
if (!Sym.IncludeHeaders.empty())
Matches.insert(Sym);
});
auto Syms = std::move(Matches).build();
auto E = FuzzyFindCache.try_emplace(ReqStr, std::move(Syms));
return &E.first->second;
}
along with the type of its single function parameter, FuzzyFindRequest in clang/index/Index.h:
struct FuzzyFindRequest {
/// A query string for the fuzzy find. This is matched against symbols'
/// un-qualified identifiers and should not contain qualifiers like "::".
std::string Query;
/// If this is non-empty, symbols must be in at least one of the scopes
/// (e.g. namespaces) excluding nested scopes. For example, if a scope "xyz::"
/// is provided, the matched symbols must be defined in namespace xyz but not
/// namespace xyz::abc.
///
/// The global scope is "", a top level scope is "foo::", etc.
std::vector<std::string> Scopes;
/// If set to true, allow symbols from any scope. Scopes explicitly listed
/// above will be ranked higher.
bool AnyScope = false;
/// The number of top candidates to return. The index may choose to
/// return more than this, e.g. if it doesn't know which candidates are best.
llvm::Optional<uint32_t> Limit;
/// If set to true, only symbols for completion support will be considered.
bool RestrictForCodeCompletion = false;
/// Contextually relevant files (e.g. the file we're code-completing in).
/// Paths should be absolute.
std::vector<std::string> ProximityPaths;
/// Preferred types of symbols. These are raw representation of `OpaqueType`.
std::vector<std::string> PreferredTypes;
bool operator==(const FuzzyFindRequest &Req) const {
return std::tie(Query, Scopes, Limit, RestrictForCodeCompletion,
ProximityPaths, PreferredTypes) ==
std::tie(Req.Query, Req.Scopes, Req.Limit,
Req.RestrictForCodeCompletion, Req.ProximityPaths,
Req.PreferredTypes);
}
bool operator!=(const FuzzyFindRequest &Req) const { return !(*this == Req); }
};
Other rabbit holes?
The following commit may be another leg to start from:
[Frontend] Allow attaching an external sema source to compiler instance and extra diags to TypoCorrections
This can be used to append alternative typo corrections to an existing
diag. include-fixer can use it to suggest includes to be added.
Differential Revision: https://reviews.llvm.org/D26745
from which we may end up in clang/include/clang/Sema/TypoCorrection.h, which sounds like a more reasonably used feature by the compiler frontend than that of the (clang extra tool) clangd. E.g.:
/// Gets the "edit distance" of the typo correction from the typo.
/// If Normalized is true, scale the distance down by the CharDistanceWeight
/// to return the edit distance in terms of single-character edits.
unsigned getEditDistance(bool Normalized = true) const {
if (CharDistance > MaximumDistance || QualifierDistance > MaximumDistance ||
CallbackDistance > MaximumDistance)
return InvalidDistance;
unsigned ED =
CharDistance * CharDistanceWeight +
QualifierDistance * QualifierDistanceWeight +
CallbackDistance * CallbackDistanceWeight;
if (ED > MaximumDistance)
return InvalidDistance;
// Half the CharDistanceWeight is added to ED to simulate rounding since
// integer division truncates the value (i.e. round-to-nearest-int instead
// of round-to-zero).
return Normalized ? NormalizeEditDistance(ED) : ED;
}
used in clang/lib/Sema/SemaDecl.cpp:
// Callback to only accept typo corrections that have a non-zero edit distance.
// Also only accept corrections that have the same parent decl.
class DifferentNameValidatorCCC final : public CorrectionCandidateCallback {
public:
DifferentNameValidatorCCC(ASTContext &Context, FunctionDecl *TypoFD,
CXXRecordDecl *Parent)
: Context(Context), OriginalFD(TypoFD),
ExpectedParent(Parent ? Parent->getCanonicalDecl() : nullptr) {}
bool ValidateCandidate(const TypoCorrection &candidate) override {
if (candidate.getEditDistance() == 0)
return false;
// ...
}
// ...
};
I would recommend checking out this 10-year old blog by Chris Lattner for a general idea of Clang error recovery mechanisms.
On Clang's Spell Checker, he writes:
One of the more visible things that Clang includes is a spell checker (also on reddit). The spell checker kicks in when you use an identifier that Clang doesn't know: it checks against other close identifiers and suggests what you probably meant.
...
Clang uses the well known Levenshtein distance function to compute the best match out of the possible candidates.

std::experimental::source_location at compile time

std::experimental::source_location will probably be added to the C++ standard at some point. I'm wondering if it possible to get the location information into the compile-time realm. Essentially, I want a function that returns different types when called from different source locations. Something like this, although it doesn't compile because the location object isn't constexpr as it's a function argument:
#include <experimental/source_location>
using namespace std::experimental;
constexpr auto line (const source_location& location = source_location::current())
{
return std::integral_constant<int, location.line()>{};
}
int main()
{
constexpr auto ll = line();
std::cout << ll.value << '\n';
}
This doesn't compile, with a message about
expansion of [...] is not a constant expression
regarding the return std::integral_constant<int, location.line()>{} line. What good it is to have the methods of source_location be constexpr if I can't use them?
As Justin pointed the issue with your code is that function argument are not constexpr but the problem of using source_location in a constexpr function in a more useful way is mentioned in the constexpr! functions proposal which says:
The "Library Fundamentals v. 2" TS contains a "magic" source_location
class get to information similar to the FILE and LINE macros
and the func variable (see N4529 for the current draft, and N4129
for some design notes). Unfortunately, because the "value" of a
source_location is frozen at the point source_location::current() is
invoked, composing code making use of this magic class is tricky:
Typically, a function wanting to track its point of invocation has to
add a defaulted parameter as follows:
void my_log_function(char const *msg,
source_location src_loc
= source_location::current()) {
// ...
}
This idiom ensure that the value of the source_location::current()
invocation is sampled where my_log_function is called instead of where
it is defined.
Immediate (i.e., constexpr!) functions, however, create a clean
separation between the compilation process and the constexpr
evaluation process (see also P0992). Thus, we can make
source_location::current() an immediate function, and wrap it as
needed in other immediate functions: The value produced will
correspond to the source location of the "root" immediate function
call. For example:
constexpr! src_line() {
return source_location::current().line();
}
void some_code() {
std::cout << src_line() << '\n'; // This line number is output.
}
So this is currently an open problem.

c++ Google test (gtest): how to create custom asserts and expects?

I am using gtest to create unit tests to my c++ program. In my tests I have to write a lot of checks like this:
ASSERT_TRUE(myObject.IsValid());
EXPECT_EQ(myObject.GetSomeAttribute(), expectedValue);
I have to write both checks because if I omit the ASSERT_TRUE and myObject happened to be not valid, than myObject.GetSomeAttributre() call crashes. That's not good even in tests.
What I want is to write something like:
EXPECT_XXX_EQ(myObject.GetSomeAttribute(), expectedValue);
This line of code should do approximately the same as the original two lines (with optional bonus that if myObject is not valid, this will be reported, GetSomeAttribute() would not be called, but the test will continue running).
How can I write such custom assert/expect?
From the Advanced Guide, we can see that there are a couple ways we could do this.
The easiest way is by using assertions in a subroutine:
template<typename T>
void AssertAttributeEquals(MyObject const& obj, T value) {
ASSERT_TRUE(obj.IsValid());
// googletest has the assumption that you put the
// expected value first
EXPECT_EQ(value, obj.GetAttribute());
}
And you can call it like so:
AssertAttributeEquals(myObject, expectedValue);
Although you may want to use SCOPED_TRACE to get a better message on failure:
{
SCOPED_TRACE("whatever message you want");
AssertAttributeEquals(myObject, expectedValue);
}
Alternatively, you can use a function that returns an AssertionResult:
template<typename T>
::testing::AssertionResult AttributeEquals(MyObject const& obj, T value) {
if (!obj.IsValid()) {
// If MyObject is streamable, then we probably want to include it
// in the error message.
return ::testing::AssertionFailure() << obj << " is not valid";
}
auto attr = obj.GetAttribute();
if (attr == value) {
return ::testing::AssertionSuccess();
} else {
return ::testing::AssertionFailure() << attr << " not equal to " << value;
}
}
This can be used like so:
EXPECT_TRUE(AttributeEquals(myObject, expectedValue));
This second technique has the benefit of producing nice error messages even if you don't use SCOPED_TRACE

Getting rid of an ugly C construct

I have inherited a (large) piece of code which has an error tracking mechanism where they pass in a boolean variable to all the methods they call and on errors at various stages of execution the method is stopped and returns, sometimes a default value.
Something like (BEFORE):
#include <iostream.h>
int fun1(int par1, bool& psuccess)
{
if(par1 == 42) return 43;
psuccess = false;
return -1;
}
int funtoo(int a, bool& psuccess)
{
int t = fun1(a, psuccess);
if(!psuccess)
{
return -1;
}
return 42;
}
void funthree(int b, bool& psuccess)
{
int h = funtoo(b, psuccess);
if(!psuccess)
{
return;
}
cout << "Yuppi" << b;
}
int main()
{
bool success = true;
funthree(43, success);
if(!success)
{
cout<< "Life, universe and everything have no meaning";
}
}
Please note, that this is a mixture of C and C++ code, exactly the way the project is in.
Now, comes a piece of C magic: "someone" somewhere defined a macro:
#define SUCCES_OR_RETURN if(!psuccess) return
And the program above becomes (AFTER):
#include<iostream.h>
int fun1(int par1, bool& psuccess)
{
if(par1 == 42) return 43;
psuccess = false;
return -1;
}
int funtoo(int a, bool& psuccess)
{
int t = fun1(a, psuccess);
SUCCES_OR_RETURN -1;
return 42;
}
void funthree(int b, bool& psuccess)
{
int h = funtoo(b, psuccess);
SUCCES_OR_RETURN ;
std::cout << "Yuppi" << b;
}
int main()
{
bool success = true;
funthree(43, success);
if(!success)
{
cout<< "Life, universe and everything have no meaning";
}
}
The question: I am wondering if there is a nicer way to handle this kind of error tracking or I have to live with this. I personally don't like the abuse of the C macro SUCCES_OR_RETURN ie. that once it is called with a parameter, and in other cases it is called without, feels like a real return statement, but I did not find any better solutions to this ancient design.
Please note that due to platform restrictions we have several restrictions, but regardless of it I am willing to hear opinions about these two:
throwing exceptions. The code is a mixture of C and C++ functions calling each other and the compiler sort of does not support throw (accepts in the syntax but does nothing with it, just a warning). This solution is sort of the standard way of solving this problem in a C++ environment.
C++11 features, this goes to a tiny embedded platform with an obscure and ancient "almost" C++ compiler which wasn't made to support the latest C++ features. However for future reference I am curios if there is anything C++11 offers.
template magic. The compiler has problems understanding complex templated issues, but again I am willing to see any solutions that you can come up with.
Edit
Also, as #BlueMoon suggested in the commend, creating a global variable is not working since at a very beginning of the function chain calling the success variable is a member variable of a class, and there are several objects of this class created, each of them needs to report its success status :)
There's a great breakdown of hybrid C and C++ error handling strategies here:
http://blog.sduto.it/2014/05/a-c-error-handling-style-that-plays.html
To quote the linked article, your options largely boil down to:
Return an error code from functions that can fail.
Provide a function like Windows's GetLastError() or OpenGL's glGetError() to retrieve the most recently occurring error code.
Provide a global (well, hopefully, thread-local) variable containing the most recent error, like POSIX's errno.
Provide a function to return more information about an error, possibly in conjunction with one of the above approaches, like POSIX's strerror function.
Allow the client to register a callback when an error occurs, like GLFW's glfwSetErrorCallback.
Use an OS-specific mechanism like structured exception handling.
Write errors out to a log file, stderr, or somewhere else.
Just assert() or somehow else terminate the program when an error occurs.
It seems like the author of the code you have inherited picked a rather strange way, passing a pointer to a boolean [sic] for the function to work with seems rather unusual.
The article has some great examples, personally I like this style:
libfoo_widget_container_t container = NULL;
libfoo_error_details_t error = NULL;
if (libfoo_create_widgets(12, &container, &error) != libfoo_success) {
printf("Error creating widgets: %s\n", libfoo_error_details_c_str(error));
libfoo_error_details_free(error);
abort(); // goodbye, cruel world!
}
Here you get a bit of everything, passed in pointer to error type, a comparison against a success constant (rather than 0|1, a painful dichotomy between C and the rest of the world!).
I don't think it would be too much of a push to say that your macro could rather better be implemented with a goto, in any case, if a function is calling SUCCES_OR_RETURN more than once, it might be a clue that the function is doing too much. Complex cleanup, or return might be a code smell, you can read more here http://eli.thegreenplace.net/2009/04/27/using-goto-for-error-handling-in-c/
I have seen this style of error handling before. I call it error-oblivious manual pseudo-exceptions.
The code flow is mostly error-oblivious: you can call 3 functions in a row with the same error flag, then look at the error flag to see if any errors have occurred.
The error flag acts as a pseudo-exception, where once set we start "skipping" over normal code flow, but this is done manually instead of automatically.
If you do something and do not care if an error occurs, you can just drop the error produced and proceed on.
The ICU library handles errors in a similar way.
A more C++1y way to do this while minimizing structural differences would be to modify code to return an expected object.
An expected<T, Err> is expected to be a T, and if something went wrong it is instead an Err type. This can be implemented as a hybrid of boost::variant and C++1y's std::optional. If you go and overload most arithmetic operations on expected< T, Err > + U to return expected< decltype( std::declval<T&>() + std::declval<U>(), Err > and did some careful auto, you could allow at least arithmetic expressions to keep their structure. You'd then check for the error after the fact.
On the other hand, if the error return values are predictable based on their type, you could create a type that when cast to a given type produced an error value. Modify functions returning void to return an error object of some kind while you are at it. And now every function can
if (berror) return error_flag_value{};
which at least gets rid of that strange ; or -1; issue.
If you want to go full C++, the answer would be changing the "invalid return values" for exceptions...
#include <iostream>
#include <exception>
using std::exception;
struct error : exception { const char* what() const throw() override { return "unsuccessful"; } };
int fun1(int par1) {
if( par1 == 42 ) return 43;
throw error();
}
int funtoo(int a) {
fun1(a);
return 42;
}
void funthree(int b) {
funtoo(b);
std::cout << "Yuppi " << b << "\n";
}
int main() {
try {
funthree(42);
} catch(exception& e) {
std::cout << "Life has no meaning, because " << e.what() << "\n";
}
}
This prints Yuppi 42 (if you change the call funthree(42) for funthree(43) it prints Life has no meaning, because unsuccessful...)
(live at coliru)

How can I get more details about errors generated during protobuf parsing? (C++)

I am new to protobuf (C++) and my code fails during parse of my messages. How can I get more details about the errors that occurred?
Example
The following snippet illustrates the problem:
const bool ok=my_message.ParseFromCodedStream(&stream);
if(ok){
std::cout<< "message parsed. evidence:\n"<< my_message.DebugString();
}
else{
std::cerr<< "error parsing protobuf\n";
//HOW CAN I GET A REASON FOR THE FAILURE HERE?
}
If you look inside protobuf code, you will find it's using its own logging system - based on macros. By default all these messages goes to stderr, but you can capture them in your program with SetLogHandler():
typedef void LogHandler(LogLevel level, const char* filename, int line,
const std::string& message);
The possible solution is to make your own errno-like mechanism (sorry for C++11-ishness):
typedef LogMessage std::tuple<LogLevel, std::string, int, std::string>; // C++11
typedef LogStack std::list<LogMessage>;
namespace {
LogStack stack;
bool my_errno;
} // namespace
void MyLogHandler(LogLevel level, const char* filename, int line,
const std::string& message) {
stack.push_back({level, filename, line, message}); // C++11.
my_errno = true;
}
protobuf::SetLogHandler(MyLogHandler);
bool GetError(LogStack* my_stack) {
if (my_errno && my_stack) {
// Dump collected logs.
my_stack->assign(stack.begin(), stack.end());
}
stack.clear();
bool old_errno = my_errno;
my_errno = false;
return old_errno;
}
And use it in your code:
...
else {
std::cerr<< "error parsing protobuf" << std::endl;
LogStack my_stack;
if (GetError(&my_stack) {
// Handle your errors here.
}
}
The main drawback of my sample code - it doesn't work well with multiple threads. But that can be fixed on your own.
Sometimes error information will be printed to the console, but that's it. There's no way to get extra error info through the API.
That said, there are only two kinds of errors anyway:
A required field was missing. (Information should be printed to the console in this case.)
The data is corrupt. It was not generated by a valid protobuf implementation at all -- it's not even a different type of protobuf, it's simply not a protobuf.
If you are seeing the latter case, you need to compare your data on the sending and receiving side and figure out why it's different. Remember that the data you feed to the protobuf parser not only must be the same bytes, but it must end at the same place -- the protobuf parser does not know where the message ends except by receiving EOF. This means that if you are writing multiple messages to a stream, you need to write the size before the data, and make sure to read only that many bytes on the receiving end before passing on to the protobuf parser.