Clang for fuzzy parsing C++

Clang for fuzzy parsing C++ - c++

Is it at all possible to parse C++ with incomplete declarations with clang with its existing libclang API ? I.e. parse .cpp file without including all the headers, deducing declarations on the fly. so, e.g. The following text:
A B::Foo(){return stuff();}
Will detect unknown symbol A, call my callback that deducts A is a class using my magic heuristic, then call this callback the same way with B and Foo and stuff. In the end I want to be able to infer that I saw a member Foo of class B returning A, and stuff is a function.. Or something to that effect.
context: I wanna see if I can do sensible syntax highlighting and on the fly code analysis without parsing all the headers very quickly.
[EDIT] To clarify, I'm looking for very heavily restricted C++ parsing, possibly with some heuristic to lift some of the restrictions.
C++ grammar is full of context dependencies. Is Foo() a function call or a construction of a temporary of class Foo? Is Foo<Bar> stuff; a template Foo<Bar> instantiation and declaration of variable stuff, or is it weird-looking 2 calls to overloaded operator < and operator > ? It's only possible to tell in context, and context often comes from parsing the headers.
What I'm looking for is a way to plug my custom convention rules. E.g. I know that I don't overload Win32 symbols, so I can safely assume that CreateFile is always a function, and I even know its signature. I also know that all my classes start with a capital letter and are nouns, and functions are usually verbs, so I can reasonably guess that Foo and Bar are class names. In a more complex scenario, I know I don't write side-effect-free expressions like a < b > c; so I can assume that a is always a template instantiation. And so on.
So, the question is whether it's possible to use Clang API to call back every time it encounters an unknown symbol, and give it an answer using my own non-C++ heuristic. If my heuristic fails, then the parse fails, obviously. And I'm not talking about parsing Boost library :) I'm talking about very simple C++, probably without templates, restricted to some minimum that clang can handle in this case.

I know the question is fairly old, but have a look here :
LibFuzzy is a library for heuristically parsing C++ based on Clang's
Lexer. The fuzzy parser is fault-tolerant, works without knowledge of
the build system and on incomplete source files. As the parser
necessarily makes guesses, the resulting syntax tree may be partially
wrong.
It is a sub-project from clang-highlight, an (experimental?) tool which seems to be no longer developed.
I'm only interested in the fuzzy parsing part and forked it on my github page where I fixed several minor issues and made the tool autonomous (it can be compiled outside clang's source tree). Don't try to compile it with C++14 (which G++ 6's default mode), because there will be conflicts with make_unique.
According to this page, clang-format has its own fuzzy parser (and is actively developed), but the parser was (is ?) more tighly coupled to the tool.

Unless you heavily restrict the code that people are allowed to write, it is basically impossible to do a good job of parsing C++ (and hence syntax highlighting beyond keywords/regular expressions) without parsing all the headers. The pre-processor is particularly good at screwing things up for you.
There are some thoughts on the difficulties of fuzzy parsing (in the context of visual studio) here which might be of interest: http://blogs.msdn.com/b/vcblog/archive/2011/03/03/10136696.aspx

Another solution which I think will suit more the OP than fuzzy parsing.
When parsing, clang maintains Semantic information through the Sema part of the analyzer. When encountering an unknown symbol, Sema will fallback to ExternalSemaSource to get some information about this symbol. Through this, you could implement what you want.
Here is a quick example how to set up it. It is not entirely functional (I'm not doing anything in the LookupUnqualified method), you might need to do further investigations and I think it is a good start.
// Declares clang::SyntaxOnlyAction.
#include <clang/Frontend/FrontendActions.h>
#include <clang/Tooling/CommonOptionsParser.h>
#include <clang/Tooling/Tooling.h>
#include <llvm/Support/CommandLine.h>
#include <clang/AST/AST.h>
#include <clang/AST/ASTConsumer.h>
#include <clang/AST/RecursiveASTVisitor.h>
#include <clang/Frontend/ASTConsumers.h>
#include <clang/Frontend/FrontendActions.h>
#include <clang/Frontend/CompilerInstance.h>
#include <clang/Tooling/CommonOptionsParser.h>
#include <clang/Tooling/Tooling.h>
#include <clang/Rewrite/Core/Rewriter.h>
#include <llvm/Support/raw_ostream.h>
#include <clang/Sema/ExternalSemaSource.h>
#include <clang/Sema/Sema.h>
#include "clang/Basic/DiagnosticOptions.h"
#include "clang/Frontend/TextDiagnosticPrinter.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Basic/TargetOptions.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/FileManager.h"
#include "clang/Basic/SourceManager.h"
#include "clang/Lex/Preprocessor.h"
#include "clang/Basic/Diagnostic.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/Parse/Parser.h"
#include "clang/Parse/ParseAST.h"
#include <clang/Sema/Lookup.h>
#include <iostream>
using namespace clang;
using namespace clang::tooling;
using namespace llvm;
class ExampleVisitor : public RecursiveASTVisitor<ExampleVisitor> {
private:
ASTContext *astContext;
public:
explicit ExampleVisitor(CompilerInstance *CI, StringRef file)
: astContext(&(CI->getASTContext())) {}
virtual bool VisitVarDecl(VarDecl *d) {
std::cout << d->getNameAsString() << "#\n";
return true;
}
};
class ExampleASTConsumer : public ASTConsumer {
private:
ExampleVisitor visitor;
public:
explicit ExampleASTConsumer(CompilerInstance *CI, StringRef file)
: visitor(CI, file) {}
virtual void HandleTranslationUnit(ASTContext &Context) {
// de cette façon, on applique le visiteur sur l'ensemble de la translation
// unit
visitor.TraverseDecl(Context.getTranslationUnitDecl());
}
};
class DynamicIDHandler : public clang::ExternalSemaSource {
public:
DynamicIDHandler(clang::Sema *Sema)
: m_Sema(Sema), m_Context(Sema->getASTContext()) {}
~DynamicIDHandler() = default;
/// \brief Provides last resort lookup for failed unqualified lookups
///
/// If there is failed lookup, tell sema to create an artificial declaration
/// which is of dependent type. So the lookup result is marked as dependent
/// and the diagnostics are suppressed. After that is's an interpreter's
/// responsibility to fix all these fake declarations and lookups.
/// It is done by the DynamicExprTransformer.
///
/// #param[out] R The recovered symbol.
/// #param[in] S The scope in which the lookup failed.
virtual bool LookupUnqualified(clang::LookupResult &R, clang::Scope *S) {
DeclarationName Name = R.getLookupName();
std::cout << Name.getAsString() << "\n";
// IdentifierInfo *II = Name.getAsIdentifierInfo();
// SourceLocation Loc = R.getNameLoc();
// VarDecl *Result =
// // VarDecl::Create(m_Context, R.getSema().getFunctionLevelDeclContext(),
// // Loc, Loc, II, m_Context.DependentTy,
// // /*TypeSourceInfo*/ 0, SC_None, SC_None);
// if (Result) {
// R.addDecl(Result);
// // Say that we can handle the situation. Clang should try to recover
// return true;
// } else{
// return false;
// }
return false;
}
private:
clang::Sema *m_Sema;
clang::ASTContext &m_Context;
};
// *****************************************************************************/
LangOptions getFormattingLangOpts(bool Cpp03 = false) {
LangOptions LangOpts;
LangOpts.CPlusPlus = 1;
LangOpts.CPlusPlus11 = Cpp03 ? 0 : 1;
LangOpts.CPlusPlus14 = Cpp03 ? 0 : 1;
LangOpts.LineComment = 1;
LangOpts.Bool = 1;
LangOpts.ObjC1 = 1;
LangOpts.ObjC2 = 1;
return LangOpts;
}
int main() {
using clang::CompilerInstance;
using clang::TargetOptions;
using clang::TargetInfo;
using clang::FileEntry;
using clang::Token;
using clang::ASTContext;
using clang::ASTConsumer;
using clang::Parser;
using clang::DiagnosticOptions;
using clang::TextDiagnosticPrinter;
CompilerInstance ci;
ci.getLangOpts() = getFormattingLangOpts(false);
DiagnosticOptions diagnosticOptions;
ci.createDiagnostics();
std::shared_ptr<clang::TargetOptions> pto = std::make_shared<clang::TargetOptions>();
pto->Triple = llvm::sys::getDefaultTargetTriple();
TargetInfo *pti = TargetInfo::CreateTargetInfo(ci.getDiagnostics(), pto);
ci.setTarget(pti);
ci.createFileManager();
ci.createSourceManager(ci.getFileManager());
ci.createPreprocessor(clang::TU_Complete);
ci.getPreprocessorOpts().UsePredefines = false;
ci.createASTContext();
ci.setASTConsumer(
llvm::make_unique<ExampleASTConsumer>(&ci, "../src/test.cpp"));
ci.createSema(TU_Complete, nullptr);
auto &sema = ci.getSema();
sema.Initialize();
DynamicIDHandler handler(&sema);
sema.addExternalSource(&handler);
const FileEntry *pFile = ci.getFileManager().getFile("../src/test.cpp");
ci.getSourceManager().setMainFileID(ci.getSourceManager().createFileID(
pFile, clang::SourceLocation(), clang::SrcMgr::C_User));
ci.getDiagnosticClient().BeginSourceFile(ci.getLangOpts(),
&ci.getPreprocessor());
clang::ParseAST(sema,true,false);
ci.getDiagnosticClient().EndSourceFile();
return 0;
}
The idea and the DynamicIDHandler class are from cling project where unknown symbols are variable (hence the comments and the code).

OP doesn't want "fuzzy parsing". What he wants is full context-free parsing of the C++ source code, without any requirement for name and type resolution.
He plans to make educated guesses about the types based on the result of the parse.
Clang proper tangles parsing and name/type resolution, which means it must have all that background type information available when it parses. Other answers suggest a LibFuzzy that produces incorrect parse trees, and some fuzzy parser for clang-format about which I know nothing. If one insists on producing a classic AST, none of these solutions will produce the "right" tree in the face of ambiguous parses.
Our DMS Software Reengineering Toolkit with its C++ front end can parse C++ source without the type information, and produces accurate "ASTs"; these are actually abstract syntax dags where forks in trees represent different possible interpretations of the source code according to a language-precise grammar (ambiguous (sub)parses).
What Clang tries to do is avoid producing these multiple sub-parses by using type information as it parses. What DMS does is produce the ambiguous parses, and in (optional) post-parsing (attribute-grammar-evaluation) pass, collect symbol table information and eliminate the sub-parses which are inconsistent with the types; for well-formed programs, this produces a plain AST with no ambiguities left.
If OP wants to make heuristic guesses about the type information, he will need to know these possible interpretations. If they are eliminated in advance, he cannot straightforwardly guess what types might be needed. An interesting possibility is the idea of modifying the attribute grammar (provided in source form as part of the DMS's C++ front end), which already knows all of C++ type rules, to do this with partial information. That would be an enormous head start over building the heuristic analyzer from scratch, given that it has to know about 600 pages of arcane name and type resolution rules from the standard.
You can see examples of the (dag) produced by DMS's parser.

Related

ROS2 C++ Message Builder Pattern: error use of auto before deduction of auto

On Ubuntu 20.04, ROS2 Foxy, GCC version 9.4.0, C++ rclcpp node.
The completely undocumented builder pattern for messages works as follows. (Odometry for example)
#include <nav_msgs/msg/odometry.hpp>
// https://docs.ros2.org/foxy/api/nav_msgs/msg/Odometry.html
nav_msgs::msg::Odometry const odom = nav_msgs::build<nav_msgs::msg::Odometry>()
.header(std_msgs::msg::Header()) .child_frame_id("")
.pose(geometry_msgs::msg::PoseWithCovariance())
.twist(geometry_msgs::msg::TwistWithCovariance());
// This works // Excuse typos
This is opposed to the "classic" style (since ROS committee won't allow initialization with arguments):
nav_msgs::msg::Odometry odom; // Notice can't make it const
odom.header = std_msgs::msg::Header();
odom.header.stamp = builtin_interfaces::msg::Time(); // or = node.get_clock()->now();
odom.header.frame_id = "...";
// ... etc, fill in other odometry members
Problem: But a lot of the time I get a cryptic error, and I can't discern why.
Ex, if I tried:
odom.header = std_msgs::build<std_msgs::msg::Header>() // Error here, on ')'
.stamp(node.get_clock()->now()).frame_id("map");
error: use of auto std_msgs::build() [with MessageType = std_msgs::msg::Header_<std::allocator<void> >] before deduction of auto
I tried to make the type as explicit as possible; casting to target type, creating an intermediary variable of the target type, no change in error.
To even understand that the builder pattern exists, unless you found some cryptic scrawl on the internet (like so), you have to dig into the generated header files, so I did.
You will find
/// vim /opt/ros/foxy/include/std_msgs/msg/header.hpp
// generated from rosidl_generator_cpp/resource/idl.hpp.em
// generated code does not contain a copyright notice
#ifndef STD_MSGS__MSG__HEADER_HPP_
#define STD_MSGS__MSG__HEADER_HPP_
#include "std_msgs/msg/detail/header__struct.hpp"
#include "std_msgs/msg/detail/header__builder.hpp"
#include "std_msgs/msg/detail/header__traits.hpp"
#endif // STD_MSGS__MSG__HEADER_HPP_
/// vim /opt/ros/foxy/include/std_msgs/msg/detail/header__builder.hpp
// ...
namespace std_msgs {
namespace msg {
namespace builder {
// ... bunch of builder classes calling the next, ex:
class Init_Header_stamp { ... };
} // namespace builder
} // namespace msg
template<typename MessageType>
auto build();
template<>
inline
auto build<::std_msgs::msg::Header>()
{
return std_msgs::msg::builder::Init_Header_stamp();
}
} // namespace std_msgs
Using this as a guide, I wondered if I repeated / copy-pasted the template declaration in my code, and replaced auto with the actual type(s), would that work... but at this point my c++ fu was exhausted, and all the cobbled substitutions just gave more weird errors.
Thoughts?

Solution:
The parent type includes the child type definitions, but not the builder definitions.
In this example, to include Odometry and then use the builder pattern on Header, you also need to #include <std_msgs/msg/header.hpp>.
Why: (no, really, why was it implemented like this, where is the documentation)
/// In vim /opt/ros/foxy/include/nav_msgs/msg/detail/odometry__struct.hpp
// Include directives for member types
// Member 'header'
#include "std_msgs/msg/detail/header__struct.hpp"
// Member 'pose'
#include "geometry_msgs/msg/detail/pose_with_covariance__struct.hpp"
// Member 'twist'
#include "geometry_msgs/msg/detail/twist_with_covariance__struct.hpp"
/// Notice, doesn't include the full child header, which also includes __builder.hpp,
/// but just the sub-header with type definitions, __struct.hpp!
So include every header manually if you use it's builder pattern.

VSCode: Set C/C++ preprocessor macro for static analysis

I am developing a library which lets user set a crucial type alias, or do it through preprocessor directives.
This type alias (or the directive) is undeclared in the library, by design. Thus, when developing my code I get annoying error messages and squiggles for this undeclared type. This could be avoided if I declare a temporary type for it somewhere. However, I do not like the idea of declaring it when I work with the code and then remove it when I publish it. It is also bug prone, since I could easily forget to remove it.
My question is: Can I define preprocessor directives for VS Code's static analysis (IntelliSense? C/C++ Extension)?
That would let me consider the analysis like what a well defined type alias would produce. And avoid annoying error messages/squiggles.
Minimal reproducable example:
Online compiler example
tCore.hpp
#pragma once
#include <string>
// User is responsible of declaring the tCore type
// tCore interface methods
template<typename TCore>
std::string printImpl();
tPrint.hpp
#pragma once
#include <iostream>
class tPrint {
public:
tPrint() = default;
void print() const {
std::cout << printImpl<tCore>() << std::endl; // <-- Static analysis error here!
}
};
tFoo.hpp - tCore candidate
#pragma once
#include <string>
#include "tCore.hpp"
struct tFoo {};
template<>
std::string printImpl<tFoo>() {
return "tFoo";
}
main.cpp
#include <tFoo.hpp>
using tCore = tFoo;
int main() {
tPrint p{};
p.print(); // -> "tFoo"
return 0;
}

I found out it was IntelliSense causing the error through the C/C++ Extension.
I also found an option of adding compiler arguments to IntelliSense, which is exactly what I was looking for.
Either through the UI:
Press
F1 -> > C/C++: Edit Configurations (UI) -> Scroll down to Defines
Or via the JSON :
c_cpp_properties.json configurations has a field defines which holds any compiler arguments.

Extending the class definition upon including a file inside its block

I am drafting some modification on a C++ I have been given. In order to separate my contributions for the time being, I would like to extend the class definition in a original header file by including another file.
In terms of pseudo code it would be something like this
class OldClass : public OldSuperClass
{
private:
blah blah
protected:
blah blah
public:
blah blah
#include "NewStuff.h" // this is the action
blah blah
};
This was inspired by the explanation that
the #include directive causes the preprocessor to include another file, usually a header file" (C++ Pocket Reference, O'Reilly).
The file NewStuff contains vanilla definitions like
#ifndef _JNew_
#define _JNew_
#include <iomanip> // this is the mistake (see comments and answers)
const int apha = 1;
double beta;
blah blah
inline double HalfDif (double a, double b)
{
return (a - b) * 0.5;
}
#endif
I would want these to become public members of OldClass. At compile time, however, I receive errors of the type
error: expected unqualified-id at end of input
error: expected unqualified-id before ‘namespace’
Searching across the internet fora, these seem to indicate that there is a syntax problem with braces and semicolons.
However, in this case the error is triggered by the inclusion, and I am not familiar with the rules that I have apparently infringed.
What is wrong with the action above, and how can it be made to work? As a temporary hack would actually be quite handy.

What you have done can work. Just consider that #includes are simply about textual replacement. What you can put in the header is for example:
// xy.h
int x;
So when it is expanded in:
struct foo {
#include "xy.h"
};
The result becomes
struct foo {
// xy.h
int x;
};
Your code is invalid after replacing the #include with the contents of the header.
However, even if you fix that, this is uncommon and will make a good surprise for many reading the code (real life: surprises yeah, coding: surprises meh). Headers are not made to be used like this. At least use a different extension like .incl. But even then it is not a "nice" solution. It will result in a mess pretty fast once you have to apply modifications in different places in the code. If you want to keep your modifications apart from the rest of the code you should use a version control system where you can create your working branch without disturbing others working on other branches.

Is it bad to use #include in the middle of code?

I keep reading that it's bad to do so, but I don't feel those answers fully answer my specific question.
It seems like it could be really useful in some cases. I want to do something like the following:
class Example {
private:
int val;
public:
void my_function() {
#if defined (__AVX2__)
#include <function_internal_code_using_avx2.h>
#else
#include <function_internal_code_without_avx2.h>
#endif
}
};
If using #include in the middle of code is bad in this example, what would be a good practice approach for to achieve what I'm trying to do? That is, I'm trying to differentiate a member function implementation in cases where avx2 is and isn't available to be compiled.

No it is not intrinsically bad. #include was meant to allow include anywhere. It's just that it's uncommon to use it like this, and this goes against the principle of least astonishment.
The good practices that were developed around includes are all based on the assumption of an inclusion at the start of a compilation unit and in principle outside any namespace.
This is certainly why the C++ core guidelines recommend not to do it, being understood that they have normal reusable headers in mind:
SF.4: Include .h files before other declarations in a file
Reason
Minimize context dependencies and increase readability.
Additional remarks: How to solve your underlying problem
Not sure about the full context. But first of all, I wouldn't put the function body in the class definition. This would better encapsulate the implementation specific details for the class consumers, which should not need to know.
Then you could use conditional compilation in the body, or much better opt for some policy based design, using templates to configure the classes to be used at compile time.

I agree with what #Christophe said. In your case I would write the following code
Write a header commonInterface.h
#pragma once
#if defined (__AVX2__)
void commonInterface (...) {
#include <function_internal_code_using_avx2.h>
}
#else
void commonInterface (...) {
#include <function_internal_code_without_avx2.h>
}
#endif
so you hide the #if defined in the header and still have good readable code in the implementation file.
#include <commonInterface>
class Example {
private:
int val;
public:
void my_function() {
commonInterface(...);
}
};

#ifdef __AVX2__
# include <my_function_avx2.h>
#else
# include <my_function.h>
#endif
class Example {
int val;
public:
void my_function() {
# ifdef __AVX2__
my_function_avx2(this);
# else
my_function(this);
# endif
}
};

Whether it is good or bad really depends on the context.
The technique is often used if you have to write a great amount of boilerplate code. For example, the clang compiler uses it all over the place to match/make use of all possible types, identifiers, tokens, and so on. Here is an example, and here another one.
If you want to define a function differently depending on certain compile-time known parameters, it's seen cleaner to put the definitions where they belong to be.
You should not split up a definition of foo into two seperate files and choose the right one at compile time, as it increases the overhead for the programmer (which is often not just you) to understand your code.
Consider the following snippet which is, at least in my opinion, much more expressive:
// platform.hpp
constexpr static bool use_avx2 = #if defined (__AVX2__)
true;
#else
false;
#endif
// example.hpp
class Example {
private:
int val;
public:
void my_function() {
if constexpr(use_avx2) {
// code of "functional_internal_code_using_avx2.h"
}
else {
// code of "functional_internal_code_without_avx2.h"
}
};
The code can be improved further by generalizing more, adding layers of abstractions that "just define the algorithm" instead of both the algorithm and platform-specific weirdness.
Another important argument against your solution is the fact that both functional_internal_code_using_avx2.h and functional_internal_code_without_avx2.h require special attention:
They do not build without example.h and it is not obvious without opening any of the files that they require it. So, specific flags/treatment when building the project have to be added, which is difficult to maintain as soon as you use more than one such functional_internal_code-files.

I am not sure what you the bigger picture is in your case, so whatever follows should be taken with a grain of salt.
Anyway: #include COULD happen anywhere in the code, BUT you could think of it as a way of separating code / avoiding redundancy. For definitions, this is already well covered by other means. For declarations, it is the standard approach.
Now, this #includes are placed at the beginning as a courtesy to the reader who can catch up more quickly on what to expect in the code to follow, even for #ifdef guarded code.
In your case, it looks like you want a different implementation of the same functionality. The to-go approach in this case would be to link a different portion of code (containing a different implementation), rather than importing a different declaration.
Instead, if you want to really have a different signature based on your #ifdef then I would not see a more effective way than having #ifdef in the middle of the code. BUT, I would not consider this a good design choice!

I define this as bad coding for me. It makes code hard to read.
My approach would be to create a base class as an abstract interface and create specialized implementations and then create the needed class.
E.g.:
class base_functions_t
{
public:
virtual void function1() = 0;
}
class base_functions_avx2_t : public base_functions_t
{
public:
virtual void function1()
{
// code here
}
}
class base_functions_sse2_t : public base_functions_t
{
public:
virtual void function1()
{
// code here
}
}
Then you can have a pointer to your base_functions_t and instanciate different versions. E.g.:
base_functions_t *func;
if (avx2)
{
func = new base_functions_avx2_t();
}
else
{
func = new base_functions_sse2_t();
}
func->function1();

As a general rule I would say that it's best to put headers that define interfaces first in your implementation files.
There are of course also headers that don't define any interfaces. I'm thinking mainly of headers that use macro hackery and are intended to be included one or more times. This type of header typically doesn't have include guards. An example would be <cassert>. This allows you to write code something like this
#define NDEBUG 1
#include <cassert>
void foo() {
// do some stuff
assert(some_condition);
}
#undef NDEBUG
#include <cassert>
void bar() {
assert(another_condition);
}
If you only include <cassert> at the start of your file you will have no granularity for asserts in your implementation file other than all on or all off. See here for more discussion on this technique.
If you do go down the path of using conditional inclusion as per your example then I would strongly recommend that you use an editor like Eclipse or Netbeans that can do inline preprocessor expansion and visualization. Otherwise the loss of locality that inclusion brings can severely hurt readability.

How to overcome the namespace evil of C++ header files?

With one of my projects I will head into the C++ field. Basically I am coming
from a Java background and was wondering how the concept of Java packages
is realized in the C++ world. This led me to the C++ concept of namespaces.
I am absolutely fine with namespaces so far but when it comes to header files
things are becoming kind of inefficient with respect to fully qualified class
names, using-directives and using-declarations.
A very good description of the issue is this article by Herb Sutter.
As I understand it this all boils down to: If you write a header file always
use fully qualified type names to refer to types from other namespaces.
This is almost unacceptable. As a C++ header commonly provides the declaration
of a class, a maximum of readability has top priority. Fully qualifying each
type from a different namespace creates a lot of visual noise, finally
diminishing readability of the header to a degree which raises the question
whether to use namespaces at all.
Nevertheless I want to take advantage of C++ namespaces and so put some thought into
the question: How to overcome the namespace evil of C++ header files? After
some research I think typedefs could be a valid cure to this problem.
Following you will find a C++ sample program which demonstrates how I would
like to use public class scoped typedefs to import types from other namespaces.
The program is syntactically correct and compiles fine on MinGW W64. So far so
good, but I am not sure whether this approach happily removes the using keyword
from the header but brings in another problem which I am simply not aware of.
Just something tricky like the things described by Herb Sutter.
That is I kindly ask everybody who has a thorough understanding of C++ to
review the code below and let me know whether this should work or not. Thanks
for your thoughts.
MyFirstClass.hpp
#ifndef MYFIRSTCLASS_HPP_
#define MYFIRSTCLASS_HPP_
namespace com {
namespace company {
namespace package1 {
class MyFirstClass
{
public:
MyFirstClass();
~MyFirstClass();
private:
};
} // namespace package1
} // namespace company
} // namespace com
#endif /* MYFIRSTCLASS_HPP_ */
MyFirstClass.cpp
#include "MyFirstClass.hpp"
using com::company::package1::MyFirstClass;
MyFirstClass::MyFirstClass()
{
}
MyFirstClass::~MyFirstClass()
{
}
MySecondClass.hpp
#ifndef MYSECONDCLASS_HPP_
#define MYSECONDCLASS_HPP_
#include <string>
#include "MyFirstClass.hpp"
namespace com {
namespace company {
namespace package2 {
/*
* Do not write using-declarations in header files according to
* Herb Sutter's Namespace Rule #2.
*
* using std::string; // bad
* using com::company::package1::MyFirstClass; // bad
*/
class MySecondClass{
public:
/*
* Public class-scoped typedefs instead of using-declarations in
* namespace package2. Consequently we can avoid fully qualified
* type names in the remainder of the class declaration. This
* yields maximum readability and shows cleanly the types imported
* from other namespaces.
*/
typedef std::string String;
typedef com::company::package1::MyFirstClass MyFirstClass;
MySecondClass();
~MySecondClass();
String getText() const; // no std::string required
void setText(String as_text); // no std::string required
void setMyFirstInstance(MyFirstClass anv_instance); // no com::company:: ...
MyFirstClass getMyFirstInstance() const; // no com::company:: ...
private:
String is_text; // no std::string required
MyFirstClass inv_myFirstInstance; // no com::company:: ...
};
} // namespace package2
} // namespace company
} // namespace com
#endif /* MYSECONDCLASS_HPP_ */
MySecondClass.cpp
#include "MySecondClass.hpp"
/*
* According to Herb Sutter's "A Good Long-Term Solution" it is fine
* to write using declarations in a translation unit, as long as they
* appear after all #includes.
*/
using com::company::package2::MySecondClass; // OK because in cpp file and
// no more #includes following
MySecondClass::MySecondClass()
{
}
MySecondClass::~MySecondClass()
{
}
/*
* As we have already imported all types through the class scoped typedefs
* in our header file, we are now able to simply reuse the typedef types
* in the translation unit as well. This pattern shortens all type names
* down to a maximum of "ClassName::TypedefTypeName" in the translation unit -
* e.g. below we can simply write "MySecondClass::String". At the same time the
* class declaration in the header file now governs all type imports from other
* namespaces which again enforces the DRY - Don't Repeat Yourself - principle.
*/
// Simply reuse typedefs from MySecondClass
MySecondClass::String MySecondClass::getText() const
{
return this->is_text;
}
// Simply reuse typedefs from MySecondClass
void MySecondClass::setText(String as_text)
{
this->is_text = as_text;
}
// Simply reuse typedefs from MySecondClass
void MySecondClass::setMyFirstInstance(MyFirstClass anv_instance)
{
this->inv_myFirstInstance = anv_instance;
}
// Simply reuse typedefs from MySecondClass
MySecondClass::MyFirstClass MySecondClass::getMyFirstInstance() const
{
return this->inv_myFirstInstance;
}
Main.cpp
#include <cstdio>
#include "MySecondClass.hpp"
using com::company::package2::MySecondClass; // OK because in cpp file and
// no more #includes following
int main()
{
// Again MySecondClass provides all types which are imported from
// other namespaces and are part of its interface through public
// class scoped typedefs
MySecondClass *lpnv_mySecCls = new MySecondClass();
// Again simply reuse typedefs from MySecondClass
MySecondClass::String ls_text = "Hello World!";
MySecondClass::MyFirstClass *lpnv_myFirClsf =
new MySecondClass::MyFirstClass();
lpnv_mySecCls->setMyFirstInstance(*lpnv_myFirClsf);
lpnv_mySecCls->setText(ls_text);
printf("Greetings: %s\n", lpnv_mySecCls->getText().c_str());
lpnv_mySecCls->setText("Goodbye World!");
printf("Greetings: %s\n", lpnv_mySecCls->getText().c_str());
getchar();
delete lpnv_myFirClsf;
delete lpnv_mySecCls;
return 0;
}

Pain is mitigated by reducing complexity. You're bending C++ into Java. (That works just as bad as trying the other way.)
Some hints:
Remove the "com" namespace level. (This is just a java-ism that you don't need)
Drop the "company" namespace, maybe replace by "product" or "library" namespace (i.e. boost, Qt, OSG, etc). Just pick something that's unique w.r.t. the other libs you're using.
You don't need to fully declare names that are in the same namespace you're in (caveat emptor: template classe, see comment). Just avoid any using namespace directives in the headers. (And use with care in C++ files, if at all. Inside functions is preferred.)
Consider namespace aliases (in functions/cpp files), i.e namespace bll = boost::lambda;. This creates shortcuts that are quite neat.
Also, by hiding private members/types using the pimpl pattern, your header have less types to expose.
P.S: Thanks to #KillianDS a few good tips in comments (that were deleted when I edited them into the question.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js