what is c++ modules and how do they differ from namespaces? - c++

I was looking at libstdc++ documentation at http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a01618.html and found it arranged in "modules" such as Algorithm, Strings etc
I have multiple questions
Since this is auto-generated documentation from doxygen, which part of libstdc++ source code or config file, makes doxygen "aware" of different modules and their contents/dependencies?
What is modules and how does it differ from namespace.
I did a google search on c++ modules and found that modules are defined by "export modulename", but i could not find any export definition in libstdc++ source code. Does the word "Modules" in the above documentation refer to some different construct than export ?
Do developers typically divide their source code into modules for large projects?
where can i learn about modules, so that i can organize my source code into modules and namespaces

It looks to me like you're running into two entirely separate things that happen to use the same name.
The "modules" you're seeing in the documentation seem to be just a post-hoc classification of the algorithms and such. It may be open to argument that they should correspond closely to namespaces, but in the case of the standard library, essentially everything is in one giant namespace. If it were being designed using namespaces from the beginning it might not be that way, but that's not how things happened. In any case, the classification applies to the documentation, not to the code itself. Somebody else producing similar documentation might decide to divide it up into different modules, still without affecting the code.
During the C++11 standardization effort, there was a proposal to add something else (that also went by the name modules) to the C++ language itself. This proposal was removed, primarily in the interest of finishing the standard sooner. The latter differed from namespaces quite a bit, and is the one that used "export" for a module name. It's dead and gone (a least for now) though, so I won't go into a lot more detail about it here. If you're curious, you can read Daveed Vandervoorde's paper about it though.
Update: The committee added modules to C++ 20. What was added is at least somewhat different from anything anybody would have known about in 2012 when this question was asked, but it is at least pretty much the same general idea as the modules that were proposed for C++11. A bit much to add on to a 10 year-old answer, but here's a link to at least some information about them:
https://en.cppreference.com/w/cpp/language/modules

The modules you see in the documentation are created by Doxygen and are not a part of C++. Certain classes in libstdc++ library are grouped together into modules using the \ingroup Doxygen command.
See: http://www.doxygen.nl/manual/grouping.html for more information on creating modules/groups in Doxygen.

Related

Do concepts alleviate the need of defining classes in header files?

Bjarne Stroustrup has mentioned the disadvantage of having to define templates in header files multiple times.
Example: https://youtu.be/HddFGPTAmtU
My question is now, if this is now solved by the new concepts feature of C++ in C++20?
I cannot really find anything regarding that and Bjarne did also not say anything about that anymore, afaik.
Concepts itself doesn't eliminate this, but C++ Modules will. C++ Modules are separate from concepts: you can use modules without using concepts. But given that Concepts encourage the creation of generic code, Modules will be a much needed addition.
You'll be able to use Modules side-by-side with #include: you can use one, the other, or both as fits your needs. Modules will speed up the compilation of code significantly, and with modules you'll be able to put templates (and concepts) in a cpp file without even having a header file.
Concepts are about constraining templates and template-related entities. Concepts do not, and has never tried to, address the restriction on having to define templates in header files.
Modules, on the other hand, does try to address that issue. Not by letting you define templates in source files, but by adding a totally new encapsulation layer to the language and hopefully reducing compile times as a result. But while Concepts have already been added to the working draft for C++20, Modules has not been. It's looking like it could make C++20, but it is not yet clear if it will or not. We'll see.

C++ modules: module implementation units for unnecessary recompilation?

Recently watched video from CppCon 2017: Boris Kolpackov “Building C++ Modules”
https://www.youtube.com/watch?v=E8EbDcLQAoc
Approximately at 31:35 he starts explaining that we should still use header/source splitting and shows 3 reasons. The first reason:
If you have both declarations/definitions in the same place when you touch this module all other modules that depend on the module interface (BMI) will be recompiled.
And that I didn't like at all. It sounded like we are still in 90s and compilers cannot be smart enough to see difference in BMI-related changes and implementation related changes. As I see it, compilers are able to quickly scan each module and generate only BMI from it. And if BMI is not changed - don't recompile other modules that depend on it.
Or am I missing something?
The author of that talk later said the recompilation issue is a matter of implementation. Quoting the article Common C++ Modules TS Misconceptions by Boris Kolpackov:
It turns out a lot of people would like to get rid of the header/source split (or, in modules terms, interface/implementation split) and keep everything in a single file. You can do that in Modules TS: with modules (unlike headers) you can define non-inline functions and variables in module interface units. So if you want to keep everything in a single file, you can.
and
Now, keeping everything in a single file may have negative build performance implications but it appears a smart enough build system in cooperation with the compiler should be able to overcome this. See this discussion for details.
Quoting Gor Nishanov (the Project Editor of Coroutines TS) from the linked thread:
That is up to you how to structure your code. Module TS does not impose on you how you decompose your module into individual files. If you want, you can implement your entire module in the interface file (thus you will have C# like experience), or you can partition your module into an interface and one or more implementation files.
The Project Editor of Modules TS, Gabriel Dos Reis, commented on the MSVC implementation:
Ideally, only semantics-relevant changes should trigger recompilation keyed on the IFC.
(As a side note, the Modules TS has now been approved and sent to ISO for publication.)

Discovering Symbol Usage

Issue
I have recently found myself working with a large, unfamiliar, multi-department, C++ codebase in need of better organization. I would like to discover a way to map which symbols are used by which source files for any given header. This is in the hope that if only one department uses a given function, then it can be moved out of the shared area and into that department's area.
Attempts
My first thoughts were to use the symbol table: ie. compile the project and dump the symbols for each object file. From there I figured I could simply write a script to check if the symbols from my header file were used. While this approach seems viable, it would require me to create a list of symbols I am looking for from the headers. With my limited knowledge, I am unsure of how to automate such a process, and with hundreds of headers files to test, doing it manually is out of the question.
Questions
Is my approach valid? If so..
What can I use to generate the symbol names from my header file?
If not..
What else can I do?
Additionally, while I am using Linux, most of the development teams work in Windows only environments. What utilities could I use on both platforms?
Any and all help is greatly appreciated.
When I need to clean up APIs I sometimes use information from callcatcher. It basically builds a database of all symbols while compiling and allows you to determine what symbols are used in some build product.
I sometimes also use DXR (code on github, an example installation) to browse what code defined where is used how. In contrast to callcatcher with DXR you can drill down to much finer detail. Setting up DXR is pretty heavy duty, but might be worth it if you have enough code to work with.
On the other side of the spectrum there are tools like cscope. Even though it doesn't work super nicely with C++ code it is still very useful. If you deal with more than a couple 100kloc you will quickly feel limited though.
If I had to pick only one of these tools and would be working on a large code base (>1Mloc) I would definitely pick DXR.
You can get a reasonable start on the information that you've described by using doxygen.
Even for source that doesn't contain the doxygen formatted comments the documentation created can contain a list of places (ie. source files) where a particular symbol is used.
And, as doxygen can be used to generate html documentation, navigating through your source tree becomes trivial. It's can be even better if you enable the dot functionality to generate relationship diagrams for the classes in your source tree.
very old-school, simple, and possibly unix only, but are you aware of etags? there's also gnu global which i think is similar.
the gnu global link refers to the "comparison with similar tools" discussion here which might also be useful.

What generic template processor should I use?

This is a potentially dangerous question because interdisciplinary questions and answers will be biased, but I'll have a stab at it anyway. All in good spirit!
So, here we go. I'm writing a major editing mode for Emacs for the language that it has almost no support for yet. And I'm at the point, where I have to decide on a way to generate project files. Below is the syllabus of the task ahead:
The templates have to represent project directory tree, not only single files.
The resulting files are of various formats, potentially including SGML-like languages, but not limited to this variety. They also have to generate C-like source code and, eLisp source code and plain text files, like README, for example.
The templates must be processed in a batch upon user-initiated action (as in user wants to create a project - several files must be created in the user-appointed directory). It may be beneficial to have an ability to supervise the creation, but this is less important then the ability to run the process entirely automatically.
Bonus features:
The template language has already a user base (with a potential of reuse of existing templates).
The templates can be used for code snippets (contain blanks which are filled interactively once the user invokes code-generating routine while editing the file).
Obvious things like cross-platform-ness, ease of use both through graphical interface and command line.
I made a research, but I won't share my results (yet) so I won't bias the answers. The problem with answering this question is not that the answer is hard to find, but that it is hard to chose one from many.
I'm developing a system based on Mustache for exactly the use case that you've described. The template language itself is a very simple extension of Mustache called Groome.
I also released a command-line tool called Molt that renders Groome templates. I'd be curious to know if it does everything that you need. I'm still adding features to the tool and haven't yet announced it. Thanks.
I went to solve a similar problem several years aback, where I wanted to use Emacs to generate code out of a UML diagram (cogre), and also generate Makefiles from project specifications. I first tried to use Tempo, but when I tried to get the templates to nest, I ran into problems. I also looked into skeleton, but that didn't quite fit the plan either.
I ended up using Google Templates for a little bit, and liked the syntax, and developed SRecode instead, and just borrowed the good bits from Google templates. SRecode was written specifically for machine-generated code. The interaction for template insertion (aka - what tempo was written for) isn't first class in SRecode. For generating code from a data structure, however, it is very robust, and has a lot of features, and automatically filled variables. It works closely with your major mode, and allows many nested templates, with control over the nested dictionary values. There is a subsystem that will use Semantic tags and generate code from them for a couple languages. That means you can parse code in one language with Semantic, and generate code in another language with SReocde using those tags. Nifty! Many parts of CEDET Reference manuals were built that way.
The templates themselves allow looping, if statements, and include statements. There are a couple examples in SRecode for making an 'application', such as the comment writer, and EDE uses it to create Makefiles, which is almost exactly what you are trying to do.
Another option is Generator, which offers “language-agnostic project bootstrapping with an emphasis on simplicity”. Installation requires Node.js and npm.
Generator’s emphasis on simplicity means it is very easy to learn how to make a template. Generator also saves you from having to reference templates by file paths – it looks for templates in ~/.generator.
However, there is no way to write README or LICENSE files for the template itself without those files being copied to the generated project. Also, post-generation commands written in the Makefile will be copied to the generated Makefile, even after they are no longer of use. Finally, the ad-hoc templating language doesn’t provide a way to escape its __lowercasevariables__ – though I can’t think of a language where that limitation would be a problem.

Where do I learn "what I need to know" about C++ compilers?

I'm just starting to explore C++, so forgive the newbiness of this question. I also beg your indulgence on how open ended this question is. I think it could be broken down, but I think that this information belongs in the same place.
(FYI -- I am working predominantly with the QT SDK and mingw32-make right now and I seem to have configured them correctly for my machine.)
I knew that there was a lot in the language which is compiler-driven -- I've heard about pre-compiler directives, but it seems like someone would be able to write books the different C++ compilers and their respective parameters. In addition, there are commands which apparently precede make (like qmake, for example (is this something only in QT)).
I would like to know if there is any place which gives me an overview of what compilers are out there, and what their different options are. I'd also like to know how each of them views Makefiles (it seems that there is a difference in syntax between them?).
If there is no website regarding, "Everything you need to know about C++ compilers but were afraid to ask," what would be the best way to go about learning the answers to these questions?
Concerning the "numerous options of the various compilers"
A piece of good news: you needn't worry about the detail of most of these options. You will, in due time, delve into this, only for the very compiler you use, and maybe only for the options that pertain to a particular set of features. But as a novice, generally trust the default options or the ones supplied with the make files.
The broad categories of these features (and I may be missing a few) are:
pre-processor defines (now, you may need a few of these)
code generation (target CPU, FPU usage...)
optimization (hints for the compiler to favor speed over size and such)
inclusion of debug info (which is extra data left in the object/binary and which enables the debugger to know where each line of code starts, what the variables names are etc.)
directives for the linker
output type (exe, library, memory maps...)
C/C++ language compliance and warnings (compatibility with previous version of the compiler, compliance to current and past C Standards, warning about common possible bug-indicative patterns...)
compile-time verbosity and help
Concerning an inventory of compilers with their options and features
I know of no such list but I'm sure it probably exists on the web. However, suggest that, as a novice you worry little about these "details", and use whatever free compiler you can find (gcc certainly a great choice), and build experience with the language and the build process. C professionals may likely argue, with good reason and at length on the merits of various compilers and associated runtine etc., but for generic purposes -and then some- the free stuff is all that is needed.
Concerning the build process
The most trivial applications, such these made of a single unit of compilation (read a single C/C++ source file), can be built with a simple batch file where the various compiler and linker options are hardcoded, and where the name of file is specified on the command line.
For all other cases, it is very important to codify the build process so that it can be done
a) automatically and
b) reliably, i.e. with repeatability.
The "recipe" associated with this build process is often encapsulated in a make file or as the complexity grows, possibly several make files, possibly "bundled together in a script/bat file.
This (make file syntax) you need to get familiar with, even if you use alternatives to make/nmake, such as Apache Ant; the reason is that many (most?) source code packages include a make file.
In a nutshell, make files are text files and they allow defining targets, and the associated command to build a target. Each target is associated with its dependencies, which allows the make logic to decide what targets are out of date and should be rebuilt, and, before rebuilding them, what possibly dependencies should also be rebuilt. That way, when you modify say an include file (and if the make file is properly configured) any c file that used this header will be recompiled and any binary which links with the corresponding obj file will be rebuilt as well. make also include options to force all targets to be rebuilt, and this is sometimes handy to be sure that you truly have a current built (for example in the case some dependencies of a given object are not declared in the make).
On the Pre-processor:
The pre-processor is the first step toward compiling, although it is technically not part of the compilation. The purposes of this step are:
to remove any comment, and extraneous whitespace
to substitute any macro reference with the relevant C/C++ syntax. Some macros for example are used to define constant values such as say some email address used in the program; during per-processing any reference to this constant value (btw by convention such constants are named with ALL_CAPS_AND_UNDERSCORES) is replace by the actual C string literal containing the email address.
to exclude all conditional compiling branches that are not relevant (the #IFDEF and the like)
What's important to know about the pre-processor is that the pre-processor directive are NOT part of the C-Language proper, and they serve several important functions such as the conditional compiling mentionned earlier (used for example to have multiple versions of the program, say for different Operating Systems, or indeed for different compilers)
Taking it from there...
After this manifesto of mine... I encourage to read but little more, and to dive into programming and building binaries. It is a very good idea to try and get a broad picture of the framework etc. but this can be overdone, a bit akin to the exchange student who stays in his/her room reading the Webster dictionary to be "prepared" for meeting native speakers, rather than just "doing it!".
Ideally you shouldn't need to care what C++ compiler you are using. The compatability to the standard has got much better in recent years (even from microsoft)
Compiler flags obviously differ but the same features are generally available, it's just a differently named option to eg. set warning level on GCC and ms-cl
The build system is indepenant of the compiler, you can use any make with any compiler.
That is a lot of questions in one.
C++ compilers are a lot like hammers: They come in all sizes and shapes, with different abilities and features, intended for different types of users, and at different price points; ultimately they all are for doing the same basic task as the others.
Some are intended for highly specialized applications, like high-performance graphics, and have numerous extensions and libraries to assist the engineer with those types of problems. Others are meant for general purpose use, and aren't necessarily always the greatest for extreme work.
The technique for using each type of hammer varies from model to model—and version to version—but they all have a lot in common. The macro preprocessor is a standard part of C and C++ compilers.
A brief comparison of many C++ compilers is here. Also check out the list of C compilers, since many programs don't use any C++ features and can be compiled by ordinary C.
C++ compilers don't "view" makefiles. The rules of a makefile may invoke a C++ compiler, but also may "compile" assembly language modules (assembling), process other languages, build libraries, link modules, and/or post-process object modules. Makefiles often contain rules for cleaning up intermediate files, establishing debug environments, obtaining source code, etc., etc. Compilation is one link in a long chain of steps to develop software.
Also, many development environments abstract the makefile into a "project file" which is used by an integrated development environment (IDE) in an attempt to simplify or automate many programming tasks. See a comparison here.
As for learning: choose a specific problem to solve and dive in. The target platform (Linux/Windows/etc.) and problem space will narrow the choices pretty well. Which you choose is often linked to other considerations, such as working for a particular company, or being part of a team. C++ has something like 95% commonality among all its flavors. Learn any one of them well, and learning the next is a piece of cake.