In C++, your classes are often divided into two parts, being the header-file and the actual implementation. In my (unexperienced) opinion, this is awful. It requires me to do all sorts of unnecessary book-keeping, clutters up my project directory and goes against everything I've learned about software development (double implementation). Languages where you only deal with the implementation, such as Java or Python, are much nicer to work with.
I've always learned that the reason to use them was to significantly decrease compilation time. However, wouldn't a modern IDE (CLion in my case) or even the compiler be smart enough to either:
Keep some sort of "shadow"-header file, which would automatically be updated whenever a definition is changed in the implementation?
Automatically split it into the header and implementation during compile time, allowing you to only have to deal with one file? (Something that Lazy C++ seems to do)
Or are there any plugins available that offer this kind of behaviour? C++ modules also seem to offer a solution to this problem, but their current status/support is unclear to me and to make matters worse there seem to be two competing standards (Clang's and Microsoft's).
Unfortunately it is not that simple. Header/source file separation C++ inherited from C due to preprocessor, that both share. Automatic generation of a header file is not possible in general, first of all that separation is not trivial, second header file often has preprocessor code that manually written and generates compilation code. Third almost all templated code goes to a header file due to process of compilation and rules of visibility. Changing all of that would require breaking compatibility with existing code, amount of which is significant and nobody wants to do that. More easy would be to create yet another language (like D) but many people would not want to migrate due to various reasons. We know that committee is working on modules and if they manage to make them work without breaking compatibility, that would be helpful for many of us. But again this is not trivial task at all, the way you describe it would only work in certain environments (when you limit yourself) but cannot be applied to everybody.
Will modules make template compilation faster? Templates (usually) have to be header only, and end up residing in the translation unit of the #includer.
Related: Do precompiled headers make template compilation faster?
According to modules proposal, from the very paper you cited, it's the first of the three primary goals for adding modules:
1 Introduction
Modules are a mechanism to package libraries and encapsulate their implementations.
They differ from the traditional approach of translation units and header files primarily in
that all entities are defined in just one place (even classes, templates, etc.). This paper
proposes a module mechanism (somewhat similar to that of Modula-2) with three
primary goals:
Significantly improve build times of large projects
Enable a better separation between interface and implementation
Provide a viable transition path for existing libraries
While these are the driving goals, the proposal also resolves a number of other longstanding practical C++ issues (initialization ordering, run-time performance, etc.).
So, how can they accomplish those goals? Well, from section 4.1:
Since header files are typically included in many other files, the
growth in build cycles is generally superlinear with respect to the total amount of source
code. If the issue is not addressed, it is likely to become worse as the use of templates
increases and more powerful declarative facilities (like concepts, contract programming,
etc.) are added to the language.
Modules address this issue by replacing the textual inclusion mechanism (whose
processing time is roughly proportional to the amount of code included) by a precompiled
module attachment mechanism (whose processing time—when properly implemented—
is roughly proportional to the number of imported declarations). The property that client
translation units need not be recompiled when private module definitions change can be
retained.
In other words, at the very least, the time taken to parse these templates is only done once instead of N times, which is already a huge improvement.
Later sections describe improvements for things like explicit instantiation. The one thing this doesn't directly improve is automatic template instantiation, as section 5.8 acknowledges. Here all that can be guaranteed is exactly the same benefit you already get from precompiled headers: "Both modules Set and Reset must instantiate Lib::S and in fact both expose this instantiation in their interface file." But the proposal then gives some possible technical solutions to the ODR problems, at least some of which also solve the multiple-instantiation problem and may not be possible in today's world. For example, the kind of queried instantiation suggested has been tried repeatedly and it's generally considered too hard to get right with today's model, but modules might make it feasible. There's no proof that it's impossible to get right today, just experience that it's hard, and there's no proof that it would be easier with modules, just the plausible possibility that it might be.
And that fits with a general implication that's never quite stated in the proposal, but is there in the background: Making compilation simpler means we might get new optimizations in the process (directly, because it's easier to reason about what's happening, or indirectly, because more people work on the problem once it's not such a huge pain).
In summary, modules can and will definitely make template compilation faster, if for no other reason than that the template definitions themselves only have to be parsed once. They may allow for other benefits that are either impossible or more difficult to do without modules, but that may not be guaranteeable.
I don't know about modules, but I do know that gcc even now provides precompiled headers, as do many other compilers. A precompiled header can contain a very efficient machine-readable version of a template description, so when that is available upon inclusion of a header, many compiling steps can be skipped which would normally be required for a source-text-only uncompiled header.
The modules paper talks about precompiled interface files, so I assume that current precompiled headers and new precompiled interface files will provide comparable peformance. Creating such a file from a plain text portable module description will probably be more efficient as it can save time due to restrictions of the language syntax. And it will be more standardized, so more headers will get the benefit of precompilation. Current projects seldom precompile more than one header, and cross-project precompiled headers are even rarer in my experience.
Do precompiled headers make template compilation faster?
No; it makes templates not compile. Which is the entire point of both PCHs and modules: to stop compiling everything.
The idea is to turn "load C++ text and compile" into "load C++ symbols." Modules are a generalized form of PCHs.
Now, you still have the cost of instantiating templates (unless they were instantiated within a PCH/module). But the cost of compiling the C++ template code is removed.
I'm developing a C++ library. It got me thinking of the ways Java and C# handle including different components of the libraries. For example, Java uses "import" to allow use of classes from other packages, while C# simply uses "using" to import entire modules.
My questions is, would it be a good idea to #include everything in the library in one massive include and then just use the using directive to import specific classes and modules? Or would this just be down right crazy?
EDIT:
Good responses so far, here are a few mitigating factors which I feel add to this idea:
1) Internal #includes are kept as normal (short and to the point)
2) The file which includes everything is optionally supplied with the library to those who wish to use it3) You could optionally make the big include file part of the pre-compiled header
You're confusing the purpose of #include statements in C++. They do not behave like import statements in Java or using statements in C#. #include does what it says; namely, loads and parses the entire indicated file as part of the current translation unit. The reason for the separate includes is to not have to spend compilation time parsing the entire standard library in every file. In contrast, the statements you're trying to make #include behave like are merely for programmer organization purposes.
#include is for management of the compilation process; not for separating uses. (In fact, you cannot use seperate headers to enforce seperate uses because to do so would violate the one definition rule)
tl;dr -> No, you shouldn't do that. #include as little as possible. When your project becomes large, you'll thank yourself when you're not waiting many hours to compile your project.
I would personally recommend only including the headers when you need them to explicitly show which functionalities your file requires. At the same time, doing so will prevent you from gaining access to functionalities you might no necessarily want, e.g functions unrelated to the goal of the file. Sure, this is no big deal, but I think that it's easier to maintain and change code when you don't have access to unnecessary functions/classes; it just makes it more straightforward.
I might be downvoted for this, but I think you bring up an interesting idea. It would probably slow down compilation a bit, but I think the concept is neat.
As long as you used using sparingly — only for the namespaces you need — other developers would be able to get an idea of what classes were used in a file by glancing at the top. It wouldn't be as granular as seeing a list of #included files, but is seeing a list of included header files really very useful? I don't think so.
Just make sure that all of the header files all use inclusion guards, of course. :)
As said by #Billy ONeal, the main thing is that #include is a preprocessor directive that causes a "^C, ^V" (copy-paste) of code that leads to a compile time increase.
The best considered policy in C++ is to forward declare all possible classes in ".h" files and just include them in the ".cpp" file. It isolates dependencies, as a C/C++ project will be cascadingly rebuilt if a dependent include file is changed.
Of course M$ compilers and its precompiled headers tend to do the opposite, enclosing to what you suggest. But anyone that tried to port code across those compilers is well aware of how smelly it can go.
Some libraries like Qt make extensive use of forward declarations. Take a look on it to see if you like its taste.
I think it will be confusing. When you write C++ you should avoid making it look like Java or C# (or C :-). I for one would really wonder why you did that.
Supplying an include-all file isn't really that helpful either, as a user could easily create one herself, with the parts of the library actually used. Could then be added to a precompiled header, if one is used.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Many languages, such as Java, C#, do not separate declaration from implementation. C# has a concept of partial class, but implementation and declaration still remain in the same file.
Why doesn't C++ have the same model? Is it more practical to have header files?
I am referring to current and upcoming versions of C++ standard.
Backwards Compatibility - Header files are not eliminated because it would break Backwards Compatibility.
Header files allow for independent compilation. You don't need to access or even have the implementation files to compile a file. This can make for easier distributed builds.
This also allows SDKs to be done a little easier. You can provide just the headers and some libraries. There are, of course, ways around this which other languages use.
Even Bjarne Stroustrup has called header files a kludge.
But without a standard binary format which includes the necessary metadata (like Java class files, or .Net PE files) I don't see any way to implement the feature. A stripped ELF or a.out binary doesn't have much of the information you would need to extract. And I don't think that the information is ever stored in Windows XCOFF files.
I routinely flip between C# and C++, and the lack of header files in C# is one of my biggest pet peeves. I can look at a header file and learn all I need to know about a class - what it's member functions are called, their calling syntax, etc - without having to wade through pages of the code that implements the class.
And yes, I know about partial classes and #regions, but it's not the same. Partial classes actually make the problem worse, because a class definition is spread across several files. As far as #regions go, they never seem to be expanded in the manner I'd like for what I'm doing at the moment, so I have to spend time expanding those little plus's until I get the view right.
Perhaps if Visual Studio's intellisense worked better for C++, I wouldn't have a compelling reason to have to refer to .h files so often, but even in VS2008, C++'s intellisense can't touch C#'s
C was made to make writing a compiler easily. It does a LOT of stuff based on that one principle. Pointers only exist to make writing a compiler easier, as do header files. Many of the things carried over to C++ are based on compatibility with these features implemented to make compiler writing easier.
It's a good idea actually. When C was created, C and Unix were kind of a pair. C ported Unix, Unix ran C. In this way, C and Unix could quickly spread from platform to platform whereas an OS based on assembly had to be completely re-written to be ported.
The concept of specifying an interface in one file and the implementation in another isn't a bad idea at all, but that's not what C header files are. They are simply a way to limit the number of passes a compiler has to make through your source code and allow some limited abstraction of the contract between files so they can communicate.
These items, pointers, header files, etc... don't really offer any advantage over another system. By putting more effort into the compiler, you can compile a reference object as easily as a pointer to the exact same object code. This is what C++ does now.
C is a great, simple language. It had a very limited feature set, and you could write a compiler without much effort. Porting it is generally trivial! I'm not trying to say it's a bad language or anything, it's just that C's primary goals when it was created may leave remnants in the language that are more or less unnecessary now, but are going to be kept around for compatibility.
It seems like some people don't really believe that C was written to port Unix, so here: (from)
The first version of UNIX was written
in assembler language, but Thompson's
intention was that it would be written
in a high-level language.
Thompson first tried in 1971 to use
Fortran on the PDP-7, but gave up
after the first day. Then he wrote a
very simple language he called B,
which he got going on the PDP-7. It
worked, but there were problems.
First, because the implementation was
interpreted, it was always going to be
slow. Second, the basic notions of B,
which was based on the word-oriented
BCPL, just were not right for a
byte-oriented machine like the new
PDP-11.
Ritchie used the PDP-11 to add types
to B, which for a while was called NB
for "New B," and then he started to
write a compiler for it. "So that the
first phase of C was really these two
phases in short succession of, first,
some language changes from B, really,
adding the type structure without too
much change in the syntax; and doing
the compiler," Ritchie said.
"The second phase was slower," he said
of rewriting UNIX in C. Thompson
started in the summer of 1972 but had
two problems: figuring out how to run
the basic co-routines, that is, how to
switch control from one process to
another; and the difficulty in getting
the proper data structure, since the
original version of C did not have
structures.
"The combination of the things caused
Ken to give up over the summer,"
Ritchie said. "Over the year, I added
structures and probably made the
compiler code somewhat better --
better code -- and so over the next
summer, that was when we made the
concerted effort and actually did redo
the whole operating system in C."
Here is a perfect example of what I mean. From the comments:
Pointers only exist to make writing a compiler easier? No. Pointers exist because they're the simplest possible abstraction over the idea of indirection. – Adam Rosenfield (an hour ago)
You are right. In order to implement indirection, pointers are the simplest possible abstraction to implement. In no way are they the simplest possible to comprehend or use. Arrays are much easier.
The problem? To implement arrays as efficiently as pointers you have to pretty much add a HUGE pile of code to your compiler.
There is no reason they couldn't have designed C without pointers, but with code like this:
int i=0;
while(src[++i])
dest[i]=src[i];
it will take a lot of effort (on the compilers part) to factor out the explicit i+src and i+dest additions and make it create the same code that this would make:
while(*(dest++) = *(src++))
;
Factoring out that variable "i" after the fact is HARD. New compilers can do it, but back then it just wasn't possible, and the OS running on that crappy hardware needed little optimizations like that.
Now few systems need that kind of optimization (I work on one of the slowest platforms around--cable set-top boxes, and most of our stuff is in Java) and in the rare case where you might need it, the new C compilers should be smart enough to make that kind of conversion on its own.
In The Design and Evolution of C++, Stroustrup gives out one more reason...
The same header file can have two or more implementation files which can be simultaneously worked-upon by more than one programmer without the need of a source-control system.
This might seem odd these days, but I guess it was an important issue when C++ was invented.
If you want C++ without header files then I have good news for you.
It already exists and is called D (http://www.digitalmars.com/d/index.html)
Technically D seems to be a lot nicer than C++ but it is just not mainstream enough for use in many applications at the moment.
One of C++'s goals is to be a superset of C, and it's difficult for it to do so if it cannot support header files. And, by extension, if you wish to excise header files you may as well consider excising CPP (the pre-processor, not plus-plus) altogether; both C# and Java do not specify macro pre-processors with their standards (but it should be noted in some cases they can be and even are used even with these languages).
As C++ is designed right now, you need prototypes -- just as in C -- to statically check any compiled code that references external functions and classes. Without header files, you would have to type out these class definitions and function declarations prior to using them. For C++ not to use header files, you'd have to add a feature in the language that would support something like Java's import keyword. That'd be a major addition, and change; to answer your question of if it'd be practical: I don't think so--not at all.
Many people are aware of shortcomings of header files and there are ideas to introduce more powerful module system to C++.
You might want to take a look at Modules in C++ (Revision 5) by Daveed Vandevoorde.
Well, C++ per se shouldn't eliminate header files because of backwards compatibility. However, I do think they're a silly idea in general. If you want to distribute a closed-source lib, this information can be extracted automatically. If you want to understand how to use a class w/o looking at the implementation, that's what documentation generators are for, and they do a heck of a lot better a job.
There is value in defining the class interface in a separate component to the implementation file.
It can be done with interfaces, but if you go down that road, then you are implicitly saying that classes are deficient in terms of separating implementation from contract.
Modula 2 had the right idea, definition modules and implementation modules. http://www.modula2.org/reference/modules.php
Java/C#'s answer is an implicit implementation of the same (albeit object-oriented.)
Header files are a kludge, because header files express implementation detail (such as private variables.)
In moving over to Java and C#, I find that if a language requires IDE support for development (such that public class interfaces are navigable in class browsers), then this is maybe a statement that the code doesn't stand on its own merits as being particularly readable.
I find the mix of interface with implementation detail quite horrendous.
Crucially, the lack of ability to document the public class signature in a concise well-commented file independent of implementation indicates to me that the language design is written for convenience of authorship, rather convenience of maintenance. Well I'm rambling about Java and C# now.
One advantage of this separation is that it is easy to view only the interface, without requiring an advanced editor.
No language exists without header files. It's a myth.
Look at any proprietary library distribution for Java (I have no C# experience to speak of, but I'd expect it's the same). They don't give you the complete source file; they just give you a file with every method's implementation blanked ({} or {return null;} or the like) and everything they can get away with hiding hidden. You can't call that anything but a header.
There is no technical reason, however, why a C or C++ compiler could count everything in an appropriately-marked file as extern unless that file is being compiled directly. However, the costs for compilation would be immense because neither C nor C++ is fast to parse, and that's a very important consideration. Any more complex method of melding headers and source would quickly encounter technical issues like the need for the compiler to know an object's layout.
If you want the reason why this will never happen: it would break pretty much all existing C++ software. If you look at some of the C++ committee design documentation, they looked at various alternatives to see how much code it would break.
It would be far easier to change the switch statement into something halfway intelligent. That would break only a little code. It's still not going to happen.
EDITED FOR NEW IDEA:
The difference between C++ and Java that makes C++ header files necessary is that C++ objects are not necessarily pointers. In Java, all class instances are referred to by pointer, although it doesn't look that way. C++ has objects allocated on the heap and the stack. This means C++ needs a way of knowing how big an object will be, and where the data members are in memory.
Header files are an integral part of the language. Without header files, all static libraries, dynamic libraries, pretty much any pre-compiled library becomes useless. Header files also make it easier to document everything, and make it possible to look over a library/file's API without going over every single bit of code.
They also make it easier to organize your program. Yes, you have to be constantly switching from source to header, but they also allow you define internal and private APIs inside the implementations. For example:
MySource.h:
extern int my_library_entry_point(int api_to_use, ...);
MySource.c:
int private_function_that_CANNOT_be_public();
int my_library_entry_point(int api_to_use, ...){
// [...] Do stuff
}
int private_function_that_CANNOT_be_public() {
}
If you #include <MySource.h>, then you get my_library_entry_point.
If you #include <MySource.c>, then you also get private_function_that_CANNOT_be_public.
You see how that could be a very bad thing if you had a function to get a list of passwords, or a function which implemented your encryption algorithm, or a function that would expose the internals of an OS, or a function that overrode privileges, etc.
Oh Yes!
After coding in Java and C# it's really annoying to have 2 files for every classes. So I was thinking how can I merge them without breaking existing code.
In fact, it's really easy. Just put the definition (implementation) inside an #ifdef section and add a define on the compiler command line to compile that file. That's it.
Here is an example:
/* File ClassA.cpp */
#ifndef _ClassA_
#define _ClassA_
#include "ClassB.cpp"
#include "InterfaceC.cpp"
class ClassA : public InterfaceC
{
public:
ClassA(void);
virtual ~ClassA(void);
virtual void methodC();
private:
ClassB b;
};
#endif
#ifdef compiling_ClassA
ClassA::ClassA(void)
{
}
ClassA::~ClassA(void)
{
}
void ClassA::methodC()
{
}
#endif
On the command line, compile that file with
-D compiling_ClassA
The other files that need to include ClassA can just do
#include "ClassA.cpp"
Of course the addition of the define on the command line can easily be added with a macro expansion (Visual Studio compiler) or with an automatic variables (gnu make) and using the same nomenclature for the define name.
Still I don't get the point of some statements. Separation of API and implementation is a very good thing, but header files are not API. There are private fields there. If you add or remove private field you change implementation and not API.
C++ seems to be rather grouchy when declaring templates across multiple files. More specifically, when working with templated classes, the linker expect all method definitions for the class in a single compiler object file. When you take into account headers, other declarations, inheritance, etc., things get really messy.
Are there any general advice or workarounds for organizing or redistributing templated member definitions across multiple files?
Are there any general advice or workarounds for organizing or redistributing templated member definitions across multiple files?
Yes; don't.
The C++ spec permits a compiler to be able to "see" the entire template (declaration and definition) at the point of instantiation, and (due to the complexities of any implementation) most compilers retain this requirement. The upshot is that #inclusion of any template header must also #include any and all source required to instantiate the template.
The easiest way to deal with this is to dump everything into the header, inline where posible, out-of-line where necessary.
If you really regard this as an unacceptable affront, a common option is to split the template into the usual header/implementation pair, and then #include the implementation file at the end of the header.
C++'s "export" feature may or may not provide another workaround. The feature is poorly supported and poorly defined; although it in principle should permit some kind of separate compilation of templates, it doesn't necessarily obviate the demand that the compiler be able to see the entire template body.
Across how many files? If you just want to separate class definitions from implementation then try this article in the C++ faqs. That's about the only way I know of that works at the moment, but some IDEs (Eclipse CDT for example) won't link this method properly and you may get a lot of errors. However, writing your own makefiles or using Visual C++ has always worked for me :-)
When/if your compiler supports C++0x, the extern keyword can be used to separate template declarations from definitions.
See here for a brief explanation.
Also, section 6.3, "The Separation Model," of C++ Templates: The Complete Guide by David Vandevoorde and Nicolai M. Josuttis describes other options.