Related
In C++, your classes are often divided into two parts, being the header-file and the actual implementation. In my (unexperienced) opinion, this is awful. It requires me to do all sorts of unnecessary book-keeping, clutters up my project directory and goes against everything I've learned about software development (double implementation). Languages where you only deal with the implementation, such as Java or Python, are much nicer to work with.
I've always learned that the reason to use them was to significantly decrease compilation time. However, wouldn't a modern IDE (CLion in my case) or even the compiler be smart enough to either:
Keep some sort of "shadow"-header file, which would automatically be updated whenever a definition is changed in the implementation?
Automatically split it into the header and implementation during compile time, allowing you to only have to deal with one file? (Something that Lazy C++ seems to do)
Or are there any plugins available that offer this kind of behaviour? C++ modules also seem to offer a solution to this problem, but their current status/support is unclear to me and to make matters worse there seem to be two competing standards (Clang's and Microsoft's).
Unfortunately it is not that simple. Header/source file separation C++ inherited from C due to preprocessor, that both share. Automatic generation of a header file is not possible in general, first of all that separation is not trivial, second header file often has preprocessor code that manually written and generates compilation code. Third almost all templated code goes to a header file due to process of compilation and rules of visibility. Changing all of that would require breaking compatibility with existing code, amount of which is significant and nobody wants to do that. More easy would be to create yet another language (like D) but many people would not want to migrate due to various reasons. We know that committee is working on modules and if they manage to make them work without breaking compatibility, that would be helpful for many of us. But again this is not trivial task at all, the way you describe it would only work in certain environments (when you limit yourself) but cannot be applied to everybody.
Imagine a project with the development stretched over 10+ years timespan. Some parts are in C, some are in C++ and all of the code uses global functions and global variables. The architecture was designed inherently single threaded and kept growing that way. But now we consider utilizing many-core architectures.
Now one idea being evaluated is to refactor a part of the code into a library, to make it possible to create more than one instance, so that they can run in separate threads and don’t interfere with each other.
The proposal that gains the most traction at this point is to wrap all the library files into namespaces with macro defines, like:
namespace VARIANT {
// all the code
}
Then define the VARIANT in a header or on project level. This will make it possible to have different contexts within different namespaces. And the selling point is that this approach will require minimal code change and has low risk of introducing any regression.
But if at some point we need to make the behavior of Variant1 different from Variant2, things will get tricky, since there’s no way to compare the value of a macro define with a string in a preprocessor macro.
Is there a more elegant way to achieve this?
Another variant might be spotting all global variables and making them thread_local. Requires either C++11 or at least compiler extensions providing the same (__thread using older GCC).
If I read this question right, you even don't need to convert your C files into C++ files (which your approach requires as C does not support namespaces...), but you need C11 for.
Refactoring an old project and make it multithreading is not so simple. First of all you have mixture of C and C++ codes and you cannot blindly follow C++ approach here. Instead of namespace you need to thing on the below areas:-
Find out all the code blocks, container like list, large array of objects etc which need synchronization.
Find out interdependency of threads and how will you control them. For example one thread will generate a report and insert into a table and second one need that information to generate its final report, now you need to find out these kind of dependency among the threads in your code base and need to find out their control mechanism.
Old style of multithreading in C++ was very tricky hence you need to migrate your code to C++11 where implementation of multithreading is much easier.
As you said that in your current project there are lots of global variable, you need to think properly how you are going to share these variables amongst different threads and how will you synchronize access of these variables.
These are some hints you need to consider lots of areas in advance before starting refactoring else all your efforts end in smoke.
GOOD LUCK for your plan.
Just do it in steps, testing each time:
1) typedef a struct with all the globals in it. malloc one, and edit the existing code to reference it. Test - should work exactly the same as with the globals.
2) Create one thread to run one instance of the code. Test - should work exactly the same as with the globals.
3) Try multiple threads.
One step at a time...
Please try very hard to not attempt any bodges!
I am developing an interface that can be used as dynamic loading.Also it should be compiler independent.So i wanted to export Interfaces.
I am facing now the following problems..
Problem 1: The interface functions are taking some custom data types (basically classes or structures) as In\Out parameters.I want to initialise members of these classes with default values using constructors.If i do this it is not possible to load my library dynamically and it becomes compiler dependent. How to solve this.
Problem 2: Some interfaces returns lists(or maps) of element to client.I am using std containers for this purpose.But this also once again compiler dependent(and compiler version also some times).
Thanks.
Code compiled differently can only work together if it adopts the same Application Binary Interface (ABI) for the set of types used for parameters and return value. ABI's are significant at a much deeper level - name mangling, virtual dispatch tables etc., but my point's that if there's one your compilers support allowing calling of functions with simple types, you can at least think about hacking together some support for more complex types like compiler-specific implementations of Standard containers and user-defined types.
You'll have to research what ABI support your compilers provide, and infer what you can about what they'll continue to provide.
If you want to support other types beyond what the relevant ABI standardises, options include:
use simpler types to expose internals of more complex types
pass [const] char* and size_t extracted by my_std_string.data() or &my_std_string[0] and my_std_string.size(), similarly for std::vector
serialise the data and deserialise it using the data structures of the receiver (can be slow)
provide a set of function pointers to simple accessor/mutator functions implemented by the object that created the data type
e.g. the way the classic C qsort function accepts a pointer to an element comparison function
As I usually have a multithreading focus, I'm mostly going to bark about your second problem.
You already realized that passing elements of a container over an API seems to be compiler dependent. It's actually worse: it's header file & C++-library dependent, so at least for Linux you're already stuck with two different sets: libstc++ (originating from gcc) and libcxx (originating from clang).
Because part of the containers is header files and part is library code, getting things ABI-independent is close to impossible.
My bigger worry is that you actually thought of passing container elements around. This is a huge threadsafety issue: the STL containers are not threadsafe - by design.
By passing references over the interface, you are passing "pointers to encapsulated knowledge" around - the users of your API could make assumptions of your internal structures and start modifying the data pointed to. That is usually already really bad in a singlethreaded environment, but gets worse in a multithreaded environment.
Secondly, pointers you provided could get stale, not good either.
Make sure to return copies of your inner knowledge to prevent user modification of your structures.
Passing things const is not enough: const can be cast away and you still expose your innards.
So my suggestion: hide the data types, only pass simple types and/or structs that you fully control (i.e. are not dependent on STL or boost).
Designing an API with the widest ABI compatibility is an extremely complex subject, even more so when C++ is involved instead of C.
Yet there are more theoretical-type issues that aren't really quite as bad as they sound in practice. For example, in theory, calling conventions and structure padding/alignment sound like they could be major headaches. In practice they aren't so much, and you can even resolve such issues in hindsight by specifying additional build instructions to third parties or decorating your SDK functions with macros indicating the appropriate calling convention. By "not so bad" here, I mean that they can trip you up but they won't have you going back to the drawing board and redesigning your entire SDK in response.
The "practical" issues I want to focus on are issues that can have you revisiting the drawing board and redoing the entire SDK. My list is also not exhaustive, but are some of the ones I think you should really keep in mind first.
You can also treat your SDK as consisting of two parts: a dynamically-linked part that actually exports functionality whose implementation is hidden from clients, and a statically (internally) linked convenience library part that adds C++ wrappers on top. If you treat your SDK as having these two distinct parts, you're allowed a lot more liberty in the statically-linked library to use a lot more C++ mechanisms.
So, let's get started with those practical headache inducers:
1. The binary layout of a vtable is not necessarily consistent across compilers.
This is, in my opinion, one of the biggest gotchas. We're usually looking at 2 main ways to access functionality from one module to another at runtime: function pointers (including those provided by dylib symbol lookup) and interfaces containing virtual functions. The latter can be so much more convenient in C++ (both for implementor and client using the interface), yet unfortunately using virtual functions in an API that aims to be binary compatible with the widest range of compilers is like playing minesweeper through a land of gotchas.
I would recommend avoiding virtual functions outright for this purpose unless your team consists of minesweeper experts who know all of these gotchas. It's useful to try to fall in love with C again for those public interface parts and start building a fondness for these kinds of interfaces consisting of function pointers:
struct Interface
{
void* opaque_private_data;
void (*func1)(struct Interface* self, ...);
void (*func2)(struct Interface* self, ...);
void (*func3)(struct Interface* self, ...);
};
These present far fewer gotchas and are nowhere near as fragile against changes (ex: you're perfectly allowed to do things like add more function pointers to the bottom of the structure without affecting ABI).
2. Stub libs for dylib symbol lookup are linker-specific (as are all static libs in general).
This might not seem like a big deal until combined with #1. When you toss out virtual functions for the purpose of exporting interfaces, then the next big temptation is to often export whole classes or select methods through a dylib.
Unfortunately doing this with manual symbol lookup can become very unwieldy very quickly, so the temptation is to often do this automatically by simply linking to the appropriate stub.
Yet this too can become unwieldy when your goal is to support as many compilers/linkers as possible. In such a case, you may have to possess many compilers and build and distribute different stubs for each possibility.
So this can kind of push you into a corner where it's no longer very practical export class definitions anymore. At this point you might simply export free-standing functions with C linkage (to avoid C++ name mangling which is another potential source of headaches).
One of the things that should be obvious already is that we're getting nudged more and more towards favoring a C or C-like API if our goal is universal binary compatibility without opening up too many cans of worms.
3. Different modules have 'different heaps'.
If you allocate memory in one module and try to deallocate it in another, then you're trying to free memory from a mismatching heap and will invoke undefined behavior.
Even in plain old C, it's easy to forget this rule and malloc in one exported function only to return a pointer to it with the expectation that the client accessing the memory from a different module will free it when done. This once again invokes undefined behavior, and we have to export a second function to indirectly free the memory from the same module that allocated it.
This can become a much bigger gotcha in C++ where we often have class templates that have internal linkage that implicitly do memory management. For example, even if we roll our own std::vector-like sequence like List<T>, we can run into a scenario where a client creates a list, passes it to our API by reference where we use functions that can allocate/deallocate memory (like push_back or insert) and butt heads with this mismatching heap/free store issue. So even this hand-rolled container should ensure that it allocates and deallocates memory from the same central location if it's going to be passed around across modules, and placement new will become your friend when implementing such containers.
4. Passing/returning C++ standard objects is not ABI-compatible.
This includes C++ standard containers as you have already guessed. There's no really practical way to ensure that one compiler will use a compatible representation of something like std::vector when including <vector> as another. So passing/returning such standard objects whose representation is outside of your control is generally out of the question if you're targeting wide binary compatibility.
These don't even necessarily have compatible representations within two projects built by the same compiler, as their representations can vary in incompatible ways based on build settings.
This might make you think that you should now roll all kinds of containers by hand, but I would suggest a KISS approach here. If you're returning a variable number of elements as a result from a function, then we don't need a wide range of container types. We only need one dynamic array kind of container, and it doesn't even have to be a growable sequence, just something with proper copy, move, and destruction semantics.
It might seem nicer and could save some cycles if you just returned a set or a map in a function that computes one, but I'd suggest forgetting about returning these more sophisticated structures and convert to/from this basic dynamic array kind of representation. It's rarely the bottleneck you might think it would be to transfer to/from contiguous representations, and if you actually do run into a hotspot as a result of this which you actually gained from a legit profiling session of a real world use case, then you can always add more to your SDK in a very discrete and selective fashion.
You can also always wrap those more sophisticated containers like map into a C-like function pointer interface that treats the handle to the map as opaque, hidden away from clients. For heftier data structures like a binary search tree, paying the cost of one level of indirection is generally very negligible (for simpler structures like a random-access contiguous sequence, it generally isn't quite as negligible, especially if your read operations like operator[] involve indirect calls).
Another thing worth noting is that everything I've discussed so far relates to the exported, dynamically-linked side of your SDK. The static convenience library that is internally linked is free to receive and return standard objects to make things convenient for the third party using your library, provided that you're not actually passing/returning them in your exported interfaces. You can even avoid rolling your own containers outright and just take a C-style mindset to your exported interfaces, returning raw pointers to T* that needs to be freed while your convenience library does that automatically and transfers the contents to std::vector<T>, e.g.
5. Throwing exceptions across module boundaries is undefined.
We should generally not be throwing exceptions from one module to be caught in another when we cannot ensure compatible build settings in the two modules, let alone the same compiler. So throwing exceptions from your API to indicate input errors is generally out of the question in this case.
Instead we should catch all possible exceptions at the entry points to our module to avoid leaking them into the outside world, and translate all such exceptions into error codes.
The statically-linked convenience library can still call one of your exported functions, check the error code, and in the case of failure, throw an exception. This is perfectly fine here since that convenience library is internally linked to the module of the third party using this library, so it's effectively throwing the exception from the third party module to be caught by the same third party module.
Conclusion
While this is, by no means, an exhaustive list, these are some caveats that can, when unheeded, cause some of the biggest issues at the broadest level of your API design. These kinds of design-level issues can be exponentially more expensive to fix in hindsight than implementation-type issues, so they should generally have the highest priority.
If you're new to these subjects, you can't go too far wrong favoring a C or very C-like API. You can still use a lot of C++ implementing it and can also build a C++ convenience library back on top (your clients don't even have to use anything but the C++ interfaces provided by that internally-linked convenience library).
With C, you're typically looking at more work at the baseline level, but potentially far fewer of those disastrous design-level gotchas. With C++, you're looking at less work at the baseline level, but far more potentially disastrous surprise scenarios. If you favor the latter route, you generally want to ensure that your team's expertise with ABI issues is higher with a larger coding standards document dedicating large sections to these potential ABI gotchas.
For your specific questions:
Problem 1: The interface functions are taking some custom data types
(basically classes or structures) as In\Out parameters.I want to
initialise members of these classes with default values using
constructors.If i do this it is not possible to load my library
dynamically and it becomes compiler dependent. How to solve this.
This is where that statically-linked convenience library can come in handy. You can statically link all that convenient code like a class with constructors and still pass in its data in a more raw, primitive kind of form to the exported interfaces. Another option is to selectively inline or statically link the constructor so that its code is not exported as with the rest of the class, but you probably don't want to be exporting classes as indicated above if your goal is max binary compatibility and don't want too many gotchas.
Problem 2: Some interfaces returns lists(or maps) of element to
client.I am using std containers for this purpose.But this also once
again compiler dependent(and compiler version also some times).
Here we have to forgo those standard container goodies at least at the exported API level. You can still utilize them at the convenience library level which has internal linkage.
Is it possible to implement monkey patching in C++?
Or any other similar approach to that?
Thanks.
Not portably so, and due to the dangers for larger projects you better have good reason.
The Preprocessor is probably the best candidate, due to it's ignorance of the language itself. It can be used to rename attributes, methods and other symbol names - but the replacement is global at least for a single #include or sequence of code.
I've used that before to beat "library diamonds" into submission - Library A and B both importing an OS library S, but in different ways so that some symbols of S would be identically named but different. (namespaces were out of the question, for they'd have much more far-reaching consequences).
Similary, you can replace symbol names with compatible-but-superior classes.
e.g. in VC, #import generates an import library that uses _bstr_t as type adapter. In one project I've successfully replaced these _bstr_t uses with a compatible-enough class that interoperated better with other code, just be #define'ing _bstr_t as my replacement class for the #import.
Patching the Virtual Method Table - either replacing the entire VMT or individual methods - is somethign else I've come across. It requires good understanding of how your compiler implements VMTs. I wouldn't do that in a real life project, because it depends on compiler internals, and you don't get any warning when thigns have changed. It's a fun exercise to learn about the implementation details of C++, though. One application would be switching at runtime from an initializer/loader stub to a full - or even data-dependent - implementation.
Generating code on the fly is common in certain scenarios, such as forwarding/filtering COM Interface calls or mapping OS Window Handles to library objects. I'm not sure if this is still "monkey-patching", as it isn't really toying with the language itself.
To add to other answers, consider that any function exposed through a shared object or DLL (depending on platform) can be overridden at run-time. Linux provides the LD_PRELOAD environment variable, which can specify a shared object to load after all others, which can be used to override arbitrary function definitions. It's actually about the best way to provide a "mock object" for unit-testing purposes, since it is not really invasive. However, unlike other forms of monkey-patching, be aware that a change like this is global. You can't specify one particular call to be different, without impacting other calls.
Considering the "guerilla third-party library use" aspect of monkey-patching, C++ offers a number of facilities:
const_cast lets you work around zealous const declarations.
#define private public prior to header inclusion lets you access private members.
subclassing and use Parent::protected_field lets you access protected members.
you can redefine a number of things at link time.
If the third party content you're working around is provided already compiled, though, most of the things feasible in dynamic languages isn't as easy, and often isn't possible at all.
I suppose it depends what you want to do. If you've already linked your program, you're gonna have a hard time replacing anything (short of actually changing the instructions in memory, which might be a stretch as well). However, before this happens, there are options. If you have a dynamically linked program, you can alter the way the linker operates (e.g. LD_LIBRARY_PATH environment variable) and have it link something else than the intended library.
Have a look at valgrind for example, which replaces (among alot of other magic stuff it's dealing with) the standard memory allocation mechanisms.
As monkey patching refers to dynamically changing code, I can't imagine how this could be implemented in C++...
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Many languages, such as Java, C#, do not separate declaration from implementation. C# has a concept of partial class, but implementation and declaration still remain in the same file.
Why doesn't C++ have the same model? Is it more practical to have header files?
I am referring to current and upcoming versions of C++ standard.
Backwards Compatibility - Header files are not eliminated because it would break Backwards Compatibility.
Header files allow for independent compilation. You don't need to access or even have the implementation files to compile a file. This can make for easier distributed builds.
This also allows SDKs to be done a little easier. You can provide just the headers and some libraries. There are, of course, ways around this which other languages use.
Even Bjarne Stroustrup has called header files a kludge.
But without a standard binary format which includes the necessary metadata (like Java class files, or .Net PE files) I don't see any way to implement the feature. A stripped ELF or a.out binary doesn't have much of the information you would need to extract. And I don't think that the information is ever stored in Windows XCOFF files.
I routinely flip between C# and C++, and the lack of header files in C# is one of my biggest pet peeves. I can look at a header file and learn all I need to know about a class - what it's member functions are called, their calling syntax, etc - without having to wade through pages of the code that implements the class.
And yes, I know about partial classes and #regions, but it's not the same. Partial classes actually make the problem worse, because a class definition is spread across several files. As far as #regions go, they never seem to be expanded in the manner I'd like for what I'm doing at the moment, so I have to spend time expanding those little plus's until I get the view right.
Perhaps if Visual Studio's intellisense worked better for C++, I wouldn't have a compelling reason to have to refer to .h files so often, but even in VS2008, C++'s intellisense can't touch C#'s
C was made to make writing a compiler easily. It does a LOT of stuff based on that one principle. Pointers only exist to make writing a compiler easier, as do header files. Many of the things carried over to C++ are based on compatibility with these features implemented to make compiler writing easier.
It's a good idea actually. When C was created, C and Unix were kind of a pair. C ported Unix, Unix ran C. In this way, C and Unix could quickly spread from platform to platform whereas an OS based on assembly had to be completely re-written to be ported.
The concept of specifying an interface in one file and the implementation in another isn't a bad idea at all, but that's not what C header files are. They are simply a way to limit the number of passes a compiler has to make through your source code and allow some limited abstraction of the contract between files so they can communicate.
These items, pointers, header files, etc... don't really offer any advantage over another system. By putting more effort into the compiler, you can compile a reference object as easily as a pointer to the exact same object code. This is what C++ does now.
C is a great, simple language. It had a very limited feature set, and you could write a compiler without much effort. Porting it is generally trivial! I'm not trying to say it's a bad language or anything, it's just that C's primary goals when it was created may leave remnants in the language that are more or less unnecessary now, but are going to be kept around for compatibility.
It seems like some people don't really believe that C was written to port Unix, so here: (from)
The first version of UNIX was written
in assembler language, but Thompson's
intention was that it would be written
in a high-level language.
Thompson first tried in 1971 to use
Fortran on the PDP-7, but gave up
after the first day. Then he wrote a
very simple language he called B,
which he got going on the PDP-7. It
worked, but there were problems.
First, because the implementation was
interpreted, it was always going to be
slow. Second, the basic notions of B,
which was based on the word-oriented
BCPL, just were not right for a
byte-oriented machine like the new
PDP-11.
Ritchie used the PDP-11 to add types
to B, which for a while was called NB
for "New B," and then he started to
write a compiler for it. "So that the
first phase of C was really these two
phases in short succession of, first,
some language changes from B, really,
adding the type structure without too
much change in the syntax; and doing
the compiler," Ritchie said.
"The second phase was slower," he said
of rewriting UNIX in C. Thompson
started in the summer of 1972 but had
two problems: figuring out how to run
the basic co-routines, that is, how to
switch control from one process to
another; and the difficulty in getting
the proper data structure, since the
original version of C did not have
structures.
"The combination of the things caused
Ken to give up over the summer,"
Ritchie said. "Over the year, I added
structures and probably made the
compiler code somewhat better --
better code -- and so over the next
summer, that was when we made the
concerted effort and actually did redo
the whole operating system in C."
Here is a perfect example of what I mean. From the comments:
Pointers only exist to make writing a compiler easier? No. Pointers exist because they're the simplest possible abstraction over the idea of indirection. – Adam Rosenfield (an hour ago)
You are right. In order to implement indirection, pointers are the simplest possible abstraction to implement. In no way are they the simplest possible to comprehend or use. Arrays are much easier.
The problem? To implement arrays as efficiently as pointers you have to pretty much add a HUGE pile of code to your compiler.
There is no reason they couldn't have designed C without pointers, but with code like this:
int i=0;
while(src[++i])
dest[i]=src[i];
it will take a lot of effort (on the compilers part) to factor out the explicit i+src and i+dest additions and make it create the same code that this would make:
while(*(dest++) = *(src++))
;
Factoring out that variable "i" after the fact is HARD. New compilers can do it, but back then it just wasn't possible, and the OS running on that crappy hardware needed little optimizations like that.
Now few systems need that kind of optimization (I work on one of the slowest platforms around--cable set-top boxes, and most of our stuff is in Java) and in the rare case where you might need it, the new C compilers should be smart enough to make that kind of conversion on its own.
In The Design and Evolution of C++, Stroustrup gives out one more reason...
The same header file can have two or more implementation files which can be simultaneously worked-upon by more than one programmer without the need of a source-control system.
This might seem odd these days, but I guess it was an important issue when C++ was invented.
If you want C++ without header files then I have good news for you.
It already exists and is called D (http://www.digitalmars.com/d/index.html)
Technically D seems to be a lot nicer than C++ but it is just not mainstream enough for use in many applications at the moment.
One of C++'s goals is to be a superset of C, and it's difficult for it to do so if it cannot support header files. And, by extension, if you wish to excise header files you may as well consider excising CPP (the pre-processor, not plus-plus) altogether; both C# and Java do not specify macro pre-processors with their standards (but it should be noted in some cases they can be and even are used even with these languages).
As C++ is designed right now, you need prototypes -- just as in C -- to statically check any compiled code that references external functions and classes. Without header files, you would have to type out these class definitions and function declarations prior to using them. For C++ not to use header files, you'd have to add a feature in the language that would support something like Java's import keyword. That'd be a major addition, and change; to answer your question of if it'd be practical: I don't think so--not at all.
Many people are aware of shortcomings of header files and there are ideas to introduce more powerful module system to C++.
You might want to take a look at Modules in C++ (Revision 5) by Daveed Vandevoorde.
Well, C++ per se shouldn't eliminate header files because of backwards compatibility. However, I do think they're a silly idea in general. If you want to distribute a closed-source lib, this information can be extracted automatically. If you want to understand how to use a class w/o looking at the implementation, that's what documentation generators are for, and they do a heck of a lot better a job.
There is value in defining the class interface in a separate component to the implementation file.
It can be done with interfaces, but if you go down that road, then you are implicitly saying that classes are deficient in terms of separating implementation from contract.
Modula 2 had the right idea, definition modules and implementation modules. http://www.modula2.org/reference/modules.php
Java/C#'s answer is an implicit implementation of the same (albeit object-oriented.)
Header files are a kludge, because header files express implementation detail (such as private variables.)
In moving over to Java and C#, I find that if a language requires IDE support for development (such that public class interfaces are navigable in class browsers), then this is maybe a statement that the code doesn't stand on its own merits as being particularly readable.
I find the mix of interface with implementation detail quite horrendous.
Crucially, the lack of ability to document the public class signature in a concise well-commented file independent of implementation indicates to me that the language design is written for convenience of authorship, rather convenience of maintenance. Well I'm rambling about Java and C# now.
One advantage of this separation is that it is easy to view only the interface, without requiring an advanced editor.
No language exists without header files. It's a myth.
Look at any proprietary library distribution for Java (I have no C# experience to speak of, but I'd expect it's the same). They don't give you the complete source file; they just give you a file with every method's implementation blanked ({} or {return null;} or the like) and everything they can get away with hiding hidden. You can't call that anything but a header.
There is no technical reason, however, why a C or C++ compiler could count everything in an appropriately-marked file as extern unless that file is being compiled directly. However, the costs for compilation would be immense because neither C nor C++ is fast to parse, and that's a very important consideration. Any more complex method of melding headers and source would quickly encounter technical issues like the need for the compiler to know an object's layout.
If you want the reason why this will never happen: it would break pretty much all existing C++ software. If you look at some of the C++ committee design documentation, they looked at various alternatives to see how much code it would break.
It would be far easier to change the switch statement into something halfway intelligent. That would break only a little code. It's still not going to happen.
EDITED FOR NEW IDEA:
The difference between C++ and Java that makes C++ header files necessary is that C++ objects are not necessarily pointers. In Java, all class instances are referred to by pointer, although it doesn't look that way. C++ has objects allocated on the heap and the stack. This means C++ needs a way of knowing how big an object will be, and where the data members are in memory.
Header files are an integral part of the language. Without header files, all static libraries, dynamic libraries, pretty much any pre-compiled library becomes useless. Header files also make it easier to document everything, and make it possible to look over a library/file's API without going over every single bit of code.
They also make it easier to organize your program. Yes, you have to be constantly switching from source to header, but they also allow you define internal and private APIs inside the implementations. For example:
MySource.h:
extern int my_library_entry_point(int api_to_use, ...);
MySource.c:
int private_function_that_CANNOT_be_public();
int my_library_entry_point(int api_to_use, ...){
// [...] Do stuff
}
int private_function_that_CANNOT_be_public() {
}
If you #include <MySource.h>, then you get my_library_entry_point.
If you #include <MySource.c>, then you also get private_function_that_CANNOT_be_public.
You see how that could be a very bad thing if you had a function to get a list of passwords, or a function which implemented your encryption algorithm, or a function that would expose the internals of an OS, or a function that overrode privileges, etc.
Oh Yes!
After coding in Java and C# it's really annoying to have 2 files for every classes. So I was thinking how can I merge them without breaking existing code.
In fact, it's really easy. Just put the definition (implementation) inside an #ifdef section and add a define on the compiler command line to compile that file. That's it.
Here is an example:
/* File ClassA.cpp */
#ifndef _ClassA_
#define _ClassA_
#include "ClassB.cpp"
#include "InterfaceC.cpp"
class ClassA : public InterfaceC
{
public:
ClassA(void);
virtual ~ClassA(void);
virtual void methodC();
private:
ClassB b;
};
#endif
#ifdef compiling_ClassA
ClassA::ClassA(void)
{
}
ClassA::~ClassA(void)
{
}
void ClassA::methodC()
{
}
#endif
On the command line, compile that file with
-D compiling_ClassA
The other files that need to include ClassA can just do
#include "ClassA.cpp"
Of course the addition of the define on the command line can easily be added with a macro expansion (Visual Studio compiler) or with an automatic variables (gnu make) and using the same nomenclature for the define name.
Still I don't get the point of some statements. Separation of API and implementation is a very good thing, but header files are not API. There are private fields there. If you add or remove private field you change implementation and not API.