C++ 20? modules - no exporting of privates

C++ 20? modules - no exporting of privates - c++

Watched 2 CppCon Gabriel Dos Reis' talks about modules TS.
And as I remember he didn't want to export any private members/functions, so other modules won't be able to use them at all.
I understand his position, but won't it conflict with future C++ reflection? For instance when C++ has a reflection that can enumerate all class functions, shouldn't it be able to enumerate private functions too?
I know, accessing private functions is bad, but in rare extreme cases it is necessary (how const_cast is sometimes necessary, despite the fact that a developer who writes it should feel terrible)
Edit:
And there is at least one exception for "not exposing privates" - if classes use "non virtual interface" pattern

Modules do not (as presently defined) change how C++ works at a fundamental level. It changes a few things with what names are accessible and from where they are accessible. But it doesn't change how the language works with those names.
So if static reflection comes along and permits you to talk about private members of a class, then you can talk about private members of a class. Whether that class definition came to you through a #include directive or a module inclusion is irrelevant.
To allow what Gabriel Dos Reis wants would require an explicit exception to the reflection rules to be made. And that is unlikely to happen.
And it's not really that important either. While being able to remove from module files private members (and any non-exported types they work with) would make module files quite a bit smaller, I don't think this extra module file size will be that big of a deal. The main thing it would allow you to do is to make certain changes to module source code that wouldn't require recompiling modules that include your module. But a well-modularized codebase ought to have relatively quick compilation anyway. So while it would be nice, it's hardly essential.

Related

It is impossible to hide implementation of templates (for protection of intellectual property) from outside eyes. Am I right?

I have read the Why can templates only be implemented in the header file? and Why can’t I separate the definition of my templates class from its declaration and put it inside a .cpp file?
If I create the templates then I am to provide access also to their cpp-files additionally to their h-files, or write the definitions directly in the header file.
Therefore, if I want to allow to use fully my templates in other applications, then I can't hide their implementation from outside eyes (for protection of intellectual property). Am I right?

In general you're right... the implementation must be exposed.
If your client only needs to instantiate them for a finite set of specific types that they can list for you, you can provide them with a pre-compiled object/library containing the implementations of the instantiations for just those types: see https://isocpp.org/wiki/faq/templates#separate-template-fn-defn-from-decl
Obfuscation is another possibility - let them see the code, but make it confusing and unmaintainable.
If neither of those options suit, consider whether you can provide a templated adapter that creates a run-time polymorphic interface over their user-provided type, capturing the specific set of functions your algorithms need. Accept those adapters as a front-end to your code. This does have runtime costs.

Intellectual property is mostly protected by legal means, not technical ones.
(e.g. the technical possibility to read some header files do not give me the right to use it, or to copy its code elsewhere)
However, you might consider code obfuscation techniques. You might even customize your recent GCC for that purpose, i.e. write your MELT free software extension. It could mean weeks (or months) of work.
Alternatively, consider publishing your header-only template C++ library as free software... (perhaps with GPL license).
But IANAL. You should ask your lawyer.

C++ : avoiding compiler dependencies vs. avoiding pointer overuse

I know it's considered old-fashioned / out-of-date style to overuse pointers where they aren't necessary. However, I'm finding this ideal conflicts with another consideration of avoiding compiler dependencies.
Specifically, I can use forward declarations in a header file and avoid #include statements if member variables are pointers. But then this leads me to member variables of my own classes to be pointers, even when there's not really a good reason to do so.
Incidentally, I find using the Qt framework (which I enjoy) leads me to program in this java-esque everything-on-the-heap programming style since that's the way the interface is setup.
How do I weigh these two competing considerations?

It depends. Reducing dependencies is definitely a good thing,
per se, but it must be weighed against all of the other issues.
Using the compilation firewall idiom, for example, can move the
dependencies out of the header file, at the cost of one
allocation.
As for what QT does: it's a GUI framework, which (usually---I've
not looked at QT) means lots of polymorphism, and that most
classes have identity, and cannot be copied. In such cases, you
usually do have to use dynamic allocation and work with
pointers. The rule to avoid pointers mainly concerns objects
with value semantics.
(And by the way, there's nothing "old-fashioned" or
"out-of-date" about using too many pointers. It's been the rule
since I started using C++, 25 years ago.)

Qt needs it because it is a library that can be dynamically loaded. Users can compile and link without having to worry about implementation details. You can at runtime use many versions of Qt without having to recompile. This is pretty powerful and flexible. This wouldn't be possible if private object instances were used inside classes.

How to implement monkey patch in C++?

Is it possible to implement monkey patching in C++?
Or any other similar approach to that?
Thanks.

Not portably so, and due to the dangers for larger projects you better have good reason.
The Preprocessor is probably the best candidate, due to it's ignorance of the language itself. It can be used to rename attributes, methods and other symbol names - but the replacement is global at least for a single #include or sequence of code.
I've used that before to beat "library diamonds" into submission - Library A and B both importing an OS library S, but in different ways so that some symbols of S would be identically named but different. (namespaces were out of the question, for they'd have much more far-reaching consequences).
Similary, you can replace symbol names with compatible-but-superior classes.
e.g. in VC, #import generates an import library that uses _bstr_t as type adapter. In one project I've successfully replaced these _bstr_t uses with a compatible-enough class that interoperated better with other code, just be #define'ing _bstr_t as my replacement class for the #import.
Patching the Virtual Method Table - either replacing the entire VMT or individual methods - is somethign else I've come across. It requires good understanding of how your compiler implements VMTs. I wouldn't do that in a real life project, because it depends on compiler internals, and you don't get any warning when thigns have changed. It's a fun exercise to learn about the implementation details of C++, though. One application would be switching at runtime from an initializer/loader stub to a full - or even data-dependent - implementation.
Generating code on the fly is common in certain scenarios, such as forwarding/filtering COM Interface calls or mapping OS Window Handles to library objects. I'm not sure if this is still "monkey-patching", as it isn't really toying with the language itself.

To add to other answers, consider that any function exposed through a shared object or DLL (depending on platform) can be overridden at run-time. Linux provides the LD_PRELOAD environment variable, which can specify a shared object to load after all others, which can be used to override arbitrary function definitions. It's actually about the best way to provide a "mock object" for unit-testing purposes, since it is not really invasive. However, unlike other forms of monkey-patching, be aware that a change like this is global. You can't specify one particular call to be different, without impacting other calls.

Considering the "guerilla third-party library use" aspect of monkey-patching, C++ offers a number of facilities:
const_cast lets you work around zealous const declarations.
#define private public prior to header inclusion lets you access private members.
subclassing and use Parent::protected_field lets you access protected members.
you can redefine a number of things at link time.
If the third party content you're working around is provided already compiled, though, most of the things feasible in dynamic languages isn't as easy, and often isn't possible at all.

I suppose it depends what you want to do. If you've already linked your program, you're gonna have a hard time replacing anything (short of actually changing the instructions in memory, which might be a stretch as well). However, before this happens, there are options. If you have a dynamically linked program, you can alter the way the linker operates (e.g. LD_LIBRARY_PATH environment variable) and have it link something else than the intended library.
Have a look at valgrind for example, which replaces (among alot of other magic stuff it's dealing with) the standard memory allocation mechanisms.

As monkey patching refers to dynamically changing code, I can't imagine how this could be implemented in C++...

Should C++ eliminate header files? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Many languages, such as Java, C#, do not separate declaration from implementation. C# has a concept of partial class, but implementation and declaration still remain in the same file.
Why doesn't C++ have the same model? Is it more practical to have header files?
I am referring to current and upcoming versions of C++ standard.

Backwards Compatibility - Header files are not eliminated because it would break Backwards Compatibility.

Header files allow for independent compilation. You don't need to access or even have the implementation files to compile a file. This can make for easier distributed builds.
This also allows SDKs to be done a little easier. You can provide just the headers and some libraries. There are, of course, ways around this which other languages use.

Even Bjarne Stroustrup has called header files a kludge.
But without a standard binary format which includes the necessary metadata (like Java class files, or .Net PE files) I don't see any way to implement the feature. A stripped ELF or a.out binary doesn't have much of the information you would need to extract. And I don't think that the information is ever stored in Windows XCOFF files.

I routinely flip between C# and C++, and the lack of header files in C# is one of my biggest pet peeves. I can look at a header file and learn all I need to know about a class - what it's member functions are called, their calling syntax, etc - without having to wade through pages of the code that implements the class.
And yes, I know about partial classes and #regions, but it's not the same. Partial classes actually make the problem worse, because a class definition is spread across several files. As far as #regions go, they never seem to be expanded in the manner I'd like for what I'm doing at the moment, so I have to spend time expanding those little plus's until I get the view right.
Perhaps if Visual Studio's intellisense worked better for C++, I wouldn't have a compelling reason to have to refer to .h files so often, but even in VS2008, C++'s intellisense can't touch C#'s

C was made to make writing a compiler easily. It does a LOT of stuff based on that one principle. Pointers only exist to make writing a compiler easier, as do header files. Many of the things carried over to C++ are based on compatibility with these features implemented to make compiler writing easier.
It's a good idea actually. When C was created, C and Unix were kind of a pair. C ported Unix, Unix ran C. In this way, C and Unix could quickly spread from platform to platform whereas an OS based on assembly had to be completely re-written to be ported.
The concept of specifying an interface in one file and the implementation in another isn't a bad idea at all, but that's not what C header files are. They are simply a way to limit the number of passes a compiler has to make through your source code and allow some limited abstraction of the contract between files so they can communicate.
These items, pointers, header files, etc... don't really offer any advantage over another system. By putting more effort into the compiler, you can compile a reference object as easily as a pointer to the exact same object code. This is what C++ does now.
C is a great, simple language. It had a very limited feature set, and you could write a compiler without much effort. Porting it is generally trivial! I'm not trying to say it's a bad language or anything, it's just that C's primary goals when it was created may leave remnants in the language that are more or less unnecessary now, but are going to be kept around for compatibility.
It seems like some people don't really believe that C was written to port Unix, so here: (from)
The first version of UNIX was written
in assembler language, but Thompson's
intention was that it would be written
in a high-level language.
Thompson first tried in 1971 to use
Fortran on the PDP-7, but gave up
after the first day. Then he wrote a
very simple language he called B,
which he got going on the PDP-7. It
worked, but there were problems.
First, because the implementation was
interpreted, it was always going to be
slow. Second, the basic notions of B,
which was based on the word-oriented
BCPL, just were not right for a
byte-oriented machine like the new
PDP-11.
Ritchie used the PDP-11 to add types
to B, which for a while was called NB
for "New B," and then he started to
write a compiler for it. "So that the
first phase of C was really these two
phases in short succession of, first,
some language changes from B, really,
adding the type structure without too
much change in the syntax; and doing
the compiler," Ritchie said.
"The second phase was slower," he said
of rewriting UNIX in C. Thompson
started in the summer of 1972 but had
two problems: figuring out how to run
the basic co-routines, that is, how to
switch control from one process to
another; and the difficulty in getting
the proper data structure, since the
original version of C did not have
structures.
"The combination of the things caused
Ken to give up over the summer,"
Ritchie said. "Over the year, I added
structures and probably made the
compiler code somewhat better --
better code -- and so over the next
summer, that was when we made the
concerted effort and actually did redo
the whole operating system in C."
Here is a perfect example of what I mean. From the comments:
Pointers only exist to make writing a compiler easier? No. Pointers exist because they're the simplest possible abstraction over the idea of indirection. – Adam Rosenfield (an hour ago)
You are right. In order to implement indirection, pointers are the simplest possible abstraction to implement. In no way are they the simplest possible to comprehend or use. Arrays are much easier.
The problem? To implement arrays as efficiently as pointers you have to pretty much add a HUGE pile of code to your compiler.
There is no reason they couldn't have designed C without pointers, but with code like this:
int i=0;
while(src[++i])
dest[i]=src[i];
it will take a lot of effort (on the compilers part) to factor out the explicit i+src and i+dest additions and make it create the same code that this would make:
while(*(dest++) = *(src++))
;
Factoring out that variable "i" after the fact is HARD. New compilers can do it, but back then it just wasn't possible, and the OS running on that crappy hardware needed little optimizations like that.
Now few systems need that kind of optimization (I work on one of the slowest platforms around--cable set-top boxes, and most of our stuff is in Java) and in the rare case where you might need it, the new C compilers should be smart enough to make that kind of conversion on its own.

In The Design and Evolution of C++, Stroustrup gives out one more reason...
The same header file can have two or more implementation files which can be simultaneously worked-upon by more than one programmer without the need of a source-control system.
This might seem odd these days, but I guess it was an important issue when C++ was invented.

If you want C++ without header files then I have good news for you.
It already exists and is called D (http://www.digitalmars.com/d/index.html)
Technically D seems to be a lot nicer than C++ but it is just not mainstream enough for use in many applications at the moment.

One of C++'s goals is to be a superset of C, and it's difficult for it to do so if it cannot support header files. And, by extension, if you wish to excise header files you may as well consider excising CPP (the pre-processor, not plus-plus) altogether; both C# and Java do not specify macro pre-processors with their standards (but it should be noted in some cases they can be and even are used even with these languages).
As C++ is designed right now, you need prototypes -- just as in C -- to statically check any compiled code that references external functions and classes. Without header files, you would have to type out these class definitions and function declarations prior to using them. For C++ not to use header files, you'd have to add a feature in the language that would support something like Java's import keyword. That'd be a major addition, and change; to answer your question of if it'd be practical: I don't think so--not at all.

Many people are aware of shortcomings of header files and there are ideas to introduce more powerful module system to C++.
You might want to take a look at Modules in C++ (Revision 5) by Daveed Vandevoorde.

Well, C++ per se shouldn't eliminate header files because of backwards compatibility. However, I do think they're a silly idea in general. If you want to distribute a closed-source lib, this information can be extracted automatically. If you want to understand how to use a class w/o looking at the implementation, that's what documentation generators are for, and they do a heck of a lot better a job.

There is value in defining the class interface in a separate component to the implementation file.
It can be done with interfaces, but if you go down that road, then you are implicitly saying that classes are deficient in terms of separating implementation from contract.
Modula 2 had the right idea, definition modules and implementation modules. http://www.modula2.org/reference/modules.php
Java/C#'s answer is an implicit implementation of the same (albeit object-oriented.)
Header files are a kludge, because header files express implementation detail (such as private variables.)
In moving over to Java and C#, I find that if a language requires IDE support for development (such that public class interfaces are navigable in class browsers), then this is maybe a statement that the code doesn't stand on its own merits as being particularly readable.
I find the mix of interface with implementation detail quite horrendous.
Crucially, the lack of ability to document the public class signature in a concise well-commented file independent of implementation indicates to me that the language design is written for convenience of authorship, rather convenience of maintenance. Well I'm rambling about Java and C# now.

One advantage of this separation is that it is easy to view only the interface, without requiring an advanced editor.

No language exists without header files. It's a myth.
Look at any proprietary library distribution for Java (I have no C# experience to speak of, but I'd expect it's the same). They don't give you the complete source file; they just give you a file with every method's implementation blanked ({} or {return null;} or the like) and everything they can get away with hiding hidden. You can't call that anything but a header.
There is no technical reason, however, why a C or C++ compiler could count everything in an appropriately-marked file as extern unless that file is being compiled directly. However, the costs for compilation would be immense because neither C nor C++ is fast to parse, and that's a very important consideration. Any more complex method of melding headers and source would quickly encounter technical issues like the need for the compiler to know an object's layout.

If you want the reason why this will never happen: it would break pretty much all existing C++ software. If you look at some of the C++ committee design documentation, they looked at various alternatives to see how much code it would break.
It would be far easier to change the switch statement into something halfway intelligent. That would break only a little code. It's still not going to happen.
EDITED FOR NEW IDEA:
The difference between C++ and Java that makes C++ header files necessary is that C++ objects are not necessarily pointers. In Java, all class instances are referred to by pointer, although it doesn't look that way. C++ has objects allocated on the heap and the stack. This means C++ needs a way of knowing how big an object will be, and where the data members are in memory.

Header files are an integral part of the language. Without header files, all static libraries, dynamic libraries, pretty much any pre-compiled library becomes useless. Header files also make it easier to document everything, and make it possible to look over a library/file's API without going over every single bit of code.
They also make it easier to organize your program. Yes, you have to be constantly switching from source to header, but they also allow you define internal and private APIs inside the implementations. For example:
MySource.h:
extern int my_library_entry_point(int api_to_use, ...);
MySource.c:
int private_function_that_CANNOT_be_public();
int my_library_entry_point(int api_to_use, ...){
// [...] Do stuff
}
int private_function_that_CANNOT_be_public() {
}
If you #include <MySource.h>, then you get my_library_entry_point.
If you #include <MySource.c>, then you also get private_function_that_CANNOT_be_public.
You see how that could be a very bad thing if you had a function to get a list of passwords, or a function which implemented your encryption algorithm, or a function that would expose the internals of an OS, or a function that overrode privileges, etc.

Oh Yes!
After coding in Java and C# it's really annoying to have 2 files for every classes. So I was thinking how can I merge them without breaking existing code.
In fact, it's really easy. Just put the definition (implementation) inside an #ifdef section and add a define on the compiler command line to compile that file. That's it.
Here is an example:
/* File ClassA.cpp */
#ifndef _ClassA_
#define _ClassA_
#include "ClassB.cpp"
#include "InterfaceC.cpp"
class ClassA : public InterfaceC
{
public:
ClassA(void);
virtual ~ClassA(void);
virtual void methodC();
private:
ClassB b;
};
#endif
#ifdef compiling_ClassA
ClassA::ClassA(void)
{
}
ClassA::~ClassA(void)
{
}
void ClassA::methodC()
{
}
#endif
On the command line, compile that file with
-D compiling_ClassA
The other files that need to include ClassA can just do
#include "ClassA.cpp"
Of course the addition of the define on the command line can easily be added with a macro expansion (Visual Studio compiler) or with an automatic variables (gnu make) and using the same nomenclature for the define name.

Still I don't get the point of some statements. Separation of API and implementation is a very good thing, but header files are not API. There are private fields there. If you add or remove private field you change implementation and not API.

Is it correct to export data members? (C++)

As the title suggests, is it correct or valid to import/export static data from within a C++ class?
I found out my problem - the author of the class I was looking at was trying to export writable static data which isn't supported on this platform.
Many thanks for the responses however.

An exported C++ class means that the DLL clients have to use the same compiler as the DLL because of name mangling and other issues. This is actually a pretty big problem, I once had to write C wrappers to a bunch of C++ classes because the client programs had switched to MSVC9, while the DLL itself was using MSVC71. [There were some other complications with switching the DLL to MSVC90]. Since then I've been quite skeptical about this business of exporting classes, and prefer to write a C wrapper for everything.
Now, if you're willing to pay the price of exporting classes, I'd say that exporting static data doesn't make the problem any worse. Arguably, among the kinds of things you could export, it is safest to export static constants. Even so, I'd rather not do it, because like Timo says, you're now locked into this implementation.
One of the frameworks that I worked on required its clients to provide a set of error code constants. Over time, we found that using a simple bunch of constants was too brittle, and we switched to an OO design. We had a default implementation that would return the common error codes, but each of the error codes was accessed using a virtual function which could be overridden by individual clients - and they used it from some advanced device specific error handling. This solution proved far more scalable than the one based on exporting constants.
I'd suggest that you think long and hard about how you expect the component to evolve before you exporting static variables.

Is it correct inasmuch as it'll work and do what you expect it to? Assuming that you are talking about using _declspec(dllexport/dllimport) on a class or class member, yes, you can do that and it should give you the expected result - the static data would be accessible outside your dll and other C++ code could access it provided that that C++ access specification (public/protected/private) wouldn't block outside access in the first place.
Is it a good idea? Personally I don't think so as you would be exposing class internals not only within your library but to the outside world, which means that it will be pretty much impossible to change what is an implementation detail at the end of the day. Ask yourself if you are 100% certain if the interface of this class and large parts of its implementation will never, ever change...

dllexport (or import) on a class's (non-static) data member does nothing. Exported "things" are either functions or global data (though this is a questionable design choice). dllexport on a class is just a shortcut for saying "export all of these functions".

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js