Objects of a class share same code segment for methods? - c++

For example we have code
class MyClass
{
private:
int data;
public:
int getData()
{
return data;
}
};
int main()
{
MyClass A, B, C;
return 0;
}
Since A, B and C are objects of MyClass, all have their own memory.
My question is that, are all of these objects share same memory for methods of class ( getData() in this case) or all objects have separate code segment for each object.?
Tnahks in advance....

The C++ Standard has nothing to say on the subject. If your architecture supports multiple code segments, then whether multiple segments are used is down to the implementation of the compiler and linker you are using. It's highly unlikely that any implementation would create separate segments for each class or object, however. Or indeed produce separate code for each object - methods belong to classes, not individual objects.

They usually share the same code segment.

Same.
You could be interested in knowledge of how things in C++ are implemented under the hood.

In general with classes and objects the following is how it works:
A class is a description of data and operations on that data.
An object is a specification of the type of object (the class it represents) and the values of the attributes. Because the type of the object is saved the compiler will know where to find the methods that are being called on the object. So even though a new copy of the data is created when a new object is made, the methods still are fixed.

In your example, MyClass::getData() is 'inline', so in that case the location of the instructions for each instance may well be in different locations under some circumstances.
Most compilers only truly in-line code when optimisation is enabled (and even then may choose not to do so). However, if this in-line code were defined in a header file, the compiler would necessarily generate code for each compilation unit in which the class were used, even if it were not in-lined within the compilation unit. The linker may or may not then optimise out the multiple instances of the code.
For code not defined inline all instances will generally share the same code unless the optimiser decides to inline the code; which unless the linker is very smart, will only happen when the class is instantiated and used in the same compilation unit in which it was defined.

Related

Why aren't C++ constructors capable of adding in new member variables?

It would make more intuitive sense to me if you could add in member variables in the constructor. This way, the class can adapt to changing input.
In C++ an object always has a fixed size. If constructors can add members at runtime, that guarantee goes out the window. In addition, in C++ all objects of the same type have the same size. Since a class can have multiple different constructors, the different constructors could specify different sizes.
This single, fixed size is the magic sauce that makes a number of C++'s high-performance tricks work, and in C++ convenience often gives way to speed. For example, an array of objects actually holds the objects. Not references to the objects, literally the objects. It can do this because everything in the array is the same size and the compiler can generate all of the indexing at compile time. CPUs love this because access is dead predictable and it can make full use of caches (assuming the access patterns you write allow it to do so). The more that's known and fixed at compile time, the more optimization opportunities the compiler has.
What you can do is add a member like a std::map or std::unordered_map that maps an identifier to its data. If all data is of the same type, this can be as easy as
std::map<std::string, int> members;
and access looks something like
members["hit points"] -= damage;
Note that while the map is inside the object, these mapped variables are not "inside" the object, They have to first be looked up in the map and then the data needs to be loaded from wherever it resides in in dynamic memory. This can slow down access considerably compared to a member that is known at compile time and reduced to an offset from the beginning of the object at a memory location that was probably already loaded into cache with the rest of the object.
Let's say we add that to the standard. Let's say we decide to introduce new keyword append to create new class member from within constructor. Then, one could have a following class:
struct A
{
A(int n) {
append int x = n;
}
A(std::string s) {
append std::string str = s;
}
};
Now, what is the sizeof(A)? Is it sizeof(int) or sizeof(std::string)? Remember that sizeof is a compile time operation. Compiler must be able to know that, it cannot be deferred to runtime.
And one more example:
void foo(A a)
{
std::cout << a.x; //should this compile?
std::cout << a.str; //or should this compile?
}
How would compiler know if a has member x or member str accessible to foo? Compilation in C++ is done in translation units, with each translation unit being compiled completely separately from the others. If foo() is defined in foo.cpp and it is called from main.cpp, compiler would have no idea which operation is valid. Moreover, both could be valid, just for different A objects.
C++ has many ways to add some flexibility to amount of members in classes, notably inheritance (to add new members) and templates (to create members of different type in the same class template). There is no need to try to introduce mechanisms from interpreted languages like Python.

Can I use a slim version of my header to be included with the library?

What I mean is my real header file can look like this:
#include "some_internal_class.h"
class MyLibrary {
Type private_member;
void private_function();
public:
MyLibrary();
void function_to_be_called_by_library_users();
};
Now I want to produce a dynamic library containing all the necessary definitions. and I want to ship with it a single header instead of shipping every single header I have in my library.
So I was thinking I could create a slim version of my header like so:
class MyLibrary {
public:
MyLibrary();
void function_to_be_called_by_library_users();
};
Headers are just declarations anyway right? they're never passed to the compiler. And I've declared what the user will be using.
Is that possible? If not, why not?
This is a One Definition Rule violation. The moment you deviate by a single token.
[basic.def.odr]/6
There can be more than one definition of a class type, [...] in a
program provided that each definition appears in a different
translation unit, and provided the definitions satisfy the following
requirements. Given such an entity named D defined in more than one
translation unit, then
each definition of D shall consist of the same sequence of tokens; and
Your program may easily break if you violate the ODR like that. And your build system isn't at all obligated to even warn you about it.
You cannot define a class twice. It breaks the One Definition Rule (ODT). MyLibrary does that, unfortunately.
they're never passed to the compiler
They will. Members of a class must be known at compile time, so that the compiler can determine the class's size.
Header are just declarations anyway right? they're never passed to the
compiler. And I've declared what the user will be using.
No. Headers are part of source code and are compiled together with source files. They contain the information necessary for a compiler to understand how to work with code (in your case, with class MyLibrary).
As an example, you want library users to be able to create objects of class MyLibrary, so you export the constructor. However, this is not sufficient: the compiler needs to know the size of the object to be created, which is impossible unless you specify all the fields.
In practice, deciding what to expose to library users and what to hide as implementation details is a hard question, which requires detailed inspection of the library usage and semantics. If you really want to hide the class internals as implementation detail, here are some common options:
The pimpl idiom is a common solution. It enables you to work with the class as it is usually done, but the implementation details are nicely hidden.
Extract the interface into an abstract class with virtual functions, and use pointers (preferably smart pointers) to work with the objects.
Headers are just declarations anyway right? they're never passed to the compiler.
The moment you do a #include to a file, its content are copied and pasted into your source file exactly as they are.
So even though you don't pass them directly as compiler arguments, they're still part of your code and code in them will be compiled into your translation units.
Solutions by #lisyarus are pretty good.
But another option would be doing it the C way. Which is the most elegant in my opinion.
In C you give your users a handle, which will most likely be a pointer.
Your header would look something like this:
struct MyLibrary;
MyLibrary*
my_library_init();
void
my_library_destroy(MyLibrary*);
void
my_library_function_to_be_called_by_library_users(MyLibrary*);
A very small and simple interface that does not show your users anything you don't want them to see.
Another nice perk is that your build system will not have to recompile your whole program just because you added a field to the MyLibrary struct.
You have to watch out though, because now you have to call my_library_destroy which will carry the logic of your destructor.

Is acceptable usage access specifiers before each member of class in C++

I write some code c++
public class SomeClass
{
private:
int m_CurrentStatus;
int m_PreviouseStatus;
public:
int get_CurrentStatus() { return m_CurrentStatus; }
int get_PreviouseStatus() { return m_PreviouseStatus; }
}
in c# style
public class SomeClass
{
private: int m_CurrentStatus;
private: int m_PreviouseStatus;
public: int get_CurrentStatus() { return m_CurrentStatus; }
public: int get_PreviouseStatus() { return m_PreviouseStatus; }
}
Such usage access specifiers before each member of class is acceptable?
Or can trouble compiler spend more time for compilation or other effects?
Code successfully compiled with no warning.
What you describe is legal C++.
The impact on compilation times will depend on the compiler. However, practically, you will probably be hard pressed to detect any difference of compilation times.
There may be either impact or benefit on code readability - i.e. ability of a human to understand what is going on. Generally, people prefer "sections" (e.g. the declaration of several members following a single access modifier (public, private, or protected)) quite well, as long as the sections don't get too large (e.g. fill more than a screen when editing the code). So doing it the way you are may be unpopular with other developers. This is highly subjective though - different people will have different preferences. However, if you find other people objecting to your approach, listen to them - unless you are happy to be unpopular with other team members, lose jobs, etc etc.
There may or may not be an impact on layout of data members in the class. Different versions of the C++ standard make different guarantees but there are considerable freedoms for compilers to lay out classes differently. If you are writing code that relies on (or tests for) particular class layout (order of data members, offsets, etc) you might observe a difference. Or you might not. However, these things are permitted to vary between implementations (compilers) anyway so writing code that relies on specific layouts is often a bad idea anyway.
It's legal syntax and shouldn't upset the compiler, you can have as many public and private blocks as you want. I don't see that it's an improvement syntactically though.
It's certainly legal and free from compile time penalties to put access specifiers before each and every class member. You could also put 50 semicolons at the end of every line. It simply isn't canonical C++.
It is Legal and also very correct. You won't use an object oriented language if you want to have your members public. A basic usage of object oriented programming is information hiding.
The only thing that can trouble compiler in your case(in a matter of time), is that your function block is inside the class.
What do I mean?
If you have a main.cpp and a class.h , and your functions are declared and defined inside class.h, your compilation will take a little longer, rather than if your function body was in functions.cpp
It's perfectly legal syntax for C++,
except for the public keyword before class
and the missing semicolon (;) after the closing curly bracket of class.
BUT I've never seen something like that, I suppose it's surely not commonly used and, in my opinion, it adds nothing to code readability (on contrary, I personally find it more cumbersome to read).

Name of this C++ pattern and the reasoning behind it?

In my company's C++ codebase I see a lot of classes defined like this:
// FooApi.h
class FooApi {
public:
virtual void someFunction() = 0;
virtual void someOtherFunction() = 0;
// etc.
};
// Foo.h
class Foo : public FooApi {
public:
virtual void someFunction();
virtual void someOtherFunction();
};
Foo is this only class that inherits from FooApi and functions that take or return pointers to Foo objects use FooApi * instead. It seems to mainly be used for singleton classes.
Is this a common, named way to write C++ code? And what is the point in it? I don't see how having a separate, pure abstract class that just defines the class's interface is useful.
Edit[0]: Sorry, just to clarify, there is only one class deriving from FooApi and no intention to add others later.
Edit[1]: I understand the point of abstraction and inheritance in general but not this particular usage of inheritance.
The only reason that I can see why they would do this is for encapsulation purposes. The point here is that most other code in the code-base only requires inclusion of the "FooApi.h" / "BarApi.h" / "QuxxApi.h" headers. Only the parts of the code that create Foo objects would actually need to include the "Foo.h" header (and link with the object-file containing the definition of the class' functions). And for singletons, the only place where you would normally create a Foo object is in the "Foo.cpp" file (e.g., as a local static variable within a static member function of the Foo class, or something similar).
This is similar to using forward-declarations to avoid including the header that contains the actual class declaration. But when using forward-declarations, you still need to eventually include the header in order to be able to call any of the member functions. But when using this "abstract + actual" class pattern, you don't even need to include the "Foo.h" header to be able to call the member functions of FooApi.
In other words, this pattern provides very strong encapsulation of the Foo class' implementation (and complete declaration). You get roughly the same benefits as from using the Compiler Firewall idiom. Here is another interesting read on those issues.
I don't know the name of that pattern. It is not very common compared to the other two patterns I just mentioned (compiler firewall and forward declarations). This is probably because this method has quite a bit more run-time overhead than the other two methods.
This is for if the code is later added on to. Lets say NewFoo also extends/implements FooApi. All the current infrastructure will work with both Foo and NewFoo.
It's likely that this has been done for the same reason that pImpl ("pointer to implementation idiom", sometimes called "private implementation idiom") is used - to keep private implementation details out of the header, which means common build systems like make that use file timestamps to trigger code recompilation will not rebuild client code when only implementation has changed. Instead, the object containing the new implementation can be linked against existing client object(s), and indeed if the implementation is distributed in a shared object (aka dynamic link library / DLL) the client application can pick up a changed implementation library the next time it runs (or does a dlopen() or equivalent if it's linking at run-time). As well as facilitating distribution of updated implementation, it can reduce rebuilding times allowing a faster edit/test/edit/... cycle.
The cost of this is that implementations have to be accessed through out-of-line virtual dispatch, so there's a performance hit. This is typically insignificant, but if a trivial function like a get-int-member is called millions of times in a performance critical loop it may be of interest - each call can easily be an order of magnitude slower than inlined member access.
What's the "name" for it? Well, if you say you're using an "interface" most people will get the general idea. That term's a bit vague in C++, as some people use it whenever a base class has virtual methods, others expect that the base will be abstract, lack data members and/or private member functions and/or function definitions (other than the virtual destructor's). Expectations around the term "interface" are sometimes - for better or worse - influenced by Java's language keyword, which restricts the interface class to being abstract, containing no static methods or function definitions, with all functions being public, and only const final data members.
None of the well-known Gang of Four Design Patterns correspond to the usage you cite, and while doubtless lots of people have published (web- or otherwise) corresponding "patterns", they're probably not widely enough used (with the same meaning!) to be less confusing than "interface".
FooApi is a virtual base class, it provides the interface for concrete implementations (Foo).
The point is you can implement functionality in terms of FooApi and create multiple implementations that satisfy its interface and still work with your functionality. You see some advantage when you have multiple descendants - the functionality can work with multiple implementations. One might implement a different type of Foo or for a different platform.
Re-reading my answer, I don't think I should talk about OO ever again.

Is it safe to use strings as private data members in a class used across a DLL boundry?

My understanding is that exposing functions that take or return stl containers (such as std::string) across DLL boundaries can cause problems due to differences in STL implementations of those containers in the 2 binaries. But is it safe to export a class like:
class Customer
{
public:
wchar_t * getName() const;
private:
wstring mName;
};
Without some sort of hack, mName is not going to be usable by the executable, so it won't be able to execute methods on mName, nor construct/destruct this object.
My gut feeling is "don't do this, it's unsafe", but I can't figure out a good reason.
It is not a problem. Because it is trumped by the bigger problem, you cannot create an object of that class in code that lives in a module other than the one that contains the code for the class. Code in another module cannot accurately know the required object size, their implementation of the std::string class may well be different. Which, as declared, also affects the size of the Customer object. Even the same compiler cannot guarantee this, mixing optimized and debugging builds of these modules for example. Albeit that this is usually pretty easy to avoid.
So you must create a class factory for Customer objects, a factory that lives in that same module. Which then automatically implies that any code that touches the "mName" member also lives in the same module. And is therefore safe.
Next step then is to not expose Customer at all but expose an pure abstract base class (aka interface). Now you can prevent the client code from creating an instance of Customer and shoot their leg off. And you'll trivially hide the std::string as well. Interface-based programming techniques are common in module interop scenarios. Also the approach taken by COM.
As long as the allocator of instances of the class and deallocator are of the same settings, you should be ok, but you are right to avoid this.
Differences between the .exe and .dll as far as debug/release, code generation (Multi-threaded DLL vs. Single threaded) could cause problems in some scenarios.
I would recommend using abstract classes in the DLL interface with creation and deletion done solely inside the DLL.
Interfaces like:
class A {
protected:
virtual ~A() {}
public:
virtual void func() = 0;
};
//exported create/delete functions
A* create_A();
void destroy_A(A*);
DLL Implementation like:
class A_Impl : public A{
public:
~A_Impl() {}
void func() { do_something(); }
}
A* create_A() { return new A_Impl; }
void destroy_A(A* a) {
A_Impl* ai=static_cast<A_Impl*>(a);
delete ai;
}
Should be ok.
Even if your class has no data members, you cannot expect it to be usable from code compiled with a different compiler. There is no common ABI for C++ classes. You can expect differences in name mangling just for starters.
If you are prepared to constrain clients to use the same compiler as you, or provide source to allow clients to compile your code with their compiler, then you can do pretty much anything across your interface. Otherwise you should stick to C style interfaces.
If you want to provide an object oriented interface in a DLL that is truly safe, I would suggest building it on top of the COM object model. That's what it was designed for.
Any other attempt to share classes between code that is compiled by different compilers has the potential to fail. You may be able to get something that seems to work most of the time, but it can't be guaraneteed to work.
The chances are that at some point you're going to be relying on undefined behaviour in terms of calling conventions or class structure or memory allocation.
The C++ standard does not say anything about the ABI provided by implementations. Even on a single platform changing the compiler options may change binary layout or function interfaces.
Thus to ensure that standard types can be used across DLL boundaries it is your responsibility to ensure that either:
Resource Acquisition/Release for standard types is done by the same DLL. (Note: you can have multiple crt's in a process but a resource acquired by crt1.DLL must be released by crt1.DLL.)
This is not specific to C++. In C for example malloc/free, fopen/fclose call pairs must each go to a single C runtime.
This can be done by either of the below:
By explicitly exporting acquisition/release functions ( Photon's answer ). In this case you are forced to use a factory pattern and abstract types.Basically COM or a COM-clone
Forcing a group of DLL's to link against the same dynamic CRT. In this case you can safely export any kind of functions/classes.
There are also two "potential bug" (among others) you must take care, since they are related to what is "under" the language.
The first is that std::strng is a template, and hence it is instantiated in every translation unit. If they are all linked to a same module (exe or dll) the linker will resolve same functions as same code, and eventually inconsistent code (same function with different body) is treated as error.
But if they are linked to different module (and exe and a dll) there is nothing (compiler and linker) in common. So -depending on how the module where compiled- you may have different implementation of a same class with different member and memory layout (for example one may have some debugging or profiling added features the other has not). Accessing an object created on one side with methods compiled on the other side, if you have no other way to grant implementation consistency, may end in tears.
The second problem (more subtle) relates to allocation/deallocaion of memory: because of the way windows works, every module can have a distinct heap. But the standard C++ does not specify how new and delete take care about which heap an object comes from. And if the string buffer is allocated on one module, than moved to a string instance on another module, you risk (upon destruction) to give the memory back to the wrong heap (it depends on how new/delete and malloc/free are implemented respect to HeapAlloc/HeapFree: this merely relates to the level of "awarness" the STL implementation have respect to the underlying OS. The operation is not itself destructive -the operation just fails- but it leaks the origin's heap).
All that said, it is not impossible to pass a container. It is just up to you to grant a consistent implementation between the sides, since the compiler and linker have no way to cross check.