C++ Object vs Struct Memory Overhead - c++

I need to create very large arrays of RGB values for image processing. The actual operations that will be performed on them are simple—just orthogonal projection to see how similar two colors are—but every bit counts with regards to memory. I am thinking about storing images as a double pointer to structs with 3 chars in them, which I thought would be the most memory efficent way, but I know it is usually recommended to use wrapper classes. By question is how trivial is the memory overhead for creating a class vs a struct and using some sort of wrapper vs using a double pointer.

There is absolutely no difference between
class X
{
public:
T1 x;
T2 y;
T3 z;
}
and
struct X
{
T1 x;
T2 y;
T3 z;
};
If you add virtual functions to the class, yes, it will add to the storage. But nothing else will make any difference between class and struct (in fact, it's possible to have virtual members in struct too - although it is typical to distinguish between struct and class by having member functions and (non-trivial) constructors only for classes).

Related

2 C++ classes with same functionality but different variable names - which pattern to use?

I have 2 C++ classes which have the same variable types and need to perform exactly the same operations on these variables. However, the variable and function names of these 2 classes are different as they mean different things in their respective classes.
How can I achieve this without code duplication? Does inheritance or templates help here?
Here's a stripped down example.
class A
{
private:
float m;
float n;
public:
float foo() {return m + n};
};
class B
{
private:
float p;
float q;
public:
float bar() {return p + q};
};
In my case, the operations and variables are more complicated than the above toy example. The only difference between the 2 classes are the variable and function names. The rest is identical. How can I refactor this in C++?
There are several ways to deal with this.
Just use one class. If the functionality is the same and you are just changing names, then why do you need more than one class.
PROS: minimize code duplication
CONS: if these are for two different applications, you are violating the single responsibility principle
Create a base class with all the functions and then another for each case with the correct names.
PROS: minimize code duplication
CONS: may not be related by other than chance
Create a single class and typedef to create another name.
PROS: minimize code duplication
CONS: only the name of the 'class' is changed, not the method calls
Leave it as two classes
PROS: each class is responsible for it's own non-related operations and acts independently of the other
CONS: possible code duplication
In general, without knowing the specifics of the classes, I would say that the best approach is to leave both classes as they are. If they are doing different things, and just happen to both have similar methods, that is chance and doesn't merit sacrificing readability and the single responsibility principle.
If they are related, then consider if it would follow an inherited relationship. If not, again, leave them as two classes.
Two reasonably-complex functions that are the same except for which variables are used? Sounds like a simple case where the commonality could be extracted into a separate function, which takes arguments for the things that differ. For example:
float DoComplicatedThing(float a, float b)
{
// whatever the complicated thing is
// ...
// but staying with your example it's just:
return a + b;
}
class A
{
private:
float m;
float n;
public:
float foo() { return DoComplicatedThing(m, n) };
};
class B
{
private:
float p;
float q;
public:
float bar() { return DoComplicatedThing(p, q) };
};
If there's a good reason to have A and B derive from a common base class, then DoComplicatedThing could potentially be a static function from that base class. Otherwise it could just be a standalone function without a class (as in my example).

std::variant vs pointer to base class for heterogeneous containers in C++

Let's assume this class hierarchy below.
class BaseClass {
public:
int x;
}
class SubClass1 : public BaseClass {
public:
double y;
}
class SubClass2 : public BaseClass {
public:
float z;
}
...
I want to make a heterogeneous container of these classes. Since the subclasses are derived from the base class I can make something like this:
std::vector<BaseClass*> container1;
But since C++17 I can also use std::variant like this:
std::vector<std::variant<SubClass1, SubClass2, ...>> container2;
What are the advantages/disadvantages of using one or the other? I am interested in the performance too.
Take into consideration that I am going to sort the container by x, and I also need to be able to find out the exact type of the elements. I am going to
Fill the container,
Sort it by x,
Iterate through all the elements, find out the type, use it accordingly,
Clear the container, then the cycle starts over again.
std::variant<A,B,C> holds one of a closed set of types. You can check whether it holds a given type with std::holds_alternative, or use std::visit to pass a visitor object with an overloaded operator(). There is likely no dynamic memory allocation, however, it is hard to extend: the class with the std::variant and any visitor classes will need to know the list of possible types.
On the other hand, BaseClass* holds an unbounded set of derived class types. You ought to be holding std::unique_ptr<BaseClass> or std::shared_ptr<BaseClass> to avoid the potential for memory leaks. To determine whether an instance of a specific type is stored, you must use dynamic_cast or a virtual function. This option requires dynamic memory allocation, but if all processing is via virtual functions, then the code that holds the container does not need to know the full list of types that could be stored.
A problem with std::variant is that you need to specify a list of allowed types; if you add a future derived class you would have to add it to the type list. If you need a more dynamic implementation, you can look at std::any; I believe it can serve the purpose.
I also need to be able to find out the exact type of the elements.
For type recognition you can create a instanceof-like template as seen in C++ equivalent of instanceof. It is also said that the need to use such a mechanism sometimes reveals poor code design.
The performance issue is not something that can be detected ahead of time, because it depends on the usage: it's a matter of testing different implementations and see witch one is faster.
Take into consideration that, I am going to sort the container by x
In this case you declare the variable public so sorting is no problem at all; you may want to consider declaring the variable protected or implementing a sorting mechanism in the base class.
What are the advantages/disadvantages of using one or the other?
The same as advantages/disadvantages of using pointers for runtime type resolution and templates for compile time type resolution. There are many things that you might compare. For example:
with pointers you might have memory violations if you misuse them
runtime resolution has additional overhead (but also depends how would you use this classes exactly, if it is virtual function call, or just common member field access)
but
pointers have fixed size, and are probably smaller than the object of your class will be, so it might be better if you plan to copy your container often
I am interested in the performance too.
Then just measure the performance of your application and then decide. It is not a good practice to speculate which approach might be faster, because it strongly depends on the use case.
Take into consideration that, I am going to sort the container by x
and I also need to be able to find out the exact type of the elements.
In both cases you can find out the type. dynamic_cast in case of pointers, holds_alternative in case of std::variant. With std::variant all possible types must be explicitly specified. Accessing member field x will be almost the same in both cases (with the pointer it is pointer dereference + member access, with variant it is get + member access).
Sending data over a TCP connection was mentioned in the comments. In this case, it would probably make the most sense to use virtual dispatch.
class BaseClass {
public:
int x;
virtual void sendTo(Socket socket) const {
socket.send(x);
}
};
class SubClass1 final : public BaseClass {
public:
double y;
void sendTo(Socket socket) const override {
BaseClass::sendTo(socket);
socket.send(y);
}
};
class SubClass2 final : public BaseClass {
public:
float z;
void sendTo(Socket socket) const override {
BaseClass::sendTo(socket);
socket.send(z);
}
};
Then you can store pointers to the base class in a container, and manipulate the objects through the base class.
std::vector<std::unique_ptr<BaseClass>> container;
// fill the container
auto a = std::make_unique<SubClass1>();
a->x = 5;
a->y = 17.0;
container.push_back(a);
auto b = std::make_unique<SubClass2>();
b->x = 1;
b->z = 14.5;
container.push_back(b);
// sort by x
std::sort(container.begin(), container.end(), [](auto &lhs, auto &rhs) {
return lhs->x < rhs->x;
});
// send the data over the connection
for (auto &ptr : container) {
ptr->sendTo(socket);
}
It's not the same. std::variant is like a union with type safety. No more than one member can be visible at the same time.
// C++ 17
std::variant<int,float,char> x;
x = 5; // now contains int
int i = std::get<int>(v); // i = 5;
std::get<float>(v); // Throws
The other option is based on inheritance. All members are visible depending on which pointer you have.
Your selection will depend on if you want all the variables to be visible and what error reporting you want.
Related: don't use a vector of pointers. Use a vector of shared_ptr.
Unrelated: I'm somewhat not of a supporter of the new union variant. The point of the older C-style union was to be able to access all the members it had at the same memory place.

Alternative schemes for implementing vptr?

This question is not about the C++ language itself(ie not about the Standard) but about how to call a compiler to implement alternative schemes for virtual function.
The general scheme for implementing virtual functions is using a pointer to a table of pointers.
class Base {
private:
int m;
public:
virtual metha();
};
equivalently in say C would be something like
struct Base {
void (**vtable)();
int m;
}
the first member is usually a pointer to a list of virtual functions, etc. (a piece of area in the memory which the application has no control of). And in most case this happens to cost the size of a pointer before considering the members, etc. So in a 32bit addressing scheme around 4 bytes, etc. If you created a list of 40k polymorphic objects in your applications, this is around 40k x 4 bytes = 160k bytes before any member variables, etc. I also know this happens to be the fastest and common implementation among C++ compiles.
I know this is complicated by multiple inheritance (especially with virtual classes in them, ie diamond struct, etc).
An alternative way to do the same is to have the first variable as a index id to a table of vptrs(equivalently in C as below)
struct Base {
char classid; // the classid here is an index into an array of vtables
int m;
}
If the total number of classes in an application is less than 255(including all possible template instantiations, etc), then a char is good enough to hold an index thereby reducing the size of all polymorphic classes in the application(I am excluding alignment issues, etc).
My questions is, is there any switch in GNU C++, LLVM, or any other compiler to do this?? or reduce the size of polymorphic objects?
Edit: I understand about the alignment issues pointed out. Also a further point, if this was on a 64bit system(assuming 64bit vptr) with each polymorphic object members costing around 8 bytes, then the cost of vptr is 50% of the memory. This mostly relates to small polymorphics created in mass, so I am wondering if this scheme is possible for at least specific virtual objects if not the whole application.
You're suggestion is interesting, but it won't work if the executable is made of several modules, passing objects among them. Given they are compiled separately (say DLLs), if one module creates an object and passes it to another, and the other invokes a virtual method - how would it know which table the classid refers to? You won't be able to add another moduleid because the two modules might not know about each other when they are compiled. So unless you use pointers, I think it's a dead end...
A couple of observations:
Yes, a smaller value could be used to represent the class, but some processors require data to be aligned so that saving in space may be lost by the requirement to align data values to e.g. 4 byte boundaries. Further, the class-id must be in a well defined place for all members of a polymorphic inheritance tree, so it is likely to be ahead of other date, so alignment problems can't be avoided.
The cost of storing the pointer has been moved to the code, where every use of a polymorphic function requires code to translate the class-id to either a vtable pointer, or some equivalent data structure. So it isn't for free. Clearly the cost trade-off depends on the volume of code vs numer of objects.
If objects are allocated from the heap, there is usually space wasted in orer to ensure objects are alogned to the worst boundary, so even if there is a small amount of code, and a large number of polymorphic objects, the memory management overhead migh be significantly bigger than the difference between a pointer and a char.
In order to allow programs to be independently compiled, the number of classes in the whole program, and hence the size of the class-id must be known at compile time, otherwise code can't be compiled to access it. This would be a significant overhead. It is simpler to fix it for the worst case, and simplify compilation and linking.
Please don't let me stop you trying, but there are quite a lot more issues to resolve using any technique which may use a variable size id to derive the function address.
I would strongly encourage you to look at Ian Piumarta's Cola also at Wikipedia Cola
It actually takes a different approach, and uses the pointer in a much more flexible way, to to build inheritance, or prototype-based, or any other mechanism the developer requires.
No, there is no such switch.
The LLVM/Clang codebase avoids virtual tables in classes that are allocated by the tens of thousands: this work well in a closed hierachy, because a single enum can enumerate all possible classes and then each class is linked to a value of the enum. The closed is obviously because of the enum.
Then, virtuality is implemented by a switch on the enum, and appropriate casting before calling the method. Once again, closed. The switch has to be modified for each new class.
A first alternative: external vpointer.
If you find yourself in a situation where the vpointer tax is paid way too often, that is most of the objects are of known type. Then you can externalize it.
class Interface {
public:
virtual ~Interface() {}
virtual Interface* clone() const = 0; // might be worth it
virtual void updateCount(int) = 0;
protected:
Interface(Interface const&) {}
Interface& operator=(Interface const&) { return *this; }
};
template <typename T>
class InterfaceBridge: public Interface {
public:
InterfaceBridge(T& t): t(t) {}
virtual InterfaceBridge* clone() const { return new InterfaceBridge(*this); }
virtual void updateCount(int i) { t.updateCount(i); }
private:
T& t; // value or reference ? Choose...
};
template <typename T>
InterfaceBridge<T> interface(T& t) { return InterfaceBridge<T>(t); }
Then, imagining a simple class:
class Counter {
public:
int getCount() const { return c; }
void updateCount(int i) { c = i; }
private:
int c;
};
You can store the objects in an array:
static Counter array[5];
assert(sizeof(array) == sizeof(int)*5); // no v-pointer
And still use them with polymorphic functions:
void five(Interface& i) { i.updateCount(5); }
InterfaceBridge<Counter> ib(array[3]); // create *one* v-pointer
five(ib);
assert(array[3].getCount() == 5);
The value vs reference is actually a design tension. In general, if you need to clone you need to store by value, and you need to clone when you store by base class (boost::ptr_vector for example). It is possible to actually provide both interfaces (and bridges):
Interface <--- ClonableInterface
| |
InterfaceB ClonableInterfaceB
It's just extra typing.
Another solution, much more involved.
A switch is implementable by a jump table. Such a table could perfectly be created at runtime, in a std::vector for example:
class Base {
public:
~Base() { VTables()[vpointer].dispose(*this); }
void updateCount(int i) {
VTables()[vpointer].updateCount(*this, i);
}
protected:
struct VTable {
typedef void (*Dispose)(Base&);
typedef void (*UpdateCount)(Base&, int);
Dispose dispose;
UpdateCount updateCount;
};
static void NoDispose(Base&) {}
static unsigned RegisterTable(VTable t) {
std::vector<VTable>& v = VTables();
v.push_back(t);
return v.size() - 1;
}
explicit Base(unsigned id): vpointer(id) {
assert(id < VTables.size());
}
private:
// Implement in .cpp or pay the cost of weak symbols.
static std::vector<VTable> VTables() { static std::vector<VTable> VT; return VT; }
unsigned vpointer;
};
And then, a Derived class:
class Derived: public Base {
public:
Derived(): Base(GetID()) {}
private:
static void UpdateCount(Base& b, int i) {
static_cast<Derived&>(b).count = i;
}
static unsigned GetID() {
static unsigned ID = RegisterTable(VTable({&NoDispose, &UpdateCount}));
return ID;
}
unsigned count;
};
Well, now you'll realize how great it is that the compiler does it for you, even at the cost of some overhead.
Oh, and because of alignment, as soon as a Derived class introduces a pointer, there is a risk that 4 bytes of padding are used between Base and the next attribute. You can use them by careful selecting the first few attributes in Derived to avoid padding...
The short answer is that no, I don't know of any switch to do this with any common C++ compiler.
The longer answer is that to do this, you'd just about have to build most of the intelligence into the linker, so it could coordinate distributing the IDs across all the object files getting linked together.
I'd also point out that it wouldn't generally do a whole lot of good. At least in a typical case, you want each element in a struct/class at a "natural" boundary, meaning its starting address is a multiple of its size. Using your example of a class containing a single int, the compiler would allocate one byte for the vtable index, followed immediately by three byes of padding so the next int would land at an address that was a multiple of four. The end result would be that objects of the class would occupy precisely the same amount of storage as if we used a pointer.
I'd add that this is not a far-fetched exception either. For years, standard advice to minimize padding inserted into structs/classes has been to put the items expected to be largest at the beginning, and progress toward the smallest. That means in most code, you'd end up with those same three bytes of padding before the first explicitly defined member of the struct.
To get any good from this, you'd have to be aware of it, and have a struct with (for example) three bytes of data you could move where you wanted. Then you'd move those to be the first items explicitly defined in the struct. Unfortunately, that would also mean that if you turned this switch off so you have a vtable pointer, you'd end up with the compiler inserting padding that might otherwise be unnecessary.
To summarize: it's not implemented, and if it was wouldn't usually accomplish much.

Order of fields in C/C++ structs

I have a situation similar to this one
struct Child
{
u16 x, y;
// other fields
};
struct Father
{
struct Child child1;
struct Child child2;
// other fields
};
Father tilemap[WIDTH][HEIGHT];
Now I just realized I would like to save four bytes for x,y which are set always to the same values for both children of the same father.
All around my code I pass around many Father* and many Child* while recovering coordinates with father->child1->x or child1->x respectively. I would like to safely move the coordinates at Father level but I'm unsure about some facts.
Will the order of declared fields be respected versus any optimization or possible implementation of gcc/g++? Can I be confident that &father == &father.child1?
The real issue here is that I pass Child* without knowing if it's a child1 or child2 field so I cannot directly know the offset to recover address of father (and coordinates consequently).. I was wondering to use a bit at Child level to distinguish them but will I be easily able to recover address of father then?
Any suggestion would be appreciated, thanks
EDIT: just as a further info, I'm using C++ as my main language but these structs don't contain ANY strange methods, just fields and empty constructor.
The general rules about field layout in C are:
The address of the first member is the same as the address of the struct itself. That is, the offsetof of the member field is 0.
The addresses of the members always increase in declaration order. That is, the offsetof of the n-th field is lower than that of the (n+1)-th member.
In C++, of course, that is only true if it is a standard layout type, that is roughly, a class or struct with no public/private/protected mixed members, no virtual functions and no members inherited from other classes.
Disclaimer: Partial answer. C++ only
Will the order of declared fields be respected versus any optimization
or possible implementation of gcc/g++?
The order of the members in the memory layout will not be tampered with by the compiler. It's the same order you declared the members in.
Can I be confident that &father == &father.child1?
In this particular case, yes. But it does not follow from the mere fact that child1 is the first member of father that &father == &father.child1?. This is true only if father is a POD, which in this case it is.
The pertinent section of the C standard says this (emphasis mine):
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
The C++ standard makes the same promise:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. —end note ]
So when you ask:
Can I be confident that &father == &father.child1?
The answer is yes.
Try the following
struct Child
{
int isChild1;
u16 x, y;
// other fields
};
...
Father *father_p;
if (*child_p).isChild1
father_p = child_p;
else
father_p = child_p - sizeof(struct Child);
(*father_p).x = ... // whatever you want to do with coordinates
You should be careful not to pass a Child which isn't incorporated in a corresponding Father, in which case you'll get a bogus address in father_p and (maybe) corrupt your memory.
Instead of relying on memory layout, have you considered using something along the lines of the Flyweight pattern?
Instead of storing two Childs in Father, you can store two stripped-down BasicChilds (that don't have x,y data) and generate full-fledged Childs on-the-fly as needed:
struct BasicChild
{
float foo;
float bar;
void printPosition(int x, int y) {std::cout << x << "," << y << "\n";}
};
struct Child
{
Child(BasicChild basicChild, int x, int y)
: basicChild_(basicChild), x_(x), y_(y){}
float foo() {return basicChild_.foo;}
float bar() {return basicChild_.bar;}
void printPosition() {basicChild_.printPosition(x_, y_);}
private:
int x_, y_;
BasicChild basicChild_;
};
struct Father
{
Child child1() {return Child(child1_, x_, y_);}
Child child2() {return Child(child2_, x_, y_);}
private:
int x_, y_;
BasicChild child1_;
BasicChild child2_;
};
If you decide to rely on any compiler specific behaviour then the best you can do is to add static asserts for any assumptions you are relying on. This way, if anything changes about the layout due to compiler upgrade, compile options, or pragmas elsewhere in code, you will be told at compile time straight away, and exactly what the problem is. This then would serve as a basis for porting to other platforms too if that is a requirement.
Examples in this Q: What does static_assert do, and what would you use it for?

a struct doesn't belong in an object oriented program

Or does it?
Should an object-oriented design use a language construct that exposes member data by default, if there is an equally useful construct that properly hides data members?
EDIT: One of the responders mentioned that if there's no invariant one can use a struct. That's an interesting observation: a struct is a data structure, i.e. it contains related data. If the data members in a struct are related isn't there's always an invariant?
In C++, structs and classes are identical except for the default public/privateness of their members. (This default is easily, and usually, overridden.)
However, most programmers think of a struct as a "data object" and a class as an "interactive object". That's not a bad thing; and in fact should be taken advantage of. If something is just an inanimate lump of data (even maybe if it has a couple of inspector methods), use a struct for it; it'll save a bit of effort when a programmer is trying to see what it's for.
Don't be a hiding zealot. If your get/set methods do nothing but simply copy verbatim the value onto/from a hidden, private field, you've gained nothing over a public member and only complicate unnecessarily your class (and, depending on the intelligence of the compiler, slow its usage a bit).
There's a case for not allowing direct access when your setter methods do some validation, copy the data somewhere else, process it a bit before storing it, etc. Same in the case of getters that actually calculate the value they return from multiple internal sources, and hide the way it's derived (I believe Bertrand Meyer speaks a bit about this in his book)
Or if allowing the users of your class to directly change such a value would have unintended side effects or breaks an assumption some of your member classes have about the values. On those situations, by all means, do hide your values.
For instance, for a simple "Point" class, that only holds a couple coordinates and colour, and methods to "Plot" it and "Hide" it on screen, I would see no point in not allowing the user to directly set the values for its fields.
In C# for example I use structs for some simple better-left-as-values data types:
public struct Point
{
int X;
int Y;
}
and for any P/Invoke to libraries where the arguments are structs you'll have to use them for certain.
Do they belong in the general design of an application? Of course they do, use a struct when it makes sense to do so. Just like you'd use a enum with bit flags when it makes sense to do so instead of resorting to some complicated string parsing for storing combined values.
In C++, the difference between a struct and a class is the default visibility of its contents (i.e. public for a struct, and private for a class). I guess this difference was to keep C compatibility.
But semantically, I guess this is subject to interpretation.
An example of struct
In a struct, everything is public (by default), meaning the user can modify each data value as desired, and still the struct remains a valid object. Example of struct:
struct CPoint
{
int x ;
int y ;
CPoint() : x(0), y(0) {}
int getDistanceFromOrigin() const
{
return std::sqrt(x * x + y * y) ;
}
} ;
inline CPoint operator + (const CPoint & lhs, const CPoint & rhs)
{
CPoint r(lhs) ;
r.x += rhs.x ;
r.y += rhs.y ;
return r ;
}
You can change the x value of a CPoint, and it still remains a valid CPoint.
Note that, unlike some believe, a C++ struct can (and should) have constructors, methods and non-member functions attached to its interface, as shown above.
An example of class
In a class, everything is private (by default), meaning the user can modify the data only through a well defined interface, because the class must keep its internals valid. Example of class:
class CString
{
public :
CString(const char * p) { /* etc. */ } ;
CString(const CString & p) { /* etc. */ } ;
const char * getString() const { return this->m_pString ; }
size_t getSize() const { return this->m_iSize ; }
void copy { /* code for string copy */ }
void concat { /* code for string concatenation */ }
private :
size_t m_iSize ;
char * m_pString ;
} ;
inline CString operator + (const CString & lhs, const CString & rhs)
{
CString r(lhs) ;
r.concat(rhs) ;
return r ;
}
You see that when you call concat, both the pointer could need reallocation (to increase its size), and the size of the string must be updated automatically. You can't let the user modify the string by hand, and forget updating the size.
So, the class must protect its internal, and be sure everything will be correctly updated when needed.
Conclusion
For me, the difference between a struct and a class is the dependencies between the aggregated data.
If each and every piece of data is independent from all the others, then perhaps you should consider a struct (i.e., a class with public data member).
If not, or if in doubt, use a class.
Now, of course, in C#, the struct and class are two different type of objects (i.e. value types for structs, and referenced types for classes). But this is out of this topic, I guess.
Technically, a struct is a class with the default visibility of public (a real class has a default visibility of private).
There is more of a distinction in common use.
A struct is normally just a collection of data, to be examined and processed by other code.
A class is normally more of a thing, maintaining some sort of control over its data, and with behavior specified by associated functions.
Typically, classes are more useful, but every so often there's uses for something like a C struct, and it's useful to have a notational difference to show it.
The matter is easy. If the class does have invariants to guarantee, you should never make the members constraining the invariant public.
If your struct is merely an aggregate of different objects, and doesn't have an invariant to hold, you are indeed free and encouraged to put its members public. That's the way std::pair<T, U> in C++ does it.
What's that invariant stuff?
Simple example: Consider you have a Point class whose x and y members must always be >= 0 . You can make an invariant stating
/* x >= 0 && y >= 0 for this classes' objects. */
If you now make those members public, clients could simply change x and y, and your invariant could break easily. If the members, however, are allowed to contain all possible values fitting their own invariants respectively, you could of course just make those members public: You wouldn't add any protection to them anyway.
A struct is essentially a model class but with different syntax.
public struct Point {
int x;
int y;
}
is logically the same as:
public class Point {
private int x;
private int y;
public void setX(int x) { this.x=x; }
public int getX(); { return x; }
public void setY(int y) { this.y=y; }
public int getY(); { return y; }
}
Both are a mutable model that holds pair of integer values called x and y. So I would say that it's a valid object oriented construct.
Yes. It's like a mini-class.
Yes, they do. They have different semantic than classes. A struct is generally considered and treated as a value type, while a class is generally considered and treated as a reference type. The difference is not as much pronunciated in every day programming; however, it is an imprtant difference when it comes to things like marshalling, COM interop and passing instances around.
I use structs regularly - mostly for data received from the network or hardware. They are usually wrapped in a class for use by higher level parts of the program.
My rule of thumb is a struct is always pure data, except for a constructor. Anything else is a class.
Most answers seem to be in favor of a struct as something to be acceptable and useful, as long as it does not have a behavior (i.e. methods). That seems fair enough.
However, you can never be sure that your object does not evolve into something that may need behavior, and hence some control over its data members. If you're lucky enough that you have control over all users of your struct, you can go over all uses of all data members. But what if you don't have access to all users?
A struct, as used in C or C++, and the struct used in C# ( or any .Net language ), are such different animals that they probably should not even have the same name... Just about any generalization about structs in one language can easily be false, or true for a completely unrelated reason, in the other.
If there is a need for invariant, make it a class. Otherwise, struct is OK.
See these similar questions:
When should you use a class vs a struct in C++?
What are the differences between struct and class in C++
plus:
According to Stroustrup in the C++ Programming Language:
Which style you use depends on circumstances and taste. I usually prefer to use struct for classes that have all data public. I think of such classes as "not quite proper types, just data structures."
Formally, in C++ a struct is a class with the visibility of its members set to public by default. By tradition structs are used to group collection of homogeneous data that have no particular reasons for being accessed by specific methods.
The public visibility of its members makes structs preferred to class to implement policy classes and metafunctions.
There's nothing wrong with using structs per se, but if you're finding yourself needing them, you should ask what's wrong with your analysis. Consider, eg, the Point class above: it gives some little improvement in readability, since you can always use
Point foo;
//...
foo.x = bar;
in place of, say, having a two element array as in
#define X 0
#define Y 1
//...
foo[X] = bar;
But classes are meant to hide details behind a contract. If your Point is in some normalized space, the values may range in the half-open interval [0.0..1.0); if it's a screen they may range in [0..1023]. If you use a struct in place of accessors, how will you keep someone from assigning foo.x = 1023 when x should be everywhere < 1.0?
The only reason C++ programmers used structs for Points is that back at the dawn of time --- 20 years ago, when C++ was new --- inlining wasn't handled very well, so that foo.setX(1023) actually took more instructions than foo.x = 1023. That's no longer true.
Structs are fine as long as they're kept small. As you probably know, they are allocated on the stack (not the heap) so you need to watch the size. They can come in handy for small data structures like Point, Size, etc. A class is usually the better choice though, being a reference type and all.