How to compare generic structs in C++? - c++

I want to compare structs in a generic way and I've done something like this (I cannot share the actual source, so ask for more details if necessary):
template<typename Data>
bool structCmp(Data data1, Data data2)
{
void* dataStart1 = (std::uint8_t*)&data1;
void* dataStart2 = (std::uint8_t*)&data2;
return memcmp(dataStart1, dataStart2, sizeof(Data)) == 0;
}
This mostly works as intended, except sometimes it returns false even though the two struct instances have identical members (I've checked with eclipse debugger). After some searching I discovered that memcmp can fail due to the struct used being padded.
Is there a more proper way of comparing memory that's indifferent to padding? I'm not able to modify the structs used (they're part of an API I'm using) and the many different structs used has some differing members and thus cannot be compared individually in a generic way (to my knowledge).
Edit: I'm unfortunately stuck with C++11. Should've mentioned this earlier...

No, memcmp is not suitable to do this. And reflection in C++ is insufficient to do this at this point (there are going to be experimental compilers that support reflection strong enough to do this already, and c++23 might have the features you need).
Without built-in reflection, the easiest way to solve your problem is to do some manual reflection.
Take this:
struct some_struct {
int x;
double d1, d2;
char c;
};
we want to do the minimal amount of work so we can compare two of these.
If we have:
auto as_tie(some_struct const& s){
return std::tie( s.x, s.d1, s.d2, s.c );
}
or
auto as_tie(some_struct const& s)
-> decltype(std::tie( s.x, s.d1, s.d2, s.c ))
{
return std::tie( s.x, s.d1, s.d2, s.c );
}
for c++11, then:
template<class S>
bool are_equal( S const& lhs, S const& rhs ) {
return as_tie(lhs) == as_tie(rhs);
}
does a pretty decent job.
We can extend this process to be recursive with a bit of work; instead of comparing ties, compare each element wrapped in a template, and that template's operator== recursively applies this rule (wrapping the element in as_tie to compare) unless the element already has a working ==, and handles arrays.
This will require a bit of a library (100ish lines of code?) together with writing a bit of manual per-member "reflection" data. If the number of structs you have is limited, it might be easier to write per-struct code manually.
There are probably ways to get
REFLECT( some_struct, x, d1, d2, c )
to generate the as_tie structure using horrible macros. But as_tie is simple enough. In c++11 the repetition is annoying; this is useful:
#define RETURNS(...) \
noexcept(noexcept(__VA_ARGS__)) \
-> decltype(__VA_ARGS__) \
{ return __VA_ARGS__; }
in this situation and many others. With RETURNS, writing as_tie is:
auto as_tie(some_struct const& s)
RETURNS( std::tie( s.x, s.d1, s.d2, s.c ) )
removing the repetition.
Here is a stab at making it recursive:
template<class T,
typename std::enable_if< !std::is_class<T>{}, bool>::type = true
>
auto refl_tie( T const& t )
RETURNS(std::tie(t))
template<class...Ts,
typename std::enable_if< (sizeof...(Ts) > 1), bool>::type = true
>
auto refl_tie( Ts const&... ts )
RETURNS(std::make_tuple(refl_tie(ts)...))
template<class T, std::size_t N>
auto refl_tie( T const(&t)[N] ) {
// lots of work in C++11 to support this case, todo.
// in C++17 I could just make a tie of each of the N elements of the array?
// in C++11 I might write a custom struct that supports an array
// reference/pointer of fixed size and implements =, ==, !=, <, etc.
}
struct foo {
int x;
};
struct bar {
foo f1, f2;
};
auto refl_tie( foo const& s )
RETURNS( refl_tie( s.x ) )
auto refl_tie( bar const& s )
RETURNS( refl_tie( s.f1, s.f2 ) )
c++17 refl_tie(array) (fully recursive, even supports arrays-of-arrays):
template<class T, std::size_t N, std::size_t...Is>
auto array_refl( T const(&t)[N], std::index_sequence<Is...> )
RETURNS( std::array<decltype( refl_tie(t[0]) ), N>{ refl_tie( t[Is] )... } )
template<class T, std::size_t N>
auto refl_tie( T(&t)[N] )
RETURNS( array_refl( t, std::make_index_sequence<N>{} ) )
Live example.
Here I use a std::array of refl_tie. This is much faster than my previous tuple of refl_tie at compile time.
Also
template<class T,
typename std::enable_if< !std::is_class<T>{}, bool>::type = true
>
auto refl_tie( T const& t )
RETURNS(std::cref(t))
using std::cref here instead of std::tie could save on compile-time overhead, as cref is a much simpler class than tuple.
Finally, you should add
template<class T, std::size_t N, class...Ts>
auto refl_tie( T(&t)[N], Ts&&... ) = delete;
which will prevent array members from decaying to pointers and falling back on pointer-equality (which you probably don't want from arrays).
Without this, if you pass an array to a non-reflected struct in, it falls back on pointer-to-non-reflected struct refl_tie, which works and returns nonsense.
With this, you end up with a compile-time error.
Support for recursion through library types is tricky. You could std::tie them:
template<class T, class A>
auto refl_tie( std::vector<T, A> const& v )
RETURNS( std::tie(v) )
but that doesn't support recursion through it.

You are right that padding gets in your way of comparing arbitrary types in this way.
There are measures you can take:
If you are in control of Data then eg gcc has __attribute__((packed)). It has impact on performance, but it might be worth to give it a try. Though, I have to admit that I dont know if packed enables you to disallow padding completely. Gcc doc says:
This attribute, attached to struct or union type definition, specifies that each member of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.
If you are not in control of Data then at least std::has_unique_object_representations<T> can tell you if your comparison will yield correct results:
If T is TriviallyCopyable and if any two objects of type T with the same value have the same object representation, provides the member constant value equal true. For any other type, value is false.
and further:
This trait was introduced to make it possible to determine whether a type can be correctly hashed by hashing its object representation as a byte array.
PS: I only addressed padding, but dont forget that types that can compare equal for instances with different representation in memory are by no means rare (eg std::string, std::vector and many others).

In short: Not possible in a generic way.
The problem with memcmp is that the padding may contain arbitrary data and hence the memcmp may fail. If there were a way to find out where the padding is, you could zero-out those bits and then compare the data representations, that would check for equality if the members are trivially comparable (which is not the case i.e. for std::string since two strings can contain different pointers, but the pointed two char-arrays are equal). But I know of no way to get at the padding of structs. You can try to tell your compiler to pack the structs, but this will make accesses slower and is not really guranteed to work.
The cleanest way to implement this is to compare all members. Of course this is not really possible in a generic way (until we get compile time reflections and meta classes in C++23 or later). From C++20 onward, one could generate a default operator<=> but I think this would also only be possible as a member function so, again this is not really applicable. If you are lucky and all structs you want to compare have an operator== defined, you can of course just use that. But that is not guaranteed.
EDIT: Ok, there is actually a totally hacky and somewhat generic way for aggregates. (I only wrote the conversion to tuples, those have a default comparison operator). godbolt

C++ 20 supports default comaparisons
#include <iostream>
#include <compare>
struct XYZ
{
int x;
char y;
long z;
auto operator<=>(const XYZ&) const = default;
};
int main()
{
XYZ obj1 = {4,5,6};
XYZ obj2 = {4,5,6};
if (obj1 == obj2)
{
std::cout << "objects are identical\n";
}
else
{
std::cout << "objects are not identical\n";
}
return 0;
}

Assuming POD data, default assignment operator copies only member bytes. (actually not 100% sure about that, don't take my word for it)
You can use this to your advantage:
template<typename Data>
bool structCmp(Data data1, Data data2) // Data is POD
{
Data tmp;
memcpy(&tmp, &data1, sizeof(Data)); // copy data1 including padding
tmp = data2; // copy data2 only members
return memcmp(&tmp, &data1, sizeof(Data)) == 0;
}

I believe you may be able to base a solution on Antony Polukhin's wonderfully devious voodoo in the magic_get library - for structs, not for complex classes.
With that library, we are able to iterate the different fields of a struct, with their appropriate type, in purely-general-templated code. Antony has used this, for example, to be able to stream arbitrary structs to an output stream with the correct types, completely generically. It stands to reason that comparison might also be a possible application of this approach.
... but you would need C++14. At least it's better than the C++17 and later suggestions in other answers :-P

Related

Tuple wrapper that works with get, tie, and other tuple operations

I have written a fancy "zip iterator" that already fulfils many roles (can be used in for_each, copy loops, container iterator range constructors etc...).
Under all the template code to work around the pairs/tuples involved, it comes down to the dereference operator of the iterator returning a tuple/pair of references and not a reference to a tuple/pair.
I want my iterator to work with std::sort, so I need to be able to do swap(*iter1, *iter2) and have the underlying values switched in the original containers being iterated over.
The code and a small demo can be viewed here (it's quite a bit to get through): http://coliru.stacked-crooked.com/a/4fe23b4458d2e692
Although libstdc++'s sort uses std::iter_swap which calls swap, e.g. libc++'s does not, and it just calls swap directly, so I would like a solution involving swap as the customization point.
What I have tried (and gotten oooooh so close to working) is instead of returning std::pair/std::tuple from the operator* as I am doing now, is returning a simple wrapper type instead. The intent is to have the wrapper behave as if it were a std::pair/std::tuple, and allow me to write a swap function for it.
It looked like this:
template<typename... ValueTypes>
struct TupleWrapper : public PairOrTuple_t<ValueTypes...>
{
using PairOrTuple_t<ValueTypes...>::operator=;
template<typename... TupleValueTypes>
operator PairOrTuple_t<TupleValueTypes...>() const
{
return static_cast<PairOrTuple_t<ValueTypes...>>(*this);
}
};
template<std::size_t Index, typename... ValueTypes>
decltype(auto) get(TupleWrapper<ValueTypes...>& tupleWrapper)
{
return std::get<Index>(tupleWrapper);
}
template<std::size_t Index, typename... ValueTypes>
decltype(auto) get(TupleWrapper<ValueTypes...>&& tupleWrapper)
{
return std::get<Index>(std::forward<TupleWrapper<ValueTypes...>>(tupleWrapper));
}
template<typename... ValueTypes,
std::size_t... Indices>
void swap(TupleWrapper<ValueTypes...> left,
TupleWrapper<ValueTypes...> right,
const std::index_sequence<Indices...>&)
{
(std::swap(std::get<Indices>(left), std::get<Indices>(right)), ...);
}
template<typename... ValueTypes>
void swap(TupleWrapper<ValueTypes...> left,
TupleWrapper<ValueTypes...> right)
{
swap(left, right, std::make_index_sequence<sizeof...(ValueTypes)>());
}
namespace std
{
template<typename... ValueTypes>
class tuple_size<utility::implementation::TupleWrapper<ValueTypes...>> : public tuple_size<utility::implementation::PairOrTuple_t<ValueTypes...>> {};
template<std::size_t Index, typename... ValueTypes>
class tuple_element<Index, utility::implementation::TupleWrapper<ValueTypes...>> : public tuple_element<Index, utility::implementation::PairOrTuple_t<ValueTypes...>> {};
}
Full code here: http://coliru.stacked-crooked.com/a/951cd639d95af130.
Returning this wrapper in operator* seems to compile (at least on GCC) but produces garbage.
On Clang's libc++, the std::tie fails to compile.
Two questions:
How can I get this to compile with libc++ (the magic seems to lie in the conversion operator of TupleWrapper?)
Why is the result wrong and what did I do wrong?
I know it's a lot of code, but well, I can't get it any shorter as all the tiny examples of swapping tuple wrappers worked fine for me.
1st problem
One of the issues is that the ZipIterator class does not satisfy the requirements of RandomAccessIterator.
std::sort requires RandomAccessIterators as its parameters
RandomAccessIterators must be BidirectionalIterators
BidirectionalIterators must be ForwardIterators
ForwardIterators have the condition that ::reference must be value_type& / const value_type&:
The type std::iterator_traits<It>::reference must be exactly
T& if It satisfies OutputIterator (It is mutable)
const T& otherwise (It is constant)
(where T is the type denoted by std::iterator_traits<It>::value_type)
which ZipIterator currently doesn't implement.
It works fine with std::for_each and similar functions that only require the iterator to satisfy the requirements of InputIterator / OutputIterator.
The reference type for an input iterator that is not also a LegacyForwardIterator does not have to be a reference type: dereferencing an input iterator may return a proxy object or value_type itself by value (as in the case of std::istreambuf_iterator).
tl;dr: ZipIterator can be used as an InputIterator / OutputIterator, but not as a ForwardIterator, which std::sort requires.
2nd problem
As #T.C. pointed out in their comment std::sort is allowed to move values out of the container and then later move them back in.
The type of dereferenced RandomIt must meet the requirements of MoveAssignable and MoveConstructible.
which ZipIterator currently can't handle (it never copies / moves the referenced objects), so something like this doesn't work as expected:
std::vector<std::string> vector_of_strings{"one", "two", "three", "four"};
std::vector<int> vector_of_ints{1, 2, 3, 4};
auto first = zipBegin(vector_of_strings, vector_of_ints);
auto second = first + 1;
// swap two values via a temporary
auto temp = std::move(*first);
*first = std::move(*second);
*second = std::move(temp);
// Result:
/*
two, 2
two, 2
three, 3
four, 4
*/
(test on Godbolt)
Result
Unfortunately it is not possible to create an iterator that produces elements on the fly and can by used as a ForwardIterator with the current standard (for example this question)
You could of course write your own algorithms that only require InputIterators / OutputIterators (or handle your ZipIterator differently)
For example a simple bubble sort: (Godbolt)
template<class It>
void bubble_sort(It begin, It end) {
using std::swap;
int n = std::distance(begin, end);
for (int i = 0; i < n-1; i++) {
for (int j = 0; j < n-i-1; j++) {
if (*(begin+j) > *(begin+j+1))
swap(*(begin+j), *(begin+j+1));
}
}
}
Or change the ZipIterator class to satisfy RandomAccessIterator.
I unfortunately can't think of a way that would be possible without putting the tuples into a dynamically allocated structure like an array (which you're probably trying to avoid)

Nice syntax to get sized reference to vector's/array's data?

I'm wondering if there's any std:: function to get a sized pointer/reference to a vector/array's underlying data? Something better than:
const size_t(&asArray1)[N] = *(size_t(*)[N]) vec.data();
const size_t(&asArray2)[arr.size()] = *(size_t(*)[arr.size()]) arr.data();
Clarification - something I could pass to the below:
template<size_t N>
void foo(size_t(&sizedArray)[N]) {}
Update -- SOLUTION:
Use helper functions defined once, that do the appropriate casting and leave the call-site cleaner... See my answer below for helper code.
Live demo: https://onlinegdb.com/S167RI20U
What you're asking for is to decide a type (which must be done at compile time) with information only available at runtime. It is impossible.
C++20 std::span encapsulates ugly syntax in its own non-explicit constructor, so just pass it.
Before C++20, there's std::begin , std::end, std::size, which are not exactly what you need, but may help.
All this is implementable in old compiler, it does not require any special compiler support.
SOLUTION: Use helper functions defined once, that do the appropriate casting and leave the call-site cleaner... I'm surprised that helpers like these aren't available at least for std::array...
For vectors, I will admit that it's a rare use case where you have a compile-time known size for the vector. but I do have a use case where I know my vector contains at least as many elements as an array, and I want the first N of them to be processed through some algorithm. I can guarantee that that many elements are available, and include asserts to boot, etc...
Live demo: https://onlinegdb.com/S167RI20U
template<size_t N, typename T>
using CArrayPtr = T(*)[N];
template<size_t N, typename T>
auto& cArray(array<T, N>& arr) {
return *(CArrayPtr<N, T>)arr.data();
}
template<size_t N, typename T>
auto& cArray(vector<T>& vec) {
return *(CArrayPtr<N, T>)vec.data();
}

Why isn't std::variant allowed to equal compare with one of its alternative types?

For example, it should be very helpful to equal compare a std::variant<T1, T2> with a T1 or T2. So far we can only compare with the same variant type.
A variant may have multiple duplicates of the same type. E.g. std::variant<int, int>.
A given instance of std::variant compares equal to another if and only if they hold the same variant alternative and said alternatives' values compare equal.
Thus, a std::variant<int, int> with index() 0 compares not equal to a std::variant<int, int> with index() 1, despite the active variant alternatives being of the same type and same value.
Because of this, a generic "compare to T" was not implemented by the standard. However, you are free to design your own overload of the comparison operators using the other helper utilities in the <variant> header (e.g. std::holds_alternative and std::get<T>).
I can't answer the why part of the question but since you think it would be useful to be able to compare a std::variant<T1, T2> with a T1 or T2, perhaps this can help:
template<typename T, class... Types>
inline bool operator==(const T& t, const std::variant<Types...>& v) {
const T* c = std::get_if<T>(&v);
return c && *c == t; // true if v contains a T that compares equal to t
}
template<typename T, class... Types>
inline bool operator==(const std::variant<Types...>& v, const T& t) {
return t == v;
}
It is an arbitrary decision by the standards committee.
Ok, not quite arbitrary. The point is you have a scale* of strictness of comparison, with points such as:
Most-strict: Only variants can equal each other, and they need to match both in the sequence-of-alternatives (i.e. the type), the actual alternative (the index, really, since you can have multiple identical-type alternatives) and in value.
Less-Strict: Equality of both the variant alternative, as a type and the value, but not of the sequence-of-alternatives, nor the index within that sequence (so the same value within two distinct alternatives of the same type would be equal).
Most-relaxed: Equality of the value in the active alternative, with implicit conversion of one of the elements if relevant.
These are all valid choices. the C++ committee made the decision based on all sorts of extrinsic criteria. Try looking up the std::variant proposal, as perhaps it says what these criteria are.
(*) - A lattice actually.

C++ Type-erasure of a function template using lambdas

I'm trying to type erase an object and ran into a bit of an issue that I'm hoping someone here may be have expertise in.
I haven't had a problem type-erasing arbitrary non-templated functions; so far what I have been doing is creating a custom static
"virtual table"-esque collection of function pointers. This is all managed with non-capturing lambdas, since
they decay into free-function pointers:
template<typename Value, typename Key>
class VTable {
Value (*)(const void*, const Key&) at_function_ptr = nullptr;
// ...
template<typename T>
static void build_vtable( VTable* table ) {
// normalizes function into a simple 'Value (*)(const void*, const Key&)'' type
static const auto at_function = []( const void* p, const Key& key ) {
return static_cast<const T*>(p)->at(key);
}
// ...
table->at_function_ptr = +at_function;
}
// ...
}
(There are more helper functions/aliases that are omitted for brevity)
Sadly this same approach does not work with a function template.
My desire is for the type-erased class to have something akin to the following:
template<typename U>
U convert( const void* ptr )
{
return cast<U>( static_cast<const T*>( ptr ) );
}
where:
cast is a free function,
U is the type being casted to,
T is the underlying type erased type being casted from, and
ptr is the type-erased pointer that follows the same idiom above for the type erasure.
[Edit: The issue above is that T isn't known from the function convert; the only function that knows of T's type in the example is build_vtable. This may just require a design change]
The reason this has become challenging is that there does not appear to be any simple way to type erase both types
independently. The classical/idiomatic type-erasure technique of a base-class doesn't work here, since you can't have
a virtual template function. I have experimented with a visitor-like pattern with little success for similar
reasons to the above.
Does anyone with experience in type-erasure have any suggestions or techniques that can be used to achieve what
I'm trying to do? Preferably in standards-conforming c++14 code.
Or, perhaps is there design change that might facilitate the same concept desired here?
I've been searching around for this answer for a little while now, and haven't had much luck. There are a few cases that are similar to what I'm trying to do, but often with enough differences that the solutions don't seem to apply to the same problem (Please let me know if I'm wrong!).
It appears most readings/blogs on these topics tend to cover the basic type-erasure technique, but not what I'm looking for here!
Thanks!
Note: please do not recommend Boost. I am in an environment where I am unable to use their libraries, and do not
wish to introduce that dependency to the codebase.
Each distinct convert<U> is a distinct type erasure.
You can type erase a list of such functions, storing the method of doing it in each case. So suppose you have Us..., type erase all of convert<Us>....
If Us... is short this is easy.
If it is long, this is a pain.
It is possible that the majority of these may be null (as in operation is illegal), so you can implement a sparse vtable that takes this into account, so your vtable isn't large and full of zeros. This can be done by type erasing a function (using the standard vtable technique) that returns a reference (or a type-erased accessor) to said sparse vtable that maps from std::typeindex to U-placement-constructor converter (that writes to a void* in the signature). You then run that function, extract the entry, create a buffer to store the U in, call the U-placement-constructor converter passing in that buffer.
This all occurs in your type_erased_convert<U> function (which itself is not type-erased) so end users don't have to care about the internal details.
You know, simple.
The restriction is that the list of possible convert-to types U that are supported needs to be located prior to the location of type erasure. Personally, I would restrict type_erased_convert<U> to only being called on the same list of types U, and accept that this list must be fundamentally short.
Or you could create some other conversion graph that lets you plug a type into it and determine how to reach another type possibly through some common intermediary.
Or you could use a scripting or bytecode language that includes a full compiler during the execution phase, permitting the type-erased method to be compiled against a new completely independant type when called.
std::function< void(void const*, void*) > constructor;
std::function< constructor( std::typeindex ) > ctor_map;
template<class...Us>
struct type_list {};
using target_types = type_list<int, double, std::string>;
template<class T, class U>
constructor do_convert( std::false_type ) { return {}; }
template<class T, class U>
constructor do_convert( std::true_type ) {
return []( void const* tin, void* uout ) {
new(uout) U(cast<U>( static_cast<const T*>( ptr ) ));
};
}
template<class T, class...Us>
ctor_map get_ctor_map(std::type_list<Us...>) {
std::unordered_map< std::typeindex, constructor > retval;
using discard = int[];
(void)discard{0,(void(
can_convert<U(T)>{}?
(retval[typeid(U)] = do_convert<T,U>( can_convert<U(T)>{} )),0
: 0
),0)...};
return [retval]( std::typeindex index ) {
auto it = retval.find(index);
if (it == retval.end()) return {};
return it->second;
};
}
template<class T>
ctor_map get_ctor_map() {
return get_ctor_map<T>(target_types);
}
You can replace the unordered_map with a compact stack-based one when it is small. Note that std::function in MSVC is limited to about 64 bytes or so?
If you don't want a fixed list of source/dest types, we can decouple this.
Expose the typeindex of the type stored within the type erasure container, and the ability to get at the void const* that points at it.
Create a type trait that maps a type T to the list of types Us... it supports conversion-to. Use the above technique to store these conversion functions in a (global) map. (Note that this map can be placed in static storage, as you can deduce the size of the buffer required etc. But using an static unordered_map is easier).
Create a second type trait that maps a type U to a list of types Ts... it supports conversion-from.
In both cases, a function convert_construct( T const* src, tag_t<U>, void* dest ) is called to do the actual conversion.
You'd start with a set of universal targets type_list<int, std::string, whatever>. A particular type would augment it by having a new list.
For a type T building its sparse conversion table we would attempt each target type. If an overload of convert_construct fails to be found, the map would not be populated for that case. (Generating compile time errors for types added explicitly to work with T is an option).
On the other end, when we call the type_erased_convert_to<U>( from ), we look for a different table that maps the type U cross typeindex to a U(*)(void const* src) converter. Both the from-T map gotten from the type-erased T and the to-U gotten in the wrapping code are consulted to find a converter.
Now, this doesn't permit certain kinds of conversion. For example, a type T that converts-from anything with a .data() -> U* and .size() -> size_t method needs to explicitly list every type it converts-from.
The next step would be to admit a multi-step conversion. A multi-step conversion is where you teach your T to convert-to some (set of) famous types, and we teach U to convert-from a similar (set of) famous types. (The fame of these types is optional, I'll admit; all you need to know is how to create and destroy them, what storage you need, and a way to match up the T-to and U-from options, to use them as an intermediary.)
This may seem over engineered. But the ability to convert-to std::int64_t and convert-from that to any signed integral type is an example of this (and similarly for uint64_t and unsigned).
Or the ability to convert-to a dictionary of key-value pairs, and then examine this dictionary on the other side to determine if we can convert-from it.
As you go down this path, you'll want to examine loose typing systems in various scripting and bytecode languages to pick up how they did it.

How to implement own function to std::vector?

I would like to add a function that returns the .size() value as an integer, instead of unsigned integer.
Edit: Due to comments, i explain more detailed:
I have code:
int something = 3;
if(arr.size() > something)
Which will produce compiler warning, and i dislike adding (int) to every place where i have this.
So, a solution i thought it would be nice to have sizei() function:
int something = 3;
if(arr.sizei() > something)
Which wouldnt produce a warning.
So, im not wanting to create a separate function, but a function in the std::vector itself.
Edit: Seems like the only way to do this is to create another function, such as:
template <typename T>
inline int sizei(const T &arr){
return (int)arr.size();
}
On the positive side: this doesnt seem to increase my executable size at all.
First of all, why would you want that? I don't see any reason, or advantage:
Anyway, you can do this:
template<typename T>
int size(const std::vector<T> &v) { return (int) v.size(); }
//use
std::vector<int> ints;
//...
int count = size(ints);
Still I don't see any point in doing that. You can simply write:
int count = (int) ints.size();
But I would still say its not better than the following :
size_t count = ints.size(); //don't prefer anything over this. Always use size_t
Advice: avoid using int for size. Prefer size_t.
As for the edit in your question. Why don't you use size_t as:
size_t something = 3;
if(arr.size() > something)
No warning. In my opinion, if you choose the data type consistently throughout your program, you wouldn't come across a situation when you've to compare int with size_t which is defined as unsigned integral type.
Or if there is some legacy code which you've to work with, and which use int for size, then I think its better to use explicit cast when you need it, instead of adding a function in the framework itself, which hides the potential problem:
int something = /*some legacy code API call or something */;
if(arr.size() > (size_t) something)
//or even better;
size_t something = (size_t) /*some legacy code API call or something */;
if(arr.size() > something)
As a rule, in C and C++ you should never use an unsigned type such as size_t to restrict the domain. That's because (1) these languages provide no range checking, and (2) they do provide unreasonable implicit promotions. No range checking means (1) no advantage, and unreasonable implicit promotions means (2) very undesirable disadvantages, so it's plain stupid thing to do: no advantage, very undesirable disadvantages.
However, the standard libraries for these languages do that. They do it for historical reasons only, caught irreversibly in early decisions which at one time made sense. This has both extremely silly consequences such as C99 requiring 17 (!) bits for ptrdiff_t, and it has the aforementioned extremely undesirable consequences such as using inordinately much time on hunting down bugs resulting from implicit promotions (etc.). For example, in C++ you are practically guaranteed that std::string( "bah!" ).length() < -5 – which can easily trip you up and anyway is as silly as it is possible to design.
Now, you can't infuse new member functions in std::vector, but you can add a freestanding function. A good name is countOf. Template it so that it can be applied to just about anything (raw arrays, vectors, etc.).
The triad of functions startOf, endOf and countOf were, as far as I know, first identified by Dietmar Kuehl. C++0x will have std::begin and std::end, but AFAIK no corresponding std::size. In the meantime you can just define this support, which allows you to treat any kinds of container plus raw arrays the same.
An example implementation & further discussion is provided at my blog.
EDIT Adding some code, because it's requested in the comments.
Detection of suitable iterator type:
template< typename Type >
struct It
{
typedef typename Type::iterator T;
};
template< typename Type >
struct It< Type const >
{
typedef typename Type::const_iterator T;
};
template< typename ElemType, Size N >
struct It< ElemType[N] >
{
typedef ElemType* T;
};
And the countOf, startOf and endOf functions, using that deduced iterator type:
template< typename T >
inline Size countOf( T const& c ) { return static_cast<Size>( c.size() ); }
template< typename T, Size N >
inline Size countOf( T (&)[N] ) { return N; }
template< typename T >
inline typename It<T>::T startOf( T& c ) { return c.begin(); }
template< typename T, Size N >
inline T* startOf( T (&a)[N] ) { return a; }
template< typename T >
inline typename It<T>::T endOf( T& c ) { return c.end(); }
template< typename T, Size N >
inline T* endOf( T (&a)[N] ) { return a + N; }
where Size is a typedef for ptrdiff_t.
Note: in 64-bit Windows int (and even long) is 32-bit. Hence, int is in general not sufficient for a really large array. ptrdiff_t is guaranteed to be able to represent the difference between any two pointers, when that difference is well-defined.
Cheers & hth.
You can derive from vector as follows:
template<typename T>
class my_vector : public vector<T>
{
// missing constructors!
int size() const
{
if (vector<T>::size() > INT_MAX)
throw std::range_error("too many elements in vector");
return (int) vector<T>::size();
}
};
The down-side is that you'll have to define and forward constructors yourself.
I would favor using an explicit cast to int instead of a function: static_cast<int> (v.size()). Even better would be to always use size_t when dealing with memory sizes. For example, favor for (size_t i=0; i < v.size(); ++i) over for (int i=0; i < (int) v.size(); ++i). Use the right type for the job. You should not be comparing std::vector sizes with a signed type.
See the following references for why you should prefer size_t to int:
Using size_t appropriately can improve the portability, efficiency, or readability of your code. Maybe even all three.
unsigned int vs. size_t
Is it good practice to use size_t in C++?
When should I use std::size_t?
Quick answer for .size() is: no. For vectors, the possibilities are its storage value and the alloc method (default new/delete, not normally overridden) along with methods that utilize InputIterator.
Most are going to ask why would you want a different size_t. If it's just the annoy warnings, you can cast or use unsigned integers to iterate/check against size(). (If it's a lot of code, you going to have to find/replace)... If it is handling empty conditions, you could wrap the vector in a class with some smarts. As an aside, since I don't know your problem at hand, a good place to look for ideas and already implemented features is std library's algorithms such as sort, for_each, find, and lots more.
For std algorithms, see: http://www.sgi.com/tech/stl/table_of_contents.html
While #Nawaz, in my opinion, provided the most appropriate answer, if you really want to add an additional member to std::vector<> it isn't really possible. #zvrba provided the only way that could be accomplished, but as stated in the comments there the std container types do not have virtual destructors and therefore are not meant to be subclassed from.
However, you could implement a new type of vector using a container adaptor, like this:
template <class T>
class my_vector
{
public:
int size_i() const
{
return static_cast<int>(container_.size());
}
private:
std::vector<T> container_;
};
The drawback here is that you have to explicitly expose the functions of the container that you actually need to support. If you are using 'std::vector' normally throughout your code, this would likely be a significant change. See 'std::queue' for an implementation example of a container adaptor.