C++ sort algorithm segmentation fault [duplicate] - c++

Can someone explain why the sort below causes seg faults? Is this a known bug with g++ (sorting vector of pointers)? I am compiling with g++ 4.5.2.
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
typedef vector<int> A;
bool face_cmp(const A *x, const A *y) {
return x != y;
}
int main(int argc, char* argv[]) {
vector<A *> vec;
for (int i=0; i<100; i++) {
vec.push_back( new vector<int>(i%100, i*i) );
}
vector<A *>::iterator it;
sort(vec.begin(), vec.end(), face_cmp);
return EXIT_SUCCESS;
}
Compiling on codepad gives:
/usr/local/lib/gcc/i686-pc-linux-gnu/4.1.2/../../../../include/c++/4.1.2/debug/safe_iterator.h:240:
error: attempt to decrement a dereferenceable (start-of-sequence)
iterator.
Objects involved in the operation:
iterator "this" # 0x0xbf4b0844 {
type = N11__gnu_debug14_Safe_iteratorIN9__gnu_cxx17__normal_iteratorIPPN15__gnu_debug_def6vectorIiSaIiEEEN10__gnu_norm6vectorIS7_SaIS7_EEEEENS4_IS7_SB_EEEE (mutable iterator);
state = dereferenceable (start-of-sequence);
references sequence with type `N15__gnu_debug_def6vectorIPNS0_IiSaIiEEESaIS3_EEE' # 0x0xbf4b0844
}
Thank you for the all the quick replies. The original comp function was:
if (x == y) return false;
if (x->size() < y->size()) return true;
else if (x->size() > y->size()) return false;
else {
for (register int i=0; i<x->size(); i++) {
if ((*x)[i] < (*y)[i]) return true;
}
return false;
}
I just changed the first line and removed the rest. But it turns out it also suffers from not being a strict weak ordering (I forgot the case if (*x)[i] > (*y)[i]). I should probably have posted the entire function to begin with. Nevertheless, thanks again!!

The comparison function must define a strict weak ordering which means that a < b and b < a cannot be both true. Your comparison function does not have this property.
It does not define any "before-after" relationship, so it's no wonder that the algorithm relying on this property does not function properly.

Third argument of std::sort should be a function (or functional object) such that if compare(a, b) is true then compare(b, a) should be false, but your one isn't such. So your program is UB and can give any result.

No your code is wrong. Comparison functions for std::sort must use < or it's equivalent, using != is not correct. Probably you want this
bool face_cmp(const A *x, const A *y) {
return *x < *y;
}

Ensure that you're just using greater than or less than. DO NO USE equal to. Equal to will SEGFAULT with certain data sets:
// Good
bool face_cmp(const A *x, const A *y) {
return *x < *y;
}
// Also okay for reverse sorting
bool face_cmp(const A *x, const A *y) {
return *x > *y;
}
// This will SEGFAULT
bool face_cmp(const A *x, const A *y) {
return *x <= *y;
}
The real danger with <= is the lack of repeatability. I had some C++ code that SEGFAULT'ed on Android, while happily running on my x86 PC. For me, the magic number was 68 elements, 67 was fine, 68 SEGFAULT'ed.

Define your comparison function as
bool face_cmp(const A *x, const A *y) {
return x < y;
}

Related

Accessing std::map with custom struct as key type causes strange behaviour

I am trying to access std::map with simple custom key, but while most of the time this works, every once in a while, depending on the values given, it will fail to access the mapped value.
Here I baked a test program, that shows the issue in more detail:
#include <map>
#include <cstdint>
#include <cassert>
struct key_type
{
uint32_t a;
uint32_t b;
bool operator<(const key_type& value) const
{
if (value.a < a)
return true;
if (value.b < b)
return true;
return false;
}
key_type(uint32_t a, uint32_t b) : a(a), b(b)
{}
};
std::map<key_type, int*> test;
int get_int(uint32_t a, uint32_t b)
{
if (test.count(key_type(a, b)) == 0)
{
int* r = new int;
assert(r != nullptr);
key_type key = key_type(a, b);
test[key] = r;
assert(test[key] != nullptr);
}
return *test[key_type(a,b)];
}
Now I try to call get_int with two different sets of arguments. The first case works as expected.
int main(int argc, char* argv[])
{
get_int(2, 4);
get_int(3, 4);
get_int(4, 5);
get_int(2, 1);
get_int(120, 1);
return 0;
}
Now if I change the set of values a bit, everything explodes.
int main(int argc, char* argv[])
{
get_int(2, 4);
get_int(3, 4);
get_int(4, 5);
get_int(120, 1);
return 0;
}
The "assert(test[key] != nullptr);" fails. While I can circumvent the actual problem, but I would like to know what happens here under the surface that causes this behaviour?
Your comparison operator does not make much sense. The complement to
(value.a < a)
includes also cases where value.a > a.
If you make the entire body of the comparison operator:
return std::make_pair(a, b) < std::make_pair(value.a, value.b);
even better would be to use std::tie:
return std::tie(a, b) < std::tie(value.a, value.b);
Your operator< does not impose a Strict Weak Ordering™. Therefore, your attempt to use the map is undefined behaviour.
Basically, the operator doesn't actually produce a single ordering that orders all values of that type.
Consider:
bool operator<(const key_type& value) const
{
if (value.a != a)
return value.a < a;
if (value.b != b)
return value.b < b;
return false;
}
Your ordering is loosely weak. Read this article from wikipedia and this one from Wolfram.
I hope you understand the importance of these articles but regardless look at the following case, according to your algorithm
(3,2) < (4,1) returns true
and
(4,1) < (3,2) returns true
the std::map requires strong ordering and the above will cause undefined behaviours.
To fix you must do the following
if a < value.a return true;
if a > value.a return false;
if b < value.b return true;
return false;

Why does std::sort throw a segmentation fault on this code?

Can someone explain why the sort below causes seg faults? Is this a known bug with g++ (sorting vector of pointers)? I am compiling with g++ 4.5.2.
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
typedef vector<int> A;
bool face_cmp(const A *x, const A *y) {
return x != y;
}
int main(int argc, char* argv[]) {
vector<A *> vec;
for (int i=0; i<100; i++) {
vec.push_back( new vector<int>(i%100, i*i) );
}
vector<A *>::iterator it;
sort(vec.begin(), vec.end(), face_cmp);
return EXIT_SUCCESS;
}
Compiling on codepad gives:
/usr/local/lib/gcc/i686-pc-linux-gnu/4.1.2/../../../../include/c++/4.1.2/debug/safe_iterator.h:240:
error: attempt to decrement a dereferenceable (start-of-sequence)
iterator.
Objects involved in the operation:
iterator "this" # 0x0xbf4b0844 {
type = N11__gnu_debug14_Safe_iteratorIN9__gnu_cxx17__normal_iteratorIPPN15__gnu_debug_def6vectorIiSaIiEEEN10__gnu_norm6vectorIS7_SaIS7_EEEEENS4_IS7_SB_EEEE (mutable iterator);
state = dereferenceable (start-of-sequence);
references sequence with type `N15__gnu_debug_def6vectorIPNS0_IiSaIiEEESaIS3_EEE' # 0x0xbf4b0844
}
Thank you for the all the quick replies. The original comp function was:
if (x == y) return false;
if (x->size() < y->size()) return true;
else if (x->size() > y->size()) return false;
else {
for (register int i=0; i<x->size(); i++) {
if ((*x)[i] < (*y)[i]) return true;
}
return false;
}
I just changed the first line and removed the rest. But it turns out it also suffers from not being a strict weak ordering (I forgot the case if (*x)[i] > (*y)[i]). I should probably have posted the entire function to begin with. Nevertheless, thanks again!!
The comparison function must define a strict weak ordering which means that a < b and b < a cannot be both true. Your comparison function does not have this property.
It does not define any "before-after" relationship, so it's no wonder that the algorithm relying on this property does not function properly.
Third argument of std::sort should be a function (or functional object) such that if compare(a, b) is true then compare(b, a) should be false, but your one isn't such. So your program is UB and can give any result.
No your code is wrong. Comparison functions for std::sort must use < or it's equivalent, using != is not correct. Probably you want this
bool face_cmp(const A *x, const A *y) {
return *x < *y;
}
Ensure that you're just using greater than or less than. DO NO USE equal to. Equal to will SEGFAULT with certain data sets:
// Good
bool face_cmp(const A *x, const A *y) {
return *x < *y;
}
// Also okay for reverse sorting
bool face_cmp(const A *x, const A *y) {
return *x > *y;
}
// This will SEGFAULT
bool face_cmp(const A *x, const A *y) {
return *x <= *y;
}
The real danger with <= is the lack of repeatability. I had some C++ code that SEGFAULT'ed on Android, while happily running on my x86 PC. For me, the magic number was 68 elements, 67 was fine, 68 SEGFAULT'ed.
Define your comparison function as
bool face_cmp(const A *x, const A *y) {
return x < y;
}

Alterative array representation

I'm facing a problem in C++ for which I currently don't have an elegant solution. I'm receiving data in the following format:
typedef struct {
int x;
int y;
int z;
}Data3D;
vector<Data3D> v; // the way data is received (can be modified)
But the functions that do the computations receive parameters like this:
Compute(int *x, int *y, int *z, unsigned nPoints)
{...}
Is there a way to modify the way data is received Data3D so that the memory representation would change from:
XYZXYZXYZ
to
XXXYYYZZZ
What I'm looking for is some way of populating a data structure in a similar way we populate an array but that has the representation above (XXXYYYZZZ). Any custom data structures are welcome.
So I want to write something like (in the above example):
v[0].x = 1
v[0].y = 2
v[0].y = 0
v[1].x = 6
v[1].y = 7
v[1].z = 5
and to have the memory representation below
1,6...2,7....0,5
1,6 is the beginning of the x array
2,7 is the beginning of the y array
0,5 is the beginning of the z array
I know that this can be solved by using a temporary array but I'm interested to know if there are other methods for doing this.
Thanks,
Iulian
LATER EDIT:
Since there are some solutions that change only the declaration of Compute function without changing its code - this should be taken into account also. See the answers related to the solution that involves using an iterator.
Iterator-based solution
An elegant solution would be to make Compute() accept iterators instead of pointers. The iterators you provide will have an adequate ++ operator (see boost::iterator for an easy way to build them)
Compute(MyIterator x, MyIterator y, MyIterator z);
There are normally very few changes to make to the function body, since *x, x[i] or ++x will be handled by MyIterator to point to the right memory location.
Quick'n Dirty solution
A less elegant but more straightforward solution is to hold your Data in the following struct
typedef struct {
std::vector<int> x;
std::vector<int> y;
std::vector<int> z;
}DataArray3D;
When receiving the data fill your struct like
void Receive(const Data3D& data, DataArray3D& array)
{
array.x.push_back(data.x);
array.y.push_back(data.y);
array.z.push_back(data.z);
}
and call Compute like this (Compute itself is unchanged)
Compute(&array.x[0], &array.y[0], &array.z[0]);
You could of course change your computer function.
I assume that all operation done on your int* in compute are dereference and increment operation.
I did not test it but you could pass in a structure like this
struct IntIterator
{
int* m_currentPos;
IntIterator(int* startPos):m_currentPos(startPos){};
IntIterator& operator++()
{
m_currentPos += 3;
return *this;
}
IntIterator& operator++(int)
{
m_currentPos += 3;
return *this;
}
int operator*()
{
return *m_currentPos;
}
int& operator[](const int index)
{
return m_currentPos[index*3];
}
};
And initialize it with this
std::vector<Data3D> v;
IntIterator it(&v[0].x);
Now all you need to do is change the type of your compute function arguments and it should do it. If of course some pointer arithmetics are used than it is getting more complex.
Reasonably elegant would be (not compiled/tested):
struct TempReprPoints
{
TempReprPoints(size_t size)
{
x.reserve(size); y.reserve(size); z.reserve(size);
}
TempReprPoints(const vector<Data3D> &v)
{
x.reserve(v.size()); y.reserve(v.size()); z.reserve(v.size());
for (size_t i = 0; i < v.size(); ++i ) push_back(v[i]);
}
void push_back(const Data3D& data)
{
x.push_back(data.x); y.push_back(data.y); z.push_back(data.z);
}
int* getX() { return &x[0]; }
int* getY() { return &y[0]; }
int* getZ() { return &z[0]; }
size_t size() { return x.size(); }
std::vector<int> x;
std::vector<int> y;
std::vector<int> z;
};
So you can fill it with a loop or even try to make the std::back_inserter work with it.
In order to get the syntax you want, you could use something like this.
struct Foo {
vector<int> x;
vector<int> y;
vector<int> z;
struct FooAccessor {
FooAccessor(Foo & f, int i) : x(f.x[i]), y(f.y[i]), z(f.z[i]) {}
int &x, &y, &z;
};
FooAccessor operator[](int i) {
return FooAccessor(*this, i);
}
};
int main() {
Foo f;
f.x.resize(10);
f.y.resize(10);
f.z.resize(10);
f[0].x = 1;
f[1].y = 2;
f[2].z = 3;
for (size_t p = 0; p < 10; ++p) {
cout << f.x[p] << "," << f.y[p] << "," << f.z[p] << endl;
}
}
I'd consider this an ugly solution - changing the way you access your data would likely be "better".

comparision function in stl::sort()

class node{
public:
unsigned long long int value;
int index;
};
bool comp2(node& a,node& b){
if(a.value < b.value) { return true; }
return false;
}
vector <node*>llist,rlist;
sort(llist.begin(),llist.end(),comp2);
Above code was giving me some weired error that is too in some other lines(places latter in code), but when i changed comp2 function to following all error disappeared .
bool comp2(node* a,node* b){
assert(a && b && "comp2 - null argument");
if(a->value < b->value){ return true; }
return false;
}
Any rationale on this ?
ERROR:/usr/include/c++/4.4/bits/stl_algo.h|124|error: invalid initialization of reference of type ‘node&’ from expression of type ‘node* const’|
If this(bellow) works then above should also work
using namespace std;
void rep(int& a,int& b){
int c;
c=a;
a=b;
b=c;
}
int main(void){
int a=3,b=4;
rep(a,b);
cout<<a<<" "<<b;
return 0;
}
You have defined a std::vector of node *. Therefore, all the elements are node *, and all operations that the vector performs will be on node *. You can't give sort a comparison function of a different type.
The rational is that your vector contains values of type node*, so the comparison functions needs to compare values of type node*.
What you probably meant to say in the first place was vector<node>. If you wanted pointers to nodes, then the second comparison function is reasonable.

C++ Operator overloading - 'recreating the Vector'

I am currently in a collage second level programing course... We are working on operator overloading... to do this we are to rebuild the vector class...
I was building the class and found that most of it is based on the [] operator. When I was trying to implement the + operator I run into a weird error that my professor has not seen before (apparently since the class switched IDE's from MinGW to VS express...) (I am using Visual Studio Express 2008 C++ edition...)
Vector.h
#include <string>
#include <iostream>
using namespace std;
#ifndef _VECTOR_H
#define _VECTOR_H
const int DEFAULT_VECTOR_SIZE = 5;
class Vector
{
private:
int * data;
int size;
int comp;
public:
inline Vector (int Comp = 5,int Size = 0)
: comp(Comp), size(Size) { if (comp > 0) { data = new int [comp]; }
else { data = new int [DEFAULT_VECTOR_SIZE];
comp = DEFAULT_VECTOR_SIZE; }
}
int size_ () const { return size; }
int comp_ () const { return comp; }
bool push_back (int);
bool push_front (int);
void expand ();
void expand (int);
void clear ();
const string at (int);
int& operator[ ](int);
int& operator[ ](int) const;
Vector& operator+ (Vector&);
Vector& operator- (const Vector&);
bool operator== (const Vector&);
bool operator!= (const Vector&);
~Vector() { delete [] data; }
};
ostream& operator<< (ostream&, const Vector&);
#endif
Vector.cpp
#include <iostream>
#include <string>
#include "Vector.h"
using namespace std;
const string Vector::at(int i) {
this[i];
}
void Vector::expand() {
expand(size);
}
void Vector::expand(int n ) {
int * newdata = new int [comp * 2];
if (*data != NULL) {
for (int i = 0; i <= (comp); i++) {
newdata[i] = data[i];
}
newdata -= comp;
comp += n;
data = newdata;
delete newdata;
}
else if ( *data == NULL || comp == 0) {
data = new int [DEFAULT_VECTOR_SIZE];
comp = DEFAULT_VECTOR_SIZE;
size = 0;
}
}
bool Vector::push_back(int n) {
if (comp = 0) { expand(); }
for (int k = 0; k != 2; k++) {
if ( size != comp ){
data[size] = n;
size++;
return true;
}
else {
expand();
}
}
return false;
}
void Vector::clear() {
delete [] data;
comp = 0;
size = 0;
}
int& Vector::operator[] (int place) { return (data[place]); }
int& Vector::operator[] (int place) const { return (data[place]); }
Vector& Vector::operator+ (Vector& n) {
int temp_int = 0;
if (size > n.size_() || size == n.size_()) { temp_int = size; }
else if (size < n.size_()) { temp_int = n.size_(); }
Vector newone(temp_int);
int temp_2_int = 0;
for ( int j = 0; j <= temp_int &&
j <= n.size_() &&
j <= size;
j++) {
temp_2_int = n[j] + data[j];
newone[j] = temp_2_int;
}
////////////////////////////////////////////////////////////
return newone;
////////////////////////////////////////////////////////////
}
ostream& operator<< (ostream& out, const Vector& n) {
for (int i = 0; i <= n.size_(); i++) {
////////////////////////////////////////////////////////////
out << n[i] << " ";
////////////////////////////////////////////////////////////
}
return out;
}
Errors:
out << n[i] << " "; error C2678:
binary '[' : no operator found which
takes a left-hand operand of type
'const Vector' (or there is no
acceptable conversion)
return newone;
error C2106: '=' : left
operand must be l-value
As stated above, I am a student going into Computer Science as my selected major I would appreciate tips, pointers, and better ways to do stuff :D
This:
int operator[ ](int);
is a non-const member function. It means that it cannot be called on a const Vector.
Usually, the subscript operator is implemented such that it returns a reference (if you return a value, like you are doing, you can't use it as an lvalue, e.g. you can't do newone[j] = temp_2_int; like you have in your code):
int& operator[](int);
In order to be able to call it on a const object, you should also provide a const version of the member function:
const int& operator[](int) const;
Since you ask for "tips, pointers, and better ways to do stuff:"
You cannot name your include guard _VECTOR_H. Names beginning with an underscore followed by a capital letter are reserved for the implementation. There are a lot of rules about underscores.
You should never use using namespace std in a header.
Your operator+ should take a const Vector& since it is not going to modify its argument.
Your at should return an int and should match the semantics of the C++ standard library containers (i.e., it should throw an exception if i is out of bounds. You need to use (*this)[i] to call your overloaded operator[].
You need to learn what the * operator does. In several places you've confused pointers and the objects to which they point.
Watch out for confusing = with == (e.g. in if (comp = 0)). The compiler will warn you about this. Don't ignore warnings.
Your logic will be much simpler if you guarantee that data is never NULL.
Can't fit this into a comment on Neil's answer, so I'm gonna have to go into more detail here.
Regarding your expand() function. It looks like this function's job is to expand the internal storage, which has comp elements, by n elements, while maintaining the size of the Vector. So let's walk through what you have.
void Vector::expand(int n) {
int * newdata = new int [comp * 2];
Okay, you just created a new array that is twice as big as the old one. Error: Why doesn't the new size have anything to do with n?
if (*data != NULL) {
Error: *data is the first int element in your array. It's not a pointer. Why is it being compared to NULL?
Concept Error: Even if you said if (data != NULL), which could be a test to see if there is an array at all, at what point in time is data ever set to NULL? new [] doesn't return NULL if it's out of memory; it throws an exception.
for (int i = 0; i <= (comp); i++) {
newdata[i] = data[i];
}
Warning: You're copying the whole array, but only the first size elements are valid. The loop could just run up to size and you'd be fine.
newdata -= comp;
Error: Bad pointer math. newdata is set to a pointer to who knows where (comp ints back from the start of newdata?!), and almost certainly a pointer that will corrupt memory if given to delete [].
comp += n;
This is fine, for what it is.
data = newdata;
delete newdata;
}
Error: You stored a pointer and then immediately deleted its memory, making it an invalid pointer.
else if ( *data == NULL || comp == 0) {
data = new int [DEFAULT_VECTOR_SIZE];
comp = DEFAULT_VECTOR_SIZE;
size = 0;
}
}
Error: This should be in your constructor, not here. Again, nothing ever sets data to NULL, and *data is an int, not a pointer.
What this function should do:
create a new array of comp + n elements
copy size elements from the old array to the new one
delete the old array
set data to point to the new array
Good luck.
Besides of what others already wrote about your operator[]():
Your operator+() takes the right-hand side per non-const reference - as if it would attempt to change it. However, with A+B everyone would expect B to remain unchanged.
Further, I would implement all binary operators treating their operands equally (i.e., not changing either of them) as non-member functions. As member functions the left-hand side (this) could be treated differently. (For example, it could be subjected to overwritten versions in derived classes.)
Then, IME it's always good to base operator+() on operator+=(). operator+=() does not treat its operands equally (it changes its left one), so it's best done as a member function. Once this is done, implementing operator+() on top of it is a piece of cake.
Finally, operator+() should never, never ever return a reference to an object. When you say A+B you expect this to return a new object, not to change some existing object and return a reference to that.
There are so many errors in your code that it is hard to know where to start. Here's one:
delete [] data;
*data = *newdata;
You delete a pointer and then immediately dereference it.
And:
const string Vector::at(int i) {
this[i];
}
This is (I think) a vector of ints. why is this returning a string? And applying the [] operator to this does not call your operator[] overload - it treats this as an array, which it isn't.
You need to provide two versions of your operator[]. For accessing:
T operator[](std::size_t idx)const;
For writing to the element:
T& operator[](std::size_t idx);
In both of the above, replace T with the type of the elements. The reason you have this problem is that only functions that are marked "const" may be invoked on an object declared to be "const". Marking all non-mutating functions as "const" is definitely something you should do and is called "const-correctness". Since returning a reference to an element (necessary for writing) allows the underlying object to be mutated, that version of the function cannot be made "const". Therefore, a read-only "const" overload is needed.
You may also be interested in reading:
Const Correctness from the C++ FAQ Lite
Const Correctness in C++
int Vector::operator[] (int place) { return (data[place]); }
This should be
int Vector::operator[] (int place) const { return (data[place]); }
so that you will be able to do the [] operation on const vectors. The const after the function declaration means that the class instance (this) is treated as const Vector, meaning you won't be able to modify normal attributes. Or in other words: A method that only has read access to attributes.