Comparing strings in qsort - c++

Whenever I am comparing string in qsort, the order is completely wrong. For example, the input is
45 4 9 22 2
but my output is
22 45 4 9 2
here is my comparing function
int cmpString(const void *a, const void *b) {
const Node *a1 = *(const Node **)a;
const Node *b1 = *(const Node **)b;
return a1->s.c_str() - b1->s.c_str();
}
and dont tell me to use sort(), I can't for this assignment

This line is your code's major problem.
return a1->s.c_str() - b1->s.c_str();
The reason behind this is that you are subtracting two pointers here which is not what comparator is supposed to do in this case. Comparator does the comparison on the basis of content.
Instead, try this:
int length1 = a1->s.size();
int length2 = b1->s.size();
for (int i = 0; i < min(length1, length2); i++) {
if (a1->s[i] != b1->s[i]) { // if characters are not same, return difference of their ASCII values.
return a1->s[i] - b1->s[i];
}
}
return length1 - length2; // if they are same till now, then shorter string should appear first. That's why it is required.
Suggestion:
If you are coding in C++, then please use STL. There is a nice function sort() given by <algorithm> which allows you to do the same thing without using void *
Update:
As rightly suggested by user4581301, you may use std::string::compare directly.
Like this:
return (a1->s).compare(b1->s);

Related

Check isogram c++ using unordered set

Iam using unordered_set in C++ to check isogram words.
struct CustomHasher {
size_t operator()(const char& c) const;
};
// This hashing function should take the given character c and return an integer
// representing the hash value. This will be computed by the position of a-z,
// where a=>0, b=>1, and so on.
size_t CustomHasher::operator()(const char& c) const {
size_t i = tolower(c) - 'a';
return i;
}
void add_multiset(const string& s,
unordered_multiset<char, CustomHasher>* ms) {
for (int i = 0; i < s.length(); i++)
ms->insert(tolower(s[i]));
}
// inside main function
unordered_multiset<char, CustomHasher> ms;
add_multiset("hello", &ms);
What is wrong with my code? when I check the output of: ms.bucket('l') I should get 11, but instead I get 7
Also ms.bucket('o') I get 6, but instead I should get 14
What is wrong with my code?
The bucket number is not necessarily equal to the hash. For a nearly empty unordered_multiset you would expect many hashes to share the same bucket.
Since unordered_multiset is a template class, you can easily trace into the code to find out exactly how it's calculating the bucket number based on the hash.

unexpected value return by array variable

I am trying to pass array to function (*getcreditcurve). I am expecting function (*getcreditcurve) to return an array. Main function is expected to send several such array to function (*getcreditcurve), pointer function is expected to return a array to main function for different array using the logic given in pointer function (*getcreditcurve). I am not getting error but I don’t get correct value. I expect I+1 to be 3 * 0.0039 = 0.0117 and I+2 to be 4 *0.0060 = 0.0024 however I get following in excel output
'00D4F844 00D4F84C'
Even if I change the print statement to
'print << *(I1+1) << '\t' << *(I2+2) << endl;'
I get following excel out put
-9.26E+61 -9.26E+61
Can somebody help in trouble shooting please? Sorry I went through other post/question in this site but not able to get simplest way to solve this issue. I am going to use this logic to build other projects so simplified the question just to resolve main issue.
#include<iostream>
#include<cmath>
#include<fstream>
typedef double S1[5];
using namespace std;
double *getcreditcurve(double *);
int main()
{
S1 C1 = { 0.0029, 0.0039, 0.0046, 0.0052, 0.0057 };
S1 C2 = { 0.0020, 0.0050, 0.0060, 0.0070, 0.0080 };
typedef double *issuer;
issuer I1 = getcreditcurve(C1);
issuer I2 = getcreditcurve(C2);
ofstream print;
print.open("result1.xls");
print << (I1+1) << '\t' << (I2+2) << endl;
print.close();
return 0;
}
double *getcreditcurve(S1 ptr)
{
const int cp = 5;
typedef double curve[cp];
curve h;
h[0] = 2 * ptr[0];
h[1] = 3 * ptr[1];
h[2] = 4 * ptr[2];
h[3] = 5 * ptr[3];
h[4] = 6 * ptr[4];
return h;
}
If you want getcreditcurve to return an array, then try this:
const int cp = 5;
typedef double curve[cp];
curve getcreditcurve(S1 ptr) {
But that gives an error error: ‘foo’ declared as function returning an array. Functions can't return C arrays. But the good news is that if you fully embrace C++ you can return std::array instead.
#include<array>
const int cp = 5;
typedef curve std::array<double,cp>;
curve getcreditcurve(S1 ptr) {
But really, std::vector is probably much better as you have more flexibility about the size.
#include<vector>
std::vector<double> getcreditcurve(std::vector<double> ptr)
{
std::vector<double> h;
h.push_back(2 * ptr.at(0));
h.push_back(3 * ptr.at(1));
h.push_back(4 * ptr.at(2));
h.push_back(5 * ptr.at(3));
h.push_back(6 * ptr.at(4));
return h;
}
In fact, pretty much all problems with C arrays can be solved by std::vector. Then, in special situations, you can use std::array. But focus on std::vector for now.
It's not possible to return a C array from a function. There are other things you can return, such as std::vector or std::array. You should consider redesigning your application around those two instead.
But if you really need to use C arrays in C++, I suggest that instead of trying to return an array from getcreditcurve, you pass an extra array into getcreditcurve which will be used to store the result. This is called an output parameter.
void getcreditcurve(double*, double *);
This would solve the "scope" problem. The caller (main) would then create the array before calling getcreditcurve and would then pass that to getcreditcurve. As a result, getcreditcurve does not have to take responsibility for creating (or destroying) any object.
double I1[5];
getcreditcurve(I1, C1); // will store its result on `I1`.
This might be the easiest option if you really really need to get this working as soon as possible.
If you are willing to make some further changes, you can make a much safer program. The short term goal is to abolish all uses of * (except to use it for multiplication).
C arrays cannot be passed by value, in particular they cannot be returned. There are other funky things you can do (passing special pointers), but the best thing to do with C arrays is to pass them by reference. In C++, reference-passing behaviour is quite consistent and works well.
// http://stackoverflow.com/questions/31362360/unexpected-value-return-by-array-variable
#include<iostream>
#include<cmath>
#include<fstream>
typedef double S1[5];
using namespace std;
/* In the following declaration, the two parameters
* are taken by reference (note the '&').
* This is almost always the best way to pass arrays.
*
* Also, this is a template where N is automatically
* set to the correct number of parameters. This nice
* automatic behaviour is possible only because the
* array is taken by reference.
*
* Finally, note that the second reference, for 'input',
* has 'const'. This is to emphasize that 'input' is for input,
* that getcreditcurve will not be allowed to modify the input argument.
*/
template<size_t N>
void getcreditcurve(double (&output)[N],const double (&input)[N]);
int main()
{
/* S1 is the type - array of five doubles */
/* Here declare and initialize C1 and C2 as two variables
* of this type */
S1 C1 = { 0.0029, 0.0039, 0.0046, 0.0052, 0.0057 };
S1 C2 = { 0.0020, 0.0050, 0.0060, 0.0070, 0.0080 };
// create the two output arrays first, within main
S1 I1;
S1 I2;
// call getcreditcurve, passing in the output and input arrays
getcreditcurve(I1,C1);
getcreditcurve(I2,C2);
ofstream print;
/* you can't create Excel(.xls) files in C++ easily
* Better to just create a .csv file instead
* csv = comma-separated values
*/
print.open("result1.csv");
print << I1[0] << ',' << I2[3] << endl;
print.close();
return 0;
}
template<size_t N>
void getcreditcurve(double (&output)[N],const double (&input)[N])
{
output[0] = 2 * input[0];
output[1] = 3 * input[1];
output[2] = 4 * input[2];
output[3] = 5 * input[3];
output[4] = 6 * input[4];
}
But seriously, you really should just ditch C arrays entirely. This is C++, not C. Use std::vector<double> instead.

Copying arrays via pointers in C++

I've never programmed in C++ before and I'm trying to figure out how to recursively pass segments of an array in a C++ method. I am trying to convert the following pseudo code into C++.
SlowSort(A[1...n])
if n = 2
if A[1] > A[2]
Swap(A[1], A[2])
else if n > 2
SlowSort(A[1...(2n/3)])
SlowSort(A[(n/3+1)... n])
SlowSort(A[1...(2n/3)])
The recursive calls are the bits I'm having a problem with. I was thinking about creating two new arrays that point to the wanted locations but don't know how to go about that, specifically doing that and defining the length of the array. I've tried googling it and searching this site, but there doesn't seem to be anything, that I understand, on it. Also, in case I fudged up somewhere in my code, here's what I have for the first bit.
int SlowSort(int A[])
{
int length = (sizeof(A)/sizeof(*A));
if(length ==2)
{
if(A[0] > A[1])
{
int temp = A[0];
A[0] = A[1];
A[1] = temp;
}
}
In short, how do In covert the else if statement into C++? Explanation would be nice too.
Thanks
You will want to pass indices into the array instead, and use those.
void SlowSort(int A[], int left, int right)
{
if (right - left == 2)
if (A[left] > A[right])
Swap(A[left], A[right]);
else
{
int n = right - left + 1;
SlowSort(A, left, 2 * n / 3);
SlowSort(A, left + n / 3 + 1, right);
SlowSort(A, left, left + 2* n / 3);
}
The above code might not be correct regarding what the algorithm is supposed to do, but you get the idea I'm trying to describe. The thing is: you don't make a copy of the array. Instead, pass the same array always and the range (i.e. the indices) you are sorting.
You simply pass required pointer using the pointer arithmetic. For example the following pseudo code
SlowSort(A[(n/3+1)... n])
could be written as
SlowSort( A + n/3+1, n - n/3 - 1 );
So the function could be declared as
void SlowSort( int A[], size_t n );
As for this code snippet
int SlowSort(int A[])
{
int length = (sizeof(A)/sizeof(*A));
then it is invalid because array is implicitly converted to a ponter to its first element when it is passed as an argument to a function seclared such a way. So the value of length will not be equal to the number of elements.
This is pretty simple. Since arrays are just consecutive pointers. If you have a method:
Your code would look like this:
void slowSort(int[] array, int length)
{
if(length == 2)
{
if(array[0] > array[1])
{
int temp = array[0];
array[0] = array[1];
array[1] = temp;
}
}
else
{
slowSort(&array[0], (2 * length) / 3 - 1);
slowSort(&array[length / 3], length - (length / 3 - 1));
slowSort(&array[0], (2 * length) / 3 - 1);
}
}
The trick I use here is that I pass the pointer of the element I want to start with and the pass the end point.
This works because when you pass an array in C++ you just pass the pointer of the first element. Here I pass a custom pointer of the array.
The modern C++ way to do this would be to pass iterators to the beginning and one-past-the-end of the range. In this case, the iterators are pointers.
void SlowSort(int* begin, int* end)
{
unsigned length = end-begin;
if(length == 2)
{
if(begin[0] > begin[1])
{
std::swap( begin[0], begin[1] );
}
} else if(length>2) {
SlowSort(begin, begin+2*length/3);
SlowSort(begin+length/3, end);
SlowSort(begin, begin+2*length/3);
}
}
then, for the case of working with an entire array:
template<unsigned N>
void SlowSort( int(&Arr)[N] ) {
return SlowSort( Arr, Arr+N );
}
we dispatch it to the iterator version, relying on decaying of array-to-pointer. This has to be a template function, as we want it to work with multiple different array sizes.
Note that an int Arr[] is not an array. It is a different way to say int* Arr, left over as a legacy from C. In fact, as a parameter to a function, saying void foo( int A[27] ) results in void foo( int* A ): function parameters cannot be arrays.
They can, however, be references-to-arrays, which is what the above template function uses.

using qsort causing a segmentation fault

Well, as part of learning C++, my project has a restriction on it. I'm not allowed to use any libraries except the basic ones such as <cstring> and a few other necessities.
The project should take in input from a file that is an "n" number of columns of strings and be able to sort the output according to lexicographical ordering of any selected column. So for example, given the input
Cartwright Wendy 93
Williamson Mark 81
Thompson Mark 100
Anderson John 76
Turner Dennis 56
It should sort them by column. And my search around StackOverflow returned a result from someone else who had to do the exact same project a few years ago too Hahaha Qsort based on a column in a c-string?
But in my case I just use a global variable for the column and get on with life. My problem came in when I am implementing the compare function for qsort
In my main method I call
qsort (data, i, sizeof(char*), compare);
where data is a char * data[] and i is the number of lines to compare. (5 in this case)
Below is my code for the compare method
int compare (const void * a, const void * b){
char* line1 = new char[1000]; char* line2 = new char[1000];
strcpy(line1, *((const char**) a));
strcpy(line2, *((const char**) b));
char* left = &(strtok(line1, " \t"))[column-1];
char* right = &(strtok(line2, " \t"))[column-1];
return strcmp(left, right);
}
the 1000s are because I just generalized (and did bad coding on purpose) to overgeneralize that no lines will be longer than 1000 characters.
What confuses me is that when I use the debugger in Eclipse, I can see that it it compares it successfully the first time, then on the second round, it has a segmentation fault when it tries to compare them.
I also tried to change the code for assigning left and right to what is below but that didn't help either
char* left = new char[100];
strcpy(left, &(strtok(line1, " \t"))[column-1]);
char* right = new char[100];
strcpy(right, &(strtok(line2, " \t"))[column-1]);
Please help me understand what is causing this segmentation fault. The first time it compares the two, left = "Williamson" and right = "Thompson". The second time it compares (and crashes trying) left = "Cartwright" and right = "Thompson"
char* line1 = new char[1000]; char* line2 = new char[1000];
This is not good at all. You're never freeing this, so you leak 2000 bytes every time your comparison function is called. Eventually this will lead to low-memory conditions and new will throw. (Or on Linux your process might get killed by the OOM-killer). It's also not very efficient when you could just have said char line1[1000], which is super-quick because it simply subtracts from the stack pointer rather than potentially traversing a free list or asking the kernel for more memory.
But really you could be doing the compare without modifying or copying the strings. For example:
static int
is_end_of_token(char ch)
{
// If the string has the terminating NUL character we consider it the end.
// If it has the ' ' or '\t' character we also consider it the end. This
// accomplishes the same thing as your strtok call, but WITHOUT modifying
// the source buffer.
return (!ch || ch == ' ' || ch == '\t');
}
int
compare(const void *a, const void *b)
{
const char *strA = *(const char**)a;
const char *strB = *(const char**)b;
// Loop while there is data left to compare...
while (!is_end_of_token(*strA) && !is_end_of_token(*strB))
{
if (*strA < *strB)
return -1; // String on left is smaller
else if (*strA > *strB)
return 1; // String on right is smaller
++strA;
++strB;
}
if (is_end_of_token(*strA) && is_end_of_token(*strB))
return 0; // both strings are finished, so they are equal.
else if (is_end_of_token(*strA))
return -1; // left string has ended, but right string still has chars
else
return 1; // right string has ended, but left string still has chars
}
But lastly... You're using std::string you say? Well, if that's the case, then assuming the memory passed to qsort is compatible with "const char **" is a little weird, and I would expect it to crash. In that sense maybe what you should do something like:
int compare(const void *a, const void *b)
{
const char *strA = ((const std::string*)a)->c_str();
const char *strB = ((const std::string*)b)->c_str();
// ...
}
But really, if you are using C++ and not C, you should use std::sort.

how to improve natural sort program for decimals?

I have std::strings containing numbers in the leading section that I need to sort. The numbers can be integers or floats.
The vector<std::string> sort was not optimal, I found the following natural sort program which was much better. I still have a small issue with numbers smaller than zero that do not sort just right. Does anyone have a suggestion to improve? We're using Visual Studio 2003.
The complete program follows.
TIA,
Bert
#include <list>
#include <string>
#include <iostream>
using namespace std;
class MyData
{
public:
string m_str;
MyData(string str) {
m_str = str;
}
long field1() const
{
int second = m_str.find_last_of("-");
int first = m_str.find_last_of("-", second-1);
return atol(m_str.substr(first+1, second-first-1).c_str());
}
long field2() const
{
return atol(m_str.substr(m_str.find_last_of("-")+1).c_str());
}
bool operator < (const MyData& rhs)
{
if (field1() < rhs.field1()) {
return true;
} else if (field1() > rhs.field1()) {
return false;
} else {
return field2() < rhs.field2();
}
}
};
int main()
{
// Create list
list<MyData> mylist;
mylist.push_front(MyData("93.33"));
mylist.push_front(MyData("0.18"));
mylist.push_front(MyData("485"));
mylist.push_front(MyData("7601"));
mylist.push_front(MyData("1001"));
mylist.push_front(MyData("0.26"));
mylist.push_front(MyData("0.26"));
// Sort the list
mylist.sort();
// Dump the list to check the result
for (list<MyData>::const_iterator elem = mylist.begin(); elem != mylist.end(); ++elem)
{
cout << (*elem).m_str << endl;
}
return 1;
}
GOT:
0.26
0.26
0.18
93.33
485
1001
7601
EXPECTED:
0.18
0.26
0.26
93.33
485
1001
7601
Use atof() instead of atol() to have the comparison take the fractional part of the number into account. You will also need to change the return types to doubles.
If it's just float strings, I'd rather suggest to create a table with two columns (first row contains the original string, second row is filled with the string converted to float), sort this by the float column and then output/use the sorted string column.
If the data are all numbers I would create a new class to contain the data.
It can have a string to include the data but then allows you to have better methods to model behaviour - in this case espacially to implement operator <
The implementation could also include use of a library that calculates to exact precion e.g. GNU multiple precision this would do the comparison and canversion from string (or if the numbers do not have that many significant figures you could use doubles)
I would compute the values once and store them.
Because they are not actually part of the objects state (they are just calcualted values) mark them as mutable. Then they can also be set during const methods.
Also note that MyClass is a friend of itself and thus can access the private members of another object of the same class. So there is no need for the extranious accessor methods. Remember Accessor methods are to protect other classes from changes in the implementation not the class you are implementing.
The problem with ordering is that atoi() is only reading the integer (ie it stops at the '.' character. Thus all your numbers smaller than 0 have a zero value for comparison and thus they will appear in a random order. To compare against the full value you need to extract them as a floating point value (double).
class MyData
{
private:
mutable bool gotPos;
mutable double f1;
mutable double f2;
public:
/*
* Why is this public?
*/
std::string m_str;
MyData(std::string str)
:gotPos(false)
,m_str(str) // Use initializer list
{
// If you are always going to build f1,f2 then call BuildPos()
// here and then you don't need the test in the operator <
}
bool operator < (const MyData& rhs)
{
if (!gotPos)
{ buildPos();
}
if (!rhs.gotPos)
{ rhs.buildPos();
}
if (f1 < rhs.f1) return true;
if (f1 > rhs.f1) return false;
return f2 < rhs.f2;
}
private:
void buildPos() const
{
int second = m_str.find_last_of("-");
int first = m_str.find_last_of("-", second-1);
// Use boost lexical cast as it handles doubles
// As well as integers.
f1 = boost::lexical_cast<double>(m_str.substr(first + 1, second-first - 1));
f2 = boost::lexical_cast<double>(m_str.substr(second + 1));
gotPos = true;
}
};