How to safely compare two unsigned integer counters? - c++

We have two unsigned counters, and we need to compare them to check for some error conditions:
uint32_t a, b;
// a increased in some conditions
// b increased in some conditions
if (a/2 > b) {
perror("Error happened!");
return -1;
}
The problem is that a and b will overflow some day. If a overflowed, it's still OK. But if b overflowed, it would be a false alarm. How to make this check bulletproof?
I know making a and b uint64_t would delay this false-alarm. but it still could not completely fix this issue.
===============
Let me clarify a little bit: the counters are used to tracking memory allocations, and this problem is found in dmalloc/chunk.c:
#if LOG_PNT_SEEN_COUNT
/*
* We divide by 2 here because realloc which returns the same
* pointer will seen_c += 2. However, it will never be more than
* twice the iteration value. We divide by two to not overflow
* iter_c * 2.
*/
if (slot_p->sa_seen_c / 2 > _dmalloc_iter_c) {
dmalloc_errno = ERROR_SLOT_CORRUPT;
return 0;
}
#endif

I think you misinterpreted the comment in the code:
We divide by two to not overflow iter_c * 2.
No matter where the values are coming from, it is safe to write a/2 but it is not safe to write a*2. Whatever unsigned type you are using, you can always divide a number by two while multiplying may result in overflow.
If the condition would be written like this:
if (slot_p->sa_seen_c > _dmalloc_iter_c * 2) {
then roughly half of the input would cause a wrong condition. That being said, if you worry about counters overflowing, you could wrap them in a class:
class check {
unsigned a = 0;
unsigned b = 0;
bool odd = true;
void normalize() {
auto m = std::min(a,b);
a -= m;
b -= m;
}
public:
void incr_a(){
if (odd) ++a;
odd = !odd;
normalize();
}
void incr_b(){
++b;
normalize();
}
bool check() const { return a > b;}
}
Note that to avoid the overflow completely you have to take additional measures, but if a and b are increased more or less the same amount this might be fine already.

The posted code actually doesn’t seem to use counters that may wrap around.
What the comment in the code is saying is that it is safer to compare a/2 > b instead of a > 2*b because the latter could potentially overflow while the former cannot. This particularly true of the type of a is larger than the type of b.

Note overflows as they occur.
uint32_t a, b;
bool aof = false;
bool bof = false;
if (condition_to_increase_a()) {
a++;
aof = a == 0;
}
if (condition_to_increase_b()) {
b++;
bof = b == 0;
}
if (!bof && a/2 + aof*0x80000000 > b) {
perror("Error happened!");
return -1;
}
Each a, b interdependently have 232 + 1 different states reflecting value and conditional increment. Somehow, more than an uint32_t of information is needed. Could use uint64_t, variant code paths or an auxiliary variable like the bool here.

Normalize the values as soon as they wrap by forcing them both to wrap at the same time. Maintain the difference between the two when they wrap.
Try something like this;
uint32_t a, b;
// a increased in some conditions
// b increased in some conditions
if (a or b is at the maximum value) {
if (a > b)
{
a = a-b; b = 0;
}
else
{
b = b-a; a = 0;
}
}
if (a/2 > b) {
perror("Error happened!");
return -1;
}

If even using 64 bits is not enough, then you need to code your own "var increase" method, instead of overload the ++ operator (which may mess your code if you are not careful).
The method would just reset var to '0' or other some meaningfull value.

If your intention is to ensure that action x happens no more than twice as often as action y, I would suggest doing something like:
uint32_t x_count = 0;
uint32_t scaled_y_count = 0;
void action_x(void)
{
if ((uint32_t)(scaled_y_count - x_count) > 0xFFFF0000u)
fault();
x_count++;
}
void action_y(void)
{
if ((uint32_t)(scaled_y_count - x_count) < 0xFFFF0000u)
scaled_y_count+=2;
}
In many cases, it may be desirable to reduce the constants in the comparison used when incrementing scaled_y_count so as to limit how many action_y operations can be "stored up". The above, however, should work precisely in cases where the operations remain anywhere close to balanced in a 2:1 ratio, even if the number of operations exceeds the range of uint32_t.

Related

More general test for same order of magnitude than comparing floor(log10(abs(n)))

I am implementing an optimization algorithm and have diferent heuristics for cases where no or largely different lower and upper bounds for the solution are known or not.
To check, my first approach would be simply taking
if(abs(floor(log10(abs(LBD))) - floor(log10(abs(UBD)))) < 1 )
{ //(<1 e.g. for 6, 13)
//Bounds are sufficiently close for the serious stuff
}
else {
//We need some more black magic
}
But this requires previous checks to be gerneralized to NAN, ±INFINITY.
Also, in the case where LBD is negative and UBD positive we can't assume that the above check alone assures us that they are anywhere close to being of equal order of magnitude.
Is there a dedicated approach to this or am I stuck with this hackery?
Thanks to geza I realized that thw whole thing can be done without the log10:
A working solution is posted below, and a MWE including the log variant posted on ideone.
template <typename T> double sgn(T val) {
return double((T(0) < val) - (val < T(0)))/(val == val);
}
bool closeEnough(double LBD, double UBD, uint maxOrderDiff = 1, uint cutoffOrder = 1) {
double sgn_LBD = sgn(LBD);
double sgn_UBD = sgn(UBD);
double cutoff = pow(10, cutoffOrder);
double maxDiff = pow(10, maxOrderDiff);
if(sgn_LBD == sgn_UBD) {
if(abs(LBD)<cutoff && abs(UBD)<cutoff) return true;
return LBD<UBD && abs(UBD)<abs(LBD)*maxDiff;
}
else if(sgn_UBD > 0) {
return -LBD<cutoff && UBD<cutoff;
}
// if none of the above matches LBD >= UBD or any of the two is NAN
}
As a bonus it can take cutoffs, so if both bounds lie within [-10^cutoffOrder,+10^cutoffOrder] they are considered to be close enough!
The pow computation might also be unecessary, but at least in my case this check is not in a critical code section.
If it would be, I suppose you could just hard code the cutoff and maxDiff.

Try to reduce execution time but fail

Assume there is a set of points and almost every points are inside a quadrilateral. But a few are not. I want to know which points are not inside the quadrilateral.
So the function looks like this.
bool isInside(Point a, Point b, Point c, Point d, Point p) // a,b,c,d are the points consist of the quadrilateral.
{
if(orientation(a, b, p) < 0)
return false;
else if(orientation(b, c, p) < 0)
return false;
else if(orientation(c, d, p) < 0)
return false;
else if(orientation(d, a, p) < 0)
return false;
else
return true;
}
I wanted to reduce the number to call the orientation function and the orientation function looks like.
int orientation(const Point& p, const Point& q, const Point& r)
{
double val = (q.x - p.x) * (r.y - p.y) - (q.y - p.y) * (r.x - p.x);
if (val == 0)
return 0; // colinear
return (val < 0) ? -1 : 1; // right or left
}
So I modified the function isInside like this.
bool isInside(Point a, Point b, Point c, Point d, Point p)
{
int result;
if(p.x <= b.x)
{
result = orientation(a, b, p);
}
else
{
result = orientation(b, c, p);
}
if(result == -1) return false;
if(p.x <= d.x)
{
result = orientation(a, d, p);
}
else
{
result = orientation(d, c, p);
}
return (result == -1) ? true : false;
}
By this, the number of calling the orientation function reduces almost half(if there are more than 100,000 points it is a huge amount of number). However, it seems it does not affect the time taken and sometimes takes more.
I don't know how come this happens even though it reduces a lot of function calls.
Compiler Optimizations
It would be a good idea to check whether or not you are building with optimizations enabled. If you are building your application in debug mode, the compiler may not be optimizing your code. If you are, try running in release mode. It may build your application with optimizations enabled, or a higher level of optimization. This way, you can potentially leave your code as is, with little worrying about optimizing your code (Unless fast performance is absolutely necessary).
Quatitative Results
You could also add test code, that will allow you to get quantitative performance results (Running function x() n times takes m seconds, so each x() call takes m divided by n seconds). Then, you should be able to figure out which block of code is taking the most time.
An example of how you can go about doing the above (Without writing it for you) would look like:
#include <iostream>
#include <chrono>
//Doesn't matter where it is called, just using main as an example
int main(int argc, char *argv[])
{
int numRuns = 1000000; //Or passed in to allow changing # of runs
//without rebuilding: int numRuns = atoi(argv[1]);
//Code to initialize Point a, b, c, d, and p.
high_resolution_clock::time_point orien_start_time = high_resolution_clock::now();
for(int i = 0; i < numRuns; ++i)
{
orientation(a, b, p); //Ignore the return value
}
high_resolution_clock::time_point orien_end_time = high_resolution_clock::now();
high_resolution_clock::time_point orien_start_time = high_resolution_clock::now();
for(int i = 0; i < numRuns; ++i)
{
isInside(a, b, c, d, p); //Ignore the return value
}
high_resolution_clock::time_point orien_end_time = high_resolution_clock::now();
//Format and print/log the results
}
Then, with those time points, you can calculate how long each function takes to run. You can then use these numbers to pinpoint where exactly your application is slowing down. Going this route, you can test your old implementation vs. your new implementation, and see if the new way is infact faster. You could even try different sets of Points, to see if that changes application performance (For example, try both functions with points p1 through p5, then try both again with p6 through p10).
Note: There are a lot of things that can effect application performance outside of the code you write, which is why I used one million for the hard coded numRuns. If you go with a small number of iterations, your execution time per function call can swing pretty drastically depending on what else is running on your system. My recommendation for gathering quantitative results would be to run the test(s) on a freshly rebooted system where your application is the only user process running, that way it doesn't have to share as many resources with other applications.

Shortest way to calculate difference between two numbers?

I'm about to do this in C++ but I have had to do it in several languages, it's a fairly common and simple problem, and this is the last time. I've had enough of coding it as I do, I'm sure there must be a better method, so I'm posting here before I write out the same long winded method in yet another language;
Consider the (lilies!) following code;
// I want the difference between these two values as a positive integer
int x = 7
int y = 3
int diff;
// This means you have to find the largest number first
// before making the subtract, to keep the answer positive
if (x>y) {
diff = (x-y);
} else if (y>x) {
diff = (y-x);
} else if (x==y) {
diff = 0;
}
This may sound petty but that seems like a lot to me, just to get the difference between two numbers. Is this in fact a completely reasonable way of doing things and I'm being unnecessarily pedantic, or is my spidey sense tingling with good reason?
Just get the absolute value of the difference:
#include <cstdlib>
int diff = std::abs(x-y);
Using the std::abs() function is one clear way to do this, as others here have suggested.
But perhaps you are interested in succinctly writing this function without library calls.
In that case
diff = x > y ? x - y : y - x;
is a short way.
In your comments, you suggested that you are interested in speed. In that case, you may be interested in ways of performing this operation that do not require branching. This link describes some.
#include <cstdlib>
int main()
{
int x = 7;
int y = 3;
int diff = std::abs(x-y);
}
All the existing answers will overflow on extreme inputs, giving undefined behaviour. #craq pointed this out in a comment.
If you know that your values will fall within a narrow range, it may be fine to do as the other answers suggest, but to handle extreme inputs (i.e. to robustly handle any possible input values), you cannot simply subtract the values then apply the std::abs function. As craq rightly pointed out, the subtraction may overflow, causing undefined behaviour (consider INT_MIN - 1), and the std::abs call may also cause undefined behaviour (consider std::abs(INT_MIN)). It's no better to determine the min and max of the pair and to then perform the subtraction.
More generally, a signed int is unable to represent the maximum difference between two signed int values. The unsigned int type should be used for the output value.
I see 3 solutions. I've used the explicitly-sized integer types from stdint.h here, to close the door on uncertainties like whether long and int are the same size and range.
Solution 1. The low-level way.
// I'm unsure if it matters whether our target platform uses 2's complement,
// due to the way signed-to-unsigned conversions are defined in C and C++:
// > the value is converted by repeatedly adding or subtracting
// > one more than the maximum value that can be represented
// > in the new type until the value is in the range of the new type
uint32_t difference_int32(int32_t i, int32_t j) {
static_assert(
(-(int64_t)INT32_MIN) == (int64_t)INT32_MAX + 1,
"Unexpected numerical limits. This code assumes two's complement."
);
// Map the signed values across to the number-line of uint32_t.
// Preserves the greater-than relation, such that an input of INT32_MIN
// is mapped to 0, and an input of 0 is mapped to near the middle
// of the uint32_t number-line.
// Leverages the wrap-around behaviour of unsigned integer types.
// It would be more intuitive to set the offset to (uint32_t)(-1 * INT32_MIN)
// but that multiplication overflows the signed integer type,
// causing undefined behaviour. We get the right effect subtracting from zero.
const uint32_t offset = (uint32_t)0 - (uint32_t)(INT32_MIN);
const uint32_t i_u = (uint32_t)i + offset;
const uint32_t j_u = (uint32_t)j + offset;
const uint32_t ret = (i_u > j_u) ? (i_u - j_u) : (j_u - i_u);
return ret;
}
I tried a variation on this using bit-twiddling cleverness taken from https://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax but modern code-generators seem to generate worse code with this variation. (I've removed the static_assert and the comments.)
uint32_t difference_int32(int32_t i, int32_t j) {
const uint32_t offset = (uint32_t)0 - (uint32_t)(INT32_MIN);
const uint32_t i_u = (uint32_t)i + offset;
const uint32_t j_u = (uint32_t)j + offset;
// Surprisingly it helps code-gen in MSVC 2019 to manually factor-out
// the common subexpression. (Even with optimisation /O2)
const uint32_t t = (i_u ^ j_u) & -(i_u < j_u);
const uint32_t min = j_u ^ t; // min(i_u, j_u)
const uint32_t max = i_u ^ t; // max(i_u, j_u)
const uint32_t ret = max - min;
return ret;
}
Solution 2. The easy way. Avoid overflow by doing the work using a wider signed integer type. This approach can't be used if the input signed integer type is the largest signed integer type available.
uint32_t difference_int32(int32_t i, int32_t j) {
return (uint32_t)std::abs((int64_t)i - (int64_t)j);
}
Solution 3. The laborious way. Use flow-control to work through the different cases. Likely to be less efficient.
uint32_t difference_int32(int32_t i, int32_t j)
{ // This static assert should pass even on 1's complement.
// It's just about impossible that int32_t could ever be capable of representing
// *more* values than can uint32_t.
// Recall that in 2's complement it's the same number, but in 1's complement,
// uint32_t can represent one more value than can int32_t.
static_assert( // Must use int64_t to subtract negative number from INT32_MAX
((int64_t)INT32_MAX - (int64_t)INT32_MIN) <= (int64_t)UINT32_MAX,
"Unexpected numerical limits. Unable to represent greatest possible difference."
);
uint32_t ret;
if (i == j) {
ret = 0;
} else {
if (j > i) { // Swap them so that i > j
const int32_t i_orig = i;
i = j;
j = i_orig;
} // We may now safely assume i > j
uint32_t magnitude_of_greater; // The magnitude, i.e. abs()
bool greater_is_negative; // Zero is of course non-negative
uint32_t magnitude_of_lesser;
bool lesser_is_negative;
if (i >= 0) {
magnitude_of_greater = i;
greater_is_negative = false;
} else { // Here we know 'lesser' is also negative, but we'll keep it simple
// magnitude_of_greater = -i; // DANGEROUS, overflows if i == INT32_MIN.
magnitude_of_greater = (uint32_t)0 - (uint32_t)i;
greater_is_negative = true;
}
if (j >= 0) {
magnitude_of_lesser = j;
lesser_is_negative = false;
} else {
// magnitude_of_lesser = -j; // DANGEROUS, overflows if i == INT32_MIN.
magnitude_of_lesser = (uint32_t)0 - (uint32_t)j;
lesser_is_negative = true;
}
// Finally compute the difference between lesser and greater
if (!greater_is_negative && !lesser_is_negative) {
ret = magnitude_of_greater - magnitude_of_lesser;
} else if (greater_is_negative && lesser_is_negative) {
ret = magnitude_of_lesser - magnitude_of_greater;
} else { // One negative, one non-negative. Difference is sum of the magnitudes.
// This will never overflow.
ret = magnitude_of_lesser + magnitude_of_greater;
}
}
return ret;
}
Well it depends on what you mean by shortest. The fastet runtime, the fastest compilation, the least amount of lines, the least amount of memory. I'll assume you mean runtime.
#include <algorithm> // std::max/min
int diff = std::max(x,y)-std::min(x,y);
This does two comparisons and one operation (this one is unavoidable but could be optimized through certain bitwise operations with specific cases, compiler might actually do this for you though). Also if the compiler is smart enough it could do only one comparison and save the result for the other comparison. E.g if X>Y then you know from the first comparison that Y < X but I'm not sure if compilers take advantage of this.

Stack versus Integer

I've created a program to solve Cryptarithmetics for a class on Data Structures. The professor recommended that we utilize a stack consisting of linked nodes to keep track of which letters we replaced with which numbers, but I realized an integer could do the same trick. Instead of a stack {A, 1, B, 2, C, 3, D, 4} I could hold the same info in 1234.
My program, though, seems to run much more slowly than the estimation he gave us. Could someone explain why a stack would behave much more efficiently? I had assumed that, since I wouldn't be calling methods over and over again (push, pop, top, etc) and instead just add one to the 'solution' that mine would be faster.
This is not an open ended question, so do not close it. Although you can implement things different ways, I want to know why, at the heart of C++, accessing data via a Stack has performance benefits over storing in ints and extracting by moding.
Although this is homework, I don't actually need help, just very intrigued and curious.
Thanks and can't wait to learn something new!
EDIT (Adding some code)
letterAssignments is an int array of size 26. for a problem like SEND + MORE = MONEY, A isn't used so letterAssignments[0] is set to 11. All chars that are used are initialized to 10.
answerNum is a number with as many digits as there are unique characters (in this case, 8 digits).
int Cryptarithmetic::solve(){
while(!solved()){
for(size_t z = 0; z < 26; z++){
if(letterAssignments[z] != 11) letterAssignments[z] = 10;
}
if(answerNum < 1) return NULL;
size_t curAns = answerNum;
for(int i = 0; i < numDigits; i++){
if(nextUnassigned() != '$') {
size_t nextAssign = curAns % 10;
if(isAssigned(nextAssign)){
answerNum--;
continue;
}
assign(nextUnassigned(), nextAssign);
curAns /= 10;
}
}
answerNum--;
}
return answerNum;
}
Two helper methods in case you'd like to see them:
char Cryptarithmetic::nextUnassigned(){
char nextUnassigned = '$';
for(int i = 0; i < 26; i++) {
if(letterAssignments[i] == 10) return ('A' + i);
}
}
void Cryptarithmetic::assign(char letter, size_t val){
assert('A' <= letter && letter <= 'Z'); // valid letter
assert(letterAssignments[letter-'A'] != 11); // has this letter
assert(!isAssigned(val)); // not already assigned.
letterAssignments[letter-'A'] = val;
}
From the looks of things the way you are doing things here is quite inefficiant.
As a general rule try to have the least amount of for loops possible since each one will slow down your implementation greatly.
for instance if we strip all other code away, your program looks like
while(thing) {
for(z < 26) {
}
for(i < numDigits) {
for(i < 26) {
}
for(i < 26) {
}
}
}
this means that for each while loop you are doing ((26+26)*numDigits)+26 loop operations. Thats assuming isAssigned() does not use a loop.
Idealy you want:
while(thing) {
for(i < numDigits) {
}
}
which i'm sure is possible with changes to your code.
This is why your implementation with the integer array is much slower than an implementation using the stack which does not use the for(i < 26) loops (I assume).
In Answer to your original question however, storing an array of integers will always be faster than any struct you can come up with simply because there are more overheads involved in assigning the memory, calling functions, etc.
But as with everything, implementation is the key difference between a slow program and a fast program.
The problem is that by counting you are considering also repetitions, when may be the problem asks to assign a different number to each different letter so that the numeric equation holds.
For example for four letters you are testing 10*10*10*10=10000 letter->number mappings instead of 10*9*8*7=5040 of them (the bigger is the number of letters and bigger becomes the ratio between the two numbers...).
The div instruction used by the mod function is quite expensive. Using it for your purpose can easily be less efficient than a good stack implementation. Here is an instruction timings table: http://gmplib.org/~tege/x86-timing.pdf
You should also write unit tests for your int-based stack to make sure that it works as intended.
Programming is actually trading memory for time and vice versa.
Here you are packing data into integer. You spare memory but loose time.
Speed of course depends on the implementation of stack. C++ is C with classes. If you are not using classes it's basically C(as fast as C).
const int stack_size = 26;
struct Stack
{
int _data[stack_size];
int _stack_p;
Stack()
:_stack_size(0)
{}
inline void push(int val)
{
assert(_stack_p < stack_size); // this won't be overhead
// unless you compile debug version(-DNDEBUG)
_data[_stack_p] = val;
}
inline int pop()
{
assert(_stack_p > 0); // same thing. assert is very useful for tracing bugs
return _data[--_stack_p]; // good hint for RVO
}
inline int size()
{
return _stack_p;
}
inline int val(int i)
{
assert(i > 0 && i < _stack_p);
return _data[i];
}
}
There is no overhead like vtbp. Also pop() and push() are very simple so they will be inlined, so no overhead of function call. Using int as stack element also good for speed because int is guaranteed to be of best suitable size for processor(no need for alignment etc).

Custom sorting, always force 0 to back of ascending order?

Premise
This problem has a known solution (shown below actually), I'm just wondering if anyone has a more elegant algorithm or any other ideas/suggestions on how to make this more readable, efficient, or robust.
Background
I have a list of sports competitions that I need to sort in an array. Due to the nature of this array's population, 95% of the time the list will be pre sorted, so I use an improved bubble sort algorithm to sort it (since it approaches O(n) with nearly sorted lists).
The bubble sort has a helper function called CompareCompetitions that compares two competitions and returns >0 if comp1 is greater, <0 if comp2 is greater, 0 if the two are equal. The competitions are compared first by a priority field, then by game start time, and then by Home Team Name.
The priority field is the trick to this problem. It is an int that holds a positve value or 0. They are sorted with 1 being first, 2 being second, and so on with the exception that 0 or invalid values are always last.
e.g. the list of priorities
0, 0, 0, 2, 3, 1, 3, 0
would be sorted as
1, 2, 3, 3, 0, 0, 0, 0
The other little quirk, and this is important to the question, is that 95% of the time, priority will be it's default 0, because it is only changed if the user wants to manually change the sort order, which is rarely. So the most frequent case in the compare function is that priorities are equal and 0.
The Code
This is my existing compare algorithm.
int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{
if(comp1.nPriority == comp2.nPriority)
{
//Priorities equal
//Compare start time
int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs);
if(ret != 0)
{
return ret; //return compare result
}else
{
//Equal so far
//Compare Home team Name
ret = comp1.sHLongName.CompareNoCase(comp2.sHLongName);
return ret;//Home team name is last field to sort by, return that value
}
}
else if(comp1.nPriority > comp2.nPriority)
{
if(comp2.nPriority <= 0)
return -1;
else
return 1;//comp1 has lower priority
}else /*(comp1.nPriority < comp2.nPriority)*/
{
if(comp1.nPriority <= 0)
return 1;
else
return -1;//comp1 one has higher priority
}
}
Question
How can this algorithm be improved?
And more importantly...
Is there a better way to force 0 to the back of the sort order?
I want to emphasize that this code seems to work just fine, but I am wondering if there is a more elegant or efficient algorithm that anyone can suggest. Remember that nPriority will almost always be 0, and the competitions will usually sort by start time or home team name, but priority must always override the other two.
Isn't it just this?
if (a==b) return other_data_compare(a, b);
if (a==0) return 1;
if (b==0) return -1;
return a - b;
You can also reduce some of the code verbosity using the trinary operator like this:
int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{
if(comp1.nPriority == comp2.nPriority)
{
//Priorities equal
//Compare start time
int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs);
return ret != 0 ? ret : comp1.sHLongName.CompareNoCase(comp2.sHLongName);
}
else if(comp1.nPriority > comp2.nPriority)
return comp2.nPriority <= 0 ? -1 : 1;
else /*(comp1.nPriority < comp2.nPriority)*/
return comp1.nPriority <= 0 ? 1 : -1;
}
See?
This is much shorter and in my opinion easily read.
I know it's not what you asked for but it's also important.
Is it intended that if the case nPriority1 < 0 and nPriority2 < 0 but nPriority1 != nPriority2 the other data aren't compared?
If it isn't, I'd use something like
int nPriority1 = comp1.nPriority <= 0 ? INT_MAX : comp1.nPriority;
int nPriority2 = comp2.nPriority <= 0 ? INT_MAX : comp2.nPriority;
if (nPriority1 == nPriority2) {
// current code
} else {
return nPriority1 - nPriority2;
}
which will consider values less or equal to 0 the same as the maximum possible value.
(Note that optimizing for performance is probably not worthwhile if you consider that there are insensitive comparisons in the most common path.)
If you can, it seems like modifying the priority scheme would be the most elegant, so that you could just sort normally. For example, instead of storing a default priority as 0, store it as 999, and cap user defined priorities at 998. Then you won't have to deal with the special case anymore, and your compare function can have a more straightforward structure, with no nesting of if's:
(pseudocode)
if (priority1 < priority2) return -1;
if (priority1 > priority2) return 1;
if (startTime1 < startTime2) return -1;
if (startTime1 > startTime2) return 1;
if (teamName1 < teamName2) return -1;
if (teamName1 > teamName2) return -1;
return 0; // exact match!
I think the inelegance you feel about your solution comes from duplicate code for the zero priority exception. The Pragmatic Programmer explains that each piece of information in your source should be defined in "one true" place. To the naive programmer reading your function, you want the exception to stand-out, separate from the other logic, in one place, so that it is readily understandable. How about this?
if(comp1.nPriority == comp2.nPriority)
{
// unchanged
}
else
{
int result, lowerPriority;
if(comp1.nPriority > comp2.nPriority)
{
result = 1;
lowerPriority = comp2.nPriority;
}
else
{
result = -1;
lowerPriority = comp1.nPriority;
}
// zero is an exception: always goes last
if(lowerPriority == 0)
result = -result;
return result;
}
I Java-ized it, but the approach will work fine in C++:
int CompareCompetitions(Competition comp1, Competition comp2) {
int n = comparePriorities(comp1.nPriority, comp2.nPriority);
if (n != 0)
return n;
n = comp1.sStartTime24Hrs.compareToIgnoreCase(comp2.sStartTime24Hrs);
if (n != 0)
return n;
n = comp1.sHLongName.compareToIgnoreCase(comp2.sHLongName);
return n;
}
private int comparePriorities(Integer a, Integer b) {
if (a == b)
return 0;
if (a <= 0)
return -1;
if (b <= 0)
return 1;
return a - b;
}
Basically, just extract the special-handling-for-zero behavior into its own function, and iterate along the fields in sort-priority order, returning as soon as you have a nonzero.
As long as the highest priority is not larger than INT_MAX/2, you could do
#include <climits>
const int bound = INT_MAX/2;
int pri1 = (comp1.nPriority + bound) % (bound + 1);
int pri2 = (comp2.nPriority + bound) % (bound + 1);
This will turn priority 0 into bound and shift all other priorities down by 1. The advantage is that you avoid comparisons and make the remainder of the code look more natural.
In response to your comment, here is a complete solution that avoids the translation in the 95% case where priorities are equal. Note, however, that your concern over this is misplaced since this tiny overhead is negligible with respect to the overall complexity of this case, since the equal-priorities case involves at the very least a function call to the time comparison method and at worst an additional call to the name comparator, which is surely at least an order of magnitude slower than whatever you do to compare the priorities. If you are really concerned about efficiency, go ahead and experiment. I predict that the difference between the worst-performing and best-performing suggestions made in this thread won't be more than 2%.
#include <climits>
int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{
if(comp1.nPriority == comp2.nPriority)
if(int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs))
return ret;
else
return comp1.sHLongName.CompareNoCase(comp2.sHLongName);
const int bound = INT_MAX/2;
int pri1 = (comp1.nPriority + bound) % (bound + 1);
int pri2 = (comp2.nPriority + bound) % (bound + 1);
return pri1 > pri2 ? 1 : -1;
}
Depending on your compiler/hardware, you might be able to squeeze out a few more cycles by replacing the last line with
return (pri1 > pri2) * 2 - 1;
or
return (pri1-pri2 > 0) * 2 - 1;
or (assuming 2's complement)
return ((pri1-pri2) >> (CHAR_BIT*sizeof(int) - 1)) | 1;
Final comment: Do you really want CompareCompetitions to return 1,-1,0 ? If all you need it for is bubble sort, you would be better off with a function returning a bool (true if comp1 is ">=" comp2 and false otherwise). This would simplify (albeit slightly) the code of CompareCompetitions as well as the code of the bubble sorter. On the other hand, it would make CompareCompetitions less general-purpose.