Optimizing bubble sort - What am I missing? - c++

I'm trying to understand possible optimization methods for the bubble sort algorithm. I know there are better sorting methods, but I'm just curious.
To test the efficiency I'm using std::chrono. The program sorts a 10000 number long int array 30 times and prints the average sorting time. The numbers are picked randomly(up to 10000) in every iteration. Here is the code, with no optimization:
#include <iostream>
#include <ctime>
#include <chrono>
using namespace std;
int main() {
//bubble sort
srand(time(NULL));
chrono::time_point<chrono::steady_clock> start, end;
const int n = 10000;
int i,j, last, tests = 30,arr[n];
long long total = 0;
bool out;
while (tests-->0) {
for (i = 0; i < n; i++) {
arr[i] = rand() % 1000;
}
j = n;
start = chrono::high_resolution_clock::now();
while(1){
out = 0;
for (i = 0; i < j - 1; i++) {
if (arr[i + 1] < arr[i]) {
swap(arr[i + 1], arr[i]);
out = 1;
}
}
if (!out) {
break;
}
//j--;
}
end = chrono::high_resolution_clock::now();
total += chrono::duration_cast<chrono::nanoseconds>(end - start).count();
cout << "Remaining :"<<tests << endl;
}
cout << "Average :" << total / static_cast<double>(30)/1000000000<<" seconds"; // tests(30) + nanosec -> sec
cin.sync();
cin.ignore();
return 0;
}
I get 0.17 seconds average sorting time.
If I uncomment line 47(j--;) to avoid comparing numbers already sorted I get 0.12 sorting time which is understandable.
If I remember the last position where a swap took place, I know that after that index, elements are sorted, and can thus sort up to that position in further iterations. It's better explained in the second part of this post: https://stackoverflow.com/a/16196115/1967496.
This is the code that implements the new possible optimization:
#include <iostream>
#include <ctime>
#include <chrono>
using namespace std;
int main() {
//bubble sort
srand(time(NULL));
chrono::time_point<chrono::steady_clock> start, end;
const int n = 10000;
int i,j, last, tests = 30,arr[n];
long long total = 0;
bool out;
while (tests-->0) {
for (i = 0; i < n; i++) {
arr[i] = rand() % 1000;
}
j = n;
start = chrono::high_resolution_clock::now();
while(1){
out = 0;
for (i = 0; i < j - 1; i++) {
if (arr[i + 1] < arr[i]) {
swap(arr[i + 1], arr[i]);
out = 1;
last = i;
}
}
if (!out) {
break;
}
j = last + 1;
}
end = chrono::high_resolution_clock::now();
total += chrono::duration_cast<chrono::nanoseconds>(end - start).count();
cout << "Remaining :"<<tests << endl;
}
cout << "Average :" << total / static_cast<double>(30)/1000000000<<" seconds"; // tests(30) + nanosec -> sec
cin.sync();
cin.ignore();
return 0;
}
Note lines 40 and 48. And here comes the problem: The average time is now again around 0.17 seconds.
Is there a problem in my code, or am I missing something ?
Update:
I did sorting with 10 times more numbers and get now following results:
No optimization: 19.3 seconds
First optimization(j--): 14.5 seconds
Second (supposed) optimization(j=last+1): 17.4 seconds;
From my understanding, the second method should be in any case better than the first, but the numbers tell something else.

Well... The problem is that there might not be the right or wrong answer to this question.
First of all, when you're comparing only 10000 elements, you cannot really call it an effeciency test. Try comparing much higher number of elements - maybe 500000 (although you will probably need to alocate an array dynamicaly for that).
Second of all, it might be the compiler. Compilers often try to optimize things so that the program execution will run smoother and faster.

Related

Why my Shell sorting is so slow

I am trying to implement shell sorting algorithm myself. I wrote my own code and didn't watch to any code samples only watch the video of algorithm description
My sort works but very slow (bubble sort 100 items - 0.007 s; shell sort 100 items - 4.83 s), how is it possible to improve it?
void print(vector<float>vec)
{
for (float i : vec)
cout << i << " ";
cout << "\n\n";
}
void Shell_sorting(vector<float>&values)
{
int swapping = 0;
int step = values.size();
clock_t start;
double duration;
start = clock();
while (step/2 >= 1)
{
step /= 2;
for (int i = 0; i < values.size()-step; i++)
{
if ((i + step < values.size()))
{
if ((values[i + step] < values[i]))
{
swap(values[i], values[i + step]);
print(values);
++swapping;
int c = i;
while (c - step > 0)
{
if (values[c] < values[c - step])
{
swap(values[c], values[c - step]);
print(values);
++swapping;
c -= step;
}
else
break;
}
}
}
else
break;
}
}
duration = (clock() - start) / (double)CLOCKS_PER_SEC;
print(values);
cout << swapping << " " << duration;
print(values);
}
A better implementation could be:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> vec = {
726,621,81,719,167,958,607,130,263,108,
134,235,508,407,153,162,849,923,996,975,
250,78,460,667,654,62,865,973,477,912,
580,996,156,615,542,655,240,847,613,497,
274,241,398,84,436,803,138,677,470,606,
226,593,620,396,460,448,198,958,566,599,
762,248,461,191,933,805,288,185,21,340,
458,592,703,303,509,55,190,318,310,189,
780,923,933,546,816,627,47,377,253,709,
992,421,587,768,908,261,946,75,682,948,
};
std::vector<int> gaps = {5, 2, 1};
int j;
for (int gap : gaps) {
for (int i = gap; i < vec.size(); i++)
{
j = i-gap;
while (j >= 0) {
if (vec[j+gap] < vec[j])
{
int temp = vec[j+gap];
vec[j+gap] = vec[j];
vec[j] = temp;
j = j-gap;
}
else break;
}
}
}
for (int item : vec) std::cout << item << " " << std::endl;
return 0;
}
I prefer to use a vector to store gap data so that you do not need to compute the division (which is an expansive operation). Besides, this choice, gives your code more flexibility.
the extern loop cycles on gap values. Once choosen the gap, you iterate over your vector, starting from vec[gap] and explore if there are elements smaller then it according to the logic of the Shell Sort.
So, you start setting j=i-gap and test the if condition. If it is true, swap items and then repeat the while loop decrementing j. Note: vec[j+gap]is the element that in the last loop cycle was swapped. If the condition is true, there's no reason to continue in the loop, so you can exit from it with a break.
On my machine, it took 0.002s calculated using the time shell command (the time includes the process of printing numbers).
p.s. to generate all that numbers and write them in the array, since i'm too lazy to write a random function, i used this link and then i edited the output in the shell with:
sed -e 's/[[:space:]]/,/g' num | sed -e 's/$/,/'

optimizing code: fibonacci algorithm

I'm working on a fibonacci algorithm for really big numbers (100k th number). I need to make this run faster though, but just a couple of seconds and I ran out of ideas. Is there any way to make it faster? Thanks for help.
#include <iostream>
using namespace std;
int main() {
string elem_major = "1";
string elem_minor = "0";
short elem_maj_int;
short elem_min_int;
short sum;
int length = 1;
int ten = 0;
int n;
cin >> n;
for (int i = 1; i < n; i++)
{
for (int j = 0; j < length; j++)
{
elem_maj_int = short(elem_major[j] - 48);
elem_min_int = short(elem_minor[j] - 48);
sum = elem_maj_int + elem_min_int + ten;
ten = 0;
if (sum > 9)
{
sum -= 10;
ten = 1;
if (elem_major[j + 1] == NULL)
{
elem_major += "0";
elem_minor += "0";
length++;
}
}
elem_major[j] = char(sum + 48);
elem_minor[j] = char(elem_maj_int + 48);
}
}
for (int i = length-1; i >= 0; i--)
{
cout << elem_major[i];
}
return 0;
}
No matter how good optimizations you perform on a given code, without changing the underlying algorithm you can only optimize it marginally. Your approach is with linear complexity and for big values it will quickly become slow. A faster implementation of Fibonacci numbers is by doing matrix exponentiation by squaring on the matrix:
0 1
1 1
This approach will be with logarithmic complexity which is asymptotically better. Perform a few exponentiations of this matrix and you'll notice that the n + 1st Fibonacci number is at its lower right corner.
I suggest you use something like cpp-bigint (http://sourceforge.net/projects/cpp-bigint/) for your big numbers.
The code would look like this then
#include <iostream>
#include "bigint.h"
using namespace std;
int main() {
BigInt::Rossi num1(0);
BigInt::Rossi num2(1);
BigInt::Rossi num_next(1);
int n = 100000;
for (int i = 0; i < n - 1; ++i)
{
num_next = num1 + num2;
num1 = std::move(num2);
num2 = std::move(num_next);
}
cout << num_next.toStrDec() << endl;
return 0;
}
Quick benchmark on my machine:
time ./yourFib
real 0m8.310s
user 0m8.301s
sys 0m0.005s
time ./cppBigIntFib
real 0m2.004s
user 0m1.993s
sys 0m0.006s
I would save some precomputed points (especially since you are looking for really big numbers)
ie say I saved 500th and 501st fib number. Then if some one asks me what is 600th fib? I would start computing from 502 rather than from 1. This would really save time.
Now the question how many points you would save and how would select the points to save?
The answer to this question totally depends on the application and probable distribution.

Whats wrong in my code! I want to print Fibonacci Series with values less than 4000000

I think I there is some problem in implementation of my loop!
Here's my code.
#include <iostream>
using namespace std;
int main()
{
int i=2;
long long int FiboNo[100];
FiboNo[0] = 1;
FiboNo[1] = 2;
do{
FiboNo[i]=FiboNo[(i-1)]+FiboNo[(i-2)];
cout<<FiboNo[i]<<endl;
i++;
}while(FiboNo[i]<4000000);
return 0;
}
do {
FiboNo[i] = FiboNo[(i - 1)] + FiboNo[(i - 2)];
cout << FiboNo[i] << endl;
i++;
} while (FiboNo[i] < 4000000);
You are incrementing i before you compare.
do {
FiboNo[i] = FiboNo[(i - 1)] + FiboNo[(i - 2)];
cout << FiboNo[i] << endl;
} while (FiboNo[i++] < 4000000);
is what you want to do.
Here's what's happening:
i 2
fibo[2] is 2
now i is 3
fibo[3] is 0
This has no problem, when fibo[someIndex] reaches the limit. It wont come out, because your value is always a 0.

How to produce random numbers so that their sum is equal to given number?

I want to produce X random numbers, each from the interval <0; Y> (given Y as a maximum of each number), but there is restriction that the sum of these numbers must be equal to Z.
Example:
5 Randoms numbers, each max 6 and the sum must be equal to 14, e.g: 0, 2, 6, 4, 2
Is there already a C/C++ function that could do something like that?
Personally I couldn't come up with more than some ugly if-else-constucts.
Since you don't need the generated sequence to be uniform, this could be one of the possible solutions:
#include <iostream>
#include <vector>
#include <cstdlib>
int irand(int min, int max) {
return ((double)rand() / ((double)RAND_MAX + 1.0)) * (max - min + 1) + min;
}
int main()
{
int COUNT = 5, // X
MAX_VAL = 6, // Y
MAX_SUM = 14; // Z
std::vector<int> buckets(COUNT, 0);
srand(time(0));
int remaining = MAX_SUM;
while (remaining > 0)
{
int rndBucketIdx = irand(0, COUNT-1);
if (buckets[rndBucketIdx] == MAX_VAL)
continue; // this bucket is already full
buckets[rndBucketIdx]++;
remaining--;
}
std::cout << "Printing sequence: ";
for (size_t i = 0; i < COUNT; ++i)
std::cout << buckets[i] << ' ';
}
which just simply divides the total sum to bunch of buckets until it's gone :)
Example of output: Printing sequence: 4 4 1 0 5
NOTE: this solution was written when the question specified a "MAX SUM" parameter, implying a sum of less than that amount was equally acceptable. The question's now been edited based on the OP's comment that they meant the cumulative sum must actually hit that target. I'm not going to update this answer, but clearly it could trivially discard lesser totals at the last level of recursion.
This solution does a one-time population of a vector<vector<int>> with all the possible combinations of numbers solving the input criterion, then each time a new solution is wanted it picks one of those at random and shuffles the numbers into a random order (thereby picking a permutation of the combination).
It's a bit heavy weight - perhaps not suitable for the actual use that you mentioned after I'd started writing it ;-P - but it produces an even-weighted distribution, and you can easily do things like guarantee a combination won't be returned again until all other combinations have been returned (with a supporting shuffled vector of indices into the combinations).
#include <iostream>
#include <vector>
#include <algorithm>
using std::min;
using std::max;
using std::vector;
// print solutions...
void p(const vector<vector<int>>& vvi)
{
for (int i = 0; i < vvi.size(); ++i)
{
for (int j = 0; j < vvi[i].size(); ++j)
std::cout << vvi[i][j] << ' ';
std::cout << '\n';
}
}
// populate results with solutions...
void f(vector<vector<int>>& results, int n, int max_each, int max_total)
{
if (n == 0) return;
if (results.size() == 0)
{
for (int i = 0; i <= min(max_each, max_total); ++i)
results.push_back(vector<int>(2, i));
f(results, n - 1, max_each, max_total);
return;
}
vector<vector<int>> new_results;
for (int r = 0; r < results.size(); ++r)
{
int previous = *(results[r].rbegin() + 1);
int current_total = results[r].back();
int remaining = max_total - current_total;
for (int i = 0; i <= min(previous,min(max_each, remaining)); ++i)
{
vector<int> v = results[r];
v.back() = i;
v.push_back(current_total + i);
new_results.push_back(v);
}
}
results = new_results;
f(results, n - 1, max_each, max_total);
}
const vector<int>& once(vector<vector<int>>& solutions)
{
int which = std::rand() % solutions.size();
vector<int>& v = solutions[which];
std::random_shuffle(v.begin(), v.end() - 1);
return v;
}
int main()
{
vector<vector<int>> solutions;
f(solutions, 5, 6, 14);
std::cout << "All solution combinations...\n";
p(solutions);
std::cout << "------------------\n";
std::cout << "A few sample permutations...\n";
for (int n = 1; n <= 100; ++n)
{
const vector<int>& o = once(solutions);
for (int i = 0; i < o.size() - 1; ++i)
std::cout << o[i] << ' ';
std::cout << '\n';
}
}
#include<iostream>
#include <cstdlib> //rand ()
using namespace std;
void main()
{
int random ,x=5;
int max , totalMax=0 , sum=0;
cout<<"Enter the total maximum number : ";
cin>>totalMax;
cout<<"Enter the maximum number: ";
cin>>max;
srand(0);
for( int i=0; i<x ; i++)
{
random=rand()%max+1; //range from 0 to max
sum+=random;
if(sum>=totalMax)
{
sum-=random;
i--;
}
else
cout<<random<<' ';
}
cout<<endl<<"Reached total maximum number "<<totalMax<<endl;
}
I wrote this simple code
I tested it using totalMax=14 and max=3 and it worked with me
hope it's what you asked for
LiHo's answer looks pretty similar to my second suggestion, so I'll leave that, but here's an example of the first. It could probably be improved, but it shouldn't have any tragic bugs. Here's a live sample.
#include <algorithm>
#include <array>
#include <random>
std::random_device rd;
std::mt19937 gen(rd());
constexpr int MAX = 14;
constexpr int LINES = 5;
int sum{};
int maxNum = 6;
int minNum{};
std::array<int, LINES> nums;
for (int i = 0; i < LINES; ++i) {
maxNum = std::min(maxNum, MAX - sum);
// e.g., after 0 0, min is 2 because only 12/14 can be filled after
int maxAfterThis = maxNum * (LINES - i - 1);
minNum = std::min(maxNum, std::max(minNum, MAX - sum - maxAfterThis));
std::uniform_int_distribution<> dist(minNum, maxNum);
int num = dist(gen);
nums[i] = num;
sum += num;
}
std::shuffle(std::begin(nums), std::end(nums), gen);
Creating that ditribution every time could potentially slow it down (I don't know), but the range has to go in the constructor, and I'm not one to say how well distributed these numbers are. However, the logic is pretty simple. Aside from that, it uses the nice, shiny C++11 <random> header.
We just make sure no remaining number goes over MAX (14) and that MAX is reached by the end. minNum is the odd part, and that's due to how it progresses. It starts at zero and works its way up as needed (the second part to std::max is figuring out what would be needed if we got 6s for the rest), but we can't let it surpass maxNum. I'm open to a simpler method of calculating minNum if it exists.
Since you know how many numbers you need, generate them from the given distribution but without further conditions, store them, compute the actual sum, and scale them all up/down to get the desired sum.

Sieve Of Atkin is surprisingly slow

I recently became very interested in prime numbers and tried making programs to calculate them. I was able to make a sieve of Sundaram program that was able to calculate a million prime numbers in a couple seconds. I believe that's pretty fast, but I wanted better. I went on to try to make a Sieve of Atkin, I slapped together working C++ code in 20 minutes after copying the pseudocode from Wikipedia.
I knew that it wouldn't be perfect because after all, its pseudocode. I was expecting at least better times than my Sundaram Sieve though, but I was so wrong. It's very very slow. I have looked it over many times but I cannot find any significant changes that could be made. When looking at my code remember, I know it's inefficient, I know I used system commands, I know it's all over the place, but this isn't a project or anything important, it's for me.
#include <iostream>
#include <fstream>
#include <time.h>
#include <Windows.h>
#include <vector>
using namespace std;
int main(){
float limit;
float slimit;
long int n;
int counter = 0;
int squarenum;
int starttime;
int endtime;
vector <bool> primes;
ofstream save;
save.open("primes.txt");
save.clear();
cout << "Find all primes up to: " << endl;
cin >> limit;
slimit = sqrt(limit);
primes.resize(limit);
starttime = time(0);
// sets all values to false
for (int i = 0; i < limit; i++){
primes[i] = false;
}
//puts in possible primes
for (int x = 1; x <= slimit; x++){
for (int y = 1; y <= slimit; y++){
n = (4*x*x) + (y*y);
if (n <= limit && (n%12 == 1 || n%12 == 5)){
primes[n] = !primes[n];
}
n = (3*x*x) + (y*y);
if (n <= limit && n% 12 == 7){
primes[n] = !primes[n];
}
n = (3*x*x) - (y*y);
if ( x > y && n <= limit && n%12 == 11){
primes[n] = !primes[n];
}
}
}
//square number mark all multiples not prime
for (float i = 5; i < slimit; i++){
if (primes[i] == true){
for (long int k = i*i; k < limit; k = k + (i*i)){
primes[k] = false;
}
}
}
endtime = time(0);
cout << endl << "Calculations complete, saving in text document" << endl;
// loads to document
for (int i = 0 ; i < limit ; i++){
if (primes[i] == true){
save << counter << ") " << i << endl;
counter++;
}
}
save << "Found in " << endtime - starttime << " seconds" << endl;
save.close();
system("primes.txt");
system ("Pause");
return 0;
}
This isn't exactly an answer (IMO, you've already gotten an answer in the comments), but a quick standard for comparison. A sieve of Eratosthenes should find a million primes in well under a second on a reasonably modern machine.
#include <vector>
#include <iostream>
#include <time.h>
unsigned long primes = 0;
int main() {
// empirically derived limit to get 1,000,000 primes
int number = 15485865;
clock_t start = clock();
std::vector<bool> sieve(number,false);
sieve[0] = sieve[1] = true;
for(int i = 2; i<number; i++) {
if(!sieve[i]) {
++primes;
for (int temp = 2*i; temp<number; temp += i)
sieve[temp] = true;
}
}
clock_t stop = clock();
std::cout.imbue(std::locale(""));
std::cout << "Total primes: " << primes << "\n";
std::cout << "Time: " << double(stop - start) / CLOCKS_PER_SEC << " seconds\n";
return 0;
}
Running this on my laptop, I get a result of:
Total primes: 1000000
Time: 0.106 seconds
Obviously, speed will vary somewhat with processor, clock speed, etc., but with anything reasonably modern, I'd still expect a time of less than a second. Of course, if you decide to write the primes out to a file, you can expect that to add some time, but even with that I'd expect a total time under a second--with my laptop's relatively slow hard drive, writing out the numbers only gets the total up to about 0.6 seconds.
vector is a bitset. It is expensive to update bitset values that are not in cache. Try vector, it is much cheaper to write to.