2 while loops vs if else statement in 1 while loop - c++

First a little introduction:
I'm a novice C++ programmer (I'm new to programming) writing a little multiplication tables practising program. The project started as a small program to teach myself the basics of programming and I keep adding new features as I learn more and more about programming. At first it just had basics like ask for input, loops and if-else statements. But now it uses vectors, read and writes to files, creates a directory etc.
You can see the code here: Project on Bitbucket
My program now is going to have 2 modes: practise a single multiplication table that the user can choose himself or practise all multiplication tables mixed. Now both modes work quite different internally. And I developed the mixed mode as a separate program, as would ease the development, I could just focus on writing the code itself instead of also bothering how I will integrate it in the existing code.
Below the code of the currently separate mixed mode program:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
#include <time.h>
using namespace std;
using std::string;
int newquestion(vector<int> remaining_multiplication_tables, vector<int> multiplication_tables, int table_selecter){
cout << remaining_multiplication_tables[table_selecter] << " * " << multiplication_tables[remaining_multiplication_tables[table_selecter]-1]<< " =" << "\n";
return remaining_multiplication_tables[table_selecter] * multiplication_tables[remaining_multiplication_tables[table_selecter]-1];
}
int main(){
int usersanswer_int;
int cpu_answer;
int table_selecter;
string usersanswer;
vector<int> remaining_multiplication_tables = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
vector<int> multiplication_tables(10, 1);//fill vector with 10 elements that contain the value '1'. This vector will store the "progress" of each multiplication_table.
srand(time(0));
table_selecter = rand() % remaining_multiplication_tables.size();
cpu_answer = newquestion(remaining_multiplication_tables, multiplication_tables, table_selecter);
while(remaining_multiplication_tables.size() != 0){
getline(cin, usersanswer);
stringstream usersanswer_stream(usersanswer);
usersanswer_stream >> usersanswer_int;
if(usersanswer_int == cpu_answer){
cout << "Your answer is correct! :)" << "\n";
if(multiplication_tables[remaining_multiplication_tables[table_selecter]-1] == 10){
remaining_multiplication_tables.erase(remaining_multiplication_tables.begin() + table_selecter);
}
else{
multiplication_tables[remaining_multiplication_tables[table_selecter]-1] +=1;
}
if (remaining_multiplication_tables.size() != 0){
table_selecter = rand() % remaining_multiplication_tables.size();
cpu_answer = newquestion(remaining_multiplication_tables, multiplication_tables, table_selecter);
}
}
else{
cout << "Unfortunately your answer isn't correct! :(" << "\n";
}
}
return 0;
}
As you can see the newquestion function for the mixed mode is quite different. Also the while loop includes other mixed mode specific code.
Now if I want to integrate the mixed multiplication tables mode into the existing main program I have 2 choices:
-I can clutter up the while loop with if-else statements to check each time the loop runs whether mode == 10 (single multiplication table mode) or mode == 100 (mixed multiplication tables mode). And also place a if-else statement in the newquestion() function to check if mode == 10 or mode == 100
-I can let the program check on startup whether the user chose single multiplication table or mixed multiplication tables mode and create 2 while loops and 2 newquestion() functions. That would look like this:
int newquestion_mixed(){
//newquestion function for mixed mode
}
int newquestion_single(){
//newquestion function for single mode
}
//initialization
if mode == 10
//create necessary variables for single mode
while(){
//single mode loop
}
else{
//create necessary variables for mixed mode
while(){
//mixed mode loop
}
}
Now why would I bother creating 2 separate loops and functions? Well isn't it inefficient if the program checks each time the loop runs (each time the user is asked a new question, for example: '5 * 3 =') which mode the user chose? I'm worried about the performance with this option. Now I hear you think: but why would you bother about performance for such a simple, little non-performance critical application with the extremely powerful processors today and the huge amounts of RAM? Well, as I said earlier this program is mainly about teaching myself a good coding style and learning how to program etc. So I want to teach myself the good habits from the beginning.
The 2 while loops and functions option is much more efficient will use less CPU, but more space and includes duplicating code. I don't know if this is a good style either.
So basically I'm asking the experts what's the best style/way to handle this kind of things. Also if you spot something bad in my code/bad style please tell me, I'm very open to feedback because I'm still a novice. ;)

First, a fundamental rule of programming is that of "don't prematurely optimize the code" - that is, don't fiddle around with little details, before you have the code working correctly, and write code that expresses what you want done as clearly as possible. This is good coding style. To obsess over the details of "which is faster" (in a loop that spends most of it's time waiting for the user to input some number) is not good coding style.
Once it's working correcetly, analyse (using for example a profiler tool) where the code is spending it's time (assuming performance is a major factor in the first place). Once you have located the major "hotspot", then try to make that better in some way - how you go about that depends very much on what that particular hot-spot code does.
As to which performs best will highly depend on details the code and the compiler (and which compiler optimizations are chosen). It is quite likely that having an if inside a while-loop will run slower, but modern compilers are quite clever, and I have certainly seen cases where the compiler hoists such a choice out of the loop, in cases where the conditions don't change. Having two while-loops is harder for the compiler to "make better", because it most likely won't see that you are doing the same thing in both loops [because the compiler works from the bottom of the parse-tree up, and it will optimize the inside of the while-loop first, then go out to the if-else side, and at that point it's "lost track" of what's going on inside each loop].
Which is clearer, to have one while loop with an if inside, or an if with two while-loops, that's another good question.
Of course, the object oriented solution is to have two classes - one for mixed, another for single - and just run one loop, that calls the relevant (virtual) member function of the object created based on an if-else statement before the loop.

Modern CPU branch predictors are so good that if during the loop the condition never changes, it will probably be as fast as having two while loops in each branch.

Related

Dealing with heavy profiling of execution times in C++

I'm currently working on a scientific computing project involving huge data and complex algorithms, so I need to do a lot of code profiling. I'm currently relying on <ctime> and clock_t to time the execution of my code. I'm perfectly happy with this solution... except that I'm basically timing everything and thus for every line of real code I have to call start_time_function123 = clock(), end_time_function123 = clock() and cout << "function123 execution time: " << (end_time_function123-start_time_function123) / CLOCKS_PER_SEC << endl. This leads to heavy code bloating and quickly makes my code unreadable. How would you deal with that?
The only solution I can think of would be to find an IDE allowing me to mark portions of my code (at different locations, even in different files) and to toggle hide/show all marked code with one button. This would allow me to hide the part of my code related to profiling most of the time and display it only whenever I want to.
Have a RAII type that marks code as timed.
struct timed {
char const* name;
clock_t start;
timed( char const* name_to_record):
name(name_to_record),
start(clock())
{}
~timed(){
auto end=clock();
std::cout << name << " execution time: " << (end-start) / CLOCKS_PER_SEC << std::endl;
}
};
The use it:
void foo(){
timed timer(__func__);
// code
}
Far less noise.
You can augment with non-scope based finish operations. When doing heavy profiling sometimes I like to include unique ids. Using cout esoecially with endl could result in it dominating timing; fast recording to a large buffer that is dumped out in an async manner may be optimal. If you need to time ms level time, even allocation, locks and string manipulation should be avoided.
You don't say so explicitly, but I assume you are looking for possible speedups - ways to reduce the time it takes.
You think you need to do this by measuring how much time different parts of it take. If you're interested, there's an orthogonal way to approach it.
Just get it running under a debugger (using a non-optimized debug build).
Manually interrupt it at random, by Ctrl-C, Ctrl-Break, or the IDE's "pause" button.
Display the call stack and carefully examine what the program is doing, at all levels.
Do this with a suspicion that whatever it's doing could be something wasteful that you could find a better way to do.
Then if you start it up again, and halt it again, and see it doing the same thing or something similar, you know you will get a substantial speedup if you fix it.
The fewer samples you took to see that thing twice, the more speedup you will get.
That's the random pausing technique, and the statistical rationale is here.
The reason you do it on a debug build is here.
After you've cut out the fat using this method, you can switch to an optimized build and get the extra margin it gives you.

Do variables affect performance?

I am using c++ with QT 5.6. I have simple console application in 2 styles as follows:
//First style
qstring x = “Hi!”;
void func()
{
QTextStream(stdout) << x;
}
int main()
{
while (true)
{
func_one();
}
}
//Second style
void func()
{
QTextStream(stdout) << “Hi!”;
}
int main()
{
while (true)
{
func();
}
}
Which will stress out the cpu more and therefore have lesser performance there might not be a big difference but when we apply this to large scale such as a server where every 2 seconds a connection is made it makes a situation similar to the loop above and with multiple variables (but not the same variable and data) a little less resource usage can cause great performance improvements with lesser resource usage. So is using variables gives any performance improvements but I will be using the variable only once in my function though the function will be called repetitively or will using variables slows the program as it has to repetitively check the ram for where is the value of “x” stored and then retrieve the data?
Edit 1:
I will not be using the variable again in my code and we can consider that there is no compiler optimizations. #DrDonut the answer in the link you gave also doesn't answer is $array === (array) $array faster than is_array($array) i.e is it a micro-optimization and I am also asking is the second style a micro-optimization or does it harm the performance.
Your example is bad because of possible compiler optimizations and because it is not clear will you use this variable in different places or it is just a test code which will be thrown out.
But generally you are optimizing in a wrong way. There is no sense to optimize single variable or single function. You should not guess where your program will spend its time, you should first write your program in the way it works and looks OK.
After the program works, if you find its perfomance is bad you should search for bottlenecks - places where program spends a lot of time. They can be found with the help of profilers or in debugger, not by guessing.
When you found them, you need to optimize these critical places.
Read about premature optimization

Visual studio: how to make C++ consoleApplication use more of CPU power?

so i am running a console project, but when the code is running i see in Task manager that only 5% (2.8 GHz) of Cpu is been used, of course i am not exacly sure how cpu distribute the proccessing power in windows to begain with. but for more of a future reffrence i would like to know if i had a performance demanding code that i need the answer faster how would i do that?
here is the code if you would like to know:
#include "stdafx.h"
#include <iostream>
#include <string.h>
using namespace std;
void swap(char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
void permute(char *a, int l, int r)
{
int i;
if (l == r)
cout << a << endl;
else
{
for (i = l; i <= r; i++)
{
swap((a + l), (a + i));
permute(a, l + 1, r);
swap((a + l), (a + i));
}
}
}
int main()
{
char Short[] = "ABCD";
int n1 = strlen(Short);
char Long[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int n2 = strlen(Long);
while(true)
{
cout << "Would you like to see the permutions of only a) ABCD or b) the whole alphabet?!\n(please enter a or b): ";
char input;
cin >> input;
if (input == 'a')
{
cout << "The permutions of ABCD:\n";
permute(Short, 0, n1 - 1);
cout << "-----------------------------------";
}
else if (input == 'b')
{
cout << "The permutions of Alphabet:\n";
permute(Long, 0, n2 - 1);
cout << "-----------------------------------";
}
else
{
cout << "ERROR! : Enter either a or b.\n";
}
}
}
i found the code in a blog to show the permutions of "ABCD" as part of an assgiment but i also used it for the entire alphabet, and i wanted to know for that use is there a way to make code use more cpu?(it's kinda taking a much longer time than i expected)
Learning to optimize code efficiently is a major challenge for the even experienced coders, and there are volumes of books, articles, and presentations on the topic. As such, a complete treatment is well out of scope for a Stack Overflow question.
That said, here are a few principles:
Focus initially on the algorithm. You can write a messy bubble sort or an efficient one, but in most 'real world' cases quicksort will beat either handily. This is arguably the primary reason the field of computer science exists: the study and selection of algorithms and their theoretical performance.
Related to this, make sure you are comparing your implementation against a 'stock' algorithm when possible. For example, you should see how your implementation performs compared to using the C++11 std::random_shuffle in the <random> header.
Optimize the compiler settings first. Debug builds are never going to be fast, and they aren't supposed to be. Using inline can help, but only happens if the compiler is actually doing inline optimizations. For Visual C++, there are a number of different optimization settings you can try out, but remember that there are tradeoffs so /Ox (maximum optimization) may not always be the right choice, which is why most templates default to /O2 (maximize speed). In some cases, /O1 (minimize space) is actually better.
Always measure performance before and after optimization. Modern out-of-order CPUs are sophisticated systems, and they don't always do what you think they are doing. In many cases, what is a textbook optimization in code actually performs worse than the original code due to various pipelining and microarchitecture effects. The only way to know for sure is to use a good profiler, have solid test cases, and measure the impact of any optimization work. If it's slower on average than before, then revert to the 'unoptimized' version and try something else.
Focus optimization on the hotspots. This is the so-called '80/20' rule. In many applications the vast majority of the code is run rarely, so only a few areas of your application are actually spending enough time running to be worth optimizing.
As a corollary to this rule, having all of your code using extremely inefficient anti-patterns can really hurt the baseline performance of your entire application. For this reason, it's worth knowing how to write good code generally. The point of the 80/20 rule is to spend your limited time optimizing on the areas that will have the most impact rather than what you as the programmer assume matters.
All that said, in your case none of this matters. The vast majority of the CPU time is spent just creating your process and handling the serialized input and output. When dealing with an n of 4 or 26, it doesn't matter how bad or good your algorithm is. In other words, it is highly unlikely permute is your program's 'hotspot' unless you are working with tens of thousands of millions of characters.
NOTE: Yes I am oversimplifying the topic, but I'm concerned that
without this basic understanding, the more advanced topics will
actually lead to some disastrous program designs.
Maybe I'm missing something, but there also seems to be a misunderstanding regarding the link between CPU and efficiency in your mind.
Your program has N instructions, and the CPU will process those N instructions at relatively the same speed (3.56 GHz is about 3.56 billion instructions per second). That's the same (more or less), whether you're getting "5%" or "25%" use of the CPU from a single program. (I'll explain that percentage in a moment.)
The only way to get "faster" in terms of processor usage, as erip said, with parallel computing techniques, which in a nutshell employ multiple CPUs to accomplish the task.
If you think of it like an assembly line, your one worker can only process one widget at a time. If your batch of widgets to him takes up 5% of his time, that means that in order to process ALL of your widgets one-by-one, he uses 5% of his time, and the other 95% is not needed for that batch (and he'll probably use it for some other batches other people assigned him.)
He cannot process more than one widget at a time, so that's as fast as he'll get with your batch. You might be able to make things appear faster by having him alternate between two different types of widgets, instead of finishing all of batch A before starting on batch B, but it will still take the same amount of time in the end to process both batches.
MASSIVE EXCEPTION: If he's spending 100% of his time on someone else's batch of widgets, you're literally going to have to cool your heels. That's not something you can do a thing about.
However, if you add another worker to that assembly line, they can process twice (roughly) the widgets in the same amount of time, because you are processing two widgets at once. When we say you have a "quad core processor", that basically means that you have four workers available (literally 4 CPUs). Each one can only process a single instruction at once, but by assigning more than one to the batch of widgets, you get it done faster.
All of this said, one must keep in mind that those CPUs are doing a lot - they run the entire computer. You want to try and keep those percentages down as much as possible, so your program is fast and responsive on any supported computer. Not all of your users will have 3.46 GHz quad-core machines, after all.
Surely the reason this program is not using all available CPU bandwidth is because it's emitting the permutation results to the screen once for each permutation. This will result in blocking I/O within the implementation of cout.
If you want 100% cpu use you'll want to separate computation from I/O. In this case you'd then need to either:
a) store the results for later output, or
b) communicate results across a thread boundary (which will itself have a an efficiency cost because of the cost of acquiring mutexes and synchronising cache memory), or
c) a combination of the above (batching results and communicating them across the thread boundary)
For a quick check, you could remove comment out all the cout calls and see how much CPU use you get (as mentioned it will be close to 100% divided by the number of CPUs on your computer).

How to speed up program execution

This is a very simple question, but unfortunately, I am stuck and do not know what to do. My program is a simple program that keeps on accepting 3 numbers and outputs the largest of the 3. The program keeps on running until the user inputs a character.
As the tittle says, my question is how I can make this execute faster ( There will be a large amount of input data ). Any sort of help which may include using a different algorithm or using different functions or changing the entire code is accepted.
I'm not very experienced in C++ Standard, and thus do not know about all the different functions available in the different libraries, so please do explain your reasons and if you're too busy, at least try and provide a link.
Here is my code
#include<stdio.h>
int main()
{
int a,b,c;
while(scanf("%d %d %d",&a,&b,&c))
{
if(a>=b && a>=c)
printf("%d\n",a);
else if(b>=a && b>=c)
printf("%d\n",b);
else
printf("%d\n",c);
}
return 0;
}
It's working is very simple. The while loop will continue to execute until the user inputs a character. As I've explained earlier, the program accepts 3 numbers and outputs the largest. There is no other part of this code, this is all. I've tried to explain it as much as I can. If you need anything more from my side, please ask, ( I'll try as much as I can ).
I am compiling on an internet platform using CPP 4.9.2 ( That's what is said over there )
Any sort of help will be highly appreciated. Thanks in advance
EDIT
The input is made by a computer, so there is no delay in input.
Also, I will accept answers in c and c++.
UPDATE
I would also like to ask if there are any general library functions or algorithms, or any other sort of advise ( certain things we must do and what we must not do ) to follow to speed up execution ( Not just for this code, but in general ). Any help would be appreciated. ( and sorry for asking such an awkward question without giving any reference material )
Your "algorithm" is very simple and I would write it with the use of the max() function, just because it is better style.
But anyway...
What will take the most time is the scanf. This is your bottleneck. You should write your own read function which reads a huge block with fread and processes it. You may consider doing this asynchronously - but I wouldn't recommend this as a first step (some async implementations are indeed slower than the synchronous implementations).
So basically you do the following:
Read a huge block from file into memory (this is disk IO, so this is the bottleneck)
Parse that block and find your three integers (watch out for the block borders! the first two integers may lie within one block and the third lies in the next - or the block border splits your integer in the middle, so let your parser just catch those things)
Do your comparisions - that runs as hell compared to the disk IO, so no need to improve that
Unless you have a guarantee that the three input numbers are all different, I'd worry about making the program get the correct output. As noted, there's almost nothing to speed up, other than input and output buffering, and maybe speeding up decimal conversions by using custom parsing and formatting code, instead of the general-purpose scanf and printf.
Right now if you receive input values a=5, b=5, c=1, your code will report that 1 is the largest of those three values. Change the > comparisons to >= to fix that.
You can minimize the number of comparisons by remembering previous results. You can do this with:
int d;
if (a >= b)
if (a >= c)
d = a;
else
d = c;
else
if (b >= c)
d = b;
else
d = c;
[then output d as your maximum]
That does exactly 2 comparisons to find a value for d as max(a,b,c).
Your code uses at least two and maybe up to 4.

How to (computed) goto and longjmp in C++?

I don't usually code C++, but a strange comp sci friend of mine got sick of looking at my wonderful FORTRAN programs and challenged me to rewrite one of them in C++, since he likes my C++ codes better. (We're betting money here.) Exact terms being that it needs to be compilable in a modern C++ compiler. Maybe he hates a good conio.h - I don't know.
Now I realize there are perfectly good ways of writing in C++, but I'm going for a personal win here by trying to make my C++ version as FORTRAN-esque as possible. For bonus points, this might save me some time and effort when I'm converting code.
SO! This brings me to the following related queries:
On gotos:
How do you work a goto?
What are the constraints on gotos in C++?
Any concerns about scope? (I'm going to try to globally scope as much as possible, but you never know.)
If I use the GCC extension to goto to a void pointer array, are there any new concerns about undefined behavior, etc?
On longjmp:
How would you safely use a longjmp?
What are the constraints on longjmps in C++?
What does it do to scope?
Are there any specific moments when it looks like a longjmp should be safe but in fact it isn't that I should watch out for?
How would I simulate a computed goto with longjmp?
Is there any tangible benefit to using longjmp over goto if I only have one function in my program?
Right now my main concern is making a computed goto work for this. It looks like I'll probably use the longjmp to make this work because a void pointer array isn't a part of the C++ standard but a GCC specific extension.
I'll bite and take the downvote.
I seriously doubt that your friend will find Fortran written in C++ any easier (which is effectively what you'll get if you use goto and longjmp significantly) to read and he might even find it harder to follow. The C++ language is rather different from Fortran and I really don't think you should attempt a straight conversion from Fortran to C++. It will just make the C++ harder to maintain and you might as well stay with your existing codebase.
goto: You set up a label (my_label:) and then use the goto command goto my_label; which will cause your program flow to execute at the statement following the goto. You can't jump past the initialization of a variable or between functions. You can't create an array of goto targets but you can create an array of object or function pointers to jump to.
longjmp: There is no reason to prefer longjmp over goto if you have only one function. But if you have only one function, again, you really aren't writing C++ and you'll be better off in the long run just maintaining your Fortran.
You'll get plenty of haterade about using goto at all. Normally I'd jump right on the bandwagon, but in this particular case it sounds more like code golf to me. So here you go.
Use goto to move the instruction pointer to a "label" in your code, which is a C++ identifier followed by a colon. Here's a simple example of a working program:
#include <iostream>
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
int i = 0;
step:
cout << "i = " << i;
++i;
if( i < 10 )
goto step;
}
In this case, step: is the label.
There are concerns about context.
You can only goto to a label within the current function.
If your goto skips the initialization of a variable, you may evoke Undefined Behavior (Code which will compile, but you can't say for sure what it will actually do.).
You cannot goto in to a try block or catch handler. However, you can goto out of a try block.
You "can goto" with pointers etc provided the other concerns are met. If the pointer in question is in-scope at the call site and in-scope at the branch site, no problem.
I think this reference has most of the information you are looking for.
goto
longjmp
computed goto --> switch
Really, they share a (common, but not universal) underling implementation as a jump table.
If I understand the original question, the question is actually an interesting one. Rewording the question (to what I think is an equivalent question): "How do you do a FORTRAN computed goto in C?"
First we need to know what a computed goto is: Here is a link to one explanation: http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0124.htm.
An example of a computed GOTO is:
GO TO (12,24,36), INDEX
Where 12, 24, and 36 are statement numbers. (C language labels could serve as an equivalent, but is not the only thing that could be an equivalent.)
And where INDEX is a variable, but could be the result of a formula.
Here is one way (but not the only way) to do the same thing in C:
int SITU(int J, int K)
{
int raw_value = (J * 5) + K;
int index = (raw_value % 5) - 1;
return index;
}
int main(void)
{
int J = 5, K= 2;
// fortran computed goto statement: GO TO (320,330,340,350,360), SITU(J,K) + 1
switch (SITU(J,K) + 1)
{
case 0: // 320
// code statement 320 goes here
printf("Statement 320");
break;
case 1: // 330
// code statement 330 goes here
printf("Statement 330");
break;
case 2: // 340
// code statement 340 goes here
printf("Statement 340");
break;
case 3: // 350
// code statement 350 goes here
printf("Statement 350");
break;
case 4: // 360
// code statement 360 goes here
printf("Statement 360");
break;
}
printf("\nPress Enter\n");
getchar();
return 0;
}
In this particular example, we see that you do not need C gotos to implement a FORTRAN computed goto!
Longjmp can get you out of a signal handler which can be nice - and it'll add some confusion as it will not reset automatic (stack-based) variables in the function it long jumps to defined prior to the setjmp line. :)
There is a GCC extension called Labels as Values that will help you code golf, essentially giving you computed goto. You can generate the labels automatically, of course. You will probably need to do that since you can't know how many bytes of machine code each line will generate.