C++ printing extremely faster than C

C++ printing extremely faster than C - c++

I am taking part in a small programming competition online. Basically what I do is solve a task, write an algorithm and send my code to be automatically evaluated by the competition holder's server.
The server accepts a wide variety of programming languages. All the tasks basically require the program to take input from the terminal and output a correct to the terminal as well. So on the competition holder's website I noticed that one of the languages they support is C++ and they use g++ to compile it. Well, since I'm not that fluent in C++ as opposed to C I thought I would return my answers in C.
This worked great for the first task. However in the second task I constantly hit the limit set for the execution time of the program (2 seconds)
This is my C code:
#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>
#include <math.h>
#include <stdlib.h>
uint8_t get_bit(uint64_t k) {
...
}
int main(int argc, char *argv[]) {
uint64_t n;
uint64_t k;
scanf("%u", &n);
uint64_t i;
for (i = 0; i < n; i++) {
scanf("%u", &k);
printf("%d\n", get_bit(k));
}
return 0;
}
So my algorithm is defined in get_bit.
The server runs 3 different tests on my program, with different values, mostly increasing to make the program run longer.
However, this code in C failed the tests due to taking more than 2 seconds to run. Trying different solutions for hours with no avail, I finally tried to submit my code as C++ with a little different printing methods.
Here is my C++ main (the rest of the program stayed mostly the same):
int main(int argc, char *argv[]) {
uint64_t n;
uint64_t k;
cin >> n;
uint64_t i;
for (i = 0; i < n; i++) {
cin >> k;
cout.operator<<(get_bit(k)) << endl;
}
return 0;
}
And when I submitted this code, all the tests ran perfectly in just a few hundred milliseconds each. Note that I did not alter my algorithm in get_bit but only the printing.
Why is printing in C++ so much faster than in C? (in my case up to 10x faster)
If it's possible, how can I achieve these speeds in C as well? As you might probably notice, I am not fluent in C++ and the previous code is mainly copy paste. For this reason I would much rather prefer to program in C.
Thank you in advance.

It is probably because your code is may be (see comments) incorrect. You cant use %u with scanf and 64-bit integer.
Check the third table here http://www.cplusplus.com/reference/cstdio/scanf/ . You should use sth like %llu.

Related

Some codes are not running properly in DEV C++

I have been using DEV C++ for the last 2 months and never faced such an issue. But today when I am running the follow code, the execution screen is displaying nothing.
#include <bits/stdc++.h>
using namespace std;
int main()
{
int n;
cout<<"ENTER THE LENGTH OF THE ARRAY:";
cin>>n;
int a[n];
cout<<"ENTER THE ELEMENTS OF THE ARRAY:";
for(int i=0; i<n; i++){
cin>>a[i];
}
const int N = 1e6+2;
int idx[N];
for(int i=0;i<N;i++){
idx[i]=-1;
}
int minidx = INT_MAX;
for(int i=0; i<n; i++){
if(idx[a[i]]!=-1){
minidx = min(minidx, idx[a[i]]);
}
else{
idx[a[i]]=i;
}
}
if(minidx == INT_MAX){
cout<<"NO REPEATING ELEMENT FOUND";
}
else{
cout<<"REPEATING ELEMENT FOUND AT:"<<minidx+1;
}
return 0;
}
This is the screenshot of the output screen.
Pls suggest me a solution to this problem.

I have been using DEV C++ for the last 2 months
Notice that DEV C++ is an old, non-standard conforming, and buggy compiler.
int n;
cout<<"ENTER THE LENGTH OF THE ARRAY:";
cin>>n;
int a[n];
What would happen if your user has input -1234? Arrays cannot have negative dimensions, and your code did not test that!
You deserve using a better C++ compiler
E.g. the ones (GCC or Clang) inside a Debian distribution, with a good source code editor such as GNU emacs and some good build automation tool like GNU make.
#include <bits/stdc++.h>
is wrong and non standard conforming, and is slowing down your compilation.
const int N = 1e6+2;
int idx[N];
This is not reasonable: you want more than four millions bytes on your call stack, which is often limited to a megabyte. My guess is that you are experimenting some stack overflow for your automatic variable idx
Consider using standard C++ containers, and upgrade your compiler to a decent one.
I recommend using a recent GCC (at end of 2020, this means GCC 10) compiler as g++ -Wall -Wextra -g since you want all warnings and debug info. You could even use static analysis options to g++
Once you got no warnings, use the GDB debugger to understand the behavior of your program.
Read a good C++ programming book
See this C++ reference, the documentation of your C++ compiler, and of your debugger.
Consider using tools like the Clang static analyzer on your code.
Refer to a recent C++ draft standard, such as n4659. Be afraid of undefined behavior. Take inspiration from existing C++ open source code on github or gitlab.
In a few months, if so allowed, consider coding your GCC plugin to find some bugs in your programs. Be aware of Rice's theorem.

DevC++ seems now supported by Embarcadero: https://github.com/Embarcadero/Dev-Cpp/releases checked it, and it is looking ok enough. I will assume that you are not using the 1991 version of this IDE and the compiler it came with; Of course, if you want to use it with Windows 98 then the old version might be an option,
Anyhow, I will try to list the issues in your code. Most of them are a trap for beginners:
int n;
cout<<"ENTER THE LENGTH OF THE ARRAY:";
cin>>n;
in this part, you are not forcing flushing before asking for user input. It is always a good practice to flush before user-input. How this can be an issue? Long story short you can get for example C/C++ printf() before scanf() issue
solution: c++ force std::cout flush (print to screen).
Let's continue you are using Variable-length array (VLA) here which is not supported by C++ stdandard,
int a[n];
You can check for more Why aren't variable-length arrays part of the C++ standard?
Recommend using std::vector a(n); there which it will allocate in the heap. If yo want the allocation in the stack, within standards then you either need to wait for "dynarray" (which may never be available in C++) or switch to C...
const int N = 1e6+2;
int idx[N];
for(int i=0;i<N;i++){
idx[i]=-1;
}
in the code below, you are trying to allocate a gigabyte of buffer in the stack. I suspect this is the reason why your program is not running properly. If you want a cheap hack then you can convert this allocation to "static int idx[N];" which will move this buffer to the heap. Beware using a static variable is not a good practice at all and not recommending it. Recommend either "new []" or "vector" there too,
for(int i=0; i<n; i++){
if(idx[a[i]]!=-1){
minidx = min(minidx, idx[a[i]]);
}
else{
idx[a[i]]=i;
}
}
idx[a[i]] has no boundary check. you should check the boundary for both arrays.
First of all, this is a security leak and a very bad-way of writing a program: Example of a buffer overflow leading to a security leak
Also, this part looks like it has a logic issue: Withing the boundary all the "idx" elements should be "-1" so you should never get into the else-condition (in a normal way)

Comparing 2 source codes with same logic

This is my code for "Life, the Universe, and Everything" at hackerearth
Why "my solution" is slower than the "best solution" even though the logic
used is same!
In fact I think my logic should be faster because I printed the answer when n!=42 but the "best solution" has to first check for n==42 and then insert element in array (executing the 'else' part).
Please tell me why "best solution" is faster.
my solution :-
#include <iostream>
using namespace std;
int main()
{
int n;
while(1)
{
scanf("%d",&n);
if(n!=42)
{
printf("%d\n",n);
}
else
break;
}
return 0;
}
Time Taken: 1.010669 seconds
---------------------------------------------------------
best solution:-
#include <iostream>
using namespace std;
int main()
{
int n,a[10001],i=0,j;
while(1)
{
cin>>n;
if(n==42)
break;
else
a[i++]=n;
}
for(j=0;j<i;j++)
cout<<a[j]<<endl;
return 0;
}
Time Taken: 1.004222 seconds

scanf and printf are extremely expensive functions in terms of processing. They have to interpret the format string (where you write "%d" there could be a huge variety of complex format placeholders). By the time this alone is done, the other solution has probably done it's processing already. The source code for the scanf function would be several hundred lines of code, and printf too.
On the other hand, in the best solution the compiler knows the expected types already and has to deal with a lot less computing that is done at run time, so the conversion from input string to integer and vice versa is far more effective.

haven't tested it yet, but your output include twice as much output to console.
for each number "best solution" prints, you add a \n too.
more characters = more time to process your stdout buffer.

Big difference (x9) in the execution time between almost identical code in C and C++

I was trying to solve this exercise from www.spoj.com : FCTRL - Factorial
You don't really have to read it, just do it if you are curious :)
First I implemented it in C++ (here is my solution):
#include <iostream>
using namespace std;
int main() {
unsigned int num_of_inputs;
unsigned int fact_num;
unsigned int num_of_trailing_zeros;
std::ios_base::sync_with_stdio(false); // turn off synchronization with the C library’s stdio buffers (from https://stackoverflow.com/a/22225421/5218277)
cin >> num_of_inputs;
while (num_of_inputs--)
{
cin >> fact_num;
num_of_trailing_zeros = 0;
for (unsigned int fives = 5; fives <= fact_num; fives *= 5)
num_of_trailing_zeros += fact_num/fives;
cout << num_of_trailing_zeros << "\n";
}
return 0;
}
I uploaded it as the solution for g++ 5.1
The result was: Time 0.18 Mem 3.3M
But then I saw some comments which claimed that their time execution was less than 0.1. Since I couldn't think about faster algorithm I tried to implement the same code in C:
#include <stdio.h>
int main() {
unsigned int num_of_inputs;
unsigned int fact_num;
unsigned int num_of_trailing_zeros;
scanf("%d", &num_of_inputs);
while (num_of_inputs--)
{
scanf("%d", &fact_num);
num_of_trailing_zeros = 0;
for (unsigned int fives = 5; fives <= fact_num; fives *= 5)
num_of_trailing_zeros += fact_num/fives;
printf("%d", num_of_trailing_zeros);
printf("%s","\n");
}
return 0;
}
I uploaded it as the solution for gcc 5.1
This time the result was: Time 0.02 Mem 2.1M
Now the code is almost the same, I added std::ios_base::sync_with_stdio(false); to the C++ code as was suggested here to turn off the synchronization with the C library’s stdio buffers. I also split the printf("%d\n", num_of_trailing_zeros); to printf("%d", num_of_trailing_zeros); printf("%s","\n"); to compensate for double call of operator<< in cout << num_of_trailing_zeros << "\n";.
But I still saw x9 better performance and lower memory usage in C vs. C++ code.
Why is that?
EDIT
I fixed unsigned long to unsigned int in the C code. It should have been unsigned int and the results which are shown above are related to the new (unsigned int) version.

Both programs do exactly the same thing. They use the same exact algorithm, and given its low complexity, their performance is mostly bound to efficiency of the input and output handling.
scanning the input with scanf("%d", &fact_num); on one side and cin >> fact_num; on the other does not seem very costly either way. In fact it should be less costly in C++ since the type of conversion is known at compile time and the correct parser can be invoked directly by the C++ compiler. The same holds for the output. You even make a point of writing a separate call for printf("%s","\n");, but the C compiler is good enough to compile this as a call to putchar('\n');.
So looking at the complexity of both the I/O and computation, the C++ version should be faster than the C version.
Completely disabling the buffering of stdout slows the C implementation to something even slower than the C++ version. Another test by AlexLop with an fflush(stdout); after the last printf yields similar performance as the C++ version. It is not as slow as completely disabling buffering because output is written to the system in small chunks instead of one byte at a time.
This seems to point to a specific behavior in your C++ library: I suspect your system's implementation of cin and cout flushes the output to cout when input is requested from cin. Some C libraries do this as well, but usually only when reading/writing to and from the terminal. The benchmarking done by the www.spoj.com site probably redirects input and output to and from files.
AlexLop did another test: reading all the inputs at once in a vector and subsequently computing and writing all output helps understanding why the C++ version is so much slower. It increases performance to that of the C version, this proves my point and removes suspicion on the C++ formatting code.
Another test by Blastfurnace, storing all outputs in an std::ostringstream and flushing that in one blast at the end, does improve the C++ performance to that of the basic C version. QED.
Interlacing input from cin and output to cout seems to cause very inefficient I/O handling, defeating the stream buffering scheme. reducing performance by a factor of 10.
PS: your algorithm is incorrect for fact_num >= UINT_MAX / 5 because fives *= 5 will overflow and wrap around before it becomes > fact_num. You can correct this by making fives an unsigned long or an unsigned long long if one of these types is larger than unsigned int. Also use %u as the scanf format. You are lucky the guys at www.spoj.com are not too strict in their benchmarks.
EDIT: As later explained by vitaux, this behavior is indeed mandated by the C++ standard. cin is tied to cout by default. An input operation from cin for which the input buffer needs refilling will cause cout to flush pending output. In the OP's implementation, cin seems to flush cout systematically, which is a bit overkill and visibly inefficient.
Ilya Popov provided a simple solution for this: cin can be untied from cout by casting another magical spell in addition to std::ios_base::sync_with_stdio(false);:
cin.tie(nullptr);
Also note that such forced flush also occurs when using std::endl instead of '\n' to produce an end of line on cout. Changing the output line to the more C++ idiomatic and innocent looking cout << num_of_trailing_zeros << endl; would degrade performance the same way.

Another trick to make iostreams faster when you use both cin and cout is to call
cin.tie(nullptr);
By default, when you input anything from cin, it flushes cout. It can significantly harm performance if you do interleaved input and output. This is done for the command line interface uses, where you show some prompt and then wait for data:
std::string name;
cout << "Enter your name:";
cin >> name;
In this case you want to make sure the prompt is actually shown before you start waiting for input. With the line above you break that tie, cin and cout become independent.
Since C++11, one more way to achieve better performance with iostreams is to use std::getline together with std::stoi, like this:
std::string line;
for (int i = 0; i < n && std::getline(std::cin, line); ++i)
{
int x = std::stoi(line);
}
This way can come close to C-style in performance, or even surpass scanf. Using getchar and especially getchar_unlocked together with handwritten parsing still provides better performance.
PS. I have written a post comparing several ways to input numbers in C++, useful for online judges, but it is only in Russian, sorry. The code samples and the final table, however, should be understandable.

The problem is that, quoting cppreference:
any input from std::cin, output to std::cerr, or program termination forces a call to std::cout.flush()
This is easy to test: if you replace
cin >> fact_num;
with
scanf("%d", &fact_num);
and same for cin >> num_of_inputs but keep cout you'll get pretty much the same performance in your C++ version (or, rather IOStream version) as in C one:
The same happens if you keep cin but replace
cout << num_of_trailing_zeros << "\n";
with
printf("%d", num_of_trailing_zeros);
printf("%s","\n");
A simple solution is to untie cout and cin as mentioned by Ilya Popov:
cin.tie(nullptr);
Standard library implementations are allowed to omit the call to flush in certain cases, but not always. Here's a quote from C++14 27.7.2.1.3 (thanks to chqrlie):
Class basic_istream::sentry : First, if is.tie() is not a null pointer, the function calls is.tie()->flush() to synchronize the output sequence with any associated external C stream. Except that this call can be suppressed if the put area of is.tie() is empty. Further an implementation is allowed to defer the call to flush until a call of is.rdbuf()->underflow() occurs. If no such call occurs before the sentry object is destroyed, the call to flush may be eliminated entirely.

cout giving Runtime Error

#include "iostream"
using namespace std;
int main(int argc, char const *argv[])
{
int n=100000;
int cost=6;
for (int i = 1; i <= n; ++i)
{
cout<<cost<<endl;
}
return 0;
}
The above program when compiled and run on ideone.com (online g++ compiler which uses SPOJ compiler) gives a Runtime Error. When the cout line is commented out, the program runs successfully. Can someone point out the reason for the same?

As pts pointed it out in his comment, ideone.com has a limit to the number of bytes you can print out. If you change n to 10000, the code runs fine.
The maximum n value that won't give compile error is 2^15 = 32768.
If you look carefully, you can see it terminates with signal:25, SIGXFSZ. You can take a look at this page to learn what signals mean.
SIGXFSZ 25 File size limit exceeded (4.2 BSD)

Theoretically this could overflow if int is two bytes on your platform (which the standard allows). But most likely the error is due to output size limits at ideone.com. Do learn to interpret error messages: they are your friends and are as least as important as desired program output.

cout or printf which of the two has a faster execution speed C++?

I have been coding in C++ for a long time. I always wondered which has a faster execution speed printf or cout?
Situation: I am designing an application in C++ and I have certain constraints such as time limit for execution. My application has loads printing commands on the console. So which one would be preferable printf or cout?

Each has its own overheads. Depending on what you print, either may be faster.
Here are two points that come to mind -
printf() has to parse the "format" string and act upon it, which adds a cost.
cout has a more complex inheritance hierarchy and passes around objects.
In practice, the difference shouldn't matter for all but the weirdest cases. If you think it really matters - measure!
EDIT -
Oh, heck, I don't believe I'm doing this, but for the record, on my very specific test case, with my very specific machine and its very specific load, compiling in Release using MSVC -
Printing 150,000 "Hello, World!"s (without using endl) takes about -
90ms for printf(), 79ms for cout.
Printing 150,000 random doubles takes about -
3450ms for printf(), 3420ms for cout.
(averaged over 10 runs).
The differences are so slim this probably means nothing...

Do you really need to care which has a faster execution speed? They are both used simply for printing text to the console/stdout, which typically isn't a task that demands ultra-high effiency. For that matter, I wouldn't imagine there to be a large difference in speed anyway (though one might expect printf to be marginally quicker because it lacks the minor complications of object-orientedness). Yet given that we're dealing with I/O operations here, even a minor difference would probably be swamped by the I/O overhead. Certainly, if you compared the equivalent methods for writing to files, that would be the case.
printf is simply the standard way to output text to stdout in C.
'cout' piping is simply the standard way to output text to stdout in C++.
Saying all this, there is a thread on the comp.lang.cc group discussing the same issue. Consensus does however seem to be that you should choose one over the other for reasons other than performance.

The reason C++ cout is slow is the default sync with stdio.
Try executing the following to deactivate this issue.
ios_base::sync_with_stdio(false)
http://www.cplusplus.com/reference/iostream/ios_base/sync_with_stdio/
http://msdn.microsoft.com/es-es/library/7yxhba01.aspx

On Windows at least, writing to the console is a huge bottleneck, so a "noisy" console mode program will be far slower than a silent one. So on that platform, slight differences in the library functions used to address the console will probably make no significant difference in practice.
On other platforms it may be different. Also it depends just how much console output you are doing, relative to other useful work.
Finally, it depends on your platform's implementation of the C and C++ I/O libraries.
So there is no general answer to this question.

Performance is a non-issue for comparison; can't think of anything where it actually counts (developing a console-program). However, there's a few points you should take into account:
Iostreams use operator chaining instead of va_args. This means that your program can't crash because you passed the wrong number of arguments. This can happen with printf.
Iostreams use operator overloading instead of va_args -- this means your program can't crash because you passed an int and it was expecting a string. This can happen with printf.
Iostreams don't have native support for format strings (which is the major root cause of #1 and #2). This is generally a good thing, but sometimes they're useful. The Boost format library brings this functionality to Iostreams for those who need it with defined behavior (throws an exception) rather than undefined behavior (as is the case with printf). This currently falls outside the standard.
Iostreams, unlike their printf equivilants, can handle variable length buffers directly themselves instead of you being forced to deal with hardcoded cruft.
Go for cout.

I recently was working on a C++ console application on windows that copied files using CopyFileEx and was echoing the 'to' and 'from' paths to the console for each copy and then displaying the average throughput at the end of the operation.
When I ran the console application using printf to echo out the strings I was getting 4mb/sec, when replacing the printf with std::cout the throughput dropped to 800kb/sec.
I was wondering why the std::cout call was so much more expensive and even went so far as to echo out the same string on each copy to get a better comparison on the calls. I did multiple runs to even out the comparison, but the 4x difference persisted.
Then I found this answer on stackoverflow..
Switching on buffering for stdout did the trick, now my throughput numbers for printf and std::cout are pretty much the same.
I have not dug any deeper into how printf and cout differ in console output buffering, but setting the output buffer before I begin writing to the console solved my problem.

Another Stack Overflow question addressed the relative speed of C-style formatted I/O vs. C++ iostreams:
Why is snprintf faster than ostringstream or is it?
http://www.fastformat.org/performance.html
Note, however, that the benchmarks discussed were for formatting to memory buffers. I'd guess that if you're actually performing the I/O to a console or file that the relative speed differences would be much smaller due to the I/O taking more of the overall time.

If you're using C++, you should use cout instead as printf belongs to the C family of functions. There are many improvements made for cout that you may benefit from. As for speed, it isn't an issue as console I/O is going to be slow anyway.

In practical terms I have always found printf to be faster than cout. But then again, cout does a lot more for you in terms of type safety. Also remember printf is a simple function whereas cout is an object based on a complex streams hierarchy, so it's not really fair to compare execution times.

To settle this:
#include <iostream>
#include <cstdio>
#include <ctime>
using namespace std;
int main( int argc, char * argcv[] ) {
const char * const s1 = "some text";
const char * const s2 = "some more text";
int x = 1, y = 2, z = 3;
const int BIG = 2000;
time_t now = time(0);
for ( int i = 0; i < BIG; i++ ) {
if ( argc == 1 ) {
cout << i << s1 << s2 << x << y << z << "\n";
}
else {
printf( "%d%s%s%d%d%d\n", i, s1, s2, x, y, z );
}
}
cout << (argc == 1 ? "cout " : "printf " ) << time(0) - now << endl;
}
produces identical timings for cout and printf.

Why don't you do an experiment? On average for me, printing the string helloperson;\n using printf takes, on average, 2 clock ticks, while cout using endl takes a huge amount of time - 1248996720685 clock ticks. Using cout with "\n" as the newline takes only 41981 clock ticks. The short URL for my code is below:
cpp.sh/94qoj
link may have expired.
To answer your question, printf is faster.
#include <iostream>
#include <string>
#include <ctime>
#include <stdio.h>
using namespace std;
int main()
{
clock_t one;
clock_t two;
clock_t averagePrintf;
clock_t averageCout;
clock_t averagedumbHybrid;
for (int j = 0; j < 100; j++) {
one = clock();
for (int d = 0; d < 20; d++) {
printf("helloperson;");
printf("\n");
}
two = clock();
averagePrintf += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << endl;
}
two = clock();
averageCout += two-one;
one = clock();
for (int d = 0; d < 20; d++) {
cout << "helloperson;";
cout << "\n";
}
two = clock();
averagedumbHybrid += two-one;
}
averagePrintf /= 100;
averageCout /= 100;
averagedumbHybrid /= 100;
cout << "printf took " << averagePrintf << endl;
cout << "cout took " << averageCout << endl;
cout << "hybrid took " << averagedumbHybrid << endl;
}
Yes, I did use the word dumb. I first made it for myself, thinking that the results were crazy, so I searched it up, which ended up with me posting my code.
Hope it helps,
Ndrewffght

If you ever need to find out for performance reasons, something else is fundamentally wrong with your application - consider using some other logging facility or UI ;)

Under the hood, they will both use the same code, so speed differences will not matter.
If you are running on Windows only, the non-standard cprintf() might be faster as it bypasses a lot of the streams stuff.
However it is an odd requirement. Nobody can read that fast. Why not write output to a file, then the user can browse the file at their leisure?

Anecdotical evidence:
I've once designed a logging class to use ostream operators - the implementation was insanely slow (for huge amounts of data).
I didn't analyze that to much, so it might as well have been caused by not using ostreams correctly, or simply due to the amount of data logged to disk. (The class has been scrapped because of the performance problems and in practice printf / fmtmsg style was preferred.)
I agree with the other replies that in most cases, it doesn't matter. If output really is a problem, you should consider ways to avoid / delay it, as the actual display updates typically cost more than a correctly implemented string build. Thousands of lines scrolling by within milliseconds isn't very informative anyway.

You should never need to ask this question, as the user will only be able to read slower than both of them.
If you need fast execution, don't use either.
As others have mentioned, use some kind of logging if you need a record of the operations.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js