Source code of std:: namespace - c++

I want to use the std::string namespace + member, but I do not want the size of my file to increase by the size of the #include <string> file.
Is there anywhere that I can find the source code of the std:: namespace so I can just extract the string member and put it in my source so I don't get the increase in my binary size?
Please give me some suggestions, thank-you!

If using GCC on Linux (thru g++), it is free software and the source code of the standard library is available. So you could study its <string> header (some of which might perhaps use compiler magic like builtins or attributes).
Notice that if you #include <string> and ask the compiler to optimize (for size using -Os, or for speed using -O2) it is very likely that uneeded functions won't come in. At last, usually the libstdc++ is dynamically linked (as a shared object) so does not consume space in every program (but is shared across several processes)

Related

Stop bycatch of <stdio.h> definitions caused by C++ includes

In a comment to another answer, I was shown a code example that seemingly used printf and puts without including <stdio.h> but the online compiler didn't complain.[1] To understand what's going on, I copied it to my local IDE.
Reduced to relevant includes and output, it's basically this:
#include <string>
#include <map>
#include <optional>
int main() {
printf("Answer %d\n", 42);
puts("Question?");
}
Experimenting with gcc 8.1.0 (packaged with Clode::Blocks 20.03), I found out, that the includes can be further reduced to
<string> or <map> or <optional> in C++17 (ISO/GCC)
<string> or <map> in C++11/C++14 (ISO/GCC)
<string> in C++98 (ISO/GCC)
Also a sample test - C++14 (gcc 8.3) - on ideone.com compiles and runs fine:
#include <iostream>
int main() {
printf("printf without #include <stdio.h>\n");
return 0;
}
This is also true for other definitions from <stdio.h> like FILE.
I found no information at cppreference.com
std::printf, std::fprintf, std::sprintf, std::snprintf - cppreference.com
std::puts - cppreference.com
printf, fprintf, sprintf, snprintf, printf_s, fprintf_s, sprintf_s, snprintf_s - cppreference.com
puts - cppreference.com
I also tried several web and SO searches but wasn't successful so far.
While it may be handy for small examples to get some powerful functions for free, but a serious project may suffer from it: besides comparatively easy to fix compiler errors, I see the danger of serious runtime errors.
How can I effectively control/prevent this kind of inclusion?
[1] the referenced code now contains the include statement, but I'm pretty sure that it didn't at stage I copied it .. or maybe I copied just a portion of it? ... anyway the observed behavior is there as described above.
I am afraid you cannot.
The standard requires that the well known include files declare the relevant names, but does not prevent them to include other files/names, if the library implementation finds it useful.
Said differently, after including iostream you are sure that all the names belonging to it are correctly declared, but you cannot know (except by examining the file itself) if other names have been defined, or if other standard files have been included. Here, your implementation chooses to automatically include stdio.h, but a different (standard library) implementation could choose not to include it. You have reached the world of unspecifiedness...
Aside from any arguments about "purity", I don't see how including the declarations for these functions could in any way be harmful. As a matter of fact, not having them and letting the compiler implicitly assuming int printf() – albeit only C compilers do it, C++ won't – is calling for trouble.
printf is part of the C language standard. And although C++ isn't C, there's so much overlap between their ecosystems, that C++ considers printf as part of the implementation runtime environment as well, as it does with all the other functions from the C standard library.
And as such, all of these symbols are reserved. Outside of actually implementing the runtime library itself, you have no business defining symbols of those names yourself.
#include preprocessor directives are just inline text substitution with text pulled in from an additional file.
(extern) symbol declarations will not create symbols of those names, but they will make sure, that you're not going to redefine those symbols at your own.
So I really don't see where your worries about runtime errors do come from.

Does adding additional headers make programs slower?

For instance would the following two programs have the save execution time?
#include <iostream>
int main()
{
int a,b;
std::cin >> a >> b;
std::cout << a+b;
return 0;
}
and
#include <ctime>
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <cassert>
#include <time.h>
#include <algorithm>
#include <iostream>
#include <vector>
int main()
{
int a,b;
std::cin >> a >> b;
std::cout << a+b;
return 0;
}
If so is it a good practice to always include a bunch of header files? How can one test how long it takes to execute a program? Using predefined input.
Does adding additional headers make programs slower?
No. Of course, someone will show up now with some corner case to refute this. But no, extra headers don't make C or C++ programs slower in general.
If so is it a good practice to always include a bunch of header files?
Don't include "a bunch." Include the ones you use. To include extra headers increases compilation time, and if the headers are from your own project, can result in recompiling many objects in your project whenever you touch any header.
How can one test how long it takes to execute a program?
With a stopwatch. Or time(). Or rdtsc. Or QueryPerformanceCounter(). Lots of ways.
For instance would the following two programs have the same execution time?
Yes. Including additional header files doesn't affect the execution time of the program.
Header files are processed at compile time. So they (usually) don't slow down your code.
There could be corner cases, that inclusion of particular headers might pickup a different implementation of some algorithm, that is inherently slower than a different one picked up without that header.
If so is it a good practice to always include a bunch of header files?
No. You should include the header file for every type you are using, no more no less.
How can one test how long it takes to execute a program? Using predefined input.
There are several possibilities to do that. You can run your program in a profiling tool, or simply measure time yourself (in a script or such).
Does adding additional headers make programs slower?
For instance, would the following two programs have the same execution time?
Adding additional headers won't affect the runtime of your program. It will affect the compile time however because the compiler now has to include these additional headers in your program.
If so is it a good practice to always include a bunch of header files?
It is best practice to only include the header files that you will be using in your project. Also, be careful not to include the C version of a header and the C++ version of a header, you may run into issues.
How can one test how long it takes to execute a program? Using predefined input.
I'd recommend checking out the ctime library: http://www.cplusplus.com/reference/ctime/
Remember that execution time is specific to your machine.
I think it will make the program slow because when ever you call a function like cout or cin compiler will find it in header files declared by programmer
More number of header files require more time to find the function definition
Also if including extra header files doesn't increase compilation time then ide(Integrated development environment) should have omit the header file including system
Hope that make sense

What can be compiled faster? Source and header for each method, or anything in one single file?

I am writing a compiler that generates C++ code at the end. I am currently wondering what is compiling faster.
First notes about the compiler:
I don't have any classes\structs, they are optimized inside the functions.
I don't include anything like #include <vector>, when I have to use functions such as printf from libraries then I manually put prototype. (The compiler do it manually.)
I have two options:
Option #1:
//.h
#include "A.h"
#include "B.h"
int function( /* parameters */ );
//.cpp
int function( /* parameters */ ) {
// code
}
Every function has it's own source and header.
Advantages:
I can make the compiler comment out the includes that includes file that are included before it. For example if #include "B.h"'s content is included in #include "A.h" then I will can make it comment out the line #include "B.h". (Saves file reads.)
I can recognize unchanged methods/functions/files (When I regenerate my code and it can find exact files from before.) and recycle their object files. (Saves object compiling.)
Option #2:
int function( /* parameters */ );
int function2( /* parameters */ );
int function3( /* parameters */ );
// ...
int function( /* parameters */ ) {
// code
}
// ...
All functions are once defined (those prototypes at the top) and compiled in that single file.
Advantages:
Single sequential read from the disk. (No hierarchy of including and multiple including from different objects.)
Single object to compile, excluding libraries.
At single glance the looks like option #1 is faster, but some folks said they tried the second and it gave their project a boost in compiling time. They didn't compare both options, and didn't gave any proof for it.
Can I get explanation for which one is faster rather than benchmarks?
One of the most important factors is the ability to compile in parallel. As each Translation Unit is compiled in a sequential fashion (at least logically), there is limited opportunity for parallelization if you feed the C++ compiler just one big file.
The balancing force, as pointed out, is the startup cost of each compilation. You've got multiple CPU cores incurring the same startup cost.
As a result, parallelization stops being a win when adding an extra Translation Unit incurs more overhead cost than you save by using that extra core.
C++ is known to be slow to be compiled, notably because standard C++ headers (e.g. <vector>, <map> or other standard containers) are bringing a lot of C++ code (and tokens) thru internal headers.
Are your #include-d headers machine generated, or are they all common to your runtime? You might consider having a single header and pre-compile it (assuming a recent GCC or Clang/LLVM compiler), see this. Then you need to have only one single #include at top of every generated C++ file.
BTW, I am not sure that C++ is a good target language for some compiler. Generating C code (perhaps using Boehm's GC, like Bigloo does) is probably more relevant. Generating C++ code makes sense when you want to fit into some existing API, like I am doing in MELT fitting into GCC internals, and then I don't generate much code using C++ standard templates.
At last, when you generate C (and even more when generating genuine C++) you really want the C or C++ compiler to optimize the generated C or C++ code. Parsing time of the generated code does not matter than much (you might try -ftime-report option to measure where is g++ taking time). Newest GCC 5.1 has libgccjit which might interest you.

Are there techniques to greatly improve C++ building time for 3D applications?

There are many slim laptops who are just cheap and great to use. Programming has the advantage of being done in any place where there is silence and comfort, since concentrating for long hours is important factor to be able to do effective work.
I'm kinda old fashioned as I like my statically compiled C or C++, and those languages can be pretty long to compile on those power-constrainted laptops, especially C++11 and C++14.
I like to do 3D programming, and the libraries I use can be large and won't be forgiving: bullet physics, Ogre3D, SFML, not to mention the power hunger of modern IDEs.
There are several solutions to make building just faster:
Solution A: Don't use those large libraries, and come up with something lighter on your own to relieve the compiler. Write appropriate makefiles, don't use an IDE.
Solution B: Set up a building server elsewhere, have a makefile set up on an muscled machine, and automatically download the resulting exe. I don't think this is a casual solution, as you have to target your laptop's CPU.
Solution C: use the unofficial C++ module
???
Any other suggestion ?
Compilation speed is something, that can be really boosted, if you know how to. It is always wise to think carefully about project's design (especially in case of large projects, consisted of multiple modules) and modify it, so compiler can produce output efficiently.
1. Precompiled headers.
Precompiled header is a normal header (.h file), that contains the most common declarations, typedefs and includes. During compilation, it is parsed only once - before any other source is compiled. During this process, compiler generates data of some internal (most likely, binary) format, Then, it uses this data to speed up code generation.
This is a sample:
#pragma once
#ifndef __Asx_Core_Prerequisites_H__
#define __Asx_Core_Prerequisites_H__
//Include common headers
#include "BaseConfig.h"
#include "Atomic.h"
#include "Limits.h"
#include "DebugDefs.h"
#include "CommonApi.h"
#include "Algorithms.h"
#include "HashCode.h"
#include "MemoryOverride.h"
#include "Result.h"
#include "ThreadBase.h"
//Others...
namespace Asx
{
//Forward declare common types
class String;
class UnicodeString;
//Declare global constants
enum : Enum
{
ID_Auto = Limits<Enum>::Max_Value,
ID_None = 0
};
enum : Size_t
{
Max_Size = Limits<Size_t>::Max_Value,
Invalid_Position = Limits<Size_t>::Max_Value
};
enum : Uint
{
Timeout_Infinite = Limits<Uint>::Max_Value
};
//Other things...
}
#endif /* __Asx_Core_Prerequisites_H__ */
In project, when PCH is used, every source file usually contains #include to this file (I don't know about others, but in VC++ this actually a requirement - every source attached to project configured for using PCH, must start with: #include PrecompiledHedareName.h). Configuration of precompiled headers is very platform-dependent and beyond the scope of this answer.
Note one important matter: things, that are defined/included in PCH should be changed only when absolutely necessary - every chnge can cause recompilation of whole project (and other depended modules)!
More about PCH:
Wiki
GCC Doc
Microsoft Doc
2. Forward declarations.
When you don't need whole class definition, forward declare it to remove unnecessary dependencies in your code. This also implicates extensive use of pointers and references when possible. Example:
#include "BigDataType.h"
class Sample
{
protected:
BigDataType _data;
};
Do you really need to store _data as value? Why not this way:
class BigDataType; //That's enough, #include not required
class Sample
{
protected:
BigDataType* _data; //So much better now
};
This is especially profitable for large types.
3. Do not overuse templates.
Meta-programming is a very powerful tool in developer's toolbox. But don't try to use them, when they are not necessary.
They are great for things like traits, compile-time evaluation, static reflection and so on. But they introduce a lot of troubles:
Error messages - if you have ever seen errors caused by improper usage of std:: iterators or containers (especially the complex ones, like std::unordered_map), than you know what is this all about.
Readability - complex templates can be very hard to read/modify/maintain.
Quirks - many techniques, templates are used for, are not so well-known, so maintenance of such code can be even harder.
Compile time - the most important for us now:
Remember, if you define function as:
template <class Tx, class Ty>
void sample(const Tx& xv, const Ty& yv)
{
//body
}
it will be compiled for each exclusive combination of Tx and Ty. If such function is used often (and for many such combinations), it can really slow down compilation process. Now imagine, what will happen, if you start to overuse templating for whole classes...
4. Using PIMPL idiom.
This is a very useful technique, that allows us to:
hide implementation details
speed up code generation
easy updates, without breaking client code
How does it work? Consider class, that contain a lot of data (for example, representing person). It could look like this:
class Person
{
protected:
string name;
string surname;
Date birth_date;
Date registration_date;
string email_address;
//and so on...
};
Our application evolves and we need to extend/change Person definition. We add some new fields, remove others... and everything crashes: size of Person changes, names of fields change... cataclysm. In particular, every client code, that depends on Person's definition needs to be changed/updated/fixed. Not good.
But we can do it the smart way - hide the details of Person:
class Person
{
protected:
class Details;
Details* details;
};
Now, we do few nice things:
client cannot create code, that depends on how Person is defined
no recompilation needed as long as we don't modify public interface used by client code
we reduce the compilation time, because definitions of string and Date no longer need to be present (in previous version, we had to include appropriate headers for these types, that adds additional dependencies).
5. #pragma once directive.
Although it may give no speed boost, it is clearer and less error-prone. It is basically the same thing as using include guards:
#ifndef __Asx_Core_Prerequisites_H__
#define __Asx_Core_Prerequisites_H__
//Content
#endif /* __Asx_Core_Prerequisites_H__ */
It prevents from multiple parses of the same file. Although #pragma once is not standard (in fact, no pragma is - pragmas are reserved for compiler-specific directives), it is quite widely supported (examples: VC++, GCC, CLang, ICC) and can be used without worrying - compilers should ignore unknown pragmas (more or less silently).
6. Unnecessary dependencies elimination.
Very important point! When code is being refactored, dependencies often change. For example, if you decide to do some optimizations and use pointers/references instead of values (vide point 2 and 4 of this answer), some includes can become unnecessary. Consider:
#include "Time.h"
#include "Day.h"
#include "Month.h"
#include "Timezone.h"
class Date
{
protected:
Time time;
Day day;
Month month;
Uint16 year;
Timezone tz;
//...
};
This class has been changed to hide implementation details:
//These are no longer required!
//#include "Time.h"
//#include "Day.h"
//#include "Month.h"
//#include "Timezone.h"
class Date
{
protected:
class Details;
Details* details;
//...
};
It is good to track such redundant includes, either using brain, built-in tools (like VS Dependency Visualizer) or external utilities (for example, GraphViz).
Visual Studio has also a very nice option - if you click with RMB on any file, you will see an option 'Generate Graph of include files' - it will generated a nice, readable graph, that can be easily analyzed and used to track unnecessary dependencies.
Sample graph, generated inside my String.h file:
As Mr. Yellow indicated in a comment, one of the best ways to improve compile times is to pay careful attention to your use of header files. In particular:
Use precompiled headers for any header that you don't expect to change including operating system headers, third party library headers, etc.
Reduce the number of headers included from other headers to the minimum necessary.
Determine whether a include is needed in the header or whether it can be moved to cpp file. This sometimes causes a ripple effect because someone else was depending on you to include the header for it, but it is better in the long term to move the include to the place where it's actually needed.
Using forward declared classes, etc. can often eliminate the need to include the header in which that class is declared. Of course, you still need to include the header in the cpp file, but that only happens once, as opposed to happening every time the corresponding header file is included.
Use #pragma once (if it is supported by your compiler) rather than include guard symbols. This means the compiler does not even need to open the header file to discover the include guard. (Of course many modern compilers figure that out for you anyway.)
Once you have your header files under control, check your make files to be sure you no longer have unnecessary dependencies. The goal is to rebuild everything you need to, but no more. Sometimes people err on the side of building too much because that is safer than building too little.
If you've tried all of the above, there's a commercial product that does wonders, assuming you have some available PCs on your LAN. We used to use it at a previous job. It's called Incredibuild (www.incredibuild.com) and it shrunk our build time from over an hour (C++) to about 10 minutes. From their website:
IncrediBuild accelerates build time through efficient parallel computing. By harnessing idle CPU resources on the network, IncrediBuild transforms a network of PCs and servers into a private computing cloud that can best be described as a “virtual supercomputer.” Processes are distributed to remote CPU resources for parallel processing, dramatically shortening build time up by to 90% or more.
Another point that's not mentioned in the other answers: Templates. Templates can be a nice tool, but they have fundamental drawbacks:
The template, and all the templates it depends upon, must be included. Forward declarations don't work.
Template code is frequently compiled several times. In how many .cpp files do you use an std::vector<>? That is how many times your compiler will need to compile it!
(I'm not advocating against the use of std::vector<>, on the contrary you should use it frequently; it's simply an example of a really frequently used template here.)
When you change the implementation of a template, you must recompile everything that uses that template.
With template heavy code, you often have relatively few compilation units, but each of them is huge. Of course, you can go all-template and have only a single .cpp file that pulls in everything. This would avoid multiple compiling of template code, however it renders make useless: any compilation will take as long as a compilation after a clean.
I would recommend going the opposite direction: Avoid template-heavy or template-only libraries, and avoid creating complex templates. The more interdependent your templates become, the more repeated compilation is done, and the more .cpp files need to be rebuilt when you change a template. Ideally any template you have should not make use of any other template (unless that other template is std::vector<>, of course...).

no std namespace

We've got a reasonable sized C++ application that's pretty old at this stage, so it's got a few quirks.
One of these quirks is in how it deals with C++ compilers that use a pre-standisation standard library. There's one header file that's supposed to resolve any differences between a standards compliant compiler and this one non-compliant compiler. For various reasons we can't / don't want to stop supporting this compiler.
#include <vector>
#include <set>
#if defined(NO_STD_LIB)
#include <iostream.h>
#else
#incude <iostream>
using std::string;
using std::cout;
using std::vector;
using std::cout;
#endif
You use this as follows
#include stl.h
int main() {
vector<string> foo;
.....
return 0;
}
There are 2 main issues with this approach:
Each compilation unit that includes std.h has to compile lots of un-needed code (and we're trying to reduce compile times as much as possible at the minute)
The global namespace gets polluted with pretty much everything that would normally be in the std namespace.
I really would like to address both of these points as part of a code cleanup project. The first one really is the more important reason for doing this.
As we have to support this old compiler then our code will always have to avoid clashing names with things that it exposes in it's standard lib, so point 2 isn't really relevant, though I'd like to see a solution that works when / if we can finally drop support for it.
My idea so far is to break up the super header into a set of smaller headers. e.g. stl_vector, stl_iostream, stl_set, etc. This way we can only include the parts of the standard library that we're interested in. These filenames follow the pattern of the std headers, but with an easily searched for prefix. So when the time comes to dump the offending compiler, it'll be simple to search for the prefix and remove it.
I think that will fix issue 1 easily enough.
My real problem is fixing issue 2. I thought of doing someting like this
#if defined(NO_STD_LIB)
#include <iostream.h>
#define std
#else
#include <iostream>
then we could code as follows:
#incude "stl_iostream"
int main() {
std::string foo("bar");
std::cout << foo << std::endl;
}
And that almost worked. Where there was no standard namespace the #define std made std::string decompose into ::string and life was good.
Then I tried this with a .cc file that used the dreaded "using namespace std;" and I get a compile error because that becomes "using namespace ", so that obviously won't work.
Now obviously I could ban people from writing "using namespace std;", but as much as it should be avoided in headers, it's sometimes useful in .cc files where you're making heavy use of lots of STL classes.
So, finally, to the question. Is there a standard idiom for dealing with this issue. Or if there isn't a standard way to deal with this, then what tricks do you use to support compilers that use a pre-standard standard library.
I've thought of using pre-compiled headers to solve the compilation speed issue, but we target different compilers and the effort of getting this working across all of them may mean its not worth our time doing it.
Answers that advise me to drop the non-conforming compiler may be popular, but won't be accepted as this is something that we can't do just now.
You can try:
#if defined(NO_STD_LIB)
namespace std {
using ::string;
using ::cout;
using ::vector;
using ::cout;
}
#endif
Then std::string will work.
It would have been much better if the using namespace ::; directive existed in the language; however it doesn't.