Considering the following setup, I have ran into a quite strange phenomena, which I can not really explain. Using Visual Studio 2005, the following piece of code results in crash. I would like to really know the reason.
playground.cpp
static int local=-1;
#include "common.h"
int main(int arg)
{
setit();
docastorUpdate();
return 0;
}
common.h
#include <stdio.h>
#include <iostream>
void docastorUpdate();
static int *gemini;
inline void setit()
{
gemini = &local;
}
castor.cpp
static int local = 2;
#include "common.h"
void docastorUpdate() {
setit();
// crashing here, dereferencing a null pointer
std::cout << "castor:" << *gemini << std::endl;
}
The thing is, that the crash disappears when
I move the inline function setit() to an unnamed namespace
I make it static
To put it in a nutshell, I would need help to understand the reasons. Any suggestion is appreciated! (I am aware that, this solution is not one of the best partices, just being curious.)
This breaks because you are violating the one-definition rule. The one-definition rule says that in a program, across all translation units, there is only one definition of any given function. inline is sort of an exception to this rule, and it more or less means "dear compiler, there will be several definitions of this function, but they will all be the same, I promise".
static, when used here for local, means "dear compiler, this is an internal detail that only this translation unit will ever see; please do not confuse it with variables named local from other translation units"
So you promised the compiler that all definitions of setit will be the same, and asked the compiler to give each translation unit its very own local variable.
However, since the setit function uses whatever variable named local is in scope, the end result is two different definitions of setit, each one using a different variable. You just broke your promise. The compiler trusted you and the result was a totally messed up program. It thought it could do certain things with the code based on your promises, but since you broke them behind its back, those things it tried to do with the code didn't work at all.
Your code invokes undefined behaviour, in a very subtle way.
[C++11: 7.1.2/4]: An inline function shall be defined in every translation unit in which it is ODR-used and shall have exactly the same definition in every case. [..]
Although the definition looks the same because it's lexically copy-pasted into each translation unit by your #include, it's not because &local isn't taking the address of the same variable, in each case.
As a result, anything can happen when you run your program, including transferring all your hard-earned savings into my bank account, or taking away all of Jon Skeet's rep.
This is why making the function non-inline solves the problem; also, putting it in an unnamed namespace makes it a different function in each translation unit.
you can avoid using static in all the places, you can use extern in your case:
common.h:
extern int *gemini;
common.cpp;
int *gemini = nullptr;
also avoid using local like that , instead you can do this:
inline void setit(int * p)
{
gemini = p;
}
void docastorUpdate()
{
static int local = 2;
setit(&local);
std::cout << "castor:" << *gemini << std::endl;
}
Related
I will admit I have not kept up with the latest C/C++ releases but I was wondering why having a function prototype in a function is valid code? Is it related to lambda usage?
Here is sample code - this will compile/run on Visual Studio 2019 and g++ 5.4.0
int main()
{
int func(bool test);
return 0;
}
A code block may contain any number of declarations. And because a function prototype is a declaration, it may appear in a block.
Granted, it doesn't make much sense to do so logistically as opposed to declaring a function at file scope, but it's syntactically correct.
In that example the declaration is pointless. But in a more complex example it's not:
int main() {
int func(bool test);
func(true);
return 0;
}
That code is equivalent to the more usual formulation:
int func(bool test);
int main() {
int func(bool test);
func(true);
return 0;
}
except that the first one introduces the name func only inside the scope of main; the second one introduces the name into the global scope.
I occasionally use the first form when I don't want to scroll through the source file to figure out where to put the declaration; putting it in the function where it's going to be used is a quick-and-dirty solution. And if it's temporary code (adding debugging output, for example), it makes it easier to remove it all afterwards. But, in general, having declarations at global scope is simpler to deal with. After all, you might want to call that same function from somewhere else, too, and having the global declaration means you don't have to repeat it.
I was taught that functions need declarations to be called. To illustrate, the following example would give me an error as there is no declaration for the function sum:
#include <iostream>
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
// main.cpp:4:36: error: use of undeclared identifier 'sum'
// std::cout << "The result is " << sum(1, 2);
// ^
// 1 error generated.
To fix this, I'd add the declaration:
#include <iostream>
int sum(int x, int y); // declaration
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
Why the main function doesn't need the declaration, as other functions like sum need?
A definition of a function is also a declaration of a function.
The purpose of a declaring a function is to make it known to the compiler. Declaring a function without defining it allows a function to be used in places where it is inconvenient to define it. For example:
If a function is used in a source file (A) other than the one it is defined in (B), we need to declare it in A (usually via a header that A includes, such as B.h).
If two or more functions may call each other, then we cannot define all those functions before the others—one of them has to be first. So declarations can be provided first, with definitions coming afterward.
Many people prefer to put “higher level” routines earlier in a source file and subroutines later. Since those “higher level” routines call various subroutines, the subroutines must be declared earlier.
In C++, a user program never calls main, so it never needs a declaration before the definition. (Note that you could provide one if you wished. There is nothing special about a declaration of main in this regard.) In C, a program can call main. In that case, it does require that a declaration be visible before the call.
Note that main does need to be known to the code that calls it. This is special code in what is typically called the C++ runtime startup code. The linker includes that code for you automatically when you are linking a C++ program with the appropriate linker options. Whatever language that code is written in, it has whatever declaration of main it needs in order to call it properly.
I was taught that functions need declarations to be called.
Indeed. A function must be declared before it can be called.
why we don't add a declaration for the main function?
Well, you didn't call main function. In fact, you must not call main at all1, so there is never a need to declare main before anything.
Technically though, all definitions are also declarations, so your definition of main also declares main.
Footnote 1: The C++ standard says it's undefined behaviour to call main from within the program.
This allows C++ implementations to put special run-once startup code at the top of main, if they aren't able to have it run earlier from hooks in the startup code that normally calls main. Some real implementations do in fact do this, e.g. calling a fast-math function that sets some FPU flags like denormals-are-zero.
On a hypothetical implementation, calling main could result in fun things like re-running constructors for all static variables, re-initializing the data structures used by new/delete to keep track of allocations, or other total breakage of your program. Or it might not cause any problem at all. Undefined behaviour doesn't mean it has to fail on every implementation.
The prototype is required if you want to call the function, but it's not yet available, like sum in your case.
You must not call main yourself, so there is no need to have a prototype. It's even a bad a idea to write a prototype.
No, the compiler does not need a forward declaration for main().
main() is a special function in C++.
Some important things to remember about main() are:
The linker requires that one and only one main() function exist when creating an executable program.
The compiler expects a main() function in one of the following two forms:
int main () { /* body */ }
int main (int argc, char *argv[]) { /* body */ }
where body is zero or more statements
An additional acceptable form is implementation specific and provides a list of the environment variables at the time the function is called:
int main (int argc, char* argv[], char *envp[]) { /* body */ }
The coder must provide the 'definition' of main using one of these acceptable forms, but the coder does not need to provide a declaration. The coded definiton is accepted by the compiler as the declaration of main().
If no return statement is provided, the compiler will provide a return 0; as the last statement in the function body.
As an aside, there is sometimes confusion about whether a C++ program can make a call to main(). This is not recommended. The C++17 draft states that main() "shall not be used within a program." In other words, cannot be called from within a program. See e.g. Working Draft Standard for C++ Programming Language, dated "2017-03-21", Paragraph 6.6.1.3, page 66. I realize that some compilers support this (including mine), but the next version of the compiler could modify or remove that behavior as the standard uses the term "shall not".
It is illegal to call main from inside your program. That means the only thing that is going to call it is the runtime and the compiler/linker can handle setting that up.This means you do not need a prototype for main.
A definition of a function also implicitly declares it. If you need to reference a function before it is defined you need to declare it before you use it.
So writing the following is also valid:
int sum(int x, int y) {
return x + y;
}
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
If you use a declaration in one file to make a function known to the compiler before it is defined, then its definition has to be known at linking time:
main.cpp
int sum(int x, int y);
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
sum.cpp
int sum(int x, int y) {
return x + y;
}
Or sum could have its origin in a library, so you do not even compile it yourself.
The main function is not used/referenced in your code anywhere, so there is no need to add the declaration of main anywhere.
Before and after your main function the c++ library might execute some init and cleanup steps, and will call your main function. If that part of the library would be represented as c++ code then it would contain a declaration of int main() so that that it could be compiled. That code could look like this:
int main();
int __main() {
__startup_runtime();
main();
__cleanup_runtime();
}
But then you again have the same problem with __main so at some point there is no c++ anymore and a certain function (main) just represents the entry point of your code.
Nope. You can't call it anyway.
You only need forward declarations for functions called before they are defined. You need external declarations (which look exactly like forward declarations on purpose) for functions defined in other files.
But you can't call main in C++ so you don't need one. This is because the C++ compiler is allowed to modify main to do global initialization.
[I looked at crt0.c and it does have a declaration for main but that's neither here nor there].
Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?
There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.
In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.
What if I define main as a reference to function?
#include<iostream>
#include<cstring>
using namespace std;
int main1()
{
cout << "Hello World from main1 function!" << endl;
return 0;
}
int (&main)() = main1;
What will happen? I tested in an online compiler with error "Segmentation fault":
here
And under VC++ 2013 it will create a program crashing at run-time!
A code calling the data of the function pointer as a code will be compiled which will immediately crash on launch.
I would also like an ISO C++ standard quote about this.
The concept will be useful if you want to define either of 2 entry-points depending on some macro like this:
int main1();
int main2();
#ifdef _0_ENTRY
int (&main)() = main1;
#else
int (&main)() = main2;
#endif
That's not a conformant C++ program. C++ requires that (section 3.6.1)
A program shall contain a global function called main
Your program contains a global not-a-function called main, which introduces a name conflict with the main function that is required.
One justification for this would be it allows the hosted environment to, during program startup, make a function call to main. It is not equivalent to the source string main(args) which could be a function call, a function pointer dereference, use of operator() on a function object, or construction of an instance of a type main. Nope, main must be a function.
One additional thing to note is that the C++ Standard never says what the type of main actually is, and prevents you from observing it. So implementations can (and do!) rewrite the signature, for example adding any of int argc, char** argv, char** envp that you have omitted. Clearly it couldn't know to do this for your main1 and main2.
This would be useful if you want to define either of 2 entry-points depending on some macro
No, not really. You should do this:
int main1();
int main2();
#ifdef _0_ENTRY
int main() { return main1(); }
#else
int main() { return main2(); }
#endif
This is soon going to become clearly ill-formed, thanks to the resolution of CWG issue 1886, currently in "tentatively ready" status, which adds, among other things, the following to [basic.start.main]:
A program that declares a variable main at global scope or that
declares the name main with C language linkage (in any namespace) is
ill-formed.
What will happen in practice is highly dependent on the implementation.
In your case your compiler apparently implements that reference as a "pointer in disguise". In addition to that, the pointer has external linkage. I.e. your program exports an external symbol called main, which is actually associated with memory location in data segment occupied by a pointer. The linker, without looking too much into it, records that memory location as the program's entry point.
Later, trying to use that location as an entry point causes segmentation fault. Firstly, there's no meaningful code at that location. Secondly, a mere attempt to pass control to a location inside a data segment might trigger the "data execution protection" mechanisms of your platform.
By doing this you apparently hoped that the reference will get optimized out, i.e. that main will become just another name for main1. In your case it didn't happen. The reference survived as an independent external object.
Looks like you already answered the question about what happens.
As far as why, in your code main is a reference/pointer to a function. That is different than a function. And I would expect a segment fault if the code is calling a pointer instead of a function.
I made a header file sampleheader.h with following code:
namespace sample_namespace
{
int add(int n1,int n2)
{
return (n1 + n2);
}
}
Now my main.cpp file is:
#include <iostream>
#include "sampleheader.h"
using namespace std;
int add(int n1, int n2)
{
return (n1 + n2);
}
int main(void)
{
using namespace sample_namespace;
//cout<<add(5,7);
}
The project is built with no warning if i leave that line commented.It is understandable because the local add() function is defined in the global scope and add() function is made visible in the scope of main(). So no name conflict happens.
However, if I remove the comments, I get the following error:
"Ambiguous call to overloaded function"
First of all there should be no name conflict at all as explained by me above (if I'm right). But, if at all there is a name conflict why is it that it is notified by compiler only when I call the function. Such type of error should be shown as soon as a name conflicts (if at all).
Any help is appreciated.
Well, you explained it yourself.
When you have both int add() and int sample_namespace::add() in scope, the call is ambiguous. The statement add() could mean either of them.
There is no conflict in the functions co-existing because one is int add() and the other is int sample_namespace::add(). This is the entire purpose of namespaces.
You just have to be clear when you write code that uses them. If you get rid of your using namespace directives and always write explicit code, then you won't run into a problem:
#include <iostream>
namespace sample_namespace {
int add(int n1, int n2) {
return (n1 + n2);
}
}
int add(int n1, int n2) {
return (n1 + n2);
}
int main() {
std::cout << sample_namespace::add(5,7);
}
(Also, defining non-inline functions in headers is A Bad Idea™.)
"using namespace" does not hide other implementations. What it does in this case is make it ambiguous.
You have both sample_namespace::add and ::add with the same signature. Since you don't explicitly say which one to use, the compiler can't tell.
Declaring both functions is legal and unambiguous (which is why the compiler only complains when you use the function).
The reason why it's legal to declare both functions like that is because you can call both unambiguously, by qualifying them. That's why you get an error only when you make an unqualified call. Up to that point, there is not even a theoretical problem.
Intentionally, using namespace X; shouldn't introduce ambiguity errors by itself. That would very much defeat the purpose of namespaces. The common using namespace std; introduces many names that you'd reasonbly define yourself as well.
You cannot shadow symbols like that, only variable names (which is a warning on most compilers, i.e. a bad idea). using directives aren't precise in any sense, which I think is the assumption that most people make.
using directives will import a symbol into the scope. for instance, you can do things like this:
namespace foo {
using namespace std;
}
foo::string val;
This sort of practice leads to some very annoying compiler errors to track down. Your simple example is clear where the error is. If you try doing it in a few hundred thousand lines of code, you will be much less pleased.
I would advise you not to acquire the habit of relying on the using directive. If you must, just do one type at a time.
using std::string;
Basically, this keeps you honest. And if there is a naming conflict, grep will take 30 seconds to uncover where the problem is.
If you just spend a couple weeks typing std:: in front of your types, you will get used to it. I promise that the amount of time you think you are saving is greatly exaggerated.
I think this is because of compiler-optimisation. When your string is commented - he saw, that there are some function, but you don't use it, so he discard it. And when he saw that function useful - he want to insert function call, but can't chose which one you want.