Is there any type of casting (implicit, explicit) done in this line:
int *p= "hello there";
Moreover what is correctness of such a statement from C/C++ point of view ?
Note: it compiles on AVR-GCC and fails with other compilers.
"Is there any type of casting (implicit, explicit) done in this line:"
int *p= "hello there";
No, there isn't any casting done.
But in C++ you can use a reinterpret_cast<int*> along with a const_cast<char*> like follows:
int *p= reinterpret_cast<int*>(const_cast<char*>("hello there"));
See a working demo.
"Moreover what do you think about the correctness of such statement from (C,C++) point of view."
Well, that's totally left upon the user of such statement. It's certainly not recommendable to do so (because you're hitting not specified1 behavior), and dangerous if you're not a 100% sure about what you're doing.
But at least both languages support to do so.
The a bit shorter variant for C is:
int *p= (int*)"hello there";
"Note: it compiles on AVR-GCC and fails with other compilers."
Well, I don't know the version you're using for the AVR cross toolchain, but any halfway up-to-date version of GCC should give an error for this statement like
prog.cpp:3: error: cannot convert 'const char*' to 'int*' in initialization
See demo please
I'm using a fairly old version of GCC to make up that demo: g++ 4.3.2. The latest one is 5.x if I'm informed correctly, and I'm pretty sure that it's possible to build a cross toolchain for AVR with it.
1 Which is subtly different from undefined behavior. Just depends on what your hardware goes along with these pointer addresses, and the alignment settings.
to answer your question (ignoring the 'why do it' comments - which are correct)
You do
int *p=(int*)"hello there";
This code is illegal in C and C++. We might guess that AVR-GCC pretends that you had written:
int *p = (int *)"hello there";
You could perhaps try to confirm this by writing:
char *q = "hello there";
int *p = q;
and seeing if p and q hold the same address. (although that would not be completely conclusive either way).
It would be better to find out how to invoke your compiler in a mode where it will report errors or warnings for illegal code. Normally for gcc, -Wall -pedantic does this, and usually you'd also specify a standard version such as -std=c99 or -std=gnu99 .
Supposing the code is treated as int *p = (int *)"hello there";. This code may cause undefined or unexpected behaviour because the char array might not be correctly aligned for int.
Also, it will cause undefined behaviour if you read or write through p due to violating the strict aliasing rule, which says that reading or writing as int is only permitted for variables declared as int, or malloc'd space which (if being read) already has had an int written.
Related
C++ compilers happily compiles this code, with no warning:
int ival = 8;
const char *strval = "x";
const char *badresult = ival + strval;
Here we add a char* pointer value (strval) to an int value (ival) and store the result in a char* pointer (badresult). Of course, the content of the badresult will be total garbage and the app might crash on this line or later when it is trying to use the badresult elsewhere.
The problem is that it is very easy to make such mistakes in real life. The one I caught in my code looked like this:
message += port + "\n";
(where message is a string type handling the result with its operator += function; port is an int and \n is obviously a const char pointer).
Is there any way to disable this kind of behavior and trigger an error at compile time?
I don't see any normal use case for adding char* to int and I would like a solution to prevent this kind of mistakes in my large code base.
When using classes, we can create private operators and use the explicit keyword to disable unneeded conversions/casts, however now we are talking about basic types (char* and int).
One solutions is to use clang as that has a flag to enable warning for this.
However I can't use clang, so I am seeking for a solution that triggers a compiler error (some kind of operator overload or mangling with some defines to prevent such constructs or any other idea).
Is there any way to disable this kind of behavior and trigger an error at compile time?
Not in general, because your code is very similar to the following, legitimate, code:
int ival = 3;
const char *strval = "abcd";
const char *goodresult = ival + strval;
Here goodresult is pointing to the last letter d of strval.
BTW, on Linux, getpid(2) is known to return a positive integer. So you could imagine:
int ival = (getpid()>=0)?3:1000;
const char *strval = "abcd";
const char *goodresult = ival + strval;
which is morally the same as the previous example (so we humans know that ival is always 3). But teaching the compiler that getpid() does not return a negative value is tricky in practice (the return type pid_t of getpid is some signed integer, and has to be signed to be usable by fork(2), which could give -1). And you could imagine more weird examples!
You want compile-time detection of buffer overflow (or more generally of undefined behavior), and in general that is equivalent to the halting problem (it is an unsolvable problem). So it is impossible in general.
Of course, one could claim that a clever compiler could warn for your particular case, but then there is a concern about what cases should be useful to warn.
You might try some static source program analysis tools, perhaps Clang static analyzer or Frama-C (with its recent Frama-C++ variant) - or some costly proprietary tools like Coverity and many others. These tools don't detect all errors statically and takes much more time to execute than an optimizing compiler.
You could (for example) write your own GCC plugin to detect such mistakes (that means developing your own static source code analyzer). You'll spend months in writing it. Are you sure it is worth the effort?
However I can't use clang,
Why? You could ask permission to use the clang static analyzer (or some other one), during development (not for production). If your manager refuses that, it becomes a management problem, not a technical one.
I don't see any normal use case for adding char* to int
You need more imagination. Think of something like
puts("non-empty" + (isempty(x)?4:0));
Ok that is not very readable code, but it is legitimate. In the previous century, when memory was costly, some people used to code that way.
Today you'll code perhaps
if (isempty(x))
puts("empty");
else
puts("non-empty")
and the cute thing is that a clever compiler could probably optimize the later into the equivalent of former (according to the as-if rule).
No way. It is valid syntax, and very useful in many cases.
Just think about you were to write int b=a+10 but you wrote int b=a+00 incorrectly, the compiler won't know it is an error by mistake.
However, you can consider to use C++ classes. Most C++ classes are well designed to prevent such obvious mistakes.
In the first example in your question, really, compilers should issue a warning. Compilers can trivially see that the addition resolves to 8 + "x" and clang does indeed optimise it to a constant. I see the fact it doesn't warn about this as a compiler bug. Although compilers are not required to warn about this, clang goes through great efforts to provide useful diagnostics, and it would be an improvement to diagnose this as well.
In the second example, as Matteo Italia pointed out, clang does already provide a warning option for this, enabled by default: -Wstring-plus-int. You can turn specific warnings into errors by using -Werror=<warning-option>, so in this case -Werror=string-plus-int.
In the following code:
#include <iostream>
int main()
{
const long l = 4294967296;
int i = l;
return i; //just to silence the compiler
}
the compiler warns about implicit conversion (using -Wall and -std=c++14) as following:
warning: implicit conversion from 'const long' to 'int' changes value from 4294967296 to 0 [-Wconstant-conversion]
which is ok. But there is no warning if the conversion is from double to int, as in the following code:
#include <iostream>
int main()
{
const double d = 4294967296.0;
int i = d;
return i; //just to silence the compiler
}
Why the compiler reacts differently in these situations?
Note 1: clang version is 3.6.2-svn240577-1~exp1
Note 2: I've tested it with many others versions of gcc, clang and icc thanks to Compiler Explorer (gcc.godbolt.org). So all tested versions of gcc (with exception of 5.x) and icc threw the warning. No clang version did it.
The conversion from double to an integer type changes the value "by design" (think to 3.141592654 converted to an int).
The conversion from long int to int instead may or work or may be undefined behavior depending on the platform and on the value (the only guarantee is that an int is not bigger than a long int, but they may be the same size).
In other words the problems in the conversions between integer types are incidental artifacts of the implementation, not by-design decisions. Warning about them is better especially if it can be detected at compile time that something doesn't work because of those limitations.
Note also that even conversion from double to int is legal and well defined (if done within boundaries) and an implementation is not required to warn about it even when the loss of precision can be seen at compile time. Compilers that warn too much even when the use could be meaningful can be a problem (you just disable warnings or even worse get the habit of accepting a non-clean build as normal).
These implicit conversion rules may add up with other C++ wrinkles getting to truly odd-looking and hard to justify behaviors like:
std::string s;
s = 3.141592654; // No warnings, no errors (last time I checked)
Don't try to use too much logic with C++. Reading specs works better.
Well, by reading this great article named "What Every C Programmer Should Know About Undefined Behavior", specially part #3/3, at LLVM Project Blog, written by Chris Lattner - the main author of LLVM - I could understand better the Clang's Approach to Handling Undefined Behavior.
So, in order to guarantee your strong appeal for optimization and time economy - "ultimate performance" -
Keep in mind though that the compiler is limited by not having dynamic
information and by being limited to what it can without burning lots
of compile time.
Clang doesn't run all related undefined behavior checks by default,
Clang generates warnings for many classes of undefined behavior
(including dereference of null, oversized shifts, etc) that are
obvious in the code to catch some common mistakes.
instead of this, Clang and LLVM provides tools like Clang Static Analyzer, Klee project, and the -fcatch-undefined-behavior (now UndefinedBehaviorSanitizer - UBSan - ) to avoid these possible bugs.
By running UBSan in the presented code, clang++ with the following argument -fsanitize=undefined the bug will be catched as following:
runtime error: value 4.29497e+09 is outside the range of representable values of type 'int'
I'm currently learning pointers and my professor provided this piece of code as an example:
//We cannot predict the behavior of this program!
#include <iostream>
using namespace std;
int main()
{
char * s = "My String";
char s2[] = {'a', 'b', 'c', '\0'};
cout << s2 << endl;
return 0;
}
He wrote in the comments that we can't predict the behavior of the program. What exactly makes it unpredictable though? I see nothing wrong with it.
The behaviour of the program is non-existent, because it is ill-formed.
char* s = "My String";
This is illegal. Prior to 2011, it had been deprecated for 12 years.
The correct line is:
const char* s = "My String";
Other than that, the program is fine. Your professor should drink less whiskey!
The answer is: it depends on what C++ standard you're compiling against. All the code is perfectly well-formed across all standards‡ with the exception of this line:
char * s = "My String";
Now, the string literal has type const char[10] and we're trying to initialize a non-const pointer to it. For all other types other than the char family of string literals, such an initialization was always illegal. For example:
const int arr[] = {1};
int *p = arr; // nope!
However, in pre-C++11, for string literals, there was an exception in §4.2/2:
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; [...]. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ]
So in C++03, the code is perfectly fine (though deprecated), and has clear, predictable behavior.
In C++11, that block does not exist - there is no such exception for string literals converted to char*, and so the code is just as ill-formed as the int* example I just provided. The compiler is obligated to issue a diagnostic, and ideally in cases such as this that are clear violations of the C++ type system, we would expect a good compiler to not just be conforming in this regard (e.g. by issuing a warning) but to fail outright.
The code should ideally not compile - but does on both gcc and clang (I assume because there's probably lots of code out there that would be broken with little gain, despite this type system hole being deprecated for over a decade). The code is ill-formed, and thus it does not make sense to reason about what the behavior of the code might be. But considering this specific case and the history of it being previously allowed, I do not believe it to be an unreasonable stretch to interpret the resulting code as if it were an implicit const_cast, something like:
const int arr[] = {1};
int *p = const_cast<int*>(arr); // OK, technically
With that, the rest of the program is perfectly fine, as you never actually touch s again. Reading a created-const object via a non-const pointer is perfectly OK. Writing a created-const object via such a pointer is undefined behavior:
std::cout << *p; // fine, prints 1
*p = 5; // will compile, but undefined behavior, which
// certainly qualifies as "unpredictable"
As there is no modification via s anywhere in your code, the program is fine in C++03, should fail to compile in C++11 but does anyway - and given that the compilers allow it, there's still no undefined behavior in it†. With allowances that the compilers are still [incorrectly] interpreting the C++03 rules, I see nothing that would lead to "unpredictable" behavior. Write to s though, and all bets are off. In both C++03 and C++11.
†Though, again, by definition ill-formed code yields no expectation of reasonable behavior
‡Except not, see Matt McNabb's answer
Other answers have covered that this program is ill-formed in C++11 due to the assignment of a const char array to a char *.
However the program was ill-formed prior to C++11 also.
The operator<< overloads are in <ostream>. The requirement for iostream to include ostream was added in C++11.
Historically, most implementations had iostream include ostream anyway, perhaps for ease of implementation or perhaps in order to provide a better QoI.
But it would be conforming for iostream to only define the ostream class without defining the operator<< overloads.
The only slightly wrong thing that I see with this program is that you're not supposed to assign a string literal to a mutable char pointer, though this is often accepted as a compiler extension.
Otherwise, this program appears well-defined to me:
The rules that dictate how character arrays become character pointers when passed as parameters (such as with cout << s2) are well-defined.
The array is null-terminated, which is a condition for operator<< with a char* (or a const char*).
#include <iostream> includes <ostream>, which in turn defines operator<<(ostream&, const char*), so everything appears to be in place.
You can't predict the behaviour of the compiler, for reasons noted above. (It should fail to compile, but may not.)
If compilation succeeds, then the behaviour is well-defined. You certainly can predict the behaviour of the program.
If it fails to compile, there is no program. In a compiled language, the program is the executable, not the source code. If you don't have an executable, you don't have a program, and you can't talk about behaviour of something that doesn't exist.
So I'd say your prof's statement is wrong. You can't predict the behaviour of the compiler when faced with this code, but that's distinct from the behaviour of the program. So if he's going to pick nits, he'd better make sure he's right. Or, of course, you might have misquoted him and the mistake is in your translation of what he said.
As others have noted, the code is illegitimate under C++11, although it was valid under earlier versions. Consequently, a compiler for C++11 is required to issue at least one diagnostic, but behavior of the compiler or the remainder of the build system is unspecified beyond that. Nothing in the Standard would forbid a compiler from exiting abruptly in response to an error, leaving a partially-written object file which a linker might think was valid, yielding a broken executable.
Although a good compiler should always ensure before it exits that any object file it is expected to have produced will be either valid, non-existent, or recognizable as invalid, such issues fall outside the jurisdiction of the Standard. While there have historically been (and may still be) some platforms where a failed compilation can result in legitimate-appearing executable files that crash in arbitrary fashion when loaded (and I've had to work with systems where link errors often had such behavior), I would not say that the consequences of syntax errors are generally unpredictable. On a good system, an attempted build will generally either produce an executable with a compiler's best effort at code generation, or won't produce an executable at all. Some systems will leave behind the old executable after a failed build, since in some cases being able to run the last successful build may be useful, but that can also lead to confusion.
My personal preference would be for disk-based systems to to rename the output file, to allow for the rare occasions when that executable would be useful while avoiding the confusion that can result from mistakenly believing one is running new code, and for embedded-programming systems to allow a programmer to specify for each project a program that should be loaded if a valid executable is not available under the normal name [ideally something which which safely indicates the lack of a useable program]. An embedded-systems tool-set would generally have no way of knowing what such a program should do, but in many cases someone writing "real" code for a system will have access to some hardware-test code that could easily be adapted to the purpose. I don't know that I've seen the renaming behavior, however, and I know that I haven't seen the indicated programming behavior.
I have this C++ code:
#include <stdlib.h>
int main(){
char *Teclas;
Teclas = calloc(1024,sizeof(char));
}
And the compiler is giving the following error:
error: invalid conversion from `void*' to `char*'
What does this error mean and how do I fix it?
The problem is that you're trying to compile C with a C++ compiler. As the error message says, this line:
Teclas = calloc(1024,sizeof(char));
tries to convert the untyped void* pointer returned by calloc into a typed char* pointer to assign to the variable of that type. Such a conversion is valid in C, but not C++.
The solution is to use a C compiler. It looks like you're using GCC, so just rename the source file to something.c, and build with gcc rather than g++.
If you really must use a compiler for the wrong language, and don't feel like rewriting this in idiomatic C++, then you'll need a cast to force it through the compiler:
Teclas = static_cast<char*>(calloc(1024,sizeof(char)));
or, if you want the code to remain valid C:
Teclas = (char*)calloc(1024,sizeof(char));
But don't do that: use the right compiler for the language. Unless this is the first stage in converting the program to C++; in which case, the next thing to do is get rid of these allocations and use std::string instead.
calloc() returns a void*. You need to cast its value to whatever type Teclas is, which appears to be a char*. So Teclas = (char*)calloc(...).
void int main(int argc,char *argv[])
uhm... perhaps just int main(int argc, char *argv[])...
Apart from that: this looks like C code. Nothing in these lines suggests that you use C++. The error you are seeing is the result of you treating C code as if it was C++, whereas it isn't, because C is not C++, C++ is not C, and neither is the subset of the other one.
Compile your C code with a C compiler.
The best solution is to just compile your code with a C compiler, as this is C code. You really shouldn't compile C code as if it were C++. However, to directly answer your question, you can force the C++ compiler to compile your C code (which is very bad!!!!).
In C++, whenever you use any function of the alloc family, you must cast the return value to the type of your lvalue. So, in this case:
Teclas = (char*)calloc(1024, sizeof(char));
The way in which I would do the above is like this:
Teclas = (char*)malloc(1024 * sizeof(*Teclas));
The benefits here: If you change the type of Teclas, you will still end up with the correct allocation size. Also, I just prefer using malloc in general.
Additionally, you have major issues throughout your code. For one thing, void int main(...)???? And you never initialize the contents of Teclas before you print it, but you free it and calloc it again for some reason. I think you meant to make that a do while loop, but there is no do.
Also, void KeyLogger(); is WRONG. That is how you declare the function, but since it is a declaration, it should be outside of main.
On the android NDK JNI, even with typecasting, old style or new style, the error still doesn't go away.
buffer = (char *) malloc(10);
xxxx=static_cast<char*>(calloc(1024,sizeof(char)));
To make the errors go away, an extra include needs to be added to the path and symbols.
Project -> properties -> C/C++ general -> Path and symbols
On the Includes tab/GNU C++, add the following path (with the appropriate gcc version 4.6, 4.8...) Of course on windows, the path would be be a windows path....
{NDKROOT}/toolchains/arm-linux-androideabi-4.6/prebuilt/linux-x86_64/lib/gcc/arm-linux-androideabi/4.6/include
If you can make sure there is no other error in your code, you can add g++ flag: -fpermissive to put the error into warning.
exp: g++ -fpermissive yourcode.cpp
Can't a compiler warn (even better if it throws errors) when it notices a statement with undefined/unspecified/implementation-defined behaviour?
Probably to flag a statement as error, the standard should say so, but it can warn the coder at least. Is there any technical difficulties in implementing such an option? Or is it merely impossible?
Reason I got this question is, in statements like a[i] = ++i; won't it be knowing that the code is trying to reference a variable and modifying it in the same statement, before a sequence point is reached.
It all boils down to
Quality of Implementation: the more accurate and useful the warnings are, the better it is. A compiler that always printed: "This program may or may not invoke undefined behavior" for every program, and then compiled it, is pretty useless, but is standards-compliant. Thankfully, no one writes compilers such as these :-).
Ease of determination: a compiler may not be easily able to determine undefined behavior, unspecified behavior, or implementation-defined behavior. Let's say you have a call stack that's 5 levels deep, with a const char * argument being passed from the top-level, to the last function in the chain, and the last function calls printf() with that const char * as the first argument. Do you want the compiler to check that const char * to make sure it is correct? (Assuming that the first function uses a literal string for that value.) How about when the const char * is read from a file, but you know that the file will always contain valid format specifier for the values being printed?
Success rate: A compiler may be able to detect many constructs that may or may not be undefined, unspecified, etc.; but with a very low "success rate". In that case, the user doesn't want to see a lot of "may be undefined" messages—too many spurious warning messages may hide real warning messages, or prompt a user to compile at "low-warning" setting. That is bad.
For your particular example, gcc gives a warning about "may be undefined". It even warns for printf() format mismatch.
But if your hope is for a compiler that issues a diagnostic for all undefined/unspecified cases, it is not clear if that should/can work.
Let's say you have the following:
#include <stdio.h>
void add_to(int *a, int *b)
{
*a = ++*b;
}
int main(void)
{
int i = 42;
add_to(&i, &i); /* bad */
printf("%d\n", i);
return 0;
}
Should the compiler warn you about *a = ++*b; line?
As gf says in the comments, a compiler cannot check across translation units for undefined behavior. Classic example is declaring a variable as a pointer in one file, and defining it as an array in another, see comp.lang.c FAQ 6.1.
Different compilers trap different conditions; most compilers have warning level options, GCC specifically has many, but -Wall -Werror will switch on most of the useful ones, and coerce them to errors. Use \W4 \WX for similar protection in VC++.
In GCC You could use -ansi -pedantic, but pedantic is what it says, and will throw up many irrelevant issues and make it hard to use much third party code.
Either way, because compilers catch different errors, or produce different messages for the same error, it is therefore useful to use multiple compilers, not necessarily for deployment, but as a poor-man's static analysis. Another approach for C code is to attempt to compile it as C++; the stronger type checking of C++ generally results in better C code; but be sure that if you want C compilation to work, don't use the C++ compilation exclusively; you are likely to introduce C++ specific features. Again this need not be deployed as C++, but just used as an additional check.
Finally, compilers are generally built with a balance of performance and error checking; to check exhaustively would take time that many developers would not accept. For this reason static analysers exist, for C there is the traditional lint, and the open-source splint. C++ is more complex to statically analyse, and tools are often very expensive. One of the best I have used is QAC++ from Programming Research. I am not aware of any free or open source C++ analysers of any repute.
gcc does warn in that situation (at least with -Wall):
#include <stdio.h>
int main(int argc, char *argv[])
{
int a[5];
int i = 0;
a[i] = ++i;
printf("%d\n", a[0]);
return 0;
}
Gives:
$ make
gcc -Wall main.c -o app
main.c: In function ‘main’:
main.c:8: warning: operation on ‘i’ may be undefined
Edit:
A quick read of the man page shows that -Wsequence-point will do it, if you don't want -Wall for some reason.
Contrarily, compilers are not required to make any sort of diagnosis for undefined behavior:
§1.4.1:
The set of diagnosable rules consists of all syntactic and semantic rules in this International Standard except for those rules containing an explicit notation that “no diagnostic is required” or which are described as resulting in “undefined behavior.”
Emphasis mine. While I agree it may be nice, the compiler's have enough problem trying to be standards compliant, let alone teach the programmer how to program.
GCC warns as much as it can when you do something out of the norms of the language while still being syntactically correct, but beyond the certain point one must be informed enough.
You can call GCC with the -Wall flag to see more of that.
If your compiler won't warn of this, you can try a Linter.
Splint is free, but only checks C http://www.splint.org/
Gimpel Lint supports C++ but costs US $389 - maybe your company c an be persuaded to buy a copy? http://www.gimpel.com/