No instance of function template remove_if matches argument list - c++

I am trying to remove whitespaces from a string
line.erase(remove_if(line.begin(), line.end(), isspace), line.end());
But Visual Studio 2010 (C++ Express) tells me
1 IntelliSense: no instance of function template "std::remove_if" matches the argument list d:\parsertry\parsertry\calc.cpp 18
Full Source
Why is that? A simple piece of code
int main() {
string line = "hello world 111 222";
line.erase(remove_if(line.begin(), line.end(), isspace), line.end());
cout << line << endl;
getchar();
return 0;
}
Verifies the function works?
Funny thing is despite that, it runs giving correct result.

Don't question Intellisense, sometimes it's better to just ignore it. The parser or the database got screwed up somehow, so it doesn't work correctly anymore. Usually, a restart will fix the problem.
If you really want to know if the code is ill-formed, well, just hit F7 to compile.

Your source code compiles without even a warning with Visual C++ 11.0 (the compiler that ships with Visual Studio 2012).
Intellisense uses its own rules and isn't always reliable.
That said, your use of isspace is Undefined Behavior for all character sets except original 7-bit ASCII. Which means the heavily upvoted answer that you took it from, is just balderdash (which should not surprise). You need to cast the argument to (the C library's) isspace to unsigned char to avoid negative values and UB.
C99 §7.4/1 (from the N869 draft):
The header <ctype.h> declares several functions useful for testing and mapping
characters.
In all cases the argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value of the macro EOF. If the
argument has any other value, the behavior is undefined.
A simple way to wrap the C function is
bool isSpace( char const c )
{
typedef unsigned char UChar;
return !!::isspace( UChar( c ) );
}
Why the typedef?
It makes the code easier to adapt when
you already have such a typedef, which is not uncommon;
it makes the code more clear; and
it avoids a C syntax cast, thereby avoiding a false positive when searching for such via a regular expression or other pattern matching.
But, why the !! (double application of the negation operator)? Considering there’s an automatic implicit conversion from int to bool? And, if one absolutely feels that the conversion should be explicit, shouldn’t it be a static_cast, and not !!?
Well, the !! avoids a silly-warning from the Visual C++ compiler,
“warning C4800: 'int' : forcing value to bool 'true' or 'false' (performance warning)”
and a static_cast doesn’t stop that warning. It’s good practice to quench that warning, and since Visual C++ is the main C++ compiler on the most used system, namely Windows, better do this in all code meant to be portable.
Oh, OK, but, since the function must be wrapped anyway, then … why use the old C libary isspace (single argument) function, when the <locale> header provides a far more more flexible C++ (two arguments) isspace function?
Well, first and foremost, the old C isspace function is the one used in the question, so that’s the one discussed in this answer. I have focused on discussing just how to not do this incorrectly, that is, how to avoid Undefined Behavior. Discussing how to do it right brings it to a whole different level.
But regarding the in-practice, the C++ level function of the same name can be considered to be broken, since with g++ compilers until recently (and perhaps even with g++ 4.7.2, i haven't checked lately) only the C locale mechanism worked, and the C++ level one didn't, in Windows. It may have been fixed since g++ now supports wide streams, I don’t know. Anyway, there C library isspace function, in addition to being in-practice more portable and generally working in Windows, is also simpler and, I believe, more efficient (although for efficiency one should always MEASURE if it is deemed important!).
Thanks to James Kanze for asking (essentially) the questions above, in the comments.

What is isspace? Depending on the includes headers and the compiler
you are using, it's likely that your code won't even compile. (I don't
know about IntelliSense, but it's possible that it's looking at all of
the standard headers, and sees the ambiguity.)
There are two isspace functions in the standard, and one is a
template. Passing a function template to a template argument of another
function template does not give the compiler nearly enough information
to be able to do template argument deduction: in order to resolve the
overload of isspace, it has to know the type expected by the
remove_if, which it only knows after template argument deduction has
succeeded. And to do template argument deduction on remove_if, it has
to know the types of the arguments, which means the type of isspace,
which it will only know once it has been able to resolve the overload on
it.
(I'm actually surprised that your little bit of code compiles: you
obviously include <iostream>, and typically, <iostream> will include
<locale>, which will bring in the function template isspace.)
Of course, the function template isspace must be called with two
arguments, so if it were ever chosen, the instantiation of remove_if
wouldn't compile (but the compiler does not try to instantiate
remove_if until it has chosen a function). And the isspace in
<ctype.h> will result in undefined behavior if passed a char, so you
can't use it. The usual solution is to create a set of predicate
objects for your tool box, and use them. Something like the following
should work if you're only concerned with char:
template <std::ctype<char>::mask m>
class Is : public std::unary_function<char, bool>
{
std::locale myLocale; // To ensure lifetime of following...
std::ctype<char> const* myCType;
public:
Is( std::locale const& loc = std::locale() )
: myLocale( loc )
, myCType( &std::use_facet<std::ctype<char> >( myLocale ) )
{
}
bool operator()( char ch ) const
{
return myCType->is( m, ch );
}
};
typedef Is<std::ctype_base::space> IsSpace;
It's trivial to add the additional typedef's so you get the complete
set, and I've found it useful to add an IsNot template as well. It's
simple, and it avoids all of the surrounding issues.

Related

Is there a difference between 'type const &value' and 'type const& value'?

The topic basically says it all. I'm following a tutorial where it says type const &value, but visual studio keeps automatically correcting it to type const& value.
Is there a difference?
Is there a convention?
The way that C++ Parses tokens stipulates that these two statements will be treated equivalently, and there is no difference in the code that will be generated.
As to why MSVC keeps autocorrecting to the other, I really don't think it should, but my guess for why it does is that MSVC is helping you keep your code consistent (which is important!), and it is assuming that the way you want the code to look is in this format: type const& name.
This format is more recognizably C++, whereas the other method, type const &name is more C-like (despite the fact that C doesn't have references; but it does have pointers, and a similar declaration type const *name would be more obviously C as well), and MSVC wants to enforce the C++ style.
But again; there's no functional difference between them. Both are valid, both will compile, and both will have the same behavior.

What makes this usage of pointers unpredictable?

I'm currently learning pointers and my professor provided this piece of code as an example:
//We cannot predict the behavior of this program!
#include <iostream>
using namespace std;
int main()
{
char * s = "My String";
char s2[] = {'a', 'b', 'c', '\0'};
cout << s2 << endl;
return 0;
}
He wrote in the comments that we can't predict the behavior of the program. What exactly makes it unpredictable though? I see nothing wrong with it.
The behaviour of the program is non-existent, because it is ill-formed.
char* s = "My String";
This is illegal. Prior to 2011, it had been deprecated for 12 years.
The correct line is:
const char* s = "My String";
Other than that, the program is fine. Your professor should drink less whiskey!
The answer is: it depends on what C++ standard you're compiling against. All the code is perfectly well-formed across all standards‡ with the exception of this line:
char * s = "My String";
Now, the string literal has type const char[10] and we're trying to initialize a non-const pointer to it. For all other types other than the char family of string literals, such an initialization was always illegal. For example:
const int arr[] = {1};
int *p = arr; // nope!
However, in pre-C++11, for string literals, there was an exception in §4.2/2:
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type “pointer to char”; [...]. In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ]
So in C++03, the code is perfectly fine (though deprecated), and has clear, predictable behavior.
In C++11, that block does not exist - there is no such exception for string literals converted to char*, and so the code is just as ill-formed as the int* example I just provided. The compiler is obligated to issue a diagnostic, and ideally in cases such as this that are clear violations of the C++ type system, we would expect a good compiler to not just be conforming in this regard (e.g. by issuing a warning) but to fail outright.
The code should ideally not compile - but does on both gcc and clang (I assume because there's probably lots of code out there that would be broken with little gain, despite this type system hole being deprecated for over a decade). The code is ill-formed, and thus it does not make sense to reason about what the behavior of the code might be. But considering this specific case and the history of it being previously allowed, I do not believe it to be an unreasonable stretch to interpret the resulting code as if it were an implicit const_cast, something like:
const int arr[] = {1};
int *p = const_cast<int*>(arr); // OK, technically
With that, the rest of the program is perfectly fine, as you never actually touch s again. Reading a created-const object via a non-const pointer is perfectly OK. Writing a created-const object via such a pointer is undefined behavior:
std::cout << *p; // fine, prints 1
*p = 5; // will compile, but undefined behavior, which
// certainly qualifies as "unpredictable"
As there is no modification via s anywhere in your code, the program is fine in C++03, should fail to compile in C++11 but does anyway - and given that the compilers allow it, there's still no undefined behavior in it†. With allowances that the compilers are still [incorrectly] interpreting the C++03 rules, I see nothing that would lead to "unpredictable" behavior. Write to s though, and all bets are off. In both C++03 and C++11.
†Though, again, by definition ill-formed code yields no expectation of reasonable behavior
‡Except not, see Matt McNabb's answer
Other answers have covered that this program is ill-formed in C++11 due to the assignment of a const char array to a char *.
However the program was ill-formed prior to C++11 also.
The operator<< overloads are in <ostream>. The requirement for iostream to include ostream was added in C++11.
Historically, most implementations had iostream include ostream anyway, perhaps for ease of implementation or perhaps in order to provide a better QoI.
But it would be conforming for iostream to only define the ostream class without defining the operator<< overloads.
The only slightly wrong thing that I see with this program is that you're not supposed to assign a string literal to a mutable char pointer, though this is often accepted as a compiler extension.
Otherwise, this program appears well-defined to me:
The rules that dictate how character arrays become character pointers when passed as parameters (such as with cout << s2) are well-defined.
The array is null-terminated, which is a condition for operator<< with a char* (or a const char*).
#include <iostream> includes <ostream>, which in turn defines operator<<(ostream&, const char*), so everything appears to be in place.
You can't predict the behaviour of the compiler, for reasons noted above. (It should fail to compile, but may not.)
If compilation succeeds, then the behaviour is well-defined. You certainly can predict the behaviour of the program.
If it fails to compile, there is no program. In a compiled language, the program is the executable, not the source code. If you don't have an executable, you don't have a program, and you can't talk about behaviour of something that doesn't exist.
So I'd say your prof's statement is wrong. You can't predict the behaviour of the compiler when faced with this code, but that's distinct from the behaviour of the program. So if he's going to pick nits, he'd better make sure he's right. Or, of course, you might have misquoted him and the mistake is in your translation of what he said.
As others have noted, the code is illegitimate under C++11, although it was valid under earlier versions. Consequently, a compiler for C++11 is required to issue at least one diagnostic, but behavior of the compiler or the remainder of the build system is unspecified beyond that. Nothing in the Standard would forbid a compiler from exiting abruptly in response to an error, leaving a partially-written object file which a linker might think was valid, yielding a broken executable.
Although a good compiler should always ensure before it exits that any object file it is expected to have produced will be either valid, non-existent, or recognizable as invalid, such issues fall outside the jurisdiction of the Standard. While there have historically been (and may still be) some platforms where a failed compilation can result in legitimate-appearing executable files that crash in arbitrary fashion when loaded (and I've had to work with systems where link errors often had such behavior), I would not say that the consequences of syntax errors are generally unpredictable. On a good system, an attempted build will generally either produce an executable with a compiler's best effort at code generation, or won't produce an executable at all. Some systems will leave behind the old executable after a failed build, since in some cases being able to run the last successful build may be useful, but that can also lead to confusion.
My personal preference would be for disk-based systems to to rename the output file, to allow for the rare occasions when that executable would be useful while avoiding the confusion that can result from mistakenly believing one is running new code, and for embedded-programming systems to allow a programmer to specify for each project a program that should be loaded if a valid executable is not available under the normal name [ideally something which which safely indicates the lack of a useable program]. An embedded-systems tool-set would generally have no way of knowing what such a program should do, but in many cases someone writing "real" code for a system will have access to some hardware-test code that could easily be adapted to the purpose. I don't know that I've seen the renaming behavior, however, and I know that I haven't seen the indicated programming behavior.

How cout is more typesafe than printf()

I have read this at many places, but do not understand. Why it is said that cout is more type safe than printf(). Just because it does not required to write %d %c %f or it has some deeper meaning.
Thanks in advance.
This is why:
printf("%s\n", 42); // this will clobber the stream
This will cause a buffer overflow – the compiler cannot generally check that the format string in the first argument of printf corresponds to the types of the subsequent arguments. It could do this in the above case – because the string is hard-coded – and some compilers do.1 But in general the format string may be determined at runtime so the compiler cannot check its correctness.
1 But these checks are special-cased to printf. If you wrote your own myprintf function with the same signature as printf, there would be no way to check for type safety since the signature uses ellipsis ... which elides all type information inside the function.
The printf family functions are variadic functions, as all of them uses ellipsis ... which means argument(s) of any type(s) can be passed to the function as far as ... is concerned. There is no restriction by the compiler, as there is no requirement on the types of the arguments. The compiler cannot impose any type-safety rule as the ellipsis ... allows ALL types. The function uses a format string to assume the argument type (even if there is a mismatch!!). The format string is read and interpreted at runtime, by which time the compiler cannot do anything if there is mismatch because the code is already compiled. So in this way, this is not type-safe. By type-safe, we usually mean that the compiler is able check type consistency of the programs, by imposing rules on the types of (unevaluated) expressions.
Note that if there is a mismatch (which the function cannot figure out!), the program enters into the undefined-behaviour zone, where the behavior of the program is not predictable and theoretically anything could happen.
You can extend the same logic to any variadic function functions such as scanf family.
the type system guarantees correctness with std::ostream but not printf.
Konrad's answer is one example, but something like
printf("%ld\n", 7);
is also broken (assuming long and int are different sizes on your system). It may even work with one build target and fail with another. Trying to print typedefs like size_t has the same problem.
This is (somewhat) solved with the the diagnostic provided by some compilers, but that doesn't help with the second sense:
both type systems (the run-time type system used in the format string, and the compile-time system used in your code) cannot be kept in sync automatically. For example, printf interacts badly with templates:
template <typename T> void print(T t) { printf("%d\n",t); }
you can't make this correct for all types T- the best you can do is static_cast<int>(t) so it will fail to compile if T is not convertible to int. Compare
template <typename T> void print(std::ostream& os, T t) { os << t << '\n'; }
which selects the correct overload of operator<< for any T that has one.
Generally, compilers can't check arguments from printf, don't even argument count, neither check if they are suited format string. They are "optimized" to do that work, but is a special case, and may fail.
Example:
printf("%s %d\n", 1, "two", 3);
This will compile (unless optimized compiler detects failure), and at runtime, printf will consider first argument (1) a string and second ("two") a integer. printf will not even notice that there is a third argument, neither will notice if there aren't enough arguments!
Using cout, compiler must choose specific operator<< for each variable you insert.
Example:
cout<<1<<"two"<<3<<endl;
Compiler must change this in calls to corresponding ostream&operator<<(int) and ostream&operator<<(const char*) (and also ostream&operator<<(ios&(*)(ios&))).
cout will also be faster, as there is no runtime interpretation of format string.
From the C++ FAQ :
[15.1] Why should I use <iostream> instead of the traditional <cstdio>?
[...]
More type-safe: With , the type of object being I/O'd is known statically by the compiler. In contrast, uses "%" fields to figure out the types dynamically.
[...]
For printf, the compiler cannot check that the format script of the first argument corresponds to the types of the other arguments... In general it is done at runtime.

Removing punctuation query

I've been updating a program I wrote almost two years ago, and I've come across a call to remove all punctuation and spaces from a string.
The call works alright, but I'm not sure that it's the most efficient way to do this.
The line of code is below:
tempMessage.erase(remove_if(tempMessage.begin(), tempMessage.end(), (int(*)(int))ispunct), tempMessage.end());
I've no recollection of where I came up with this or how it was put together, but I want to be able to understand this call fully.
I get that the std::string.erase gets rid of the first argument up until the second argument. I can also see how the remove_if defines the start and end points, but can anyone tell me where the third argument in the remove_if call is coming from?
I can't remember why the (int(*)(int)) is needed for the life of me.
While you are looking at the code, can anyone improve this, or make it more efficient?
Thanks
First, this doesn't work in general; it just seems to (and it
may work with some compilers). You cannot pass a char to the
one argument version of ispunct without incurring undefined
behavior.
As for the reason for the cast: the standard defines both
a single argument ispunct function and a two argument
ispunct function template. In order to correctly
instantiation the template function erase, the compiler needs
to know the exact type of ispunct. To know the exact type of
ispunct, the compiler needs to be able to do type deduction on
the function template. In order to do type deduction, the
compiler needs to know the type expected. There's a cycle in
the dependencies, which the explicite cast (or what looks like
an explicit cast) resolves.
Because using the one parameter version of ispunct results in
undefined behavior, and using the two parameter version won't
compile unless you provide the additional parameter (using
std::bind, for example), anyone doing any string processing in
C++ will have functional objects already written in his toolbox
to handle this, and would write something like:
tempMessage.erase(
std::remove_if( tempMessage.begin(), tempMessage.end(), IsPunct() ),
tempMessage.end() );
How you implement IsPunct depends on your needs with regards
to localization. The simplest version is just:
struct IsPunct
{
bool operator()( char ch ) const
{
return ::ispunct( static_cast<unsigned char>( ch ) );
}
};
The version using the ctype facet of locale is somewhat
more complicated (and you probably want it to keep a copy of the
locale, as well as a reference to the facet, just to be sure
that the referenced facet doesn't disappear).

Using void in functions without parameter?

In C++ using void in a function with no parameter, for example:
class WinMessage
{
public:
BOOL Translate(void);
};
is redundant, you might as well just write Translate();.
I, myself generally include it since it's a bit helpful when code-completion supporting IDEs display a void, since it ensures me that the function takes definitely no parameter.
My question is, Is adding void to parameter-less functions a good practice? Should it be encouraged in modern code?
In C++
void f(void);
is identical to:
void f();
The fact that the first style can still be legally written can be attributed to C.
n3290 § C.1.7 (C++ and ISO C compatibility) states:
Change: In C++, a function declared with an empty parameter list takes
no arguments.
In C, an empty parameter list means that the number and
type of the function arguments are unknown.
Example:
int f(); // means int f(void) in C++
// int f( unknown ) in C
In C, it makes sense to avoid that undesirable "unknown" meaning. In C++, it's superfluous.
Short answer: in C++ it's a hangover from too much C programming. That puts it in the "don't do it unless you really have to" bracket for C++ in my view.
I see absolutely no reason for this. IDEs will just complete the function call with an empty argument list, and 4 characters less.
Personally I believe this is making the already verbose C++ even more verbose. There's no version of the language I'm aware of that requires the use of void here.
I think it will only help in backward compatibility with older C code, otherwise it is redundant.
I feel like no. Reasons:
A lot more code out there has the BOOL Translate() form, so others reading your code will be more comfortable and productive with it.
Having less on the screen (especially something redundant like this) means less thinking for somebody reading your code.
Sometimes people, who didn't program in C in 1988, ask "What does foo(void) mean?"
Just as a side note. Another reason for not including the void is that software, like starUML, that can read code and generate class diagrams, read the void as a parameter. Even though this may be a flaw in the UML generating software, it is still annoying to have to go back and remove the "void"s if you want to have clean diagrams