I have problem understanding the std::cin.peek() and std::cin.get() - c++

I am so confused about the usage of std::cin.peek(), std::cin.get() I mean these versions returning an int.
As I've read on C++ primer we should never assign the return from these function to a char:
char c = std::cin.get(); // erroneous
But on Cppreference it does it in the example and in many websites and many programmers including me until I discovered the logic of not doing so.
https://en.cppreference.com/w/cpp/io/basic_istream/get
And I see also such usage checking for a new line:
while(std::cin.peek() != '\n' )
;// do something
In the above code I think it is OK because in fact there is no assignment from int to char but a comparison in which the newline character '\n' is promoted first to int then compared with int which I think is not evil.
If the code is OK then what is the point in using std::char_traits<char>::to_int_type() function?
So there is no way to assign any value returned from peek() and get() to a char?'
I've seen some recommended code like:
char c;
while(std::cin.peek() != std::char_traits<char>::to_int_type('\n')){
std::cin.get(c);
std::cout.put(c);
}
So what is the difference between implicit conversion of '\n' to int and using this trait function?

Related

Passing a .war file using a http POST request not working [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

C++ toupper Syntax

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)
toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);
As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.
As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}
toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.
Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

Creating binary (custom length) string in C++ [duplicate]

If I want to construct a std::string with a line like:
std::string my_string("a\0b");
Where i want to have three characters in the resulting string (a, null, b), I only get one. What is the proper syntax?
Since C++14
we have been able to create literal std::string
#include <iostream>
#include <string>
int main()
{
using namespace std::string_literals;
std::string s = "pl-\0-op"s; // <- Notice the "s" at the end
// This is a std::string literal not
// a C-String literal.
std::cout << s << "\n";
}
Before C++14
The problem is the std::string constructor that takes a const char* assumes the input is a C-string. C-strings are \0 terminated and thus parsing stops when it reaches the \0 character.
To compensate for this, you need to use the constructor that builds the string from a char array (not a C-String). This takes two parameters - a pointer to the array and a length:
std::string x("pq\0rs"); // Two characters because input assumed to be C-String
std::string x("pq\0rs",5); // 5 Characters as the input is now a char array with 5 characters.
Note: C++ std::string is NOT \0-terminated (as suggested in other posts). However, you can extract a pointer to an internal buffer that contains a C-String with the method c_str().
Also check out Doug T's answer below about using a vector<char>.
Also check out RiaD for a C++14 solution.
If you are doing manipulation like you would with a c-style string (array of chars) consider using
std::vector<char>
You have more freedom to treat it like an array in the same manner you would treat a c-string. You can use copy() to copy into a string:
std::vector<char> vec(100)
strncpy(&vec[0], "blah blah blah", 100);
std::string vecAsStr( vec.begin(), vec.end());
and you can use it in many of the same places you can use c-strings
printf("%s" &vec[0])
vec[10] = '\0';
vec[11] = 'b';
Naturally, however, you suffer from the same problems as c-strings. You may forget your null terminal or write past the allocated space.
I have no idea why you'd want to do such a thing, but try this:
std::string my_string("a\0b", 3);
What new capabilities do user-defined literals add to C++? presents an elegant answer: Define
std::string operator "" _s(const char* str, size_t n)
{
return std::string(str, n);
}
then you can create your string this way:
std::string my_string("a\0b"_s);
or even so:
auto my_string = "a\0b"_s;
There's an "old style" way:
#define S(s) s, sizeof s - 1 // trailing NUL does not belong to the string
then you can define
std::string my_string(S("a\0b"));
The following will work...
std::string s;
s.push_back('a');
s.push_back('\0');
s.push_back('b');
You'll have to be careful with this. If you replace 'b' with any numeric character, you will silently create the wrong string using most methods. See: Rules for C++ string literals escape character.
For example, I dropped this innocent looking snippet in the middle of a program
// Create '\0' followed by '0' 40 times ;)
std::string str("\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", 80);
std::cerr << "Entering loop.\n";
for (char & c : str) {
std::cerr << c;
// 'Q' is way cooler than '\0' or '0'
c = 'Q';
}
std::cerr << "\n";
for (char & c : str) {
std::cerr << c;
}
std::cerr << "\n";
Here is what this program output for me:
Entering loop.
Entering loop.
vector::_M_emplace_ba
QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ
That was my first print statement twice, several non-printing characters, followed by a newline, followed by something in internal memory, which I just overwrote (and then printed, showing that it has been overwritten). Worst of all, even compiling this with thorough and verbose gcc warnings gave me no indication of something being wrong, and running the program through valgrind didn't complain about any improper memory access patterns. In other words, it's completely undetectable by modern tools.
You can get this same problem with the much simpler std::string("0", 100);, but the example above is a little trickier, and thus harder to see what's wrong.
Fortunately, C++11 gives us a good solution to the problem using initializer list syntax. This saves you from having to specify the number of characters (which, as I showed above, you can do incorrectly), and avoids combining escaped numbers. std::string str({'a', '\0', 'b'}) is safe for any string content, unlike versions that take an array of char and a size.
In C++14 you now may use literals
using namespace std::literals::string_literals;
std::string s = "a\0b"s;
std::cout << s.size(); // 3
Better to use std::vector<char> if this question isn't just for educational purposes.
anonym's answer is excellent, but there's a non-macro solution in C++98 as well:
template <size_t N>
std::string RawString(const char (&ch)[N])
{
return std::string(ch, N-1); // Again, exclude trailing `null`
}
With this function, RawString(/* literal */) will produce the same string as S(/* literal */):
std::string my_string_t(RawString("a\0b"));
std::string my_string_m(S("a\0b"));
std::cout << "Using template: " << my_string_t << std::endl;
std::cout << "Using macro: " << my_string_m << std::endl;
Additionally, there's an issue with the macro: the expression is not actually a std::string as written, and therefore can't be used e.g. for simple assignment-initialization:
std::string s = S("a\0b"); // ERROR!
...so it might be preferable to use:
#define std::string(s, sizeof s - 1)
Obviously you should only use one or the other solution in your project and call it whatever you think is appropriate.
I know it is a long time this question has been asked. But for anyone who is having a similar problem might be interested in the following code.
CComBSTR(20,"mystring1\0mystring2\0")
Almost all implementations of std::strings are null-terminated, so you probably shouldn't do this. Note that "a\0b" is actually four characters long because of the automatic null terminator (a, null, b, null). If you really want to do this and break std::string's contract, you can do:
std::string s("aab");
s.at(1) = '\0';
but if you do, all your friends will laugh at you, you will never find true happiness.

Uninitialized char

I'm reading over a C++ class for parsing CSV files in one of my programming books for class. I primarily write in C# for work and don't interact with C++ code very often. One of the functions, getline, uses an uninitialized char variable and I'm confused as to whether it's a typo or not.
// getline: get one line, grow as needed
int Csv::getline(string& str)
{
char c;
for (line = ""; fin.get(c) && !endofline(c); )
line += c;
split();
str = line;
return !fin.eof();
}
fin is an istream. The documentation I'm reading shows the get (char& c); function being passed a reference, but which char in the stream is returned? What's the initial value of c?
The initial value of c is undefined but it does not matter what the initial value of c is since the call to get will set the value. Since there is a sequence point after the left hand side of the || and && operators we know that all the side effects of get will have been effected and endofline will see the modified value of c.

Why can I assign/compare int and char in C

I have the code like this :
#include <stdio.h>
main()
{
int c;
c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
}
The C documentation says that the getchar() returns the int value. And in the above program we have assigned c type as an int. And most importantly EOF is a integer constant defined in the header function.
Now if the code changes to something like this:
#include <stdio.h>
main()
{
char c;
c = getchar();
while (c != EOF) {
putchar(c);
c = getchar();
}
}
This code also works! Wait a min, as per C documentation getchar() returnsint, but see in the above code I'm storing it in char. And C compiler doesn't throw any error. And also in while loop I have compared c which is an char with EOF which is an int and compiler doesn't throw any error and my program executes!
Why does the compiler doesn't throw any error in the above two cases?
Thanks in advance.
No. It simply means that the returned value which is an int, implicitly converts into char type. That is all.
The compiler may generate warning messages for such conversion, as sizeof(int) is greater than sizeof(char). For example, if you compile your code with -Wconversion option with GCC, it gives these warning messages:
c.c:5:7: warning: conversion to 'char' from 'int' may alter its value
c.c:8:8: warning: conversion to 'char' from 'int' may alter its value
That means, you should use int to avoid such warning messages.
I'm afraid that the term "dynamic programming language" is too vaguely defined to make a such a fine distinction in this case.
Though I'd argue that implicit converting to one numeric type to another is not a dynamic language feature, but just syntax sugar.
No. Lets look at wikipedia's definition
These behaviors could include extension of the program, by adding new
code, by extending objects and definitions, or by modifying the type
system, all during program execution. These behaviors can be emulated
in nearly any language of sufficient complexity, but dynamic languages
provide direct tools to make use of them.
What you have demonstrated is that a char and int in C/C++ are pretty much the same, and C/C++ automatically casts between the two. Nothing more. There's no modification of the type system here.
Lets rewrite your code to illustrate what's going on
int main(int argc, char** argv)
{
char c;
c = EOF; /* supposing getchar() returns eof */
return (c == EOF) ? 0 : 1;
}
What should the return value of this program be? EOF is not a char, but you cast it to a char. When you do a comparison, that cast happens again, and it gets squashed to the same value. Another way of rewriting this to make it clear what's going on is:
#include <stdio.h>
main()
{
int c;
c = getchar();
while ((char)c != (char)EOF) {
putchar((char)c);
c = getchar();
}
}
EOF is getting squashed; it doesn't matter how it's getting squashed, it could be squashed to the letter 'M', but since it gets squashed the same way every time, you still see it as EOF.
C will let you do what you want, but you have to accept the consequences. If you want to assign an int to a char then you can. Doesn't make it a dynamic language.
(As an aside, the title of this question should be something like "why does c let me assign an int to a char?" and just contain the final paragraph. But presumably that wouldn't attract enough attention. If it had attracted any upvotes then I'd edit the title, but since it isn't I'll leave it as an example of how not to ask a question.)
If you are not reading 7-bit ASCII data, it is possible that \xFF is valid data. If EOF is \xFFFFFFFF and you return the value of getchar() to a char then you can not distinguish EOF from \xFF as data.