What is the purpose of buffer = "\0" [duplicate] - c++

This question already has answers here:
What is a null-terminated string?
(7 answers)
Closed 9 years ago.
I remember reading the explanation on the use of this line of code, but I've read so many books on sockets for the past week that I can't find it anymore.
I do remember in the book, they wrote their code using =\0, then said it would be better to have it at 1
I tried searching it, but had no luck, this is a piece of the code I'm reading where it is used
nread = recv(newsock, buffer, 25, 0);
buffer[nread] = '\0';

It turns the received buffer into a NUL-terminated C-string, that you can use with strlen, strcpy, etc.
I assume the code you show is for illustrative purposes only, not production code, because you're not checking the return value of recv, which can be -1. If that happens it will cause memory corruption.

This is the C/C++ null terminator which indicates the end of content in a character array.
http://en.wikipedia.org/wiki/Null-terminated_string

It is to signify that the string ends at that byte. In this case the last one.
\0 is the null character.
So you don't get garbage like "This is my message.aG¤(Ag4h98av¤"G#¤". Just imagine there's a \0 at the end of that string.
When dealing with networking, you usually want to send data like integers, and the most common practice is to send it in binary and not plaintext. An integer may look like "$%\0n" for example. 4 bytes, but the third one is a \0. So you have to take into account that there can be a \0. Therefore you should not store binary representation of data as a string, but rather as a buffer/stringstream.
Of course, maybe you don't want to print out the binary representation of it. But you have to keep it in mind. Maybe you want to print it out, who knows.

Related

C++ string copy() gives me extra characters at the end, why is that?

I am studying c++, In a blog they introduced the concept of copy function. When I tried the same in my system, the result is not matching to what I expected. Please let me know what I did wrong here in the below code.
#include <iostream>
main(){
std::string statement = "I like to work in Google";
char compName[6];
statement.copy(compName, 6, 18);
std::cout<<compName;
}
I expected Google but actual output is Googlex
I am using windows - (MinGW.org GCC-6.3.0-1)
You are confusing a sequence of characters, C style string, and std::string. Let's break them down:
A sequence of characters is just that, one character after another in some container (in your case a C style array). To a human being several characters may look like a string, but there is nothing in your code to make it such.
C style string is an array of characters terminated by a symbol \0. It is a carry over from C, as such a compiler will assume that if even if you don't tell it otherwise the array of characters may potentially be such a string.
C++ string (std::string) is a template class that stores strings. There is no need to worry how it does so internally. Although there are functions for interoperability with the first two categories, it is a completely different thing.
Now, let's figure out how a compiler sees your code:
char compName[6];
This creates an array of characters with enough space to store 6 symbols. You can write C style strings into it as long as they are 5 symbols or less, since you will need to also write '\0' at the end. Since in C++ C style arrays are unsafe, they will allow you to write more characters into them, but you cannot predict in advance where those extra characters will be written into memory (or even if your program will continue to execute). You can also potentially read more characters from the array... But you cannot even ask the question where that data will be coming from, unless you are simply playing around with your compiler. Never do that in your code.
statement.copy(compName, 6, 18);
This line writes 6 characters. It does not make it into a C style string, it is simply 6 characters in an array.
std::cout<<compName;
You are trying to output to the console a C style string... which you have not provided to a compiler. So a an operator<< receives a char [], and it assumes that you knew what you were doing and works as if you gave it C string. It displays one character after another until it reaches '\0'. When will it get such a character? I have no idea, since you never gave it one. But due to C style arrays being unsafe, it will have no problem trying to read characters past the end of an array, reading some memory blocks and thinking that they are a continuation of your non-existent C style sting.
Here you got "lucky" and you only got a single byte that appeared as an 'x', and then you got a byte with 0 written in it, and the output stopped. If you run your program at a different time, with a different compiler, or compiled with different optimisations you might get a completely different data displayed.
So what should you have done?
You can try this:
#include <iostream>
#include <string>
int main()
{
std::string statement = "I like to work in Google";
char compName[7]{};
statement.copy(compName, 6, 18);
std::cout<<compName;
return 0;
}
What did i change? I made an array able to hold 7 characters (leaving enough space for a C style string of 6 characters) and i have provided an empty initialisation list {}, which will fill the array with \0 characters. This means that when you will replace the first 6 of them with your data, there will be a terminating character in the very end.
Another approach would be to do this:
#include <iostream>
#include <string>
int main()
{
std::string statement = "I like to work in Google";
char compName[7];
auto length = statement.copy(compName, 6, 18);
compName[length] = '\0';
std::cout<<compName;
return 0;
}
Here i do not initialise the array, but i get the length of the data that is written there with a .copy method and then add the needed terminator in the correct position.
What approach is best depends on your particular application.
When inserting pointer to a character into the stream insertion operator, the pointer is required to point to null terminated string.
compName does not contain the null terminator character. Therefore inserting inserting (a pointer to an element of) it into a character stream violates the requirement above.
Please let me know what I did wrong here
You violate the requirement above. As a consequence, the behaviour of your program is undefined.
I expected Google but actual output is Googlex
This is because the behaviour of the program is undefined.
How to terminate it?
Firstly, make sure that there is room in the array for the null terminator character:
char compName[7];
Then, assign the null terminator character:
compName[6] = '\0';

Please explain me the need for appending '\0' while converting string to char

While using proc/mysql for c++ I have taken string as user input and converted into char via strcpy(c,s.c_str()); function, where c is the binding variable through which I'll add value in the database table and s is the string (user input), it is working fine but my teacher is asking me append '\0' at the end - I can't understand the reason why I need to?
Your teacher is deluded.
c_str() in itself appends a zero [or rather, std::string reserves space for an extra character when creating the string, and makes sure this is zero at least at the point of c_str() returning - in C++11, it is guaranteed that there is an extra character space filled with zero at the end of the string, always].
You DO need a zero at the end of a string to mark the end of the string in a C-style string, such as those used by strcpy.
[As others have pointed out, you should also check that the string fits before copying, and I would suggest reject if it won't fit, as truncating it will lead to other problems - as well as checking that there isn't any sql-injection attacks and a multitude of other things required for "good pracice in an SQL environment"]
While the teacher is deluded on the appending '\0' to the string, your code exhibits another very bad bug.
You should never use strcpy in such a fashion. You should always use some routine which controls the nubmer of characters copied, like strncpy(), or other alternatives, and provide it with the size of receiving variable. Otherwise you are just asking for troubles.
Just guessing, it's a protection against buffer overflow. If c is only N bytes long and s.c_str() returns a pointer to a N+k length string, you'd write k bytes after c, which is bad.
Now let's say (if you didn't SEGFAULT already) you pass this c NUL-terminated string to a C function, you have no guarantee that the \0 you wrote after c is still there. This C function will then read an undefined amount of bytes after c, which is badder worse.
Anyway, use ::strncpy():
char c[64];
::strncpy(c, s.c_str(), sizeof(c));
c[sizeof(c)-1] = '\0';

Break an input string into a list using Arduino

I am working on a very basic REPL for Arduino. To get parameters, I need to split a String into parts, separating using spaces. I do not know how I would store the result. For example, pinmode 1 input would result in a list: "pinmode", 1, "input". The 1 would have to be an int. I have looked at other Stack Overflow answers, but they require a char input.
Don't use String. That's the reason all the other answers use char, commonly called C strings (lower case "s"). String uses "dynamic memory" (i.e., the heap), which is bad on this small microcontroller. It also add 1.6k to your program size.
The simplest thing to do is save each received character into a char array, until you get the newline character, '\n'. Be sure to add the NUL character at the end, and be sure your array is sized appropriately.
Then process the array using the C string library: strcmp, strtoul, strtok, isdigit, etc. Learning about these routines will really pay off, as it keeps your program small and fast. C strings are easy to print out, as well.
Again, stay away from String. It is tempting to beginners, because it is easy to understand. However, it has many subtle, complicated and unpredictable ways to make your embedded program fail. This is not a PC with lots of RAM and a swap file on a hard drive.

Now when we get a string from the user using gets(), where does the '\0' terminating character go?

Now when we declare a string, the last character is the null character, right.
(Now pls see the image of the code and its output that i have attached)
As you can see in the image attached, i am getting the null character at the 7th posn!!! What is happening?
According to the book i refer to(see the other image attached), a string always has an extra character associated with it, at the end of the string, called the null character which adds to the size of the string.
But by the above code i am getting the null character at the 7th position, although according to the book, i should get it at the 6th position.
Can someone explain the output pls?
Any help is really appreciated!!
Thank You!
Do not use gets() - ever! It is entirely immaterial what gets() does as is has no place in any reasonably written code! It is certainly removed from the C++ standard and, as far as I know, also from C (I think C removed it first). gets() happily overruns the buffer provided as it doesn't even know the size of the storage provided. It was blamed as the primary reason for most hacks of systems.
In the code you linked to there is such a buffer overrun. Also not that sizeof() determines the size of a variable. It does not consider its content in any shape or form: sizeof(str) will not change unless you change the type of str. If you want to determine the size of the string in that array you'll need to use strlen(str).
If you really need to read a string into a C array using FILE* functions, you shall use fgets() which, in addition ot the pointer to the storage and the stream (e.g. stdin for the default input stream) also takes the size of the array as parameter. fgets() fails if it can't read a complete null-terminated string.
You declare a char array that can hold up to 5 chars, however, dummy\0 is 6 characters long, resulting in buffer overflow.

What is the advantage of using gets(a) instead of cin.getline(a,20)?

We will have to define an array for storing the string either way.
char[10];
And so suppose I want to store smcck in this array. What is the advantage of using gets(a)? My teacher said that the extra space in the array is wasted when we use cin.getline(a, 20), but that applies for gets(a) too right?
Also just an extra question, what exactly is stored in the empty "boxes"of an array?
gets() is a C function,it does not do bounds checking and is considered dangerous, it has been kept all this years for compatibility and nothing else.
You can check the following link to clear your doubt :
http://www.gidnetwork.com/b-56.html
Don't mix C features with C++, though all the feature of C works in C++ but it is not recommended . If you are working on C++ then you should probably avoid using gets(). use getline() instead.
Well, I don't think gets(a) is bettet because it does not check for the size of the string. If you try to read a long string using it, it may cause an buffer overflow. That means it will use all the 10 spaces you allocated for it and then it will try to use space allocated for another variables or another programs (what is going to make you publication crash).
The cin.getline() receives an int as a parameter with tells it to not read more than the expected number of characters. If you allocate a vector with only 10 positions and read 20 characters it will cause the same problem I told you about gets().
About the strings representation in memory, if you put "smcck" on an array
char v[10];
The word will take the first 5 positions (0 to 4), the position 5 will be taken by a null character (represented by '\0') that will mark the end of the string. Usually, what comes next in the array does not matter and are kept the way it were in the past.the null terminated character is used to mark where the string ends, so you can work it safely.