Does strlen have any side effect? - c++

I know, or better I suppose, that strlen() is fairly simple function, but for some reason, when I want to use it in one my function, it will corrupt the program. I have no idea why it's happening, because the results from strlen are correct, but the program as a whole doesn't work if strlen is included. To be specific, this part of code is causing trouble:
int random1 = strlen(objectNameValue);
int random2 = strlen(PA_type_value);
int random3 = strlen(attributeValue);
printf("length of objectNameValue: %d, PA_type_value: %d, attributeValue: %d", random1, random2, random3);
My program is actually a simple LDAP server and I'm trying to send some data - those variables (attributeValue), for example, when I put this part of code to comments, it works perfectly. But if I leave it there, sent data are not correct, and I have really no idea how strlen can affect them. Funny is that strlen is actually printing correct results.
To specify a bit more, those three variables in strlen are all of type char[] and all have /0 at the end. Can anyone help?
So, after request in comments, here is a closer look:
Basically, in very minimal way, this is something I'm doing:
char message_out[1024];
bzero(message_out, 1024);
char objectNameValue[1024];
strcpy(objectNameValue,"cn=");
char PA_type_value[5] = {'u', 'i', 'd'};
strcat(Message_out, objectNameValue);
strcat(Message_out, PA_type_value);
But sometimes I'm not putting the fixed characters into string, but another string for example., and I need to find out how big that string is, because I need to count total lengths of the message_out. I always watch out for strings, to be big enough, to be initialized with zeroes.
Output is this:
Length of objectNameValue: 12, PA_type_value: 3, attributeValue: 8

The problem could be with how strlen finds the length. strlen iterates through all characters of an array of characters until it finds a terminal \0 character. If the array does not have this terminal character, the implementation could proceed out of the memory bounds of the array of characters you gave it and cause undefined behavior.

Related

C++ string copy() gives me extra characters at the end, why is that?

I am studying c++, In a blog they introduced the concept of copy function. When I tried the same in my system, the result is not matching to what I expected. Please let me know what I did wrong here in the below code.
#include <iostream>
main(){
std::string statement = "I like to work in Google";
char compName[6];
statement.copy(compName, 6, 18);
std::cout<<compName;
}
I expected Google but actual output is Googlex
I am using windows - (MinGW.org GCC-6.3.0-1)
You are confusing a sequence of characters, C style string, and std::string. Let's break them down:
A sequence of characters is just that, one character after another in some container (in your case a C style array). To a human being several characters may look like a string, but there is nothing in your code to make it such.
C style string is an array of characters terminated by a symbol \0. It is a carry over from C, as such a compiler will assume that if even if you don't tell it otherwise the array of characters may potentially be such a string.
C++ string (std::string) is a template class that stores strings. There is no need to worry how it does so internally. Although there are functions for interoperability with the first two categories, it is a completely different thing.
Now, let's figure out how a compiler sees your code:
char compName[6];
This creates an array of characters with enough space to store 6 symbols. You can write C style strings into it as long as they are 5 symbols or less, since you will need to also write '\0' at the end. Since in C++ C style arrays are unsafe, they will allow you to write more characters into them, but you cannot predict in advance where those extra characters will be written into memory (or even if your program will continue to execute). You can also potentially read more characters from the array... But you cannot even ask the question where that data will be coming from, unless you are simply playing around with your compiler. Never do that in your code.
statement.copy(compName, 6, 18);
This line writes 6 characters. It does not make it into a C style string, it is simply 6 characters in an array.
std::cout<<compName;
You are trying to output to the console a C style string... which you have not provided to a compiler. So a an operator<< receives a char [], and it assumes that you knew what you were doing and works as if you gave it C string. It displays one character after another until it reaches '\0'. When will it get such a character? I have no idea, since you never gave it one. But due to C style arrays being unsafe, it will have no problem trying to read characters past the end of an array, reading some memory blocks and thinking that they are a continuation of your non-existent C style sting.
Here you got "lucky" and you only got a single byte that appeared as an 'x', and then you got a byte with 0 written in it, and the output stopped. If you run your program at a different time, with a different compiler, or compiled with different optimisations you might get a completely different data displayed.
So what should you have done?
You can try this:
#include <iostream>
#include <string>
int main()
{
std::string statement = "I like to work in Google";
char compName[7]{};
statement.copy(compName, 6, 18);
std::cout<<compName;
return 0;
}
What did i change? I made an array able to hold 7 characters (leaving enough space for a C style string of 6 characters) and i have provided an empty initialisation list {}, which will fill the array with \0 characters. This means that when you will replace the first 6 of them with your data, there will be a terminating character in the very end.
Another approach would be to do this:
#include <iostream>
#include <string>
int main()
{
std::string statement = "I like to work in Google";
char compName[7];
auto length = statement.copy(compName, 6, 18);
compName[length] = '\0';
std::cout<<compName;
return 0;
}
Here i do not initialise the array, but i get the length of the data that is written there with a .copy method and then add the needed terminator in the correct position.
What approach is best depends on your particular application.
When inserting pointer to a character into the stream insertion operator, the pointer is required to point to null terminated string.
compName does not contain the null terminator character. Therefore inserting inserting (a pointer to an element of) it into a character stream violates the requirement above.
Please let me know what I did wrong here
You violate the requirement above. As a consequence, the behaviour of your program is undefined.
I expected Google but actual output is Googlex
This is because the behaviour of the program is undefined.
How to terminate it?
Firstly, make sure that there is room in the array for the null terminator character:
char compName[7];
Then, assign the null terminator character:
compName[6] = '\0';

Erase unwanted characters

We have created a char array with a fixed length. Now, we write a word or a sentence inside that array. However, the length of this word-sentence is shorter than the length of the char array, so when we print the message with printf function, a number of crap characters are also printed. We would like to erase all this characters, even if the length of the message written is variable.
Thank you!
C strings are terminated by a NUL byte ('\0'). If you don't have this terminator then printf doesn't know that your string has ended. The solution is to put a \0 after your word in the array.
Note: learn to use std::string which manages this for you.
Have you considered not using a fixed array size if that is not what you want? You could just use a char* instead and assign it's size dynamically.
If you want a fixed size for some reason the only solution I can come up with is to track in a separate variable how long the word is and then only print the n first characters. There is no way to determine which chars are valid and which are not in the array as far as I know.
Other than that, if your buffer is supposed to be just a byte array and not a (null terminated) string, then you can use fwrite instead of fprintf to dump the contents.
But in general, I agree with others that it could be better to use std::string.

Please explain me the need for appending '\0' while converting string to char

While using proc/mysql for c++ I have taken string as user input and converted into char via strcpy(c,s.c_str()); function, where c is the binding variable through which I'll add value in the database table and s is the string (user input), it is working fine but my teacher is asking me append '\0' at the end - I can't understand the reason why I need to?
Your teacher is deluded.
c_str() in itself appends a zero [or rather, std::string reserves space for an extra character when creating the string, and makes sure this is zero at least at the point of c_str() returning - in C++11, it is guaranteed that there is an extra character space filled with zero at the end of the string, always].
You DO need a zero at the end of a string to mark the end of the string in a C-style string, such as those used by strcpy.
[As others have pointed out, you should also check that the string fits before copying, and I would suggest reject if it won't fit, as truncating it will lead to other problems - as well as checking that there isn't any sql-injection attacks and a multitude of other things required for "good pracice in an SQL environment"]
While the teacher is deluded on the appending '\0' to the string, your code exhibits another very bad bug.
You should never use strcpy in such a fashion. You should always use some routine which controls the nubmer of characters copied, like strncpy(), or other alternatives, and provide it with the size of receiving variable. Otherwise you are just asking for troubles.
Just guessing, it's a protection against buffer overflow. If c is only N bytes long and s.c_str() returns a pointer to a N+k length string, you'd write k bytes after c, which is bad.
Now let's say (if you didn't SEGFAULT already) you pass this c NUL-terminated string to a C function, you have no guarantee that the \0 you wrote after c is still there. This C function will then read an undefined amount of bytes after c, which is badder worse.
Anyway, use ::strncpy():
char c[64];
::strncpy(c, s.c_str(), sizeof(c));
c[sizeof(c)-1] = '\0';

how to make a not null-terminated c string?

i am wondering :char *cs = .....;what will happen to strlen() and printf("%s",cs) if cs point to memory block which is huge but with no '\0' in it?
i write these lines:
char s2[3] = {'a','a','a'};
printf("str is %s,length is %d",s2,strlen(s2));
i get the result :"aaa","3",but i think this result is because that a '\0'(or a 0 byte) happens to reside in the location s2+3.
how to make a not null-terminated c string? strlen and other c string function relies heavily on the '\0' byte,what if there is no '\0',i just want know this rule deeper and better.
ps: my curiosity is aroused by studying the follw post on SO.
How to convert a const char * to std::string
and these word in that post :
"This is actually trickier than it looks, because you can't call strlen unless the string is actually nul terminated."
If it's not null-terminated, then it's not a C string, and you can't use functions like strlen - they will march off the end of the array, causing undefined behaviour. You'll need to keep track of the length some other way.
You can still print a non-terminated character array with printf, as long as you give the length:
printf("str is %.3s",s2);
printf("str is %.*s",s2_length,s2);
or, if you have access to the array itself, not a pointer:
printf("str is %.*s", (int)(sizeof s2), s2);
You've also tagged the question C++: in that language, you usually want to avoid all this error-prone malarkey and use std::string instead.
A "C string" is, by definition, null-terminated. The name comes from the C convention of having null-terminated strings. If you want something else, it's not a C string.
So if you have a string that is not null-terminated, you cannot use the C string manipulation routines on it. You can't use strlen, strcpy or strcat. Basically, any function that takes a char* but no separate length is not usable.
Then what can you do? If you have a string that is not null-terminated, you will have the length separately. (If you don't, you're screwed. You need some way to find the length, either by a terminator or by storing it separately.) What you can do is allocate a buffer of the appropriate size, copy the string over, and append a null. Or you can write your own set of string manipulation functions that work with pointer and length. In C++ you can use std::string's constructor that takes a char* and a length; that one doesn't need the terminator.
Your supposition is correct: your strlen is returning the correct value out of sheer luck, because there happens to be a zero on the stack right after your improperly terminated string. It probably helps that the string is 3 bytes, and the compiler is likely aligning stuff on the stack to 4-byte boundaries.
You cannot depend on this. C strings need NUL characters (zeroes) at the end to work correctly. C string handling is messy, and error-prone; there are libraries and APIs that help make it less so… but it's still easy to screw up. :)
In this particular case, your string could be initialized as one of these:
A: char s2[4] = { 'a','a','a', 0 }; // good if string MUST be 3 chars long
B: char *s2 = "aaa"; // if you don't need to modify the string after creation
C: char s2[]="aaa"; // if you DO need to modify the string afterwards
Also note that declarations B and C are 'safer' in the sense that if someone comes along later and changes the string declaration in a way that alters the length, B and C are still correct automatically, whereas A depends on the programmer remembering to change the array size and keeping the explicit null terminator at the end.
What happens is that strlen keeps going, reading memory values until it eventually gets to a null. it then assumes that is the terminator and returns the length that could be massively large. If you're using strlen in an environment that expects C-strings to be used, you could then copy this huge buffer of data into another one that is just not big enough - causing buffer overrun problems, or at best, you could copy a large amount of garbage data into your buffer.
Copying a non-null terminated C string into a std:string will do this. If you then decide that you know this string is only 3 characters long and discard the rest, you will still have a massively long std:string that contains the first 3 good characters and then a load of wastage. That's inefficient.
The moral is, if you're using the CRT functions to operator on C strings, they must be null-terminated. Its no different to any other API, you must follow the rules that API sets down for correct usage.
Of course, there is no reason you cannot use the CRT functions if you always use the specific-length versions (eg strncpy) but you will have to limit yourself to just those, always, and manually keep track of the correct lengths.
Convention states that a char array with a terminating \0 is a null terminated string. This means that all str*() functions expect to find a null-terminator at the end of the char-array. But that's it, it's convention only.
By convention also strings should contain printable characters.
If you create an array like you did char arr[3] = {'a', 'a', 'a'}; you have created a char array. Since it is not terminated by a \0 it is not called a string in C, although its contents can be printed to stdout.
The C standard does not define the term string until the section 7 - Library functions. The definition in C11 7.1.1p1 reads:
A string is a contiguous sequence of characters terminated by and including the first null character.
(emphasis mine)
If the definition of string is a sequence of characters terminated by a null character, a sequence of non-null characters not terminated by a null is not a string, period.
What you have done is undefined behavior.
You are trying to write to a memory location that is not yours.
Change it to
char s2[] = {'a','a','a','\0'};

strncpy char string issue when adding length

I'm having a problem with comparing 2 char strings that are both the same:
char string[50];
strncpy(string, "StringToCompare", 49);
if( !strcmp("StringToCompare", string) )
//do stuff
else
//the code runs into here even tho both strings are the same...this is what the problem is.
If I use:
strcpy(string, "StringToCompare");
instead of:
strncpy(string, "StringToCompare", 49);
it solves the problem, but I would rather insert the length of the string rather than it getting it itself.
What's going wrong here? How do I solve this problem?
You forgot to put a terminating NUL character to string, so maybe strcmp run over the end. Use this line of code:
string[49] = '\0';
to solve your problem.
You need to set the null terminator manually when using strncpy:
strncpy(string, "StringToCompare", 48);
string[49] = 0;
Lots of apparent guesses in the other answers, but a quick suggestion.
First of all, the code as written should work (and in fact, does work in Visual Studio 2010). The key is in the details of 'strncpy' -- it will not implicity add a null terminating character unless the source length is less than the destination length (which it is in this case). strcpy on the other hand does include the null terminator in all cases, suggesting that your compiler isn't properly handling the strncpy function.
So, if this isn't working on your compiler, you should likely initialize your temporary buffer like this:
char string[50] = {0}; // initializes all the characters to 0
// below should be 50, as that is the number of
// characters available in the string (not 49).
strncpy(string, "StringToCompare", 50);
However, I suspect this is likely just an example, and in the real world your source string is 49 (again, you should pass 50 to strncpy in this case) characters or longer, in which case the NULL terminator is NOT being copied into your temporary string.
I would echo the suggestions in the comments to use std::string if available. It takes care of all of this for you, so you can focus on your implementation rather than these trite details.
The byte count parameter in strncpy tells the function how many bytes to copy, not the length of the character buffer.
So in your case you are asking to copy 49 bytes from your constant string into the buffer, which I don't think is your intent!
However, it doesn't explain why you are getting the anomalous result. What compiler are you using? When I run this code under VS2005 I get the correct behavior.
Note that strncpy() has been deprecated in favor of strncpy_s, which does want the buffer length passed to it:
strncpy_s (string,sizeof(string),"StringToCompare",49)
strcopy and strncpy: in this situation they behave identically!!
So you didn't tell us the truth or the whole picture (eg: the string is at least 49 characters long)