strncpy char string issue when adding length - c++

I'm having a problem with comparing 2 char strings that are both the same:
char string[50];
strncpy(string, "StringToCompare", 49);
if( !strcmp("StringToCompare", string) )
//do stuff
else
//the code runs into here even tho both strings are the same...this is what the problem is.
If I use:
strcpy(string, "StringToCompare");
instead of:
strncpy(string, "StringToCompare", 49);
it solves the problem, but I would rather insert the length of the string rather than it getting it itself.
What's going wrong here? How do I solve this problem?

You forgot to put a terminating NUL character to string, so maybe strcmp run over the end. Use this line of code:
string[49] = '\0';
to solve your problem.

You need to set the null terminator manually when using strncpy:
strncpy(string, "StringToCompare", 48);
string[49] = 0;

Lots of apparent guesses in the other answers, but a quick suggestion.
First of all, the code as written should work (and in fact, does work in Visual Studio 2010). The key is in the details of 'strncpy' -- it will not implicity add a null terminating character unless the source length is less than the destination length (which it is in this case). strcpy on the other hand does include the null terminator in all cases, suggesting that your compiler isn't properly handling the strncpy function.
So, if this isn't working on your compiler, you should likely initialize your temporary buffer like this:
char string[50] = {0}; // initializes all the characters to 0
// below should be 50, as that is the number of
// characters available in the string (not 49).
strncpy(string, "StringToCompare", 50);
However, I suspect this is likely just an example, and in the real world your source string is 49 (again, you should pass 50 to strncpy in this case) characters or longer, in which case the NULL terminator is NOT being copied into your temporary string.
I would echo the suggestions in the comments to use std::string if available. It takes care of all of this for you, so you can focus on your implementation rather than these trite details.

The byte count parameter in strncpy tells the function how many bytes to copy, not the length of the character buffer.
So in your case you are asking to copy 49 bytes from your constant string into the buffer, which I don't think is your intent!
However, it doesn't explain why you are getting the anomalous result. What compiler are you using? When I run this code under VS2005 I get the correct behavior.
Note that strncpy() has been deprecated in favor of strncpy_s, which does want the buffer length passed to it:
strncpy_s (string,sizeof(string),"StringToCompare",49)

strcopy and strncpy: in this situation they behave identically!!
So you didn't tell us the truth or the whole picture (eg: the string is at least 49 characters long)

Related

Does strlen have any side effect?

I know, or better I suppose, that strlen() is fairly simple function, but for some reason, when I want to use it in one my function, it will corrupt the program. I have no idea why it's happening, because the results from strlen are correct, but the program as a whole doesn't work if strlen is included. To be specific, this part of code is causing trouble:
int random1 = strlen(objectNameValue);
int random2 = strlen(PA_type_value);
int random3 = strlen(attributeValue);
printf("length of objectNameValue: %d, PA_type_value: %d, attributeValue: %d", random1, random2, random3);
My program is actually a simple LDAP server and I'm trying to send some data - those variables (attributeValue), for example, when I put this part of code to comments, it works perfectly. But if I leave it there, sent data are not correct, and I have really no idea how strlen can affect them. Funny is that strlen is actually printing correct results.
To specify a bit more, those three variables in strlen are all of type char[] and all have /0 at the end. Can anyone help?
So, after request in comments, here is a closer look:
Basically, in very minimal way, this is something I'm doing:
char message_out[1024];
bzero(message_out, 1024);
char objectNameValue[1024];
strcpy(objectNameValue,"cn=");
char PA_type_value[5] = {'u', 'i', 'd'};
strcat(Message_out, objectNameValue);
strcat(Message_out, PA_type_value);
But sometimes I'm not putting the fixed characters into string, but another string for example., and I need to find out how big that string is, because I need to count total lengths of the message_out. I always watch out for strings, to be big enough, to be initialized with zeroes.
Output is this:
Length of objectNameValue: 12, PA_type_value: 3, attributeValue: 8
The problem could be with how strlen finds the length. strlen iterates through all characters of an array of characters until it finds a terminal \0 character. If the array does not have this terminal character, the implementation could proceed out of the memory bounds of the array of characters you gave it and cause undefined behavior.

Please explain me the need for appending '\0' while converting string to char

While using proc/mysql for c++ I have taken string as user input and converted into char via strcpy(c,s.c_str()); function, where c is the binding variable through which I'll add value in the database table and s is the string (user input), it is working fine but my teacher is asking me append '\0' at the end - I can't understand the reason why I need to?
Your teacher is deluded.
c_str() in itself appends a zero [or rather, std::string reserves space for an extra character when creating the string, and makes sure this is zero at least at the point of c_str() returning - in C++11, it is guaranteed that there is an extra character space filled with zero at the end of the string, always].
You DO need a zero at the end of a string to mark the end of the string in a C-style string, such as those used by strcpy.
[As others have pointed out, you should also check that the string fits before copying, and I would suggest reject if it won't fit, as truncating it will lead to other problems - as well as checking that there isn't any sql-injection attacks and a multitude of other things required for "good pracice in an SQL environment"]
While the teacher is deluded on the appending '\0' to the string, your code exhibits another very bad bug.
You should never use strcpy in such a fashion. You should always use some routine which controls the nubmer of characters copied, like strncpy(), or other alternatives, and provide it with the size of receiving variable. Otherwise you are just asking for troubles.
Just guessing, it's a protection against buffer overflow. If c is only N bytes long and s.c_str() returns a pointer to a N+k length string, you'd write k bytes after c, which is bad.
Now let's say (if you didn't SEGFAULT already) you pass this c NUL-terminated string to a C function, you have no guarantee that the \0 you wrote after c is still there. This C function will then read an undefined amount of bytes after c, which is badder worse.
Anyway, use ::strncpy():
char c[64];
::strncpy(c, s.c_str(), sizeof(c));
c[sizeof(c)-1] = '\0';

how to make a not null-terminated c string?

i am wondering :char *cs = .....;what will happen to strlen() and printf("%s",cs) if cs point to memory block which is huge but with no '\0' in it?
i write these lines:
char s2[3] = {'a','a','a'};
printf("str is %s,length is %d",s2,strlen(s2));
i get the result :"aaa","3",but i think this result is because that a '\0'(or a 0 byte) happens to reside in the location s2+3.
how to make a not null-terminated c string? strlen and other c string function relies heavily on the '\0' byte,what if there is no '\0',i just want know this rule deeper and better.
ps: my curiosity is aroused by studying the follw post on SO.
How to convert a const char * to std::string
and these word in that post :
"This is actually trickier than it looks, because you can't call strlen unless the string is actually nul terminated."
If it's not null-terminated, then it's not a C string, and you can't use functions like strlen - they will march off the end of the array, causing undefined behaviour. You'll need to keep track of the length some other way.
You can still print a non-terminated character array with printf, as long as you give the length:
printf("str is %.3s",s2);
printf("str is %.*s",s2_length,s2);
or, if you have access to the array itself, not a pointer:
printf("str is %.*s", (int)(sizeof s2), s2);
You've also tagged the question C++: in that language, you usually want to avoid all this error-prone malarkey and use std::string instead.
A "C string" is, by definition, null-terminated. The name comes from the C convention of having null-terminated strings. If you want something else, it's not a C string.
So if you have a string that is not null-terminated, you cannot use the C string manipulation routines on it. You can't use strlen, strcpy or strcat. Basically, any function that takes a char* but no separate length is not usable.
Then what can you do? If you have a string that is not null-terminated, you will have the length separately. (If you don't, you're screwed. You need some way to find the length, either by a terminator or by storing it separately.) What you can do is allocate a buffer of the appropriate size, copy the string over, and append a null. Or you can write your own set of string manipulation functions that work with pointer and length. In C++ you can use std::string's constructor that takes a char* and a length; that one doesn't need the terminator.
Your supposition is correct: your strlen is returning the correct value out of sheer luck, because there happens to be a zero on the stack right after your improperly terminated string. It probably helps that the string is 3 bytes, and the compiler is likely aligning stuff on the stack to 4-byte boundaries.
You cannot depend on this. C strings need NUL characters (zeroes) at the end to work correctly. C string handling is messy, and error-prone; there are libraries and APIs that help make it less so… but it's still easy to screw up. :)
In this particular case, your string could be initialized as one of these:
A: char s2[4] = { 'a','a','a', 0 }; // good if string MUST be 3 chars long
B: char *s2 = "aaa"; // if you don't need to modify the string after creation
C: char s2[]="aaa"; // if you DO need to modify the string afterwards
Also note that declarations B and C are 'safer' in the sense that if someone comes along later and changes the string declaration in a way that alters the length, B and C are still correct automatically, whereas A depends on the programmer remembering to change the array size and keeping the explicit null terminator at the end.
What happens is that strlen keeps going, reading memory values until it eventually gets to a null. it then assumes that is the terminator and returns the length that could be massively large. If you're using strlen in an environment that expects C-strings to be used, you could then copy this huge buffer of data into another one that is just not big enough - causing buffer overrun problems, or at best, you could copy a large amount of garbage data into your buffer.
Copying a non-null terminated C string into a std:string will do this. If you then decide that you know this string is only 3 characters long and discard the rest, you will still have a massively long std:string that contains the first 3 good characters and then a load of wastage. That's inefficient.
The moral is, if you're using the CRT functions to operator on C strings, they must be null-terminated. Its no different to any other API, you must follow the rules that API sets down for correct usage.
Of course, there is no reason you cannot use the CRT functions if you always use the specific-length versions (eg strncpy) but you will have to limit yourself to just those, always, and manually keep track of the correct lengths.
Convention states that a char array with a terminating \0 is a null terminated string. This means that all str*() functions expect to find a null-terminator at the end of the char-array. But that's it, it's convention only.
By convention also strings should contain printable characters.
If you create an array like you did char arr[3] = {'a', 'a', 'a'}; you have created a char array. Since it is not terminated by a \0 it is not called a string in C, although its contents can be printed to stdout.
The C standard does not define the term string until the section 7 - Library functions. The definition in C11 7.1.1p1 reads:
A string is a contiguous sequence of characters terminated by and including the first null character.
(emphasis mine)
If the definition of string is a sequence of characters terminated by a null character, a sequence of non-null characters not terminated by a null is not a string, period.
What you have done is undefined behavior.
You are trying to write to a memory location that is not yours.
Change it to
char s2[] = {'a','a','a','\0'};

single character c-style string full of junk

It's a shame I can't figure out such basic thing about c++, but c-style strings are acting as I wouldn't expect. For example, I create it like this:
char* cstr = new char[1];
It's initialized to: Íýýýýý««««««««îţ . Like normal, I can set just first char because others are not really existing (or I thought that they aren't). While working whit c-style strings all this junk is ingored and everything works fine.
Now I mixed std::string whit those c-stlye one and what I get is a mess. Whit this code:
std::string str = "aaa";
str += cstr;
I end up whit: aaaÍýýýýý««««««««îţ , but now those characters actually exist as string.size() returns length including this junk.
I can't find why is this happening, but it must be connected whit string creating, because something like char* cstr = "aaa" results in aaa without any additional junk, but trying to change string initialized this way results in memory access violation. Could someone explain me this behavior please? Thanks!
PS: My JavaScript Failed to load so if someone could format this post properly, I'd be glad!
Answer: Oh god! How could I forget on that... thanks to all for, well, immediate answer. Best one was from minitech so I'll mark this as answer as soon as my java script loads up :/
All C-style strings are null-terminated. So, a string initialized using new char[1] leaves you space for no characters. You can't set the first character to anything but \0, otherwise normal string operations will keep reading into memory until they find a zero. So use new char[2] instead.
When working with C-style strings you need to have a null terminator:
char* cstr = new char[2];
cstr[0] = 'X';
cstr[1] = '\0';
Having said all that, it is really bad code to do the above. Just use std::string unless you have a very good reason not too. It takes care of the memory allocations and deallocations for you.
C-style strings require a NUL ('\0') terminator; they don't have a length associated with them like C++ strings do. So your single-character string must be new char[2]; it will not be initialized; and you will need to make sure it's terminated with \0.
When you use new char[1], you request space for an array of characters. There is no request that said characters are initialized. Thus, the "junk" that you see is uninitialized memory. Before treating the array as a C-style string, you should do this:
cstr[0] = '\0';
c-style strings are NULL delimited. So, to ignore any junk in memory you need to place NULL byte('\0') in the string body. Otherwise, system library function will look at all bytes starting with your string start until they meet NULL byte in the memory (which will be at some random position).
This also mean that to have c-style string of one character you actually need to allocate 2 bytes: one for a meaningful character and second for '\0'.

Why does MSVC++ consider "std::strcat" to be "unsafe"? (C++)

When I try to do things like this:
char* prefix = "Sector_Data\\sector";
char* s_num = "0";
std::strcat(prefix, s_num);
std::strcat(prefix, "\\");
and so on and so forth, I get a warning
warning C4996: 'strcat': This function or variable may be unsafe. Consider using strcat_s instead.
Why is strcat considered unsafe, and is there a way to get rid of this warning without using strcat_s?
Also, if the only way to get rid of the warning is to use strcat_s, how does it work (syntax-wise: apparently it does not take two arguments).
If you are using c++, why not avoid the whole mess and use std::string. The same example without any errors would look like this:
std::string prefix = "Sector_Data\\sector";
prefix += "0";
prefix += "\\"
no need to worry about buffer sizes and all that stuff. And if you have an API which takes a const char *, you can just use the .c_str() member;
some_c_api(prefix.c_str());
Because the buffer, prefix, could have less space than you are copying into it, causing a buffer overrun.
Therefore, a hacker could pass in a specially crafted string which overwrites the return address or other critical memory and start executing code in the context of your program.
strcat_s solves this by forcing you to pass in the length of the buffer into which you are copying the string; it will truncate the string if necessary to make sure that the buffer is not overrun.
google strcat_s to see precisely how to use it.
You can get rid of these warning by adding:
_CRT_SECURE_NO_WARNINGS
and
_SCL_SECURE_NO_WARNINGS
to your project's preprocessor definitions.
That's one of the string-manipulation functions in C/C++ that can lead to buffer overrun errors.
The problem is that the function doesn't know what the size of the buffers are. From the MSDN documentation:
The first argument, strDestination,
must be large enough to hold the
current strDestination and strSource
combined and a closing '\0';
otherwise, a buffer overrun can occur.
strcat_s takes an extra argument telling it the size of the buffer. This allows it to validate the sizes before doing the concat, and will prevent overruns. See http://msdn.microsoft.com/en-us/library/d45bbxx4.aspx
Because it has no means of checking to see if the destination string (prefix) in your case will be written past its bounds. strcat essentially works by looping, copying byte-by-byte the source string into the destination. Its stops when it sees a value "0" (notated by '\0') called a null terminal. Since C has no built in bounds checking, and the dest str is just a place in memory, strcat will continue going ad-infinidium even if it blows past the source str or the dest. str doesn't have a null terminal.
The solutions above are platform-specific to your windows environment. If you want something platform independent, you have to wrangle with strncat:
strncat(char* dest, const char* src, size_t count)
This is another option when used intelligently. You can use count to specify the max number of characters to copy. To do this, you have to figure out how much space is available in dest (how much you allocated - strlen(dest)) and pass that as count.
To turn the warning off, you can do this.
#pragma warning(disable:4996)
btw, I strongly recommend that you use strcat_s().
There are two problems with strcat. First, you have to do all your validation outside the function, doing work that is almost the same as the function:
if(pDest+strlen(pDest)+strlen(pScr) < destSize)
You have to walk down the entire length of both strings just to make sure it will fit, before walking down their entire length AGAIN to do the copy. Because of this, many programmers will simply assume that it will fit and skip the test. Even worse, it may be that when the code is first written it is GUARANTEED to fit, but when someone adds another strcat, or changes a buffer size or constant somewhere else in the program, you now have issues.
The other problem is if pSrc and pDst overlap. Depending on your compiler, strcat may very well be simple loop that checks a character at a time for a 0 in pSrc. If pDst overwrites that 0, then you will get into a loop that will run until your program crashes.