Why isn't strlen working for me? - c++

char p[4]={'h','g','y'};
cout<<strlen(p);
This code prints 3.
char p[3]={'h','g','y'};
cout<<strlen(p);
This prints 8.
char p[]={'h','g','y'};
cout<<strlen(p);
This again prints 8.
Please help me as I can't figure out why three different values are printed by changing the size of the array.

strlen starts at the given pointer and advances until it reaches the character '\0'. If you don't have a '\0' in your array, it could be any number until a '\0' is reached.
Another way to reach the number you're looking for (in the case you've shown) is by using: int length = sizeof(p)/sizeof(*p);, which will give you the length of the array. However, that is not strictly the string length as defined by strlen.
As #John Dibling mentions, the reason that strlen gives the correct result on your first example is that you've allocated space for 4 characters, but only used 3; the remaining 1 character is automatically initialized to 0, which is exactly the '\0' character that strlen looks for.

Only your first example has a null terminated array of characters - the other two examples have no null termination, so you can't use strlen() on them in a well-defined manner.
char p[4]={'h','g','y'}; // p[3] is implicitly initialized to '\0'
char p[3]={'h','g','y'}; // no room in p[] for a '\0' terminator
char p[]={'h','g','y'}; // p[] implicitly sized to 3 - also no room for '\0'
Note that in the last case, if you used a string literal to initialize the array, you would get a null terminator:
char p[]= "hgy"; // p[] has 4 elements, last one is '\0'

That will get you a random number. strlen requires that strings be terminated with a '\0' to work.

try this:
char p[4]={'h','g','y', '\0'};

strlen is a standard library function that works with strings (in C sense of the term). String is defined as an array of char values that ends with a \0 value. If you supply something that is not a string to strlen, the behavior is undefined: the code might crash, the code might produce meaningless results etc.
In your examples only the first one supplies strlen with a string, which is why it works as expected. In the second and the third case, what you supply is not a string (not terminated with \0), which is why the results expectedly make no sense.

'\0' terminate your char buffer.
char p[4]={'h','g','y', '\0'};

This is because strlen() expects to find a null-terminator for the string. In this case, you don't have it, so strlen() keeps counting until it finds a \0 or gives a memory access violation and your program dies. RIP!

Related

How is char array stored in C++?

int main()
{
char c1[5]="abcde";
char c2[5]={'a','b','c','d','e'};
char *s1 = c1;
char *s2 = c2;
printf("%s",s1);
printf("%s",s2);
return 0;
}
In this code snippet, the char array C2 doesn't return any error but the char array C1 returns string too long. I know that C1 must require a size of 6 to store 5 characters as it stores the \0 (NULL char) in the last index. But I'm confused why C2 works just fine then?
Also, when C2 is printed using %s, the output is abcde# where # is a gibberish character. %s with printf prints all the characters starting from the given address till \0 is encountered. I don't understand why is it printing that extra character at the end?
You've created two unterminated strings. Make your arrays big enough to hold the null terminator and you'll avoid this undefined behaviour:
char c1[6] = "abcde";
char c2[6] = {'a','b','c','d','e','\0'};
Strictly, speaking the latter doesn't actually require the '\0'. This declaration is equivalent and will include the null terminator:
char c2[6] = {'a','b','c','d','e'};
I personally prefer the first form, but with the added convenience of being able to leave out the explicit length:
char c1[] = "abcde";
I know that C1 must require a size of 6 to store 5 characters as it stores the \0 (NULL char) in the last index. But I'm confused why C2 works just fine then?
The compiler does not complain about the initialization of c2 because initializing with {'a','b','c','d','e'} does not implicitly include a terminating null character.
In contrast, initializing with "abcde" does include a null character: The C standard defines a string literal to include a terminating null character, so char c1[5]="abcde"; nominally initializes a 5-element array with 6 values. The C standard does not require a warning or error in this case because C 2018 6.7.9 14 indicates that null character may be neglected if the array does not have room for it. However, the compiler you are using1 has chosen to issue a warning message because this form of initialization often indicates an error: The programmer attempted to initialize an array with a string, but there is not room for the full string.
In C, arrays of characters and strings are different things: An array is a sequence of values, and an array of characters can contain any arbitrary values of those characters, including no zero value at the end and possible zero values in the middle. For example, if we have a buffer of bytes from a binary file, the bytes are just integer values to us; their meaning as characters that might be printed is irrelevant. A string is a sequence of characters that is terminated by a null character. It cannot have internal zero values because the first null character marks the end.
So, when you define an array of characters such as char c1[5], the compiler does not automatically know whether you intend to use it to hold strings or you intended to use it as an array of arbitrary values. When you initialize the array with a string, your compiler is essentially figuring you intend to use the array to hold strings, and it warns you if the string you use to initialize the array does not fit. When you initialize the array with a list of values, your compiler essentially figures you may be using it to hold arbitrary values, and it does not warn you that there could be a missing terminator.
Also, when C2 is printed using %s, the output is abcde# where # is a gibberish character.
Because c2 does not have a terminating character, attempting to print it runs off the end of the array, resulting in behavior not defined by the C standard. Commonly, printf continues reading memory beyond the array, printing whatever happens to be there until it reaches a null character.
Footnote
1 This assumes you are indeed using a C compiler to compile this source code. C++ has different rules and does not permit an array being initialized with a string literal to be too short to include the terminating null character.

sprintf() is adding an extra variable

Why is this happening?:
char buf[256];
char date[8];
sprintf(date, "%d%02d%02d", Time.year(), Time.month(), Time.day());
snprintf(buf, sizeof(buf), "{\"team\":\"%s\"}", team.c_str());
Serial.println(date);
output:
20180202{"team":"IND"}
it should only be: 20180202
I don't know why {"team":"IND"} is getting added to the end of it.
Very likely you declared two arrays and they are lined up in a way that allowed for the buf to overwrite the null terminator of date and thus it's "concatenating" the two.
I can't write code to reproduce this because it's undefined behavior and thus not reliable. But I can tell you how you can avoid it,
snprintf(date, sizeof(date), "%d%02d%02d", Time.year(), Time.month(), Time.day());
snprintf(buf, sizeof(buf), "{\"team\":\"%s\"}", team.c_str());
Having said that, why are you using snprintf() when this appears to be c++? And so there are more suitable solutions for this kind of problem.
This would print an incorrect value, but would not cause any unexpected behavior.
Strings in c are simply arrays with a special arrangement. If the string has n printable characters it should be stored in an array of size n + 1, so that you can add what is called a null terminator. It's a special value that indicates the end of the string.
Your second snprintf() is overwriting such null terminator of the date array and thus appearing to concatenate both strings.
You have reserved space to store exactly 8 chars:
char date[8];
To store the date properly 20180202 you need
char date[9];
because sprintf() puts the extra '\0' character to the buffer you pass for proper c-style string termination.
I'd suspect you declared your buffers like
char buffer[???];
char date[8];
since these are most likely stored on your processors stack, you need to read that backwards, thus the output placed at buffer overwrites that terminating '\0', and appears immediately after date.

why char* passed to FUNCTION always with the len of the string

i am learning c/c++ recently.but i don't understand the difference between
int a(chat* str,int len)
{
cout<<str<<len;
}
and
int a(char* str)
{
cout<<str<<strlen(str);
}
When you pass char* without a length, how would you know how many elements to process? char* means a pointer to a character. When you pass a pointer, you have no idea (and cannot find out) how much memory (if any) was allocated for the pointer.
That's why C-strings use are null-terminated (they end with a '\0' character), so you can detect their length by iterating the pointer. Hence, if you want to use a pointer without giving the length of its allocated memory, you need to obey some conventions. But in general, e.g. when passing a buffer, you shouldn't expect any end-signalling character, so in this case you need to pass the length, otherwise may end up reading/writing out of bounds.
For your particular example, you're fine with passing only a pointer provided you use your function only on C-strings, since strlen(str) uses this convention of counting until encountering a '\0'.
Buffer overflows are one the most messy and nightmarish programming errors, which can result in serious security issues. That's why you should try (whenever possible) to use std::string from the C++ standard library instead of C-style char* strings.
A C-String should always contain a termination character, we call it null character. It's technically 0 (not the number 0, but ASCII 0)
When we create a char* and initialize it with some text, it automatically adds the '\0' to the end.
char* c = "Hello";
This will create an array of char with six elements. Yes, six elements.
c = {'H', 'e', 'l', 'l', 'o', '\0'}
When you print c, it will search till it finds that '\0'. What if someone replaces it.
c[5] = '!';
Then the system can't determine the end of the text. Then it will keep on reading the memory (which does not belong to that variable, or may be even the program) until it hits a null char.
That is the main reason to pass the size (or number or chars to read) to a function.
On the other hand, if you need to read some data from a stream, you can use a buffer. In that case, you should specify how many bytes to read, in that way you will not cause buffer overflows.
Above answers are to the point. So I'm going to discuss other perspective behind of practise of passing length along with char *.
As others said, not always, the string pointed by char * end up with \0. Only when the string ends with \0 strlen() would actually work. There are certain use-cases for example binary coding, where data is represented as string. In such case, char * would not end with \0. Besides, there can be certain use-cases to read / write only up to certain length / size. In such case, it is always necessary to test whether the input length is within the range of length of total string. So as a common case, length has been passed explicitly, which can be used in any way as desired by the caller.

Strlen returns unreasonable number

If I write:
char lili [3];
cout<<strlen(lili)<<endl;
then what is printed is : 11
but if I write:
char lili [3];
lili [3]='\0';
cout<<strlen(lili)<<endl;
then I get 3.
I don't understand why it returns 11 on the first part?
Isn't strlen supposed to return 3, since I allocated 3 chars for lili?
It is because strlen works with "C-style" null terminated strings. If you give it a plain pointer or uninitialised buffer as you did in your first example, it will keep marching through memory until it a) finds a \0, at which point it will return the length of that "string", or b) until it reaches a protected memory location and generates an error.
Given that you've tagged this C++, perhaps you should consider using std::array or better yet, std::string. Both provide length-returning functions (size()) and both have some additional range checking logic that will help prevent your code from wandering into uninitialised memory regions as you're doing here.
The strlen function searches for a byte set to \0. If you run it on an uninitialized array then the behavior is undefined.
You have to initialize your array first. Otherwise there is random data in it.
strlen is looking for a string termination sign and will count until it finds it.
strlen calculates the number of characters till it reaches '\0' (which denotes "end-of-string").
In C and C++ char[] is equivalent to char *, and strlen uses lili as a pointer to char and iterates the memory pointed to by it till it reaches the terminating '\0'. It just so happened that there was 0 byte in memory 11 bytes from the memory allocated for your array. You could have got much stranger result.
In fact, when you write lili[3] = '\0'
you access memory outside your array. The valid indices for 3-element array in C/C++ are 0-2.

Does the '\0' at the end of the string also get included here?

If I declare a string with 10 elements like this:
char s[10];
then does the '\0'at the end occupy the 10th place or the 11th? Basically my question is that do we get 1 element less in the string?
And if I use the strlen() function to find this string's length, will the return value be inclusive of the null? I.e if the string is "boy", will the function give me 3 or 4?
There is no 11th place, ie, yes, that is one less element to use.
Don't put a string longer than 9 chars in there. Strlen() does not include the null terminator.
Eg:
char s[]="hello";
s is an array of 6 chars, but the strlen() of s is 5.
Yes, the \0 character will occupy the 10th place. Meaning, the string will only contains 9 input characters. And no, strlen() does not count the null character.
If you declare an array like that, it doesn't need to have a zero byte at the end. If you call any functions that expect a string on it, it almost certainly will result in a crash. You need to initialize the array:
char s[] = "This is a test."
If you do something like this:
char s[10] = "012345678"
printf("%d\n", strlen(s));
It will obviously print 9. You can't put 10 characters in that array, since the null byte would get written out of the array bounds.
If you declare an array of 10 characters, you must keep in mind that one of them has to be used for the string terminator, so you can store a maximum of 9 characters.
strlen, instead, returns only the number of "logical" characters, i.e. it won't count the null terminator.