C++ strlen() initialized char array - c++

Quick question.
I couldn't find why an initialized char array returns this value. I understand that the strlen() function will only return the amount of characters inside of an array, and not the size, but why will it return 61 if there are no characters in it?
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
const int MAX = 50;
char test[MAX];
int length = strlen(test);
cout << "The current \'character\' length of the test array is: " << length << endl;
// returns "61"
// why?
cin >> test; //input == 'nice'
length = strlen(test);
cout << "The new \'character\' length of the test array is: " << length << endl;
// returns 4 when 'nice' is entered.
// this I understand.
return 0;
}
This was driving me nuts during a project because I would be trying to use a loop to feed information into a character array but strlen() would always return an outrageous value until I initialized the array as:
char testArray[50] = '';
instead of
char testArray[50];
I got these results using Visual Studio 2015
Thanks!

I think the basic misunderstanding is that - unlike in other languages - in C, locally defined variables are not initialised with any value, neither with empty strings, nor with 0, nor with any <undefined> or whatever unless you explicitly initialise them.
Note that accessing uninitialised variables actually is "undefined behaviour"; it may lead to "funny" and non-deterministic results, may crash, or might even be ignored at all.
A very common behaviour of such programs (though clearly not guaranteed!) is that if you write
char test[50];
int length = strlen(test);
then test will point to some memory, which is reserved in the size of 50 bytes yet filled with arbitrary characters, not necessarily \0-characters at all. Hence, test will probably not be "empty" in the sense that the first character is a \0 as it would be with a really empty string "". If you now access test by calling strlen(test) (which is actually UB, as said), then strlen may just go through this arbitrarily filled memory, and it might detect a \0 within the first 50 characters, or it might detect the first \0 much after having exceeded the 50 bytes.

It's good that you have found your answer, but you have to understand how does this thing works, I think.
char test[MAX];
In this line of code you have just declared an array of MAX chars. You will get random values in this array until you initialize it. The strlen function just walks through the memory until it find 0 value. So, since values in your array are random, the result of this function is random. Moreover, you can easily walk outside of your array and get UB.
char test[MAX] = '';
This code initilizes the first element in 'test' array with 0 value so strlen will be able to find it.

Related

Char array returns four times more data than expected

Before I continue, here's the code:
#include <iostream>
using namespace std;
int main() {
char array[] = {'a','b','c'};
cout << array << endl;
return 0;
}
My system:
VisualStudio 2019, default C++ settings
Using Debug build instead of release
When I run this code sample, I get something like this in my console output:
abcXXXXXXXXX
Those X's represent seemingly random characters. I know they're from existing values in memory at that address, but I don't understand why I'm getting 12 bytes back instead of the three from my array.
Now, I know that if I were doing this with ints which are four bytes long, maybe this would make sense but sizeof(array) returns three (ie. three bytes long, I know the sizeof(array) / sizeof(array[0] trick.) And when I do try it with ints, I'm even more confused because I get some four-byte hex number instead (maybe a memory address?)
This may be some trivial question, I'm sorry, but I'm just trying to figure out why it behaves like this. No vectors please, I'm trying to stay as non-STL as possible here.
cout takes this char array and addresses it as a null-terminated string.
Since the terminating character in this array is not the null character (i.e., char(0)), it attempts to print until encountering the null character.
At this point, it attempts to read memory outside of the array which you have allocated, and technically, anything could happen.
For example, there can be different data in that memory every time the function is called, or the memory access operation may even be illegal, depending on the address where array was allocated at the time the function was called.
So the behavior of your program is generally considered undefined (or non-deterministic).

Why is strlen(s) different from the size of s, and why does cout char display a character not a number?

I wrote a piece of code to count how many 'e' characters are in a bunch of words.
For example, if I type "I read the news", the counter for how many e's are present should be 3.
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char s[255],n,i,nr=0;
cin.getline(s,255);
for(i=1; i<=strlen(s); i++)
{
if(s[i-1]=='e') nr++;
}
cout<<nr;
return 0;
}
I have 2 unclear things about characters in C++:
In the code above, if I replace strlen(s) with 255, my code just doesn't work. I can only type a word and the program stops. I have been taught at school that strlen(s) is the length for the string s, which in this case, as I declared it, is 255. So, why can't I just type 255, instead of strlen(s)?
If I run the program above normally, it doesn't show me a number, like it is supposed to do. It shows me a character (I believe it is from the ASCII table, but I'm not sure), like a heart or a diamond. It is supposed to print the number of e's from the words.
Can anybody please explain these to me?
strlen(s) gives you the length of the string held in the s variable, up to the first NULL character. So if you input "hello", the length will be 5, even though s has a capacity of 255....
nr is displayed as a character because it's declared as a char. Either declare it as int, for example, or cast it to int when cout'ing, and you'll see a number.
strlen() counts the actual length of strings - the number of real characters up to the first \0 character (marking end of string).
So, if you input "Hello":
sizeof(s) == 255
strlen(s) == 5
For second question, you declare your nr as char type. std::cout recognizes char as a single letter and tries it print it as such. Declare your variable as int type or cast it before printing to avoid this.
int nr = 42;
std::cout << nr;
//or
char charNr = 42;
std::cout << static_cast<int>(charNr);
Additional mistakes not mentioned by others, and notes:
You should always check whether the stream operation was successful before trying to use the result.
i is declared as char and cannot hold values greater than 127 on common platforms. In general, the maximum value for char can be obtained as either CHAR_MAX or std::numeric_limits<char>::max(). So, on common platforms, i <= 255 will always be true because 255 is greater than CHAR_MAX. Incrementing i once it has reached CHAR_MAX, however, is undefined behavior and should never be done. I recommend declaring i at least as int (which is guaranteed to have sufficient range for this particular use case). If you want to be on the safe side, use something like std::ptrdiff_t (add #include <cstddef> at the start of your program), which is guaranteed to be large enough to hold any valid array size.
n is declared but never used. This by itself is harmless but may indicate a design issue. It can also lead to mistakes such as trying to use n instead of nr.
You probably want to output a newline ('\n') at the end, as your program's output may look odd otherwise.
Also note that calling a potentially expensive function such as strlen repeatedly (as in the loop condition) can have negative performance implications (strlen is typically an intrinsic function, though, and the compiler may be able to optimize most calls away).
You do not need strlen anyway, and can use cin.gcount() instead.
Nothing wrong with return 0; except that it is redundant – this is a special case that only applies to the main function.
Here's an improved version of your program, without trying to change your code style overly much:
#include <iostream>
#include <cstring>
#include <cstddef>
using namespace std;
int main()
{
char s[255];
int nr=0;
if ( cin.getline(s,255) )
{ // only if reading was successful
for(int i=0; i<cin.gcount(); i++)
{
if(s[i]=='e') nr++;
}
cout<<nr<<'\n';
}
return 0;
}
For exposition, the following is a more concise and expressive version using std::string (for arbitrary length input), and a standard algorithm. (As an interviewer, I would set this, modulo minor stylistic differences, as the canonical answer i.e. worth full credit.)
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;
if ( getline(cin, s) )
{
cout << std::count(begin(s), end(s), 'e') << '\n';
}
}
I have 2 unclear things about characters in C++: 1) In the code above,
if I replace the "strlen(s)" with 255, my code just doesn't work, I
can only type a word and the program stops, and I have been taught at
school that "strlen(s)" is the length for the string s, wich in this
case, as I declared it, is 255. So, why can't I just type 255, instead
of strlen(s);
That's right, but strings only go the null terminator, even if there's more space allocated. Consider this, per example:
char buf[32];
strcpy(buf, "Hello World!");
There's 32 chars worth of space, but my string is only 12 characters long. That's why strlen returns 12 in this example. It's because it doesn't know how long the buffer is, it only knows the address of the string and parses it until it finds the null terminator.
So if you enter 255, you're going past what was set by cin and you'll read the rest of the buffer. Which, in this case, is uninitialized. That's undefined behavior - in this case it will most likely read some rubbish values, and those might coincidentally have the 'e' value and thus give you a wrong result.
2) If you run the program above normaly, it doesn't show you a number,
like it's supposed to do, it shows me a character(I believe it's from
the ASCII table but I'm not sure), like a heart or a diamond, but it
is supposed to print the number of e's from the words. So can anybody
please explain these to me?
You declared nr as char. While that can indeed hold an integer value, if you print it like this, it will be printed as a character. Declare it as int instead or cast it when you print it.

Purpose of char a[0] in converting integer to string using itoa()

I have this code,
char a[0];
cout << "What is a? " << a << endl;
char *word = itoa(123,a,10);
string yr = string(word);
but i have trouble comprehending the array a[0]. I tried to change its value and see if there is any changes, but it seems to make no differences at all.
example, even if a change a[0] to a[1], or any other integer, the output still make no difference
char a[1];
cout << "What is a? " << a << endl;
char *word = itoa(123,a,10);
string yr = string(word);
What is its purpose here?
Since itoa function is non-standard, this is a discussion of a popular signature itoa(int, char*, int).
Second parameter represents a buffer into which a null-terminated string representing the value is copied. It must provide enough space for the entire string: in your case, that is "123", which takes four characters. Your code passes a[] as the buffer, but the size of a[] is insufficient to accommodate the entire "123" string. Hence, the call causes undefined behavior.
You need to make a large enough to fit the destination string. Passing a buffer of size 12 is sufficient to accommodate the longest decimal number that can be produced by itoa on a 32-bit system (i.e. -2147483648). Replace char a[0] with char a[12] in the declaration.
What is its purpose here?
A zero-length array is an array with no elements in it.
You can't [legally] print or modify its contents, because it doesn't have any.
There are arcane reasons to want to use one, but speaking generally it has no purpose for you. It's not even allowed by the standard (although compilers tend to support it for those arcane reasons).
even if a change a[0] to a[1], or any other integer, the output still make no difference
Well, if you have an array with n elements in it, and you write more than n elements' worth of data to it, that's a "buffer overrun" and has undefined behaviour. It could appear to work as you overwrite somebody else's memory, or your program could crash, or your dog could suddenly turn into a zombie and eat you alive. Best avoided tbh.

Able to Access Elements with Index Greater than Array Length

The following code seems to be running when it shouldn't. In this example:
#include <iostream>
using namespace std;
int main()
{
char data[1];
cout<<"Enter data: ";
cin>>data;
cout<<data[2]<<endl;
}
Entering a string with a length greater than 1 (e.g., "Hello"), will produce output as if the array were large enough to hold it (e.g., "l"). Should this not be throwing an error when it tried to store a value that was longer than the array or when it tried to retrieve a value with an index greater than the array length?
The following code seems to be running when it shouldn't.
It is not about "should" or "shouldn't". It is about "may" or "may not".
That is, your program may run, or it may not.
It is because your program invokes undefined behavior. Accessing an array element beyond the array-length invokes undefined behavior which means anything could happen.
The proper way to write your code is to use std::string as:
#include <iostream>
#include <string>
//using namespace std; DONT WRITE THIS HERE
int main()
{
std::string data;
std::cout<<"Enter data: ";
std::cin>>data; //read the entire input string, no matter how long it is!
std::cout<<data<<std::endl; //print the entire string
if ( data.size() > 2 ) //check if data has atleast 3 characters
{
std::cout << data[2] << std::endl; //print 3rd character
}
}
It can crash under different parameters in compilation or compiled on other machine, because running of that code giving undefined result according to documentaton.
It is not safe to be doing this. What it is doing is writing over the memory that happens to lie after the buffer. Afterwards, it is then reading it back out to you.
This is only working because your cin and cout operations don't say: This is a pointer to one char, I will only write one char. Instead it says: enough space is allocated for me to write to. The cin and cout operations keep reading data until they hit the null terminator \0.
To fix this, you can replace this with:
std::string data;
C++ will let you make big memory mistakes.
Some 'rules' that will save you most of the time:
1:Don't use char[]. Instead use string.
2:Don't use pointers to pass or return argument. Pass by reference, return by value.
3:Don't use arrays (e.g. int[]). Use vectors. You still have to check your own bounds.
With just those three you'll be writing some-what "safe" code and non-C-like code.

Strange characters appear when using strcat function in C++

I am a newbie to C++ and learning from the MSDN C++ Beginner's Guide.
While trying the strcat function it works but I get three strange characters at the
beginning.
Here is my code
#include <iostream>
#include <cstdio>
#include <cstring>
using namespace std;
int main() {
char first_name[40],last_name[40],full_name[80],space[1];
space[0] = ' ';
cout << "Enter your first name: ";
gets(first_name);
cout << "Enter your last name: ";
gets(last_name);
strcat(full_name,first_name);
strcat(full_name,space);
strcat(full_name,last_name);
cout << "Your name is: " << full_name;
return 0;
}
And here is the output
Enter your first name: Taher
Enter your last name: Abouzeid
Your name is: Y}#Taher Abouzeid
I wonder why Y}# appear before my name ?
You aren't initializing full_name by setting the first character to '\0' so there are garbage characters in it and when you strcat you are adding your new data after the garbage characters.
The array that you are creating is full of random data. C++ will allocate the space for the data but does not initialize the array with known data. The strcat will attach the data to the end of the string (the first '\0') as the array of characters has not been initialized (and is full of random data) this will not be the first character.
This could be corrected by replacing
char first_name[40],last_name[40],full_name[80],space[1];
with
char first_name[40] = {0};
char last_name[40] = {0};
char full_name[80] = {0};
char space[2] = {0};
the = {0} will set the first element to '\0' which is the string terminator symbol, and c++ will automatically fill all non specified elements with '\0' (provided that at least one element is specified).
The variable full_name isn't being initialized before being appended to.
Change this:
strcat(full_name,first_name);
to this:
strcpy(full_name,first_name);
You can not see any problem in your test, but your space string is also not null-terminated after initializing its only character with ' '.
As others have said, you must initialize the data, but have you ever thought about learning the standard c++ library? It is more intuitive sometimes, and probably more efficient.
With it would be:
string full_name=first_name+" "+last_name;
and you won't have to bother with terminating null characters. For a reference go to cplusplus
Oh and a full working example so you could understand better (from operator+=):
#include <iostream>
#include <string>
using namespace std;
int main ()
{
string name ("John");
string family ("Smith");
name += " K. "; // c-string
name += family; // string
name += '\n'; // character
cout << name;
return 0;
}
The problem is with your space text.
The strcat function requires a C-style string, which is zero or more characters followed by a null, terminating, character. So when allocating arrays for C-style strings, you need to allocate one extra character for the terminating null character.
So, your space array needs to be of length 2, one for the space character and one for the null character.
Since space is constant, you can use a string literal instead of an array:
const char space[] = " ";
Also, since you are a newbie, here are some tips:
1. Declare one variable per line.
This will be easier to modify and change variable types.
2. Either flush std::cout, use std::endl, or include a '\n'.
This will flush the buffers and display any remaining text.
3. Read the C++ language FAQ.
Click here for the C++ language Frequently Asked Questions (FAQ)
4. You can avoid C-style string problems by using std::string
5. Invest in Scott Myers Effective C++ and More Effective C++ books.
Strings are null-terminated in C and C++ (the strcat function is a legacy of C). This means that when you point to a random memory address (new char[] variables point to a stack address with random content that does not get initialized), the compiler will interpret everything up to the first \0 (null) character as a string (and will go beyond the allocated size if you use pointer arithmetic).
This can lead to very obscure bugs, security issues (buffer overflow exploits) and very unreadable and unmaintainable code. Modern compilers have features that can help with the detection of such issues.
Here is a good summary of your options.