std::out the outer dimension of a 2d char - c++

Create a console application with the following code (renaming f to your entry point):
#include <iostream>
void f(){
char a[5][5];
std::cin>>a[0]>>a[1]>>a[2]>>a[3]>>a[4];
for (int y = 0; y<5; y++)std::cout<<a[y]<<'\n';
}
and input 5 lines of 5 characters such as :
abcde
abcde
abcde
abcde
abcde
I expected the output to be identical to the input or throw an error, but instead I got:
abcdeabcdeabcdeabcdeabcde
abcdeabcdeabcdeabcde
abcdeabcdeabcde
abcdeabcde
abcde
When investigated using the debugger, each a[y] value is equal to abcde and not the displayed output.
What on earth is going on here? Why is this happening, and is there a way to stop it?
Is it related to the
Stack around the variable 'a' was corrupted
Error that gets thrown after it std::couts?
I'm well aware of other ways to get the desired output using nested loops, but I'm wondering if there's a way to iterate only the outer dimension so it uses fewer characters - this is for a code golf challenge. It makes quite a difference:
for(int y=0;y<5;y++)std::cout<<a[y]<<'\n';
vs
for(int y=0;y<5;y++){for(int x=0;x<5;x++)std::cout<<a[y][x]}std::cout<<'\n';

The problem is caused by the fact that you are trying to store "abcde" in a char array with 5 elements. You need at least one more element in the array to hold the terminating null character.
As a consequence, your program has undefined behavior. We can try to make sense of the output but it's futile.
Use
char a[5][6]; // Anything greater than 5 will work for your input
If you don't want your code to be tied to a hard coded size, you can use std::string.
std::string a[5];

A C-string is an sequence of characters that ends with a null terminator. That means "abcde" is actually 6 characters long, the 5 you see plus the null terminator.
Since you only allocated enough space for the input without the null terminator trying to put the string into the array writes off the end of the array and is undefined behavior. What you need is
char a[5][6];
As that will have enough space for the 5 characters plus the null terminator.

Related

Why does string concatenation fail when resize is used in cpp? [duplicate]

This question already has answers here:
What is an off-by-one error and how do I fix it?
(6 answers)
Closed 1 year ago.
I came across a scenario where string concatenation is failing in C++. But I don't see a reason for it to fail.
Code sample is as below:
int main()
{
std::string a;
std::string b = "bbbbbbb";
a.resize(10);
for (int i = 0; i <= 5; i++) {
a[i] = 'a';
}
a = a+b;
printf("\n%s\n", a.c_str());
}
It is outputting aaaaaa.
I was expecting it to output aaaaaabbbbb. If I change a.resize(10); to a.resize(5); I am getting the expected output.
Would be helpful if someone could help me in understanding the behaviour?
In addition to the off-by-one error, after concatenation, the contents of a in main are:
aaaaa\0\0\0\0\0bbbbb
So: five 'a' bytes, then five zero bytes, then five 'b' bytes. The string is fifteen bytes long.
printf, like other C functions, doesn't know about this size, and instead takes the length of the string to be until the first zero byte. In your case, that is "aaaaa".
To print the entire string, use something like std::cout. If you're certain you want printf, it is also possible to pass a length to that with the %.*s specifier.
std::string a;
a.resize(10);
gives you a string of size 10 but whose content is undefined.
You set the first 5 character to something specific and append some more characters to the end. But characters 5-10 never get set to something.
In the execution you are seeing, these characters happen to be zero, but printf – as a C style function — considers the appearance of a null character the end of the string. Therefore it stops printing.

Why do I get Hp printed instead of an empty line?

I have a very simple question, why is the output of this code the way it is?
I am using Dev-C++ 5.11 with TDM-GCC 4.9.2 64-bit
#include <iostream>
using namespace std;
int main()
{
char *ptr;
char Str[] = "abcdefg";
ptr = Str;
ptr += 8;
cout << ptr;
return 0;
}
I would expect the code to print an empty line.
For some reason, there seems to be a space character at position 7, you can detect that by changing ptr +=8; to ptr+=7;.
but what is weirder to me is that there are 3 more characters that can't be displayed unless you jump beyond the array limit by 2, which in this case we add 8 to the pointer. the characters are: "H,(a weird filled square),p"
screenshot of the output from my computer
I would expect the code to print an empty line.
That expectation is misguided. The behaviour of the program is undefined.
For some reason, there seems to be a space character at position 7
There is not. There is a null terminator at position 7.
but what is weirder to me is that there are 3 more characters that can't be displayed unless you jump beyond the array limit by 2 ...
The behaviour of accessing an array outside of its bounds is undefined.
You cannot expect the empty line when you try to access memory beyond your array.
At position 7 you have the '\0'.
C strings are terminated by this character and it is also used by the printing function to know when it should stop printing.
At position 8 you are beyond this character and the behavior of the program is undefined since the memory you are accessing might be everything.
The characters that you are able to print are just a representation of the memory beyond the string. They might change or exception might be thrown.
Character 'a' is at position 0 and character 'g' is at position 6 you should not access memory outside of this region except if you are trying to hack something.

Unexpected output of char arrays in c++

I'm pretty inexperienced in c++, and I wrote the following code to see how characters and strings work.
#include "stdio.h"
#include <iostream>
#include <string>
using namespace std;
int main()
{
char asdf[] = "hello";
char test[5] = {'h','e','l','l','o'};
cout << test;
}
I was expected it to output "hello", but instead I got "hellohello", which is really puzzling to me. I did some experimenting:
If I change the asdf to another string of a different length, it outputs "hello" normally.
If I change the amount of characters in test it outputs "hello" normally.
I thought this only happened when the two were the same length, but when I change them both to "hell" it seems to output "hell" normally.
To make things more confusing, when I asked a friend to run this code on their computer, it outputted "hello" and then a random character.
I'm running a fresh install of code blocks on Ubuntu. Anyone have any idea what is going on here?
This is undefined behaviour.
Raw char* or char[] strings in C and C++ must be NULL-terminated. That is, the string needs to end with a '\0' character. Your test[5] does not do that, so the function printing the output continues after the last o, because it is still looking for the NULL-termination.
Due to how the strings are stored on the stack (the stack usually grows towards lower addresses), the next bytes it encounters are those of asdf[], to which you assigned "hello". This is how the memory layout actually looks like, the arrow indicates the direction in which memory addresses (think pointers) increase:
---->
+-------------------
|hellohello\0 ...
+-------------------
\_ asdf
\_ test
Now in C++ and C, string literals like "hello" are NULL-terminated implicitly, so the compiler writes a hidden '\0' behind the end of the string. The output function continues to print the contents of asdf char-by-char until it reaches that hidden '\0' and then it stops.
If you were to remove the asdf, you would likely see a bit of garbage after the first hello and then a segmentation fault. But this is undefined behaviour, because you are reading out of the bounds of the test array. This also explains why it behaves differently on different systems: for example, some compilers may decide to lay out the variables in a different order on the stack, so that on your friends system, test is actually lower on the stack (remember, lower on the stack means at a higher address):
---->
+-------------------
|hello\0hello ...
+-------------------
\_ test
\_ asdf
Now when you print the contents of test, it will print hello char-by-char, then continue reading the memory until a \0 is found. The contents of ... are highly specific to architecture and runtime used, possibly even phase of the moon and time of day (not entirely serious), so that on your friends machine it prints a "random" character and stops then.
You can fix this by adding a '\0' or 0 to your test array (you will need to change the size to 6). However, using const char test[] = "hello"; is the sanest way to solve this.
You have to terminate your test array with an ascii 0 char. What happens now is that in memory it is adjacent to your asdf string, so since test isn't terminated, the << will just continue until it meets the ascii 0 at the end of asdf.
In case you wonder: When filling asdf, this ascii 0 is added automatically.
The reason for this is that C style strings need the null character to mark the end of the string.
As you have not put this into the array test it will just keep printing characters until it finds one. In you case the array asdf happens to follow test in memory - but this cannot be guaranteed.
Instead change the code to this:
char test[] = {'h','e','l','l','o', 0};
cout is printing all characters starting from the beginning of the given address (test here, or &test[0] in equivalent notation) up to the point where it finds a null terminator. As you haven't put a null terminator into the test array it will continue to print until it accidently finds one in memory. Up from this point it's pretty much undefined behavior what happens.
Last character should be '\0' to indicate end of string.
char test[6] = {'h','e','l','l','o','\0'};
Unless there is an overload of operator<< for a reference to an array of 5 chars, the array will "decay" to a pointer to char and treated as a C style string by the operator. C style strings are by convention terminated with a 0 char, which your array is lacking. Therefore the operator continues outputting the bytes in memory, interpreting them as printable chars. It just so happens that on the stack, the two arrays were adjacent so that the operator ran into asdf's memory area, outputting those chars and finally encountering the implicit 0 char which is at the end of "hello". If you omit the other declaration it's likely that your program will crash, namely if the next 0 byte comes later than the memory boundary of your program.
It is undefined behavior to access memory outside an object (here: test) through a pointer to that object.
Character sequences need a null terminator (\0).
char asdf[] = "hello"; // OK: String literals have '\0' appended at the end
char test[5] = {'h','e','l','l','o'}; // Oops, not null terminated. UB
Corrected:
char test[6] = {'h','e','l','l','o','\0'}; // OK
// ^ ^^^^

Char pointer giving me some really strange characters

When I run the example code, the wordLength is 7 (hence the output 7). But my char array gets some really weird characters in the end of it.
wordLength = word.length();
cout << wordLength;
char * wordchar = new char[wordLength]; //new char[7]; ??
for (int i = 0; i < word.length(); i++) //0-6 = 7
{
wordchar[i] = 'a';
}
cout << wordchar;
The output: 7 aaaaaaa²²²²¦¦¦¦¦ÂD╩2¦♀
Desired output is: aaaaaaa... What is the garbage behind it?? And how did it end up there?
You should add \0 at the end of wordchar.
char * wordchar = new char[wordLength +1];
//add chars as you have done
wordchar[wordLength] = `\0`
The reason is that C-strings are null terminated.
C strings are terminated with a '\0' character that marks their end (in contrast, C++ std::string just stores the length separately).
In copying the characters to wordchar you didn't terminate the string, thus, when operator<< outputs wordchar, it goes on until it finds the first \0 character that happens to be after the memory location pointed to by wordchar, and in the process it prints all the garbage values that happen to be in memory in between.
To fix the problem, you should:
make the allocated string 1 char longer;
add the \0 character at the end.
Still, in C++ you'll normally just want to use std::string.
Use: -
char * wordchar = new char[wordLength+1]; // 1 extra for null character
before for loop and
wordchar[i] ='\0'
after for loop , C strings are null terminated.
Without this it keeps on printing, till it finds the first null character,printing all the garbage values.
You avoid the trailing zero, that's the cause.
In C and C++ the way the whole eco-system treats string length is that it assumes a trailing zero ('\0' or simply 0 numerically). This is different then for example pascal strings, where the memory representation starts with the number which tells how many of the next characters comprise the particular string.
So if you have a certain string content what you want to store, you have to allocate one additional byte for the trailing zero. If you manipulate memory content, you'll always have to keep in mind the trailing zero and preserve it. Otherwise strstr and other string manipulation functions can mutate memory content when running off the track and keep on working on the following memory section. Without trailing zero strlen will also give a false result, it also counts until it encounters the first zero.
You are not the only one making this mistake, it often gets important roles in security vulnerabilities and their exploits. The exploit takes advantage of the side effect that function go off trail and manipulate other things then what was originally intended. This is a very important and dangerous part of C.
In C++ (as you tagged your question) you better use STL's std::string, and STL methods instead of C style manipulations.

Confusion about zero-terminating character

I've always had a question about null-terminated strings in C++/C. For example, if you have a character array like so:
char a[10];
And then you wanted to read in characters like so:
for(int i = 0; i < 10; i++)
{
cin >> a[i];
}
And lets in input the following word: questioner
as the input.
Now my question is what happens to the '\0'? If I were to reverse the string, and make it print out
renoitseuq
Where does the null-terminating character go? I thought that good programming practice was to always leave one extra character for the zero-terminating character. But in this example, everything was printed correctly, so why care about the null-terminating character? Just curious. Thanks for your thoughts!
There are cases where you're given a null-terminator, and cases where you have to ask for one yourself.
const char* x = "bla";
is a null-terminated C-style string. It actually has 4 characters - the 3 + the null terminator.
Your string isn't null-terminated. In fact, treating it as a null-terminated string leads to undefined behavior. If you were to cout << it, you'd be attempting to read beyond the memory you're allowed to access, because the runtime will keep looking for a null-terminator and spit out characters until it reaches one. In your case, you were lucky there was one right at the end, but that's not a guarantee.
char a[10]; is just like any other array - un-initialized values, 10 characters - not 11 just because it's a char array. You wouldn't expect int b[10] to contain 10 values for you to play with and an extra 0 at the end just because, would you?
Well, reading that back, I don't see why you'd expect that from a C-string as well - it's not all intuitive.
You are reading 10 chars, not a string. I assume that you also output 10 chars in reverse, so the 0-char plays no role, coz you dont use the array as string, but as an array of single chars...
char a[10] is ten characters, any of which can be a '\0'.
If you put "questioner" in there none of them are.
To get that you'd need a[11] and fill it with "questioner" and then '\0'.
If you were reversing it, you'd get the position of the first '\0' in a[?], reverse up to that and then add a null terminator.
This is a classic banana skin in C, unfortunately it still manages to get under your foot at the most inopportune of moments, even if you are all too familiar with it.