Not overwriting elements in arrays - c++

I am writing code for a program we are supposed to make in my computer science course I am taking, where we are learning C++. In this program, I'm trying to get a user to enter flavours of popcorn they may like, but without exceeding 10 (or 9) characters. When I tried including flavours with > 10 chars, the program would not result in an error or work properly, but would just store the chars exceeding 10 into the next element in the array. How can I stop the program from doing this?
#define POP_COUNT 5
#define POP_SIZE 10
int main() {
char popcorn[POP_COUNT][POP_SIZE];
fputs("Enter your 5 favourite popcorn flavours: \n", stdout);
for (int i = 0; i < POP_COUNT; i++) {
fgets(popcorn[i], POP_SIZE, stdin);
popcorn[i][POP_SIZE - 1] = 0;
}
cout << popcorn[0] << endl;
cout << popcorn[1] << endl;
fputs("Your favourite flavours are: \n", stdout);
for (int i = 0; i < POP_COUNT; i++) {
fputs(popcorn[i], stdout);
}
}

Your issue arises from the normal behavior of fgets(). It will not read more characters than a buffer of the size you specify can accommodate, including space for a terminator. Whatever it does not read on a given call remains waiting in the stream, and will be read on the next call.
To address this, you must first detect it. You can do that by checking whether a newline was stored in the destination buffer. If so, then you're good, but if not, then only a partial line has been read. Note, too, that you do not need to manually insert a terminator -- fgets() will always include a terminator on success.
If you just want to discard any extra characters, then in the second case you must do exactly that -- read and ignore additional characters until either you have read a newline, or you reach the end of the file.

Related

Why does getline behave weirdly after 3 newlines?

I'll preface this by saying I'm relatively new to posting questions, as well as C++ in general, my title is a little lame as it doesn't really specifically address the problem I am dealing with, however I couldn't really think of another way to word it, so any suggestions on improving the title is appreciated.
I am working on a relatively simple function which is supposed to get a string using getline, and read the spaces and/or newlines in the string so that it can output how many words have been entered. After reaching the character 'q' it's basically supposed to stop reading in characters.
void ReadStdIn2() {
std::string userInput;
const char *inputArray = userInput.c_str();
int count = 0;
getline(std::cin, userInput, 'q');
for (int i = 0; i < strlen(inputArray); i++){
if ((inputArray[i] == ' ') || (inputArray[i] == '\n')){
count += 1;
}
}
std::cout << count << std::endl;
}
I want to be able to enter multiple words, followed by newlines, and have the function accurately display my number of words. I can't figure out why but for some reason after entering 3 newlines my count goes right back to 0.
For example, if I enter:
hello
jim
tim
q
the function works just fine, and returns 3 just like I expect it to. But if I enter
hello
jim
tim
bill
q
the count goes right to 0. I'm assuming this has something to do with my if statement but I'm really lost as to what is wrong, especially since it works fine up until the 3rd newline. Any help is appreciated
The behaviour of the program is undefined. Reading input into std::string potentially causes its capacity to increase. This causes pointers into the string to become invalid. Pointers such as inputArray. You then later attempt to read through the invalid pointer.
P.S. calculating the length of the string with std::strlen in every iteration of the loop is not a good idea. It is possible to get the size without calculation by using userInput.size().
To fix both issues, simply don't use inputArray. You don't need it:
for (int i = 0; i < userInput.size(); i++){
if ((userInput[i] == ' ') || (userInput[i] == '\n')){
...

Trying to read a single character at a time into an array of indefinite size

I am a CS student working on a c++ project. We have been instructed to declare a struct and use it to read in an array of chars and keep a tally of how many letters are used in the string. We are not allowed to use a string; it MUST be an array of our declared struct.
The input must be as long as the user wants; the code has to be able to accept new lines of input and be terminated by '.'
I'm really struggling here. I don't even know where to begin. I've thrown together some code as best-guess for what to do, but it crashes after pressing "." then enter, and I don't know why.
//declare struct
struct data
{
int tally = 0;
char letter;
};
//size of string to read in at a time
const int SIZE_OF_CHUNK = 11;
int main()
{
//input chunk of struct
data input[SIZE_OF_CHUNK];
int placemark,
length;
cout << "Enter sequence of characters, '.' to terminate:" << endl;
do
{
for (int index = 0; (input[index].letter != '\0') && (input[index - 1].letter != '.'); index++)
{
cin >> input[index].letter;
placemark++;
}
//I intend to put something here to handle if the code
needs to read in another chunk, but I want to fix the crashing
problem first
}
while (input[placemark].letter != '.');
//print out what was read in, just to check
for (int index = 0; input[index].letter != '\0'; index++)
{
cout << input[index].letter;
}
return 0;
}
I've tried looking up how to read in a single character but haven't found anything helpful to my circumstances so far. Any tips on what I'm doing wrong, or where I can find helpful resources, would be very much appreciated.
Are you sure you must use a declared struct?
If you just want to count the number of times a character has appeared, you don't need to store the character; you just need to store the number of times it appeared. So just unsigned lettersCount[26], and each index maps to a letter (i.e. index 0 means a, index 1 means b). Whenever a letter appears, just increase the count of that index.
You can map a letter to the index by making use of ASCII. Every letter is represented by a decimal number that you can look it up at ASCII table. For example, the letter a is represented by the decimal value 97, b is 98 and so on. The number increases successively, which we can make use of. So if you want to map a letter to an index, all you need to do is just value - 97 or value - 'a'. For example, if you read in the letter a, take away 97 from that and you'll get 0, which is what you want. After getting the index, it's just a simple ++ to increment the count of that letter.
Regarding the treatment of uppercase and lowercase (i.e. treat them the same or differently), it'll be up to you to figure it out how to do it (which should be fairly simple if you can understand what I've explained).

Reading file byte by byte with ifstream::get

I wrote this binary reader after a tutorial on the internet. (I'm trying to find the link...)
The code reads the file byte by byte and the first 4 bytes are together the magic word. (Let's say MAGI!) My code looks like this:
std::ifstream in(fileName, std::ios::in | std::ios::binary);
char *magic = new char[4];
while( !in.eof() ){
// read the first 4 bytes
for (int i=0; i<4; i++){
in.get(magic[i]);
}
// compare it with the magic word "MAGI"
if (strcmp(magic, "MAGI") != 0){
std::cerr << "Something is wrong with the magic word: "
<< magic << ", couldn't read the file further! "
<< std::endl;
exit(1);
}
// read the rest ...
}
Now here comes the problem, when I open my file, I get this error output:
Something is wrong with the magic word: MAGI?, couldn't read the file further! So there is always one (mostly random) character after the word MAGI, like in this example the character ?!
I do think that it has something to do with how a string in C++ is stored and compared with each other. Am I right and how can I avoid this?
PS: this implementation is included in another program and works totally fine ... weird.
strcmp assumes that both strings are nul-terminated (end with a nul-character). When you want to compare strings which are not terminated, like in this case, you need to use strncmp and tell it how many characters to compare (4 in this case).
if (strncmp(magic, "MAGI", 4) != 0){
When you try to use strcmp to compare not null-terminated char arrays, it can't tell how long the arrays are (you can't tell the length of an array in C/C++ just by looking at the array itself - you need to know the length it was allocated with. The standard library is not exempt from this limitation). So it reads any data which happens to be stored in memory after the char array until it hits a 0-byte.
By the way: Note the comment to your question by Lightness Races in Orbit, which is unrelated to the issue you are having now, but which hints a different bug which might cause you some problems later on.

Why does scanf appear to skip input?

I am confused about scanf's behaviour in the following program. scanf appears to input once, and then not input again, until a stream of characters is printed.
Below in a C program
#include<stdio.h>
int main()
{
int i, j=0;
do
{
++j;
scanf("%d", &i);
printf("\n\n%d %d\n\n", i, j);
}
while((i!=8) && (j<10));
printf("\nJ = %d\n", j);
return 0;
}
here, Till i am inputting any integer program works perfectly fine, but when a character is inputted it goes on printing the last inputed value of i and never stops(untill j is 10 when loop exits) for scanf to take next input.
output::
1 <-----1st input
1 1
2 <---- 2nd input
2 2
a <---- character input
2 3
2 4
2 5
2 6
2 7
2 8
2 9
2 10
J = 10
same thing is happening in c++ also.
#include<iostream>
using namespace std;
int main()
{
int i, j=0;
do
{
++j;
cin>>i;
cout<<i<<" "<<j<<"\n";
}
while((i!=8) && (j<10));
cout<<"\nj = "<<j<<"\n";
}
output of c++ program ::
1 <-----1st input
1 1
2 <-----2nd input
2 2
a <------ character input
0 3
0 4
0 5
0 6
0 7
0 8
0 9
0 10
j = 10
only change in c++ is that 0 is being printed instead of last value.
I know here integer values are expected by the program, but i want to know what happens when character is inputted in place of an integer?
what is the reason of all happening above?
When you enter a, then cin >> i fails to read it because the type of i is int to which a character cannot be read. That means, a remains in the stream forever.
Now why i prints 0 is a different story. Actually it can print anything. The content of i is not defined once the attempt to read fails. Similar thing happens with scanf as well.
The proper way to write it this:
do
{
++j;
if (!(cin>>i))
{
//handle error, maybe you want to break the loop here?
}
cout<<i<<" "<<j<<"\n";
}
while((i!=8) && (j<10));
Or simply this (if you want to exit loop if error occurs):
int i = 0, j = 0;
while((i!=8) && (j<10) && ( cin >> i) )
{
++j;
cout<<i<<" "<<j<<"\n";
}
If scanf sees a character in the input stream that doesn't match the conversion specifier, it stops the conversion and leaves the offending character in the input stream.
There are a couple of ways to deal with this. One is to read everything as text (using scanf with a %s or %[ conversion specifier or fgets) and then use atoi or strtol to do the conversion (my preferred method).
Alternately, you can check the return value of scanf; it will indicate the number of successful conversions. So, if scanf("%d", &i); equals 0, then you know there's a bad character in the input stream. You can consume it with getchar() and try again.
You can never expect your users to enter valid things. The best practice is to read the input into a string and try to convert it to integer. If the input is not an integer, you can give an error message to the user.
The problem is that when you enter an input that is not of the expected type (specified by %d for scanf, and the int type for cin>>i;, the inputstream is not advanced, which results in both operations trying to extract the same type of data from the exact same incorrect input (and failing just as well this time around too), thus you will never asked for another input.
To ensure this does not happen you will need to check the return value of both operations (read the manual for how each reports errors). If an error does happen (as when you enter a character), you will need to clear the error, consume the invalid input and try again. I find it better in C++ to read a whole line using std::gtline() instead of int or even std::string when geting input from ther user interactively, so you get into this "infinite" loop you experienced.
You are ignoring the return value. See what the manual says about scanf(3):
RETURN VALUE
These functions return the number of input items successfully matched and assigned, which can be fewer than provided for, or even zero in the event of an early matching failure.
It fails matching an integer.
You could check the return value of scanf to determine if an integer has been parsed correctly (return should =1). On failure, you have choices: either notify the user of the error and terminate, or recover by reading the next token with a scanf("%s" ...) perhaps with a warning.
For scanf, you need to check its return value to see if the conversion on the input worked. scanf will return the number of elements successfully scanned. If the conversion did not work, it will leave the input alone, and you can try to scan it differently, or just report an error. For example:
if (scanf("%d", &i) != 1) {
char buf[512];
fgets(buf, sizeof(buf), stdin);
printf("error in line %d: got %s", j, buf);
return 0;
}
In your program, since the input is left alone, your loop repeats trying to read the same input.
In C++, you check for failure using the fail method, but the input stream failure state is sticky. So it won't let you scan further without clearing the error state.
std::cin >> i;
if (std::cin.fail()) {
std::string buf;
std::cin.clear();
std::getline(cin, buf);
std::cout
<< "error in line " << j
<< ": got " << buf
<< std::endl;
return 0;
}
In your program, since you never clear the failure state, the loop repeats using cin in a failure state, so it just reports failure without doing anything.
In both cases, you might find it easier or more reliable to work with the input if you would read in the input line first, and then attempt to parse the input line. In pseudocode:
while read_a_line succeeds
parse_a_line
In C, the catch to reading a line is that if it is longer than your buffer, you will need to check for that and concatenate multiple fgets call results together to form the line. And, to parse a line, you can use sscanf, which is similar to scanf but works on strings.
if (sscanf(buf, "%d", &i) != 1) {
printf("error in line %d: got %s", j, buf);
return 0;
}
In C++, for quick low level parsing of formatted input, I also prefer sscanf. But, if you want to use the stream approach, you can convert the string buffer into a istringstream to scan the input.
std::getline(cin, buf);
if (std::cin.fail()) {
break;
}
std::istringstream buf_in(buf);
buf_in >> i;
if (buf_in.fail()) {
std::cout << "error in line " << j
<< ": got " << buf
<< std::endl;
return 0;
}

Weird problem with string function

I'm having a weird problem with the following function, which returns a string with all the characters in it after a certain point:
string after(int after, string word) {
char temp[word.size() - after];
cout << word.size() - after << endl; //output here is as expected
for(int a = 0; a < (word.size() - after); a++) {
cout << word[a + after]; //and so is this
temp[a] = word[a + after];
cout << temp[a]; //and this
}
cout << endl << temp << endl; //but output here does not always match what I want
string returnString = temp;
return returnString;
}
The thing is, when the returned string is 7 chars or less, it works just as expected. When the returned string is 8 chars or more, then it starts spewing nonsense at the end of the expected output. For example, the lines
cout << after(1, "12345678") << endl;
cout << after(1, "123456789") << endl;
gives an output of:
7
22334455667788
2345678
2345678
8
2233445566778899
23456789�,�D~
23456789�,�D~
What can I do to fix this error, and are there any default C++ functions that can do this for me?
Use the std::string::substr library function.
std::string s = "12345678";
std::cout << s.substr (1) << '\n'; // => 2345678
s = "123456789";
std::cout << s.substr (1) << '\n'; // 23456789
The behavior you're describing would be expected if you copy the characters into the string but forget to tack a null character at the end to terminate the string. Try adding a null character to the end after the loop, and make sure you allocate enough space (one more character) for the null character. Or, better, use the string constructor overload which accepts not just a char * but also a length.
Or, even better std::string::substr -- it will be easier and probably more efficient.
string after(int after, string word) {
return word.substr (after);
}
BTW, you don't need an after method, since exactly what you want already exists on the string class.
Now, to answer your specific question about why this only showed up on the 8th and later characters, it's important to understand how "C" strings work. A "C" string is a sequence of bytes which is terminated by a null (0) character. Library functions (like the string constructor you use to copy temp into a string instance which takes a char *) will start reading from the first character (temp[0]) and will keep reading until the end, where "the end" is the first null character, not the size of the memory allocation. For example, if temp is 6 characters long but you fill up all 6 characters, then a library function reading that string to "the end" will read the first 6 characters and then keep going (past the end of the allocated memory!) until it finds a null character or the program crashes (e.g. due to trying to access an invalid memory location).
Sometimes you may get lucky: if temp was 6 characters long and the first byte in memory after the end of your allocation happened to be a zero, then everything would work fine. If however the byte after the end of your allocation happened to be non-zero, then you'd see garbage characters. Although it's not random (often the same bytes will be there every time since they're filled by operations like previous method calls which are consistent from run to run of your program), but if you're accessing uninitialized memory there's no way of knowing what you'll find there. In a bounds checking environment (e.g. Java or C# or C++'s string class), an attempt to read beyond the bounds of an allocation will throw an exception. But "C" strings don't know where their end is, leaving them vulnerable to problems like the one you saw, or more nefarious problems like buffer overflows.
Finally, a logical follow-up question you'd probably ask: why exactly 8 bytes? Since you're trying to access memory that you didn't allocate and didn't initialize, whats in that RAM is what the previous user of that RAM left there. On 32-bit and 64-bit machines, memory is generally allocated in 4- or 8-byte chunks. So it's likely that the previous user of that memory location stored 8 bytes of zeroes there (e.g. one 64-bit integer zero) zeros there. But the next location in memory had something different left there by the previous user. Hence your garbage characters.
Moral of the story: when using "C" strings, be very careful about your null terminators and buffer lengths!
Your string temp is not NULL terminated. You requite temp[a] = '\0'; at the end of loop. Also you need to allocate word.size() - after + 1 chars so as to accomodate the NULL character.
You're not null-terminating your char array. C-style strings (i.e., char arrays) need to have a null character (i.e., '\0') at the end so functions using them know when to stop.
I think this is basically your after() function, modulo some fudging of indexes:
string after(int after, string word) {
return word.substring(after);
}