Code adding \n and 1, and I don't know why - c++

This is a bit complicated, but basically I'm making a program and one of my functions is acting a bit strange. The function is fed an array of characters, the first time it's
new_sensor_node SN42 42 3.57 5.0 7.
right now the function just prints out each individual "token" (each set of characters separated by spaces). Then prints a space, and then prints the number of characters in the token. But for some reason, the last token is always printed weird, and 1 character extra is counted.
Here's the function:
int parseCommandLine(char cline[], char *tklist[]){
int i;
int length;
int count = 0; //counts number of tokens
int toklength = 0; //counts the length of each token
length = strlen(cline);
for (i=0; i < length; i++) { //go to first character of each token
if (((cline[i] != ' ' && cline[i-1]==' ') || i == 0)&& cline[i]!= '"') {
while ((cline[i]!=' ')&& (cline[i] != '\0')){
toklength++;
cout << cline[i];
i++;
}
cout << " " << toklength << "\n\n";
cout << "\n";
toklength = 0;
count ++;
}
if (cline[i] == '"') {
do {
i++;
} while (cline[i]!='"');
count++;
}
}
//cout << count << "\n";
return 0;
And here's the output (for that first array):
new_sensor_node 15
SN42 4
42 2
3.57 4
5.0 3
7.
3
Any thoughts on what could be causing this? I suspect it might have to do with how I'm dealing with the null character

It's very likely that the input string actually contains the newline at the end. Depending on how you read the input, it may or may not be in the input. For example, the fgets function reads the newline and leaves it in the buffer.
Especially since you don't actually do any actual tokenization or modification of the input string, you just print character by character, this is a very likely scenario.

Related

How to solve this problem trying to iterate a string?

I'm trying to invert the case of some strings, and I did it, but I have some extra characters in my return, is it a memory problem? Or because of the length?
char* invertirCase(char* str){
int size = 0;
char* iterator = str;
while (*iterator != '\0') {
size++;
iterator++;
}
char* retorno = new char[size];
for (int i = 0; i < size; i++) {
//if capital letter:
if (str[i] < 96 && str[i] > 64) {
retorno[i] = str[i] + 32;
}
// if lower case:
else if (str[i] > 96 && str[i] < 123) {
retorno[i] = str[i] - 32;
}
//if its not a letter
else {
retorno[i] = str[i];
}
}
return retorno;
}
For example, if I try to use this function with the value "Write in C" it should return "wRITE IN c", but instead it returns "wRITE IN cýýýýÝݱ7ŽÓÝ" and I don't understand where those extra characters are coming from.
PS: I know I could use a length function, but this is from school, so I can't do that in this case.
add +1 to the size of the char array.
char* retorno = new char[size+1];
add a null-terminated string before returning retorno.
retorno[size] = '\0';
Your output string is not null-terminated
When you iterate through the input string, you increment size until you reach null. That means the null is not copied to the output string. After you exit the loop, you should increment size once more to capture the end.
As an aside, it's probably a good idea to constrain size to some maximum (while(*iterator != '\0' && size < MAXSIZE)) in case someone passes a non-terminated string into your function. If you hit the max size condition, you'd need to explicitly add the null at the end of your output.
Your string should be null terminated; which is what you are looking for when you get the initial size of the string. When you create the new string, you should allocated size+1 chars of space, then retorno[size] should be set to a null terminating character (i.e. '\0'). When you attempt to print a char* using printf or cout (or similar mechanisms), it will keep printing characters until it find the null terminating character, which is why you are getting the garbage values after your expected output.
On another note, c++ has helpful functions like std::islower / std::isupper and std::tolower / std::toupper
From what I can tell, there could be 2 things going on here:
Like everyone here mentioned, the absence of a null terminating character ('\0') at the end of your char array could be causing this.
It could be the way you are printing results of your retorno character string outside of your invertirCase() function.
I tested out your function in C++14, C++17 and C++20 and it returned the correct result each time, both with the null terminating character at the end of the retorno char array and without it.
Try printing your result inside of your function before returning it, to identify if this is being caused inside of your function or outside of it. Like so:
char* invertirCase(char* str){
// [... truncated code here]
for (int i = 0; i < size; i++) {
// [... truncated code here]
}
cout << " **** TESTING INSIDE FUNCTION ****" << endl;
cout << "-- Testing index iteration" << endl;
for (int i = 0; i < size; i++) {
cout << retorno[i];
}
cout << endl;
cout << "-- Testing iterator iteration" << endl;
for (char* iterator = retorno; *iterator != '\0'; iterator++) {
cout << *iterator;
}
cout << endl;
cout << "-- Testing advanced for loop" << endl;
for (char character : retorno) {
cout << character;
}
cout << " **** END TESTING ****" << endl;
cout << endl;
return retorno;
}
This way you could possibly identify both if the problem occurs inside of your function or if the problem is occurring because of the way you may be printing your result as well.

Checking for characters in string in a do...while

do
{
e=0;
cout << "Enter input of base " << base << ": ";
cin >> input;
for ( i=0; i<input.length(); i++)
{
if (input[i]=='A')
value=10;
else if (input[i]=='B')
value=11;
else if (input[i]=='C')
value=12;
else if (input[i]=='D')
value=13;
else if (input[i]=='E')
value=14;
else if (input[i]=='F')
value=15;
else
value=input[i];
if(value>=base)
{
cout << "Invalid input data for your input base!!!" << endl << endl;
e=1;
}
}
}while (e==1);
When ever the user key in let's say 101101, and the base is 2, it will output Invalid for 6 times. What's the error?
I tried to use npos, find(), but they didn't work!
Here:
value=input[i];
You seem to be assuming that the value of the character '0' is 0 and the value of character '1' is 1 (and similarly for other digits). Your assumption is wrong for the character encoding that your system uses. In fact, '0' cannot possibly be represented by the value 0, because that is reserved for the null terminator character that designates the end of a character string.
Thanks But what should i do next?
Subtracting value of a character from another gives you the distance between the representations of those characters (subtracting character from itself gives you 0). Number digit characters are guaranteed to be sequential (0 is immediately before 1 is immediately before 2 ...). Given these axioms, it's easy to prove that subtracting the value of '0' from a character gives you the value that you're looking for.
Replace:
else
value=input[i];
With:
else
value=input[i] - '0';
Because:
'0' = 0x30 = 48

Reading white spaces crashes Parser. Why?

I'm ultimately trying to code a shell, so I need to be able to parse commands. I'm trying to convert every word and special symbol into tokens, while ignoring whitespaces. It works when characters separating tokens are | & < > however as soon as I use a single whitespace character, the program crashes. Why is that?
I'm a student, and I realize the way I came up with separating the tokens is rather crude. My apologies.
#include <iostream>
#include <stdio.h>
#include <string>
#include <cctype>
using namespace std;
#define MAX_TOKENS 10
int main()
{
//input command for shell
string str;
string tokens[MAX_TOKENS] = { "0", "0", "0", "0", "0", "0", "0", "0", "0", "0" };
int token_index = 0;
int start_token = 0;
cout << "Welcome to the Shell: Please enter valid command: " << endl << endl;
cin >> str;
for (unsigned int index = 0; index < str.length(); index++)
{
//if were at end of the string, store the last token
if (index == (str.length() - 1)) tokens[token_index++] = str.substr(start_token, index - start_token + 1);
//if char is a whitespace store the token
else if (isspace(str.at(index)) && (index - start_token > 0))
{
tokens[token_index++] = str.substr(start_token, index - start_token);
start_token = index + 1;
}
//if next char is a special char - store the existing token, and save the special char
else if (str[index] == '|' || str[index] == '<' || str[index] == '>' || str[index] == '&')
{
//stores the token before our special character
if ((index - start_token != 0)) //this if stops special character from storing twice
{
//stores word before Special character
tokens[token_index++] = str.substr(start_token, index - start_token);
}
//stores the current special character
tokens[token_index++] = str[index];
if (isspace(str.at(index + 1))) start_token = index + 2;
else start_token = index + 1;
}
}
cout << endl << "Your tokens are: " << endl;
for (int i = 0; i < token_index; i++)
{
cout << i << " = " << tokens[i] << endl;
}
return 0;
}
A few things:
Check that token_index is less than MAX_TOKENS before using it again after each increment, otherwise you have a buffer overflow. If you change tokens to be a std::vector then you can use the at() syntax as a safety net for that.
The expression index - start_token has type unsigned int so it can never be less than 0. Instead you should be doing index > start_token as your test.
str.at(index) throws an exception if index is out of range. However you never catch the exception; depending on your compiler, this may just look like the program crashing. Wrap main()'s code in a try...catch(std::exception &) block.
Finally, this is a long shot but I will mention it for completeness. Originally in C89, isspace and the other is functions had to take a non-negative argument. They were designed so that the compiler could implement them via an array lookup, so passing in a signed char with a negative value would cause undefined behaviour. I'm not entirely sure whether or not this was "fixed" in the various later versions of C and C++, but even if standards mandate it , it's possible you have a compiler that still doesn't like receiving negative chars. To eliminate this as a possibility from your code, use isspace( (unsigned char)str.at(index) ), or even better, use the C++ locale interface.

While loop not seeing or finding terminating null character

I am trying to iterate through a char array using a while loop using '\0' as the terminating condition, but my problem is that its not finding the '\0' until index position 481, the array is declared as 200 long and I cant see what I am doing wrong!! I cannot use strings or any form of string functions for this before anyone asks. Can anyone help??
#include <iostream>
using namespace std;
int main()
{
char fullString[200]={'\0'}; // Declare char string of 200, containing null characters
int alphaCount = 0;
int charCount = 0;
int wordCount = 0;
cin.getline(fullString,200); //
cout << "\n\n" << fullString;
cout << "\n\n\n";
int i=0;
int i2 = 0;
while(fullString[i]!='\0'){ //iterate through array until NULL character is found
cout << "\n\nIndex pos : " << fullString[i]; //Output char at 'i' position
while(fullString[i2]!= ' '){ //while 'i' is not equal to SPACE, iterate4 through array
if(isalpha(fullString[i2])){
alphaCount++; // count if alpha character at 'i'
}
charCount++; // count all chars at 'i'
i2++;
}
if(charCount == alphaCount){ // if charCount and alphaCount are equal, word is valid
wordCount++;
}
charCount = 0; // zero charCount and alphaCount
alphaCount = 0;
i=i2;// Assign the position of 'i2' to 'i'
while(fullString[i] == 32){ //if spaces are present, iterate past them
i++;
cout << "\n\ntest1";
}
i2 = i; // assign value of 'i' to 'i2' which is the next position of a character in the array
if(fullString[i] == '\0')
{
cout << "\n\nNull Character " << endl;
cout << "found at pos: " << i << endl;
}
}
cout << "\n\ni" << i;
cout << "\n\nWord" << wordCount;
return 0;
}
As others have pointed out, your problem is with the inner loop. You test for a space character but not for NULL, so it's iterating past the end of the last word because there is no space character after the last word.
This is easily fixed by changing your while condition from this:
while(fullString[i2]!= ' ')
... to this:
while(fullString[i2] && fullString[i2]!= ' ')
This will change your inner while loop to first test for non-NULL, and then test for non-space.
I'm not correcting the rest of your code because I presume this is a class project (it looks like one) so I'm limiting my answer to the scope of your question.
You do not check in the inner loop
while(fullString[i2]!= ' '){ //while 'i' is not equal to SPACE, iterate4 through array
if(isalpha(fullString[i2])){
alphaCount++; // count if alpha character at 'i'
}
charCount++; // count all chars at 'i'
i2++;
}
...
i=i2;// Assign the position of 'i2' to 'i'
whether the next character is equal to '\0'
It's because the inner loops don't check for the termination, they just continue looping even past the end of the string.
By the way, if you want to count the number of words, spaces and non-space characters, there are easier ways in C++. See e.g. std::count and std::count_if for the spaces and characters. For example:
std::string input = "Some string\twith multiple\nspaces in it.";
int num_spaces = std::count_if(std::begin(input), std::end(input),
[](const char& ch){ return std::isspace(ch); });
For counting words, you can use std::istringstream, std::vector, std::copy, std::istream_iterator and std::back_inserter:
std::istringstream iss(input);
std::vector<std::string> words;
std::copy(std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(words));
After the code above, the size of the words vector is the number of words.
If you use e.g. std::copy_if then you can use the above code for the other cases as well (but std::count_if is better for single character classes).

Not reading a string properly

I am practising user input handling. My goal is to have the user enter a line of integers separated by space (" "), read them as integers, store them and work on them later. I stumbled upon an interesting problem (Atleast in my oppinion) the way I am doing it, it seems that it is always not reading the last digit which was entered by the user. I will post the entire program here (since there are some extra libreries that are included).
I have left some comments in the program
#include <iostream>
#include <string>
#include <vector>
#include <stdlib.h>
using namespace std;
int main()
{
//this vector will store the integers
vector<int> a;
// this will store the user input
string inp;
getline(cin, inp);
// this string will temporarily store the digits
string tmp;
//be sure that the reading part is okay
cout << inp << endl;
//until you meet something different than a digit, read char by char and add to string
for(int i = 0; i < inp.length(); i++)
{
if(isdigit(inp[i]))
{
tmp +=inp[i];
}
else
{
// when it is not a character, turn to integer, empty string
int value = atoi(tmp.c_str());
a.push_back(value);
tmp = "";
}
}
// paste the entire vector of integers
for(int i = 0; i < a.size(); i++)
{
cout << a[i] << endl;
}
return 0;
}
Replace this line
for(int i = 0; i <inp.length(); i++)
by
for(int i = 0; i <= inp.length(); i++)
DEMO IDEONE
The problem with your code is: In example 25 30 46 whenever i=7, tmp=46. You are not pushing 46 in vector as inp[8] is a newline character, so your for loop terminates after i become 7.
Please Note: i <= inp.length() runs perfectly in most of the compilers as \0 is used/treated as sentinel.However, there are few compilers(like Microsoft Visual C++) that may show Assertion error: string subscript out of range.
If the very end of the line is a digit, you don't hit the else on the last iteration, and that last number never gets pushed into the vector.
The simplest solution would be to replicate the non-digit logic after the loop:
if (!tmp.empty()) // If tmp has content, we need to put it in the vector.
{
int value = atoi(tmp.c_str());
a.push_back(value);
tmp = "";
}
Although I'm sure you can think of a nicer way of structuring it.
Here's a version I came up with using std::stringstream, that also avoids atoi:
int main()
{
std::vector<int> ints;
std::string line;
std::getline (std::cin, line);
std::cout << "Read \"" << line << "\"\n";
std::stringstream ss(line);
int remaining = line.size();
while (remaining)
{
if(std::isdigit(ss.peek())) // Read straight into an int
{
int tmp;
ss >> tmp;
ints.push_back(tmp);
}
else
{
ss.get(); // Eat useless characters
}
remaining = line.size()-ss.tellg();
}
for (auto i : ints)
std::cout << i << '\n';
return 0;
}
Running:
$ ./a.out <<< "12 34 56"
Read "12 34 56"
12
34
56
Note, this is specifically made to work with any old gibberish between the numbers:
$ ./a.out <<< "12-abc34-56"
Read "12-abc34-56"
12
34
56
If there will only be whitespace, this is even easier, as reading ints from a stringstream will ignore that automatically. In which case you just need:
int tmp;
while (ss >> tmp)
{
ints.push_back(tmp);
}
Your program need a string which is ended with a non-digit character to work correctly. Try this string "1 12 14587 15 " because in your algorithm when your forgot the last space, your program store the number into the tmp string but don't save it into the vector. To correct that you need to add a last push_back just after your first loop.
You update a with new value only when when non digit is found. Thus if you have string ending with digits, tmp will contain digital string but you will never get to else that should perform push_back. You may fix this by adding following code after for loop
if(!tmp.empty()){
// when it is not a character, turn to integer, empty string
int value = atoi(tmp.c_str());
a.push_back(value);
tmp = "";
}
Before starting the loop, add a space to the string to be sure to push the last number: inp.push_back(' ')
Your loop is finished after last digit is read, so the last digit is never turned to integer. Just add some code after original for loop.
for(int i = 0; i < inp.length(); i++)
{
/* ...... */
}
// add this to read the last digit
if(tmp.length() > 0){
int value = atoi(tmp.c_str());
a.push_back(value);
tmp = "";
}
You never push back your last value. For instance, consider this input
40 36
Then while you are reading, you push back at the first space. But you never push 36 since there are no more characters.
After the end of your for() loop you can try this:
if(!tmp.empty()) {
a.push_back(tmp);
}
When the last digit of the last number is stored in tmp, after that the loop ends because you have read the last character of the entire string. When the loop ends tmp still contains the last number.
1) You can convert and add the last number to vector after the loop. The last number still available in tmp.
2) Or you can explicitly add non-digit character in the end of the string before the loop.
you ommit input. change your code to reflrct this:
//this vector will store the integers
vector<int> a;
// this will store the user input
string inp;
getline(cin, inp);
// this string will temporarily store the digits
string tmp;
//be sure that the reading part is okay
cout << inp << endl;
//until you meet something different than a digit, read char by char and add to string
for(int i = 0; i < inp.length(); i++)
{
if(isdigit(inp[i]))
{
tmp =inp[i];
int value = atoi(tmp.c_str());
a.push_back(value);
}
else
{
tmp = "";
}
}
// paste the entire vector of integers
for(int i = 0; i < a.size(); i++)
{
cout << a[i] << endl;
}
return 0;
or replace in loop:
for(int i = 0; i <inp.length(); i++)
by
for(int i = 0; i <= inp.length(); i++)