Am I incorrectly using atoi? - c++

I was having some trouble with my parsing function so I put some cout statements to tell me the value of certain variables during runtime, and I believe that atoi is incorrectly converting characters.
heres a short snippet of my code thats acting strangely:
c = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << c << "' number = " << atoi(&c) << endl;
the output for this statement is:
50 digit 0 = '5' number = 52
I'm calling this code within a loop, and whats strange is that it correctly converts the first 47 characters, then on the 48th character it adds a 0 after the integer, on the 49th character it adds a 1, on the 50th (Seen here) it adds a two, all the way up to the 57th character where it adds a 9, then it continues to convert correctly all the way down to the 239th character.
Is this strange or what?
Just to clarify a little more i'll post the whole function. This function gets passed a pointer to an empty double array (ping_data):
int parse_ping_data(double* ping_data)
{
ifstream data_file(DATA_FILE);
int pulled_digits [4];
int add_data;
int loop_count;
int data_index = 0;
for (char c = data_file.get(); !data_file.eof(); c = data_file.get())
{
if (c == 't' && data_file.get() == 'i' && data_file.get() == 'm' && data_file.get() == 'e' && data_file.get() == '=')
{
loop_count = 0;
c = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << c << "' number = " << atoi(&c) << endl;
pulled_digits[loop_count] = atoi(&c);
while ((c = data_file.get()) != 'm')
{
loop_count++;
if (data_index == 50)
cout << "50 digit " << loop_count << " = '" << c << "' number = " << atoi(&c) << endl;
pulled_digits[loop_count] = atoi(&c);
}
add_data = 0;
for (int i = 0; i <= loop_count; i++)
add_data += pulled_digits[loop_count - i] * (int)pow(10.0,i);
if (data_index == 50)
cout << "50 index = " << add_data << endl;
ping_data[data_index] = add_data;
data_index++;
if (data_index >= MAX_PING_DATA)
{
cout << "Error parsing data. Exceeded maximum allocated memory for ping data." << endl;
return MAX_PING_DATA;
}
}
}
data_file.close();
return data_index;
}

atoi takes a string, i.e. a null terminated array of chars, not a pointer to a single char so this is incorrect and will get you unpredictable results.
char c;
//...
/* ... */ atoi(&c) /* ... */
Also, atoi doesn't provide any way to detect errors, so prefer strtol and similar functions.
E.g.
char *endptr;
char c[2] = {0}; // initalize c to all zero
c[0] = data.file.get(); // c[1] is the null terminator
long l = strtol(c, &endptr, 10);
if (endptr == c)
// an error occured

atoi expects a null-terminated string as an input. What you are supplying is not a null-terminated string.
Having said that, it is always worth adding that it is very difficult (if at all possible) to use atoi properly. atoi is a function that offers no error control and no overflow control. The only proper way to perform string-representation-to-number conversion in C standard library is functions from strto... group.
Actually, if you need to convert just a single character digit, using atoi or any other string conversion function is a weird overkill. As it has already been suggested, all you need is to subtract the value of 0 from your character digit value to get the corresponding numerical value. The language specification guarantees that this is a portable solution.

Nevermind, it was simply that I needed to convert the character into a string terminated by \0. I changed it to this code:
char buffer [2];
buffer[1] = '\0';
buffer[0] = data_file.get();
if (data_index == 50)
cout << "50 digit 0 = '" << buffer[0] << "' number = " << atoi(buffer) << endl;
and it worked.

Related

How to solve this problem trying to iterate a string?

I'm trying to invert the case of some strings, and I did it, but I have some extra characters in my return, is it a memory problem? Or because of the length?
char* invertirCase(char* str){
int size = 0;
char* iterator = str;
while (*iterator != '\0') {
size++;
iterator++;
}
char* retorno = new char[size];
for (int i = 0; i < size; i++) {
//if capital letter:
if (str[i] < 96 && str[i] > 64) {
retorno[i] = str[i] + 32;
}
// if lower case:
else if (str[i] > 96 && str[i] < 123) {
retorno[i] = str[i] - 32;
}
//if its not a letter
else {
retorno[i] = str[i];
}
}
return retorno;
}
For example, if I try to use this function with the value "Write in C" it should return "wRITE IN c", but instead it returns "wRITE IN cýýýýÝݱ7ŽÓÝ" and I don't understand where those extra characters are coming from.
PS: I know I could use a length function, but this is from school, so I can't do that in this case.
add +1 to the size of the char array.
char* retorno = new char[size+1];
add a null-terminated string before returning retorno.
retorno[size] = '\0';
Your output string is not null-terminated
When you iterate through the input string, you increment size until you reach null. That means the null is not copied to the output string. After you exit the loop, you should increment size once more to capture the end.
As an aside, it's probably a good idea to constrain size to some maximum (while(*iterator != '\0' && size < MAXSIZE)) in case someone passes a non-terminated string into your function. If you hit the max size condition, you'd need to explicitly add the null at the end of your output.
Your string should be null terminated; which is what you are looking for when you get the initial size of the string. When you create the new string, you should allocated size+1 chars of space, then retorno[size] should be set to a null terminating character (i.e. '\0'). When you attempt to print a char* using printf or cout (or similar mechanisms), it will keep printing characters until it find the null terminating character, which is why you are getting the garbage values after your expected output.
On another note, c++ has helpful functions like std::islower / std::isupper and std::tolower / std::toupper
From what I can tell, there could be 2 things going on here:
Like everyone here mentioned, the absence of a null terminating character ('\0') at the end of your char array could be causing this.
It could be the way you are printing results of your retorno character string outside of your invertirCase() function.
I tested out your function in C++14, C++17 and C++20 and it returned the correct result each time, both with the null terminating character at the end of the retorno char array and without it.
Try printing your result inside of your function before returning it, to identify if this is being caused inside of your function or outside of it. Like so:
char* invertirCase(char* str){
// [... truncated code here]
for (int i = 0; i < size; i++) {
// [... truncated code here]
}
cout << " **** TESTING INSIDE FUNCTION ****" << endl;
cout << "-- Testing index iteration" << endl;
for (int i = 0; i < size; i++) {
cout << retorno[i];
}
cout << endl;
cout << "-- Testing iterator iteration" << endl;
for (char* iterator = retorno; *iterator != '\0'; iterator++) {
cout << *iterator;
}
cout << endl;
cout << "-- Testing advanced for loop" << endl;
for (char character : retorno) {
cout << character;
}
cout << " **** END TESTING ****" << endl;
cout << endl;
return retorno;
}
This way you could possibly identify both if the problem occurs inside of your function or if the problem is occurring because of the way you may be printing your result as well.

How can the symbols be bigger than each other? Or do I not understand something? Could you explain what this code does?

Is it ok to compare symbols with each other?
#include <iostream>
using namespace std;// For Example, Why if "k = 4" it outputs "r o" ? //
int main() {
char word[] = "programming";
int k;
cin >> k;
for (int i = 0; i < k; i++)
if (word[i] > word[i + 1]) {
cout << word[i] << endl;
}
}
The char data type is an integral type, meaning the underlying value is stored as an integer. Moreover, the integer stored by a char variable is intepreted as an ASCII character.
ASCII specifies a way to map english characters(and some other few symbols) to numbers between 0 and 127. That is, each english character(and some other few symbols) has a corresponding number between 0 and 127. This number is formally called a code point.
For example, the code point for the english character a is 97. Similarly, the code point for the english character H is 72. You can find the list of code points for all the characters here.
The important thing to note here is that the underlying value of a char variable is stored as an integer. Lets take some examples to clarify this,
char var1 = 'a'; //here var1 is stored as the integer 97
char var2 = 'H'; //here var2 is stored as the integer 72
In the above snippet, var1 is stored as the integer 97 because the code point for the english character a is 97. Similarly, var2 is stored as the integer 72 because the english character H corresponds to the code point 72.
Now lets come back to your original question. In particular what happens when k =4.
For k = 4, the for loop will be executed 4 times.
Iteration 0: Here i = 0
The if block basically translates to:
if (word[0] > word[0 + 1]) {
cout << word[0] << endl;
}
which is:
if ('p' > 'r') {
cout << 'p' << endl;
}
which is(using the ascii table):
if (112 > 114) {
cout << 'p' << endl;
}
since the condition inside if is false, the body of the if block will not be executed and you'll get no output.
Iteration 1: Here i = 1
The if block basically translates to:
if (word[1] > word[1 + 1]) {
cout << word[1] << endl;
}
which is:
if ('r' > 'o') {
cout << 'r' << endl;
}
which is(using the ascii table):
if (114 > 111) {
cout << 'r' << endl;
}
since the condition inside if is true, the body of the if block will be executed and you'll get r as output(which is followed by a newline).
Iteration 2: Here i = 2
The if block basically translates to:
if (word[2] > word[2 + 1]) {
cout << word[2] << endl;
}
which is:
if ('o' > 'g') {
cout << 'o' << endl;
}
which is(using the ascii table):
if (111 > 103) {
cout << 'o' << endl;
}
since the condition inside if is true, the body of the if block will be executed and you'll get o as output(which is followed by a newline).
Iteration 3: Here i = 3
The if block basically translates to:
if (word[3] > word[3 + 1]) {
cout << word[3] << endl;
}
which is:
if ('g' > 'r') {
cout << 'g' << endl;
}
which is(using the ascii table):
if (103 > 114) {
cout << 'g' << endl;
}
since the condition inside if is false, the body of the if block will not be executed and you'll get no output.
Hence you get the output:
r
o
Is it ok to compare symbols with each other?
Yes, it is OK.
Why if "k = 4" it outputs "r o" ?
Character types are integers. Each integer value of the character type is mapped to a symbol1. This mapping is called a character set or character encoding.
In the character encoding of the system where you ran the program, the value of the character that maps to the symbol 'r' has a greater value than the character that maps to the symbol 'o', while that 'o' character has smaller value than the character that maps to 'g'.
1 This is a simplification. There are special non-printable charcters such as null terminator which aren't symbols as such. Furthermore, the mapping isn't so simple in case of variable length encodings (Unicode).

String after appending Char changning its size

I want to test what if string append char's size, and below is the outcome.
I know that the string end with the null character, but why the outcome is like that?
#include <iostream>
#include <string>
using namespace std;
int main(){
string a = "" + 'a'; //3
string b = "" + '1'; //2
string c = "a" + 'a'; //2
string d = "1" + '1'; //3
string e = "\0" + 'a'; //20
string f = "\0" + '1'; //1
string g = "a" + '\0'; //1
string h = "1" + '\0'; //1
string i = "" + '\0'; //0
string j = "" + '\0'; //0
cout << a.size() << endl;
cout << b.size() << endl;
cout << c.size() << endl;
cout << d.size() << endl;
cout << e.size() << endl;
cout << f.size() << endl;
cout << g.size() << endl;
cout << h.size() << endl;
cout << i.size() << endl;
cout << j.size() << endl;
return 0;
}
Your code is not doing what you think.
String literals decay to const char *, and char is an integer type. If you try to sum them, the compiler finds that the simplest way to make sense of that stuff is to convert chars to ints, so the result is performing pointer arithmetic over the string literals - e.g. ""+'a' goes to the 97th character in memory after the beginning of the string literal "" (if 'a' is represented by 97 on your platform).
This results in garbage being passed to the string constructor, which will store inside the string being constructed whatever it finds at these locations of memory until it founds a \0 terminator. Hence the "strange" results you get (which aren't reproducible, since the exact memory layout of the string table depends from the compiler).
Of course all this is undefined behavior as far as the standard is concerned (you are accessing char arrays outside their bounds, apart from the cases where you add \0).
To make your code do what you mean, at least one of the operands must be of type string:
string c = string("a") + 'a';
or
string c = "a" + string("a");
so the compiler will see the relevant overloads of operator+ that involve std::string.
Most of your initializers have undefined behaviour. Consider, for example:
string a = "" + 'a';
You are adding a char to a char pointer. This advances the pointer by the ASCII value of the char, and uses the resulting (undefined) C string to initialize a.
To fix, change the above to:
string a = string("") + 'a';

'Try This' exercise on Programming Principles and Practice Using C++, For iteration

I'm studying in this book (self study) and I'd really appreciate if you could help me with a little 'try this' exercise.
This is the code I wrote:
#include "../../../std_lib_facilities.h"
int main()
{
for (char i ='a'; i <='z'; ++i) {
int x = i;
cout << i << '\t' << x << '\n';
}
keep_window_open();
return 0;
}
The next step, according to the book, is: "[...] then modify your program to also write out a table of the integer values for uppercase letters and digits"
Is there a function to do that, or do I simply have to repeat the loop starting from A?
Thanks
Yes, repeat the loop from 'A' to 'Z' and '0' to '9'.
Assuming your book has covered functions (which it may not have), you might refactor your for loop into its own function perhaps called displayCharactersInTable which takes as arguments the first character and last character. Those would replace the use of 'a' and 'z' in the loop. Thus your main function would look like:
...
displayCharactersInTable('a', 'z');
displayCharactersInTable('A', 'Z');
displayCharactersInTable('0', '9');
...
const char lc_alphabet[] = "abcdefghijklmnopqrstuvwxyz";
const char uc_alphabet[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int main() {
for (const char *cur = lc_alphabet; cur < lc_alphabet + sizeof(lc_alphabet); ++cur)
std::cout << *cur << \t << (int)*cur << '\n';
for (const char *cur = uc_alphabet; cur < uc_alphabet + sizeof(uc_alphabet); ++cur)
std::cout << *cur << \t << (int)*cur << '\n';
return 0;
}
This code does not assume that character representations are contiguous (or even increasing alphabetically), so it will work for all character encodings.
int b = 97; // the corresponding decimal ascii code for 'a'
int a = 65; // the corresponding decimal ascii code for 'A'
for(int i = 0; i < 26; ++i)
cout << char('A' + i) << '\t' << a << '\t' << char('a' + i) << '\t' << b << '\n'; //print out 'A' and add i, print out a, print out 'a' and add i, print out b
++a; //increment a by 1
++b; //increment b by 1

C++ Storing a list of addresses to array to parse raw non terminated text?

I'm just starting out programming but I've had a lot of ideas about how to make my life easier when parsing files by making a program that maps addresses of data when read into memory from a file.
Note: I cut down the wall text here's the problem in a nutshell
How does one parse an array of chars with no null terminator but the words all begin with uppercase letters so Capital can be used as delimiter?
Basically I want to parse text file that is just 'WordWordWord' and send each word to a to it's own separate string variable then be able to write each word to a text file with a newline added.
I wanted to do some more advanced stuff but I was asked to cut the wall of text so that will do for now :)
//pointers and other values like file opening were declared
int len = (int) strlen( words2 );
cout << "\nSize of Words2 is : " << len << " bytes\n";
// Loops through array if uppercase then...
for (int i = 0; i < len; i++)
{
if (isupper(words2[i]))
{
// Output the contents of words2
cout << "\n Words2 is upper : " << words2[i] << "\n";
b1 = &words2[i];
//output the address of b1 and the intvalue of words2[var]
cout << "\nChar address is " << &b1 << " word address is " << (int) words2[i] << "\n";
cout << "\nChar string is " << b1 << " address +1 "<< &b1+1 <<"\n and string is " << b1+1 << "\n";
}
cout << "\nItem I is : i " << i << " and words2 is " << words2[i] << "\n";
}
fin.clear();
fin.close();
fout.close();
Easy. Use Boost.Tokenizer, with char_separator("", "ABCDEFGHIJKLMNOPQRSTUVWXYZ"). "" is the set of dropped separators, and A-Z is the set of kept separators. (If you'd used A-Z as dropped separators, you'd get ord ord ord because you'd drop the W.)
Since you also
wanted to do some more advanced stuff
I would have a look a Boost.Regex from the get go. This is a good library for doing textual manipulations.
vector<char *> parsedStrings;
char * words = "HelloHelloHello";
int stringStartAddress = 0;
for (int i = 0; i <= strlen(words); i++)
{
/* Parses word if current char is uppercase or
if it's the last char and an uppercase char was previously matched */
if (isupper(words[i]) || ((i == strlen(words)) && (stringStartAddress != 0)))
{
// Current char is first uppercase char matched, so don't parse word
if (stringStartAddress == 0)
{
stringStartAddress = ((int)(words + i));
continue;
}
int newStringLength = ((int)(words + i)) - stringStartAddress;
char * newString = new char[newStringLength + 1];
// Copy each char from previous uppercase char up to current char
for (int j = 0; j < newStringLength; j++)
{
// Cast integer address of char to a char pointer and then get the char by dereferencing the pointer
// Increment address to that of the next char
newString[j] = *((char *)stringStartAddress++);
}
newString[newStringLength] = '\0'; // add null-terminator to string
parsedStrings.push_back(newString);
}
}