What is difference between my atoi() calls?

What is difference between my atoi() calls? - c++

I have a big number stored in a string and try to extract a single digit. But what are the differences between those calls?
#include <iostream>
#include <string>
int main(){
std::string bigNumber = "93485720394857230";
char tmp = bigNumber.at(5);
int digit = atoi(&tmp);
int digit2 = atoi(&bigNumber.at(5))
int digit3 = atoi(&bigNumber.at(12));
std::cout << "digit: " << digit << std::endl;
std::cout << "digit2: " << digit2 << std::endl;
std::cout << "digit3: " << digit3 << std::endl;
}
This will produce the following output.
digit: 7
digit2: 2147483647
digit3: 57230
The first one is the desired result. The second one seems to me to be a random number, which I cannot find in the string. The third one is the end of the string, but not just a single digit as I expected, but up from the 12th index to the end of the string. Can somebody explain the different outputs to me?
EDIT: Would this be an acceptable solution?
char tmp[2] = {bigNumber.at(5), '\0'};
int digit = atoi(tmp);
std::cout << "digit: " << digit << std::endl;

It is all more or less explicable.
int main(){
std::string bigNumber = "93485720394857230";
This line copies the single character '5' into the character variable. atoi will convert this correctly. atoi expects that the string parameter is a valid 0 terminated string. &tmp is only a pointer to the character variable - the behaviour of this call is undefined since the memory immediately following the character in memory is unknown. To be exact, you would have to create a null terminated string and pass that in.*
char tmp = bigNumber.at(5);
int digit = atoi(&tmp);
This line gets a pointer to the character in position 5 in the string. This happens to be a pointer into the original big number string above - so the string parameter to atoi looks like the string "5720394857230". atoi will clearly oveflow trying to turn this into an integer since no 32 bit integer will hold this.
int digit2 = atoi(&bigNumber.at(5))
This line gets a pointer into the string at position 12. The parameter to atoi is the string
"57230". This is converted into the integer 57230 correctly.
int digit3 = atoi(&bigNumber.at(12));
...
}
Since you are using C++, there are nicer methods to convert strings of characters into integers. One that I am partial to is the Boost lexical_cast library. You would use it like this:
char tmp = bigNumber.at(5);
// convert the character to a string then to an integer
int digit = boost::lexical_cast<int>(std::string(tmp));
// this copies the whole target string at position 5 and then attempts conversion
// if the conversion fails, then a bad_lexical_cast is thrown
int digit2=boost::lexical_cast<int>(std::string(bigNumber.at(5)));
* Strictly, atoi will scan through the numeric characters until a non-numeric one is found. It is clearly undefined when it would find one and what it will do when reading over invalid memory locations.

I know why the 2nd number is displayed.
From the atoi reference.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.
2147483647 is INT_MAX

bigNumber.at() doesn't return a new string with a single character but the address of a character in the string. So the second call is actually:
atoi("720394857230")
which causes the internal algorithm to overflow.
Also, the first call is very dangerous since it depends on the (random) value in memory at (&tmp)+1.
You have to allocate a string with two characters, assign the single character from bigNumber.at() to the first and \0 to the second and then call atoi() with the address of the temporary string.

The argument to atoi should be a zero-terminated string.

Function at gives pointer to char in the string. Function atoi converts string to int, not only one char.

Related

Are std::string with null-character possible?

I initialized a C++ string with a string literal and replaced a char with NULL.
When printed with cout << the full string is printed and the NULL char prints as blank.
When printed as c_str the string print stop at the NULL char as expected.
I'm a little confused. Does the action came from cout? or string?
int main(){
std::string a("ab0cd");
a[2] = '\0'; // '\0' is null char
std::cout << a << std::endl; // abcd
std::cout << a.c_str() << std::endl; // ab
}
Test it online.
I'm not sure whether the environment is related, anyway, I work with VSCode in Windows 10

First you can narrow down your program to the following:
#include <iostream>
#include <string>
int main(){
std::string a("ab0cd");
a[2] = '\0'; // replace '0' with '\0' (same result as NULL, just cleaner)
std::cout << a << "->" << a.c_str();
}
This prints
abcd->ab
That's because the length of a std::string is known. So it will print all of it's characters and not stop when encountering the null-character. The null-character '\0' (which is equivalent to the value of NULL [both have a value of 0, with different types]), is not printable, so you see only 4 characters. (But this depends on the terminal you use, some might print a placeholder instead)
A const char* represents (usually) a null-terminated string. So when printing a const char* it's length is not known and characters are printed until a null-character is encountered.

Contrary to what you seem to think, C++ string are not null terminated.
The difference in behavior came from the << operator overloads.
This code:
cout << a.c_str(); // a.c_str() is char*
As explained here, use the << overloads that came with cout, it print a char array C style and stop at the first null char. (the char array should be null terminated).
This code:
cout << a; // a is string
As explained here, use the << overloads that came with string, it print a string object that internally known is length and accept null char.

string end limit (boundary) is not 0 (NULL) like simple char* but its size keep internally in its member data as it's actually user-defined type (an instantiated object) as opposed to primitive type, so
int main(){
string a("abc0d");
a[3] = 0; // '\0' is null char
a.resize(2);
std::cout << a << std::endl; // ab
std::cout << a.c_str() << std::endl; // ab
}
i'm sorry change your code to be more comfortable, watch as it results in
ab
ab
good learning: http://www.cplusplus.com/reference/string/string/find/index.html

Null terminator \1?

I am doing some coding for a beginner C++ class I am taking. In the class, we have to take code submitted by another student and fix a bug they created. The code is as follows:
#include <iostream>
using namespace std;
int countChars(char *, char); // Function prototype
int main()
{
const int SIZE = 51; // Array size
char userString[SIZE]; // To hold a string
char letter; // The character to count
// Get a string from the user.
cout << "Enter a string (up to 50 characters): ";
cin.getline(userString, SIZE);
// Get a character to count occurrences of within the string.
cout << "Enter a character and I will tell you how many\n";
cout << "times it appears in the string: ";
cin >> letter;
// Display the number of times the character appears.
cout << letter << " appears ";
cout << countChars(userString, letter) << " times.\n";
return 0;
}
int countChars(char *strPtr, char ch)
{
int times = 0; // Number of times ch appears in the string
// Step through the string counting occurrences of ch.
while (*strPtr != '\0')// ***** There was a one placed inside the null operator, however, this is not a syntax error, but rather just incorrect.
{
if (*strPtr == ch) // If the current character equals ch...
times++; // ... increment the counter
strPtr++; // Go to the next char in the string.
}
return times;
}
The student changed the function such that it had the null terminator as \10, which did not cause a compile nor run time error. After playing with it, I found that it could also be \1 and still work. How is this possible. I am a complete noob, so I apologize if this is a stupid question, but I assumed that this was a boolean operator and 1 was true and 0 was false. The question is why will \10 and \1 work as the null terminator. Thank you in advance!

'\0' means "the character having the integer representation 0." Similarly, '\10' means "the character having the integer representation 10." That's why it's not a compilation error--only a logical error.

The "\0" is the only one to be called a NULL terminator. Even though "\1" or "10" or even "\103" work, only "\0" is referred to as the NULL terminator. I'll explain why.
In "\0" the 0 refers to the OCT value in the ascii table (see picture below). In the ascii table, there is no character whose OCT value is 0, therefore, if we try to use 0, it is referred to as a NULL terminator, because it points to the ground and nothing meaningful or useful.
Now, why does "\10" and "\1" work? Because these refer to OCT values 1 and 10, which can be mapped to a character on the ascii table, notably Start of Heading and Baskspace. Similarily, if you pick a OCT value that points to a letter, punctuation mark or a number, like for example, "\102" points to B, it will output B.
If you try an invalid number, like "\8", then it will simply output 8 on the screen, since it does not refer to a valid character on the ascii table.
See this code, it summarizes all the different types of pointers:
#include <iostream>
using namespace std;
int main(void)
{
cout << "\0" << endl; // This is NULL; points to the ground
cout << "\8"; << endl; // Invalid OCT value; outputs invalid number input with only a warning. No compilation error. Here, 8 will be output
cout << "\102"; // Valid; points to B. Any valid OCT value will point to a character, except NULL.
}
EDIT: After doing some research, I noticed that the correct way to use the escape sequences is indeed with only 3 numbers at minimum. Therefore, even the NULL terminator should technically be written as "\000", following the octal method. However, the compilers can apparently tell which octal value is being referred to even if it is not written in octal format. In other words, "\1" is interpreted by the compiler as "\001". Just some extra information.
I found this information at: C++ Character Literals

Storing data in char array causing corruption around variable

I am working on a C++ project and I am having an issue.
Below is my code
tempfingerprint = libssh2_hostkey_hash(session, LIBSSH2_HOSTKEY_TYPE_RSA);
char temp[48];
memset(temp, 0, sizeof(temp));
for (i = 0; i < 16; i++)
{
//fingerprintstream << (unsigned char)tempfingerprint[i] << ":";
if (temp[0] == 0)
{
sprintf(temp, "%02X:", (unsigned char)tempfingerprint[i]);
}
else
{
//sprintf(temp, "%s:%02X", temp, (unsigned char)tempfingerprint[i]);
char characters[3];
memset(characters, 0, sizeof(characters));
//If less than 16, then add the colon (:) to the end otherwise don't bother as we're at the end of the fingerprint
sprintf(characters, "%02X:", (unsigned char)tempfingerprint[i]);
strcat(temp, characters);
}
}
//Remove the end colon as its not needed. 48 Will already be null terminated, so the previous will contain the last colon
temp[47] = 0;
return string(temp);
When I run my app, I get the following error from visual studio
Run-Time-Check Failure #2 - Stack around the variable 'temp' was corrupted.
I've ran the same code on Linux through Valgrind and no errors were shown so I'm not sure what the problem is with Windows.

Here's an approach using on what Paul McKenzie's talking about (though he might implement it differently) based on it looks like you were trying to do with the stream
#include <iostream>
#include <sstream>
#include <iomanip> // output format modifiers
using namespace std;
int main()
{
stringstream fingerprintstream;
// set up the stream to print uppercase hex with 0 padding if required
fingerprintstream << hex << uppercase << setfill('0');
// print out the first value without a ':'
fingerprintstream << setw(2) << 0;
for (int i = 1; i < 16; i++) // starting at 1 because first has already been handled.
{
// print out the rest prepending the ':'
fingerprintstream << ":" << setw(2) << i;
}
// print results
std::cout << fingerprintstream.str();
return 0;
}
Output:
00:01:02:03:04:05:06:07:08:09:0A:0B:0C:0D:0E:0F
Just realized what I think OP ran up against with the garbage output. When you output a number, << will use the appropriate conversion to get text, but if you output a character << prints the character. So fingerprintstream << (unsigned char)tempfingerprint[i]; takes the binary value at tempfingerprint[i] and, thanks to the cast, tries to render it as a character. Rather than "97", you will get (assuming ASCII) "a". A large amount of what you try to print will give nonsense characters.
Example: If I change
fingerprintstream << ":" << setw(2) << i;
to
fingerprintstream << ":" << setw(2) << (unsigned char)i;
the output becomes
0?:0?:0?:0?:0?:0?:0?:0?:0?:0?:0 :0
:0?:0?:0
:0?:0?
Note the tab and the line feeds.
I need to know the definition of tempfingerprint to be sure, but you can probably solve the garbage output problem by removing the cast.
Based on new information, tempfingerprint is const char *, so tempfingerprint[i] is a char and will be printed as a character.
We want a number, so we have to force the sucker to be an integer.
static_cast<unsigned int>(tempfingerprint[i]&0xFF)
the &0xFF masks out everything but the last byte, eliminating sign extension of negative numbers into huge positive numbers when displayed unsigned.

There are, as far as I see, two issues in the code which lead to exceeding array boundaries:
First, with char temp[48] you reserve exactly 48 characters for storing results; However, when calling strcat(temp, characters) with the 16th value, and characters comprises at least the characters including the colon, then temp will comprise 16*3 digits/colons + one terminating '\0'-character, i.e. 49 characters (not 48). Note that strcat automatically appends a string terminating char.
Second, you define char characters[3] such that you reserve place for two digits and the colon, but not for the terminating '\0'-character. Hence, an sprintf(characters, "%02X:",...) will exceed characterss array bounds, as sprintf also appends the string terminator.
So, if you do not want to rewrite your code in general, changing your definitions to char temp[49] and char characters[4] will solve the problem.

pointer arithmetic on arrays

When I run the code below my output is not what I expect.
My way of understanding it is that ptr points to the address of the first element of the Str array. I think ptr + 5 should lead to the + 5th element which is f. So the output should only display f and not both fg.
Why is it showing fg? Does it have to do with how cout displays an array?
#include <iostream>
using namespace std;
int main()
{
char *ptr;
char Str[] = "abcdefg";
ptr = Str;
ptr += 5;
cout << ptr;
return 0;
}
Expected output: f
Actual output: fg

When you declare:
char Str[] = "abcdefg"
The string abcdefg is stored implicitly with an extra character \0 which marks the end of the string.
So, when you cout a char* the output will be all the characters stored where the char * points and all the characters stored in consecutive memory locations after the char* until a \0 character is encountered at one of the memory locations! Since, \0 character is after g in your example hence 2 characters are printed.
In case you only want to print the current character, you shall do this ::
cout << *ptr;

Why is it showing fg?
The reason why std::cout << char* prints the string till the end instead of a single char of the string is , because std::cout treats a char * as a pointer to the first character of a C-style string and prints it as such.1
Your array:
char Str[] = "abcdefg";
gets implicitly assigned an '\0'at the end and it is treated as a C-style string.
Does it have to do with how std::cout displays an array?
This has to do with how std::cout handles C-style strings, to test this change the array type to int and see the difference, i.e. it will print a single element.
1. This is because in C there are no string types and strings are manipulated through pointers of type char, indicating the beginning and termination character: '\0', indicating the end.

convert a string to integer in c++ without losing leading zeros

hello i have a problem with converting a string of numbers to integer.
the problem is that using atoi() to convert the string to integer i loose the leading zeros.
can you please tell me a way to do that without loosing the leading zeros?
#include <fstream>
#include <iostream>
#include <iomanip>
#include <string>
using namespace std;
struct Book{
int id;
string title;
};
struct Author{
string firstName;
string lastName;
};
Author authorInfo[200];
Book bookInfo[200];
void load ( void )
{
int count = 0;
string temp;
ifstream fin;
fin.open("myfile.txt");
if (!fin.is_open())
{
cout << "Unable to open myfile.txt file\n";
exit(1);
}
while (fin.good())
{
getline(fin, temp, '#');
bookInfo[count].id = atoi(temp.c_str());
getline(fin, bookInfo[count].title, '#');
getline(fin, authorInfo[count].firstName, '#');
getline(fin, authorInfo[count].lastName, '#');
count++;
}
fin.close();
}

Ok, so I don't think you actually WANT to store the leading zeros. I think you want to DISPLAY a consistent number of digits in the output.
So, for example, to display a fixed size id with 5 digits [note that an id of 100000 will still display in 6 digits - all it does here is make sure it's always at least 5 digits, and fill it with '0' if the number is not big enough], we could do:
std::cout << std::setw(5) << std::setfill('0') << id << ...
Alternatively, as suggested in other answers, you don't want to use the ID in a form that is an integer, you could just store it as a string - unless you are going to do math on it, all that it changes is that it takes up a tiny bit more memory per book.

An integer does not have leading zeroes. Or perhaps, more correctly, it has between zero and an infinite number of them. The numbers 42, 042 and 000000042 (other than in the source code where a leading 0 indicates a different base) are all forty-two.
If you want to keep the leading zeroes, either leave it as a string or store more information somewhere as to how big the original string was. Something like this would be a good start:
#include <iostream>
#include <iomanip>
#include <cstring>
#include <cstdio>
#include <cstdlib>
int main (void) {
// Test data.
const char *sval = "0042";
// Get original size.
int size = strlen (sval);
// Convert to int (without leading 0s).
// strtol would be better for detecting bad numbers.
int ival = atoi (sval);
// Output details.
std::cout << sval << " has size of " << size << ".\n";
std::cout << "Integer value is " << ival << ".\n";
std::cout << "Recovered value is " << std::setw(size)
<< std::setfill('0') << ival << ".\n";
return 0;
}
which outputs:
0042 has size of 4.
Integer value is 42.
Recovered value is 0042.

A = strlen(string) returns the number of characters in your string (say number of digits comprehensive of leading zeros)
B = log10(atoi(string)) + 1 returns the number of digits in your number
A - B => number of leading zeros.
Now you can format those as you prefer.

There's no such thing as "leading zeros" in a number. "Leading zeros" is a property of a specific notation, like decimal ASCII representation of a number. Once you convert that notation to a conceptually abstract numerical representation, such metric as "number of leading zeros" is no longer applicable (at least in decimal terms). It is lost without a trace.
A number is a number. It doesn't have any "zeros", leading or otherwise.
The only thing you can do is to memorize how many leading zeros you had in the original notation (or how wide was the field), and then later, when you will convert the number back to decimal ASCII representation, re-create the proper number of leading zeros using that stored information.
BTW, in your case, when the input number represents a book ID with some pre-determined formatting (like leading zeros), you might consider a different approach: don't convert your book ID to int. Keep it as a string. It is not like you are going to have to perform arithmetic operations on book IDs, is it? Most likely all you'll need is relational and equality comparisons, which can be performed on strings.

I have encountered this type of problem last month!
I think you can use the Format() method provided by Class CString:
CString::Format() formats and stores a series of characters and values in the CString. Each optional argument (if any) is converted and output according to the corresponding format specification in pszFormat or from the string resource identified by nFormatID.
For example:
CString m_NodeName;
m_NodeName.Format(_T("%.4d"),Recv[2]*100+Recv[3]);
// %.4d means the argument will be formatted as an integer,
// 4 digits wide, with unused digits filled with leading zeroes
For the detail you can find here:
http://msdn.microsoft.com/zh-cn/library/18he3sk6(v=vs.100).aspx

If you need the leading zeros, then int is not the correct data type to use. In your case you may be better off just storing the original string.

There is no way of storing an int with leading 0s.
What you may want to do instead, is have a class do it for you:
class intWithLeadingZeros {
int number;
int numberOfLeadingZeros;
intWithLeadingZeros( string val )
{
// trivial code to break down string into zeros and number
}
string toString() {
// trivial code that concatenates leading 0s and number
}
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js