C++ casting a single element of a string

C++ casting a single element of a string - c++

I have recently tried to convert an element of a string (built from digits only) to int and in order to avoid bizzare results (like 49 instead of 1) I had to use 'stringstream' (not even knowing what it is and how it works) instead of int variable = static_cast<int>(my_string_name[digit_position]);.
According to what I've read about these 'streams' the type is irrelevant when you use them so I guess that was the case. The question is: What type did I actually want to convert FROM and why didn't it work? I thought that a string element is a char but apparently not if the conversion didn't work.

I thought that a string element is a char but apparently not if the conversion didn't work.
Yes, it is a char. However, the value of this char is not how it is rendered on the screen. This actually depends on the encoding of the string. For example, if your string is encoded in ASCII, you can see that the "bizarre" 49 value that you got must have been represented as a '1' character. Building on that, you can subtract the ASCII code of the character '0' to get the numeric values of these characters. However, be very careful: this greatly depends on your encoding. It is possible that your string may use multi-byte characters when you can't even index them this naively.

I believe that what you are looking for is not a cast, but the std::stoi (http://en.cppreference.com/w/cpp/string/basic_string/stol) function, which converts a string representation of an integer to int.

char values are already numeric, such a static_cast won't help you much here. What you actually need is:
int variable = my_string_name[digit_position] - '0';
For ASCII values the digits '0' - '9' are encoded as decimal numbers from 49 - 59. So to get the numerical value that is represented by the digit we need to substract the 1st digits value of 49.
To make the code portable for other character encoding tables (e.g. EBCDIC) the '0' is substracted.

Assume you have a string: std::string myString("1");. You could access the first element which is type char and you convert it to type int via static_cast. However, your expectation of what will happen is incorrect.
Upon looking at the ASCII table for number 1 you will see it has a base 10 integer value of 49, which is why the cast gives the output you're seeing. Instead you could use a conversion function like atoi (e.g., atoi(&myString[0]);).

Your "bizzare results" aren't actually that bizzare. When accessing a single character of a std::string like so : my_string_name[digit_position] you're getting a char. Putting that char in an int will convert '1' to an int. Looking at http://www.asciitable.com/ we can see that 49 is the representation for '1'. To get the actual number you should do int number = my_string_name[digit_position] - '0';

You are working with the ASCII char code for a number, not an actual number. However because the ASCII char codes are laid out in order:
'0'
'1'
'2'
'3'
'4'
'5'
'6'
'7'
'8'
'9'
So if you are converting a single character and you're character is in base-10 then you can use:
int variable = my_string_name[digit_position] - '0'
But this depends upon the use of character codes which are in sequence. Specifically, if there were any other characters interspersed here this wouldn't work at all (I don't know any character code mapping that doesn't have these listed sequentially.)
But to guarantee the conversion you should use stoi:
int variable = stoi(my_string_name.substr(digit_position, 1))

Related

What does str[i] - 'a' mean? [duplicate]

This question already has answers here:
What's the meaning of subtracting 'a' from elements of char array
(5 answers)
Closed 1 year ago.
If we were to be iterating through a string str, and inside the for loop was a line like str[i] - 'a'... what does that exactly mean? str[i] would be returning a character from the string str, and then we would be subtracting 'a' from it? I'm just confused by that.

Assuming ASCII (or any encoding in which lowercase letters are in a compact sequence) it is the number (starting at 0) of the lowercase letter in str[i] (the ith position in str).
Very bad coding, unless you are absolutely sure it is a lowercase letter, and not e.g. LATIN-1 or some other 8-bit encoding in which you have lowercase letters like 'ñ' or 'á' or even 'ß' which are strewn about the high --first bit one-- space).

According to this The bracket operator [] of the std::string type returns a reference to a char.
Since (signed) char can be natively interpreted by the compiler as an 8-bit signed integer in the range -128 to 127, then all due mathematical operations can be made.
An example of common use AFAIK: subtracting or operating with chars on a (text) string can be used for adding salt to some cryptography algorithm.

C++ Turning Character types into int type

So I read and was taught that subtracting '0' from my given character turns it into an int, however my Visual Studio isn't recognizing that here, saying a value of type "const char*" cannot be used to initialize an entity of type int in C++ programming here.
bigint::bigint(const char* number) : bigint() {
int number1 = number - '0'; // error code
for (int i = 0; number1 != 0 ; ++i)
{
digits[i] = number1 % 10;
number1 /= 10;
digits[i] = number1;
}
}
The goal of the first half is to simply turn the given number into a type int. The second half is outputting that number backwards with no leading zeroes. Please note this function is apart of the class declared given in a header file here:
class bigint {
public:
static const int MAX_DIGITS = 50;
private:
int digits[MAX_DIGITS];
public:
// constructors
bigint();
bigint(int number);
bigint(const char * number);
}
Is there any way to convert the char parameter to an int so I can then output an int? Without using the std library or strlen, since I know there is a way to use the '0' char but I can't seem to be doing it right.

You can turn a single character in the range '0'..'9' into a single digit 0..9 by subtracting '0', but you cannot turn a string of characters into a number by subtracting '0'. You need a parsing function like std::stoi() to do the conversion work character-by-character.
But that's not what you need here. If you convert the string to a number, you then have to take the number apart. The string is already in pieces, so:
bigint::bigint(const char* number) : bigint() {
while (number) // keep looping until we hit the string's null terminator
{
digits[i] = number - '0'; // store the digit for the current character
number++; // advance the string to the next character
}
}
There could be some extra work involved in a more advanced version, such as sizing digits appropriately to fit the number of digits in number. Currently we have no way to know how many slots are actually in use in digits, and this will lead to problems later when the program has to figure out where to stop reading digits.

I don't know what your understanding is, so I will go over everything I see in the code snippet.
First, what you're passing to the function is a pointer to a char, with const keyword making the char immutable or "read only" if you prefer.
A char is actually a 8-bit sized 1 integer. It can store a numerical value in binary form, which can be also interpreted as a character.
Fundamental types - cppreference.com
Standard also expects char to be a "type for character representation". It could be represented in ASCII code, but it could be something else like EBCDIC maybe, I'm not sure. For future reference just remember that ASCII is not guaranteed, although you're likely to never use a system where it's no ASCII (if I'm correct). But it's not so much that char is somehow enforcing encoding - it's the functions that you pass those chars and char pointers to, that interpret their content as characters in ASCII encoding, while on some obscure or legacy platforms they could actually interpret them as characters in some less common encoding. Standard however demands that encoding used has this property: codes for characters '0' to '9' are subsequent, and thus '9' - '0' means: subtract code of '0' from code of '9'. The result is 9, because code for '9' is 9 positions from code for '0' in ASCII. Ranges 'a'-'z' and 'A'-'Z' have this quality as well, in case you need that, but it's a little bit trickier if your input is in base higher than 10, like a popular base of 16 called hexadecimal.
A pointer stores an address, so the most basic functionality for it is to "point" to a variable. But it can be used in various ways, one of which, very frequent in C, is to store address of the beginning of an array of variables of the same type. Those could be chars. We could interpret such an array as a line of text, or a string (a concept, not to be confused with C++ specific string class).
Since a pointer does not contain information on length or end of such an array, we need to get that information across to the function we pass the pointer to. Sometimes we can just provide the length, sometimes we provide the end pointer. When dealing with "lines of text" or c-style strings, we use (and c standard library functions expect) what is callled a null-terminated string. In such a string, the first char after the last one used for a line is a null, which is, to simplify, basically a 0. A 0, but not a '0'.
So what you're passing to the function, and what you interpret as, say 416, is actually a pointer to a place in memory where '4' is econded and stored as a number, followed by '1' and then '6', taking up three bytes. And depending on how you obtained this line of text, '6' is probably followed by a NULL, that is - a zero.
NULL - cppreference.com
Conversion of such a string to a number first requires a data type able to hold it. In case of 416 it could be anything from short upwards. If you wanted to do that on your own, you would need to iterate over entire line of text and add the numbers multiplied by proper powers of 10, take care of signedness too and maybe check if there are any edge cases. You could however use a standard function like int atoi (const char * str);
atoi - cplusplus.com
Now, that would be nice of course, but you're trying to work with "bigints". However you define them, it means your class' purpose is to deal with numbers to big to be stored in built-in types. So there is no way you can convert them just like that.
What you're trying to do right now seems to be a constructor that creates a bigint out of number represented as a c style string. How shall I put it... you want to store your bigint internally as an array of it's digits in base 10 (a good choice for code simplicity, readability and maintainability, as well as interoperation with base 10 textual representation, but it doesn't make efficient use of memory and processing power.) and your input is also an array of digits in base 10, except internally you're storing numbers as numbers, while your input is encoded characters. You need to:
sanitize the input (you need criteria for what kind of input is acceptable, fe. if there can be any leading or trailing whitespace, can the number be followed by any non-numerical characters to be discarded, how to represent signedness, is + for positive numbers optional or forbidden etc., throw exception if the input is invalid.
convert whatever standard you enforce for your input into whatever uniform standard you employ internally, fe. strip leading whitespace, remove + sign if it's optional and you don't use it internally etc.
when you know which positions in your internal array correspond with which positions in the input string, you can iterate over it and copy every number, decoding it first from ASCII.
A side note - I can't be sure as to what exactly it is that you expect your input to be, because it's only likely that it is a textual representation - as it could just as easily be an array of unencoded chars. Of course it's obviously the former, which I know because of your post, but the function prototype (the line with return type and argument types) does not assure anyone about that. Just another thing to be aware of.
Hope this answer helped you understand what is happening there.
PS. I cannot emphasize strongly enough that the biggest problem with your code is that even if this line worked:
int number1 = number - '0'; // error code
You'd be trying to store a number on the order of 10^50 into a variable capable of holding on the order of 10^9
The crucial part in this problem, which I have a vague feeling you may have found on spoj.com is that you're handling BIGints. Integers too big to be stored in a trivial manner.
1 ) The standard does not actually require for char to be this size directly, but indirectly it requires for it to be at least 8 bits, possibly more on obscure platforms. And yes, I think there were some platforms where it was indeed over 8 bits. Same thing with pointers that may behave strange on obscure architectures.

subtracting characters of numbers

I am trying to understand what is going on with the code:
cout << '5' - '3';
Is what I am printing an int? Why does it automatically change them to ints when I use the subtraction operator?

In C++ character literals just denote integer values.
A basic literal like '5' denotes a char integer value, which with almost all extant character encodings is 48 + 5 (because the character 0 is represented as value 48, and the C++ standard guarantees that the digit values are consecutive, although there's no such guarantee for letters).
Then, when you use them in an arithmetic expression, or even just write +'5', the char values are promoted to int. Or less imprecisely, the “usual arithmetic conversions” kick in, and convert up to the nearest type that is int or *higher that can represent all char values. This change of type affects how e.g. cout will present the value.
* Since a char is a single byte by definition, and since int can't be less than one byte, and since in practice all bits of an int are value representation bits, it's at best only in the most pedantic formal that a char can be converted up to a higher type than int. If that possibility exists in the formal, then it's pure language lawyer stuff.

What you're doing here is subtracting the value for ASCII character '5' from the value for ASCII character '3'. So '5' - '3' is equivalent to 53 - 51 which results in 2.

The ASCII value of character is here
Every character in C programming is given an integer value to represent it. That integer value is known as ASCII value of that character. For example: ASCII value of 'a' is 97. For example: If you try to store character 'a' in a char type variable, ASCII value of that character is stored which is 97.
Subtracting between '5' and '3' means subtracting between their ASCII value. So, replace cout << '5' - '3'; with their ASCII value cout << 53 - 51;. Because Every character in C programming is given an integer value to represent it.
There is a subtraction operation between two integer number, so, it prints a integer 2

Printing char by integer qualifier

I am trying to execute the below program.
#‎include‬ "stdio.h"
#include "string.h"
void main()
{
char c='\08';
printf("%d",c);
}
I'm getting the output as 56 . But for any numbers other than 8 , the output is the number itself , but for 8 the answer is 56.
Can somebody explain ?

A characters that begins with \0 represents Octal number, is the base-8 number system, and uses the digits 0 to 7. So \08 is invalid representation of octal number because 8 ∉ [0, 7], hence you're getting implementation-defined behavior.
Probably your compiler recognize a Multibyte Character '\08' as '\0' one character and '8' as another and interprets as '\08' as '\0' + '8' which makes it '8'. After looking at the ASCII table, you'll note that the decimal value of '8' is 56.
Thanks to #DarkDust, #GrijeshChauhan and #EricPostpischil.

The value '\08' is considered to be a multi-character constant, consisting of \0 (which evaluates to the number 0) and the ASCII character 8 (which evaluates to decimal 56). How it's interpreted is implementation defined. The C99 standard says:
An integer character constant has type int. The value of an integer
character constant containing a single character that maps to a
single-byte execution character is the numerical value of the
representation of the mapped character interpreted as an integer. The
value of an integer character constant containing more than one
character (e.g., 'ab'), or containing a character or escape sequence
that does not map to a single-byte execution character, is
implementation-defined. If an integer character constant contains a
single character or escape sequence, its value is the one that results
when an object with type char whose value is that of the single
character or escape sequence is converted to type int.
So if you would assign '\08' to something bigger than a char, like int or long, it would even be valid. But since you assign it to a char you're "chopping off" some part. Which part is probably also implementation/machine dependent. In your case it happens to gives you value of the 8 (the ASCII character which evaluates to the number 56).
Both GCC and Clang do warn about this problem with "warning: multi-character character constant".

\0 is used to represent octal numbers in C/C++. Octal base numbers are from 0->7 so \08 is a multi-character constant, consisting of \0, the compiler interprets \08 as \0 + 8, which makes it '8' whose ascii value is 56 . Thats why you are getting 56 as output.

As other answers have said, these kind of numbers represent octal characters (base 8). This means that you have to write '\010' for 8, '\011' for 9, etc.
There are other ways to write your assign:
char c = 8;
char c = '\x8'; // hexadecimal (base 16) numbers

Non-Integer numbers in an String and using atoi

If there are non-number characters in a string and you call atoi [I'm assuming wtoi will do the same]. How will atoi treat the string?
Lets say for an example I have the following strings:
"20234543"
"232B"
"B"
I'm sure that 1 will return the integer 20234543. What I'm curious is if 2 will return "232." [Thats what I need to solve my problem]. Also 3 should not return a value. Are these beliefs false? Also... if 2 does act as I believe, how does it handle the e character at the end of the string? [Thats typically used in exponential notation]

You can test this sort of thing yourself. I copied the code from the Cplusplus reference site. It looks like your intuition about the first two examples are correct, but the third example returns '0'. 'E' and 'e' are treated just like 'B' is in the second example also.
So the rules are
On success, the function returns the converted integral number as an int value.
If no valid conversion could be performed, a zero value is returned.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

According to the standard, "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." (7.20.1, Numeric conversion functions in C99).
So, technically, anything could happen. Even for the first case, since INT_MAX is guaranteed to be at least 32767, and since 20234543 is greater than that, it could fail as well.
For better error checking, use strtol:
const char *s = "232B";
char *eptr;
long value = strtol(s, &eptr, 10); /* 10 is the base */
/* now, value is 232, eptr points to "B" */
s = "20234543";
value = strtol(s, &eptr, 10);
s = "123456789012345";
value = strtol(s, &eptr, 10);
/* If there was no overflow, value will contain 123456789012345,
otherwise, value will contain LONG_MAX and errno will be ERANGE */
If you need to parse numbers with "e" in them (exponential notation), then you should use strtod. Of course, such numbers are floating-point, and strtod returns double. If you want to make an integer out of it, you can do a conversion after checking for the correct range.

atoi reads digits from the buffer until it can't any more. It stops when it encounters any character that isn't a digit, except whitespace (which it skips) or a '+' or a '-' before it has seen any digits (which it uses to select the appropriate sign for the result). It returns 0 if it saw no digits.
So to answer your specific questions: 1 returns 20234543. 2 returns 232. 3 returns 0. The character 'e' is not whitespace, a digit, '+' or '-' so atoi stops and returns if it encounters that character.
See also here.

If atoi encounters a non-number character, it returns the number formed up until that point.

I tried using atoi() in a project, but it wouldn't work if there were any non-digit characters in the mix and they came before the digit characters - it'll return zero. It seems to not mind if they come after the digits, for whatever reason.
Here's a pretty bare bones string to int converter I wrote up that doesn't seem to have that problem (bare bones in that it doesn't work with negative numbers and it doesn't incorporate any error handling, but it might be helpful in specific instances). Hopefully it might be helpful.
int stringToInt(std::string newIntString)
{
unsigned int dataElement = 0;
unsigned int i = 0;
while ( i < newIntString.length())
{
if (newIntString[i]>=48 && newIntString[i]<=57)
{
dataElement += static_cast<unsigned int>(newIntString[i]-'0')*(pow(10,newIntString.length()-(i+1)));
}
i++;
}
return dataElement;
}

I blamed myself up to this atoi-function behaviour when I was learning-approached coding program with function calculating integer factorial result given input parameter by launching command line parameter.
atoi-function returns 0 if value is something else than numeral value and "3asdf" returns 3. C -language handles command line input parameters in char -array pointer variable as we all already know.
I was told that down at the book "Linux Hater's Handbook" there's some discussion appealing for computer geeks doesn't really like atoi-function, it's kind of foolish in reason that there's no way to check validity of given input type.
Some guy asked me why I don't brother to use strtol -function located on stdlib.h -library and he gave me an example attached to my factorial-calculating recursive method but I don't care about factorial result is bigger than integer primary type value -range, out of ranged (too large base number). It will result in negative values in my program.
I solved my problem with atoi-function first checking if given user's input parameter is truly numerical value and if that matches, after then I calculate the factorial value.
Using isdigit() -function located on chtype.h -library is following:
int checkInput(char *str[]) {
for (int x = 0; x < strlen(*str); ++x)
{
if (!isdigit(*str[x])) return 1;
}
return 0;
}
My forum-pal down in other Linux programming forum told me that if I would use strtol I could handle the situations with out of ranged values or even parse signed int to unsigned long -type meaning -0 and other negative values are not accepted.
It's important upper on my code check if charachter is not numerical value. Negotation way to check this one the function returns failed results when first numerical value comes next to check in string. (or char array in C)

Writing simple code and looking to see what it does is magical and illuminating.
On point #3, it won't return "nothing." It can't. It'll return something, but that something won't be useful to you.
http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/
On success, the function returns the converted integral number as an int value.
If no valid conversion could be performed, a zero value is returned.
If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ casting a single element of a string - c++

I believe that what you are looking for is not a cast, but the std::stoi (http://en.cppreference.com/w/cpp/string/basic_string/stol) function, which converts a string representation of an integer to int.

Related

What does str[i] - 'a' mean? [duplicate]

C++ Turning Character types into int type

subtracting characters of numbers

Printing char by integer qualifier

Non-Integer numbers in an String and using atoi

Categories

Resources