I need to convert a hex literal character to its value. Consider the following:
char hex1 = 'f'; // hex equals 102, as ´f´ is ASCII 102.
char hexvalue = converter(hex1); // I need on hexvalue 0x0F, or 1111 binary
What shall be the most straightfoward converter function here ?
Thanks for helping.
A straight forward converter function would be to use a lookup array:
unsigned int Convert_Char_Digit_To_Hex(char digit)
{
static const std::string char_to_hex[] = "0123456789ABCDEF";
const std::string::size_type posn =
char_to_hex.find(digit);
if (posn != std::string::npos)
{
return posn;
}
return 0; // Error if here.
}
But why write your own when you can use existing functions to convert from textual representation to internal representation?
See also strtol, strtoul, std::istringstream, sscanf.
Edit 1: Comparisons
Another alternative is to use comparisons and math:
unsigned int Hex_Char_Digit_To_Int(char digit)
{
unsigned int value = 0U;
digit = toupper(digit);
if ((digit >= '0') and (digit <= '9'))
{
value = digit - '0';
}
else
{
if ((digit >= 'A') and (digit <= 'F'))
{
value = digit - 'A' + 10;
}
}
return value;
}
Related
Please help me to identify the error in this program, as for me it's looking correct,I have checked it,but it is giving wrong answers.
In this program I have checked explicitly for A,B,C,D,E,F,and according to them their respective values.
[Edited]:Also,this question relates to how a character number is converted to actual integer number.
#include<iostream>
#include<cmath>
#include<bits/stdc++.h>
using namespace std;
void convert(string num)
{
long int last_digit;
int s=num.length();
int i;
long long int result=0;
reverse(num.begin(),num.end());
for(i=0;i<s;i++)
{
if(num[i]=='a' || num[i]=='A')
{
last_digit=10;
result+=last_digit*pow(16,i);
}
else if(num[i]=='b'|| num[i]=='B')
{
last_digit=11;
result+=last_digit*pow(16,i);
}
else if(num[i]=='c' || num[i]=='C')
{
last_digit=12;
result+=last_digit*pow(16,i);
}
else if(num[i]=='d'|| num[i]=='D' )
{
last_digit=13;
result+=last_digit*pow(16,i);
}
else if(num[i]=='e'|| num[i]=='E' )
{
last_digit=14;
result+=last_digit*pow(16,i);
}
else if(num[i]=='f' || num[i]=='F')
{
last_digit=15;
result+=last_digit*pow(16,i);
}
else {
last_digit=num[i];
result+=last_digit*pow(16,i);
}
}
cout<<result;
}
int main()
{
string hexa;
cout<<"Enter the hexadecimal number:";
getline(cin,hexa);
convert(hexa);
}
Your code is very convoluted and wrong.
You probably want this:
void int convert(string num)
{
long int last_digit;
int s = num.length();
int i;
long long int result = 0;
for (i = 0; i < s; i++)
{
result <<= 4; // multiply by 16, using pow is overkill
auto digit = toupper(num[i]); // convert to upper case
if (digit >= 'A' && digit <= 'F')
last_digit = digit - 'A' + 10; // digit is in range 'A'..'F'
else
last_digit = digit - '0'; // digit is (hopefully) in range '0'..'9'
result += last_digit;
}
cout << result;
}
But this is still not very good:
the function should return a long long int instead of printing the result
a few other thing can be done mor elegantly
So a better version would be this:
#include <iostream>
#include <string>
using namespace std;
long long int convert(const string & num) // always pass objects as const & if possible
{
long long int result = 0;
for (const auto & ch : num) // use range based for loops whenever possible
{
result <<= 4;
auto digit = toupper(ch);
long int last_digit; // declare local variables in the inner most scope
if (digit >= 'A' && digit <= 'F')
last_digit = digit - 'A' + 10;
else
last_digit = digit - '0';
result += last_digit;
}
return result;
}
int main()
{
string hexa;
cout << "Enter the hexadecimal number:";
getline(cin, hexa);
cout << convert(hexa);
}
There is still room for more improvements as the code above assumes that the string to convert contains only hexadecimal characters. Ideally a check for invalid characters should be done somehow. I leave this as an exercise.
The line last_digit = digit - 'A' + 10; assumes that the codes for letters A to F are contiguous, which in theory might not be the case. But the probability that you'll ever encounter an encoding scheme where this is not the case is close to zero though. The vast majority of computer systems in use today use the ASCII encoding scheme, some use EBCDIC, but in both of these encoding schemes the character codes for letters A to F are contiguous. I'm not aware of any other encoding scheme in use today.
Your problem is in the elsecase in which you convert num[i] from char to its ascii equivalent. Thus, for instance, if you try to convert A0, the 0is converted into 48 but not 0.
To correct, you should instead convert your num[i] into its equivalent integer (not in asci).
To do so, replace :
else {
last_digit=num[i];
result+=last_digit*pow(16,i);
with
else {
last_digit = num[i]-'0';
result+=last_digit*pow(16,i);
}
In the new line, last_digit = num[i]-'0'; is equivalent to last_digit = (int)num[i]-(int)'0';which substracts the representation code of any one-digit-number from num[i] from the representation code of '0'
It works because the C++ standard guarantee that the number representation of the 10 decimal digits are contiguous and in incresing order (official ref iso-cpp and is stated in chapter 2.3 and paragraph 3
Thus, if you take the representation (for instance the ascii code) of any one-digit-number num[i] and substract it with the representation code of '0' (which is 48 in ascii), you obtain directly the number itself as an integer value.
An example of execution after the correction would give:
A0
160
F5
245
A small codereview:
You are repeating yourself with many result+=last_digit*pow(16,i);. you may do it only once at the end of the loop. But that's another matter.
You are complicating the problem more than you need to (std::pow is also kinda slow). std::stoul can take a numerical base and automatically convert to an integer for you:
#include <string>
#include <iostream>
std::size_t char_count{0u};
std::string hexa{};
std::getline(std::cin, hexa);
hexa = "0x" + hexa;
unsigned long value_uint = std::stoul(hexa, &char_count, 16);
I have a text file with a string which I encoded.
Let's say it is: aaahhhhiii kkkjjhh ikl wwwwwweeeett
Here the code for encoding, which works perfectly fine:
void Encode(std::string &inputstring, std::string &outputstring)
{
for (int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i+1]) {
count++;
i++;
}
if(count <= 1) {
outputstring += inputstring[i];
} else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
Output is as expected: 3a4h3i 3k2j2h ikl 6w4e2t
Now, I'd like to decompress the output - back to original.
And I am struggling with this since a couple days now.
My idea so far:
void Decompress(std::string &compressed, std::string &original)
{
char currentChar = 0;
auto n = compressed.length();
for(int i = 0; i < n; i++) {
currentChar = compressed[i++];
if(compressed[i] <= 1) {
original += compressed[i];
} else if (isalpha(currentChar)) {
//
} else {
//
int number = isnumber(currentChar).....
original += number;
}
}
}
I know my Decompress function seems a bit messy, but I am pretty lost with this one.
Sorry for that.
Maybe there is someone out there at stackoverflow who would like to help a lost and beginner soul.
Thanks for any help, I appreciate it.
Assuming input strings cannot contain digits (this cannot be covered by your encoding as e. g. both the strings "3a" and "aaa" would result in the encoded string "3a" – how would you ever want to decompose again?) then you can decompress as follows:
unsigned int num = 0;
for(auto c : compressed)
{
if(std::isdigit(static_cast<unsigned char>(c)))
{
num = num * 10 + c - '0';
}
else
{
num += num == 0; // assume you haven't read a digit yet!
while(num--)
{
original += c;
}
}
}
Untested code, though...
Characters in a string actually are only numerical values, though. You can consider char (or signed char, unsigned char) as ordinary 8-bit integers as well. And you can store a numerical value in such a byte, too. Usually, you do run length encoding exactly that way: Count up to 255 equal characters, store the count in a single byte and the character in another byte. One single "a" would then be encoded as 0x01 0x61 (the latter being the ASCII value of a), "aa" would get 0x02 0x61, and so on. If you have to store more than 255 equal characters you store two pairs: 0xff 0x61, 0x07 0x61 for a string containing 262 times the character a... Decoding then gets trivial: you read characters pairwise, first byte you interpret as number, second one as character – rest being trivial. And you nicely cover digits that way as well.
#include "string"
#include "iostream"
void Encode(std::string& inputstring, std::string& outputstring)
{
for (unsigned int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i + 1]) {
count++;
i++;
}
if (count <= 1) {
outputstring += inputstring[i];
}
else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
bool alpha_or_space(const char c)
{
return isalpha(c) || c == ' ';
}
void Decompress(std::string& compressed, std::string& original)
{
size_t i = 0;
size_t repeat;
while (i < compressed.length())
{
// normal alpha charachers
while (alpha_or_space(compressed[i]))
original.push_back(compressed[i++]);
// repeat number
repeat = 0;
while (isdigit(compressed[i]))
repeat = 10 * repeat + (compressed[i++] - '0');
// unroll releat charachters
auto char_to_unroll = compressed[i++];
while (repeat--)
original.push_back(char_to_unroll);
}
}
int main()
{
std::string deco, outp, inp = "aaahhhhiii kkkjjhh ikl wwwwwweeeett";
Encode(inp, outp);
Decompress(outp, deco);
std::cout << inp << std::endl << outp << std::endl<< deco;
return 0;
}
The decompression can't possibly work in an unambiguous way because you didn't define a sentinel character; i.e. given the compressed stream it's impossible to determine whether a number is an original single number or it represents the repeat RLE command. I would suggest using '0' as the sentinel char. While encoding, if you see '0' you just output 010. Any other char X will translate to 0NX where N is the repeat byte counter. If you go over 255, just output a new RLE repeat command
I have the IR code in form of hexadecimal stored in string (without 0x prefix) which has to be transmited via sendNEC() from IRremote.h. What's the easiest way to convert string in form like "FFFFFF" to 0xFFFFFF?
if you get every char of the string and converted to a numeric hex value then you will need only to calculate the power of every digit, that conversion returns a number which can be represented in several ways (hex in your case)
std::string w("ABCD");
unsigned int res = 0;
for (int i = w.length()-1; i >= 0; i--)
{
unsigned int t = parseCharToHex(w[w.length() - 1 - i]);
std::cout << std::hex << t << std::endl;
res += pow(16, i) * t;
}
std::cout << "res: " << std::dec << res << std::endl;
the function parseCharToHex:
unsigned int parseCharToHex(const char charX)
{
if ('0' <= charX && charX <= '9') return charX - '0';
if ('a' <= charX && charX <= 'f') return 10 + charX - 'a';
if ('A' <= charX && charX <= 'F') return 10 + charX - 'A';
}
pow function is required from the arduino doc here
This is a dirty code which works well under severe limitations:
no error checking needed
string format is exactly known and is F...F'\0'
it is assumed that codes for '0' to '9' and 'A' to 'F' are subsequent and growing
The trick is to use character codes for calculations
_
char * in;
uint64_t out=0;
int counter;
for(counter=0; in[counter]; counter++){
if(in[counter]>='0'&&in[counter]<='9') {
out*=0x10;
out+=in[counter]-'0';
} else {
//assuming that character is from 'A' to 'F'
out*=0x10;
out+=in[counter]-'A'+10;
}
}
What's the fastest way to convert a string represented by (const char*, size_t) to an int?
The string is not null-terminated.
Both these ways involve a string copy (and more) which I'd like to avoid.
And yes, this function is called a few million times a second. :p
int to_int0(const char* c, size_t sz)
{
return atoi(std::string(c, sz).c_str());
}
int to_int1(const char* c, size_t sz)
{
return boost::lexical_cast<int>(std::string(c, sz));
}
Given a counted string like this, you may be able to gain a little speed by doing the conversion yourself. Depending on how robust the code needs to be, this may be fairly difficult though. For the moment, let's assume the easiest case -- that we're sure the string is valid, containing only digits, (no negative numbers for now) and the number it represents is always within the range of an int. For that case:
int to_int2(char const *c, size_t sz) {
int retval = 0;
for (size_t i=0; i<sz; i++)
retval *= 10;
retval += c[i] -'0';
}
return retval;
}
From there, you can get about as complex as you want -- handling leading/trailing whitespace, '-' (but doing so correctly for the maximally negative number in 2's complement isn't always trivial [edit: see Nawaz's answer for one solution to this]), digit grouping, etc.
Another slow version, for uint32:
void str2uint_aux(unsigned& number, unsigned& overflowCtrl, const char*& ch)
{
unsigned digit = *ch - '0';
++ch;
number = number * 10 + digit;
unsigned overflow = (digit + (256 - 10)) >> 8;
// if digit < 10 then overflow == 0
overflowCtrl += overflow;
}
unsigned str2uint(const char* s, size_t n)
{
unsigned number = 0;
unsigned overflowCtrl = 0;
// for VC++10 the Duff's device is faster than loop
switch (n)
{
default:
throw std::invalid_argument(__FUNCTION__ " : `n' too big");
case 10: str2uint_aux(number, overflowCtrl, s);
case 9: str2uint_aux(number, overflowCtrl, s);
case 8: str2uint_aux(number, overflowCtrl, s);
case 7: str2uint_aux(number, overflowCtrl, s);
case 6: str2uint_aux(number, overflowCtrl, s);
case 5: str2uint_aux(number, overflowCtrl, s);
case 4: str2uint_aux(number, overflowCtrl, s);
case 3: str2uint_aux(number, overflowCtrl, s);
case 2: str2uint_aux(number, overflowCtrl, s);
case 1: str2uint_aux(number, overflowCtrl, s);
}
// here we can check that all chars were digits
if (overflowCtrl != 0)
throw std::invalid_argument(__FUNCTION__ " : `s' is not a number");
return number;
}
Why it's slow? Because it processes chars one-by-one. If we'd had a guarantee that we can access bytes upto s+16, we'd can use vectorization for *ch - '0' and digit + 246.
Like in this code:
uint32_t digitsPack = *(uint32_t*)s - '0000';
overflowCtrl |= digitsPack | (digitsPack + 0x06060606); // if one byte is not in range [0;10), high nibble will be non-zero
number = number * 10 + (digitsPack >> 24) & 0xFF;
number = number * 10 + (digitsPack >> 16) & 0xFF;
number = number * 10 + (digitsPack >> 8) & 0xFF;
number = number * 10 + digitsPack & 0xFF;
s += 4;
Small update for range checking:
the first snippet has redundant shift (or mov) on every iteration, so it should be
unsigned digit = *s - '0';
overflowCtrl |= (digit + 256 - 10);
...
if (overflowCtrl >> 8 != 0) throw ...
Fastest:
int to_int(char const *s, size_t count)
{
int result = 0;
size_t i = 0 ;
if ( s[0] == '+' || s[0] == '-' )
++i;
while(i < count)
{
if ( s[i] >= '0' && s[i] <= '9' )
{
//see Jerry's comments for explanation why I do this
int value = (s[0] == '-') ? ('0' - s[i] ) : (s[i]-'0');
result = result * 10 + value;
}
else
throw std::invalid_argument("invalid input string");
i++;
}
return result;
}
Since in the above code, the comparison (s[0] == '-') is done in every iteration, we can avoid this by calculating result as negative number in the loop, and then return result if s[0] is indeed '-', otherwise return -result (which makes it a positive number, as it should be):
int to_int(char const *s, size_t count)
{
size_t i = 0 ;
if ( s[0] == '+' || s[0] == '-' )
++i;
int result = 0;
while(i < count)
{
if ( s[i] >= '0' && s[i] <= '9' )
{
result = result * 10 - (s[i] - '0'); //assume negative number
}
else
throw std::invalid_argument("invalid input string");
i++;
}
return s[0] == '-' ? result : -result; //-result is positive!
}
That is an improvement!
In C++11, you could however use any function from std::stoi family. There is also std::to_string family.
llvm::StringRef s(c,sz);
int n;
s.getAsInteger(10,n);
return n;
http://llvm.org/docs/doxygen/html/classllvm_1_1StringRef.html
You'll have to either write custom routine or use 3rd party library if you're dead set on avoiding string copy.
You probably don't want to write atoi from scratch (it is still possible to make a bug here), so I'd advise to grab existing atoi from public domain or BSD-licensed code and modify it. For example, you can get existing atoi from FreeBSD cvs tree.
If you run the function that often, I bet you parse the same number many times. My suggestion is to BCD encode the string into a static char buffer (you know it's not going to be very long, since atoi only can handle +-2G) when there's less than X digits (X=8 for 32 bit lookup, X=16 for 64 bit lookup) then place a cache in a hash map.
When you're done with the first version, you can probably find nice optimizations, such as skipping the BCD encoding entirely and just using X characters in the string (when length of string <= X) for lookup in the hash table. If the string is longer, you fallback to atoi.
Edit: ... or fallback instead of atoi to Jerry Coffin's solution, which is as fast as they come.
given a string say " a 19 b c d 20", how do I test to see if at that particular position on the string there is a number? (not just the character '1' but the whole number '19' and '20').
char s[80];
strcpy(s,"a 19 b c d 20");
int i=0;
int num=0;
int digit=0;
for (i =0;i<strlen(s);i++){
if ((s[i] <= '9') && (s[i] >= '0')){ //how do i test for the whole integer value not just a digit
//if number then convert to integer
digit = s[i]-48;
num = num*10+digit;
}
if (s[i] == ' '){
break; //is this correct here? do nothing
}
if (s[i] == 'a'){
//copy into a temp char
}
}
These are C solutions:
Are you just trying to parse the numbers out of the string? Then you can just walk the string using strtol().
long num = 0;
char *endptr = NULL;
while (*s) {
num = strtol(s, &endptr, 10);
if (endptr == s) { // Not a number here, move on.
s++;
continue;
}
// Found a number and it is in num. Move to next location.
s = endptr;
// Do something with num.
}
If you have a specific location and number to check for you can still do something similar.
For example: Is '19' at position 10?
int pos = 10;
int value = 19;
if (pos >= strlen(s))
return false;
if (value == strtol(s + pos, &endptr, 10) && endptr != s + pos)
return true;
return false;
Are you trying to parse out the numbers without using any library routines?
Note: I haven't tested this...
int num=0;
int sign=1;
while (*s) {
// This could be done with an if, too.
switch (*s) {
case '-':
sign = -1;
case '+':
s++;
if (*s < '0' || *s > '9') {
sign = 1;
break;
}
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
// Parse number, start with zero.
num = 0;
do {
num = (num * 10) + (*s - '0');
s++;
} while (*s >= '0' && *s <= '9');
num *= sign;
// Restore sign, just in case
sign = 1;
// Do something with num.
break;
default:
// Not a number
s++;
}
}
It seems like you want to parse the string and extract all the numbers from it; if so, here's a more "C++" way to do it:
string s = "a 19 b c d 20"; // your char array will work fine here too
istringstream buffer(s);
string token;
int num;
while (!buffer.eof())
{
buffer >> num; // Try to read a number
if (!buffer.fail()) { // if it doesn't work, failbit is set
cout << num << endl; // It's a number, do what you want here
} else {
buffer.clear(); // wasn't a number, clear the failbit
buffer >> token; // pull out the non-numeric token
}
}
This should print out the following:
19
20
The stream extraction operator pulls out space-delimited tokens automatically, so you're saved from having to do any messy character-level operations or manual integer conversion. You'll need to #include <sstream> for the stringstream class.
You can use atoi().
after your if you need to shift to while to collect subqsequent digits until you hit a non-digit.
BUT, more inportantly, have you clearly defined your requirements? Will you allow whitespace between the digits? What if there are two numbers, like abc123def456gh?
Its not very clear what you are looking for.. Assuming you want to extract all the digits from a string and then from a whole number from the found digits you can try the following:
int i;
unsigned long num=0; // to hold the whole number.
int digit;
for (i =0;i<s[i];i++){
// see if the ith char is a digit..if yes extract consecutive digits
while(isdigit(s[i])) {
num = num * 10 + (s[i] - '0');
i++;
}
}
It is assumed that all the digits in your string when concatenated to from the whole number will not overflow the long data type.
There's no way to test for a whole number. Writing a lexer, as you've done is one way to go. Another would be to try and use the C standard library's strtoul function (or some similar function depending on whether the string has floating point numbers etc).
Your code needs to allow for whitespaces and you can use the C library's isdigit to test if the current character is a digit or not:
vector<int> parse(string const& s) {
vector<int> vi;
for (size_t i = 0; i < s.length();) {
while (::isspace((unsigned char)s[ i ]) i++;
if (::isdigit((unsigned char)s[ i ])) {
int num = s[ i ] - '0';
while (::isdigit((unsigned char)s[ i ])) {
num = num * 10 + (s[ i ] - '0');
++i;
}
vi.push_back(num);
}
....
Another approach will be to use boost::lexical_cast:
vector<string> tokenize(string const& input) {
vector<string> tokens;
size_t off = 0, start = 0;
while ((off = input.find(' ', start)) != string::npos) {
tokens.push_back(input.substr(start, off-start));
start = off + 1;
}
return tokens;
}
vector<int> getint(vector<string> tokens) {
vector<int> vi;
for (vector<string> b = tokens.begin(), e = tokens.end(); b! = e; ++b) {
try
{
tokens.push_back(lexical_cast<short>(*b));
}
catch(bad_lexical_cast &) {}
}
return vi;
}