I am writing a small program convert hex representation of a string , it is a kata to improve my skills.
This is what I have come up with
std::vector<int> decimal( std::string const & s )
{
auto getint = [](char const k){
switch(k){
case 'f':
return 15;
case 'e':
return 14;
case 'd':
return 13;
case 'c':
return 12;
case 'b':
return 11;
case 'a':
return 10;
case '9':
return 9;
case '8':
return 8;
case '7':
return 7;
case '6':
return 6;
case '5':
return 5;
case '4':
return 4;
case '3':
return 3;
case '2':
return 2;
case '1':
return 1;
case '0':
return 0;
};
std::vector<int> result;
for( auto const & k : s )
{
result.push_back(getint(k));
}
return result;
}
I was wondering if there in another way to do this. I have considered to use something as an std::map as well, but I am uncertain which one might be faster. If there is another way to do this please add it.
Please keep in mind that I am doing this as a code-kata to improve my skills, and learn.
Thanks and TIA!
To start with, you can probably simplify your logic like so:
auto getint = [](char const k){
if(k >= 'a' && k <= 'f') return (k - 'a');
else if(k >= 'A' && k <= 'F') return (k - 'A');
else if(k >= '0' && k <= '9') return (k - '0');
else return -1;
}
Beyond that, there may exist a Standard Library function that does exactly this, which you might prefer depending on your specific needs.
For the decimal digits it's very easy to convert a character to its digit, as the C++ specification says that all digits must be consecutive in all encodings, with '0' being the lowest and '9' the highest. That means you could convert a character to number by just subtracting '0', like e.g. k - '0'. There's no such requirement for the letters though, but the most common encoding (ASCII) the same is true, but it should not be counted on if you want to be portable.
You could also do it using e.g. std::transform and std::back_inserter, so no need for your own loop. Perhaps something like
std::transform(std::begin(s), std::end(s), std::back_inserter(result), getint);
In the getint function you could use e.g. std::isxdigit and std::isdigit to check if the character is a valid hexadecimal or decimal digit, respectively. You should probably be using e.g. std::tolower in case the hexadecimal digits are upper-case.
You can use strtol or strtoll to do most of the heavy lifting for you of converting from a base16 string to an integer value.
Then convert back to a regular string using a stringstream object.
// parse hex string with strtol
long value = ::strtol(s.c_str(), nullptr, 16); //not shown - checking for errors. Read the manual page for more info
// convert value back to base-10 string
std::stringstream st;
st << value;
std::string result = st.str();
return result;
Related
I'm working in a program that converts from Roman to Decimal. I have to validate 2 things: One that the characters entered are M or D or C or L or X or V or I, in other words valid for processing.
Number two, I have to make sure that bigger characters value go first and if not to print and error message and have the user to try again (this is the part where I am stuck)
For instance, If I wanted to input 9 and I input IX it should display an error message because is not in Additive form. It should be VIIII. How can I code this so it compares characters to know whether bigger letter values are first and so on?
I keep getting incorrect validation.
Is there a way to assign a value to the letters in the string? I'm thinking in comparing them as int values which I know how to and from there validate input format.
void RomanNum::setRomanNumber() //get input and calculate decimal equivalent
{
//I 1, V 5, X 10, L 50, C 100, D 500, M 1000
int value = 0;
string input;
char current, next;
enum validationData { M, D, C, L, X, V, I };
bool validationCharacters = true;
//bool validationAdditiveForm = true;
getline(cin, input, '\n');
for (int i = 0; i < input.length(); i++) //calculate each Roman letter at a time
{
current = input[i];
next = current + 1;
if (current >= validationData(next))
{
switch (input[i])
{
case 'M':
value += 1000;
break;
case 'D':
value += 500;
break;
case 'C':
value += 100;
break;
case 'L':
value += 50;
break;
case 'X':
value += 10;
break;
case 'V':
value += 5;
break;
case 'I':
value += 1;
break;
default:
validationCharacters = false;
break;
}
}
else
{
cout << "\nInvalid order. Bigger values go first\n";
}
}
}
I would recommend a std::map<char, int> to hold the mapping between letetrs and values.
With the map, you can then convert the input string (a sequence of characters) to a sequence of values (std::vector<int>). From there on, it's just a single check to see if the vector is sorted, and a single function call to add up all values. (I'll leave finding the right function as homework)
I have char byte[0] = '1' (H'0x31)and byte[1] = 'C'(H'0x43)
I am using one more buffer to more buff char hex_buff[0] .i want to have hex content in this hex_buff[0] = 0x1C (i.e combination of byte[0] and byte[1])
I was using below code but i realized that my code is valid for the hex values 0-9 only
char s_nibble1 = (byte[0]<< 4)& 0xf0;
char s_nibble2 = byte[1]& 0x0f;
hex_buff[0] = s_nibble1 | s_nibble2;// here i want to have 0x1C instead of 0x13
What keeps you from using strtol()?
char bytes[] = "1C";
char buff[1];
buff[0] = strtol(bytes, NULL, 16); /* Sets buff[0] to 0x1c aka 28. */
To add this as per chux's comment: strtol() only operates on 0-terminated character arrays. Which does not necessarily needs to be the case for the OP's question.
A possible way to do it, without dependencies with other character manipulation functions:
char hex2byte(char *hs)
{
char b = 0;
char nibbles[2];
int i;
for (i = 0; i < 2; i++) {
if ((hs[i] >= '0') && (hs[i] <= '9'))
nibbles[i] = hs[i]-'0';
else if ((hs[i] >= 'A') && (hs[i] <= 'F'))
nibbles[i] = (hs[i]-'A')+10;
else if ((hs[i] >= 'a') && (hs[i] <= 'f'))
nibbles[i] = (hs[i]-'a')+10;
else
return 0;
}
b = (nibbles[0] << 4) | nibbles[1];
return b;
}
For example: hex2byte("a1") returns the byte 0xa1.
In your case, you should call the function as: hex_buff[0] = hex2byte(byte).
You are trying to get the nibble by masking out the bits of character code, rather than subtracting the actual value. This is not going to work, because the range is disconnected: there is a gap between [0..9] and [A-F] in the encoding, so masking is going to fail.
You can fix this by adding a small helper function, and using it twice in your code:
int hexDigit(char c) {
c = toupper(c); // Allow mixed-case letters
switch(c) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9': return c-'0';
case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F': return c-'A'+10;
default: // Report an error
}
return -1;
}
Now you can code your conversion like this:
int val = (hexDigit(byte[0]) << 4) | hexDigit(byte[1]);
It looks like you are trying to convert ASCII hex into internal representation.
There are many ways to do this, but the one I use most often for each nibble is:
int nibval(unsigned short x)
{
if (('0' <= x) && ('9' >= x))
{
return x - '0';
}
if (('a' <= x) && ('f' >= x))
{
return x - ('a' - 10);
}
if (('A' <= x) && ('F' >= x))
{
return x - ('A' - 10);
}
// Invalid input
return -1;
}
This uses an unsigned int parameter so that it will work for single byte characters as well as wchar_t characters.
I'm currently reverse engineering a network protocol and I wrote a small decryption protocol.
I used to define the bytes of the packet into an unsigned character array, as so:
unsigned char buff[] = "\x00\xFF\x0A" etc.
In order to not recompile the program multiple times per packet I made a small GUI tool where it would get the bytes in \xFF notation from a string. I did this the following way:
int length = int(stencString.length());
unsigned char *buff = new unsigned char[length+1];
memcpy(buff, stencString.c_str(), length+1);
When I call my function it gives me a proper decryption when I hardcode it using the prior method but it gives me garbage then the rest of my string when I memcpy from the string to the array. The creepy part? They both have the same print output!
Here's how I'm using it:
http://pastie.org/private/kndfbaqgvmjiuwlounss9g
Here's kdxalgo.h (c) Luigi Auriemma:
http://pastie.org/private/7dzemmwyyqtngiamlxy8tw
Can someone point me in the right direction?
Thanks!
See what happens when you use the following for the hardcoded version of buff.
unsigned char buff[] =
"\\xd3\\x8c\\x38\\x6b\\x82\\x4c\\xe1\\x1e"
"\\x6b\\x7a\\xff\\x4c\\x9d\\x73\\xbe\\xab"
"\\x38\\xc7\\xc5\\xb8\\x71\\x8f\\xd5\\xbb"
"\\xfa\\xb9\\xf3\\x7a\\x43\\xdd\\x12\\x41"
"\\x4b\\x01\\xa2\\x59\\x74\\x60\\x1e\\xe0"
"\\x6d\\x68\\x26\\xfa\\x0a\\x63\\xa3\\x88";
I have a suspicion that it will produce the same output as you entering the following: \xd3\x8c\x38\x6b\x82\x4c\xe1\x1e\x6b\x7a\xff\x4c\x9d\x73\xbe\xab\x38\xc7\xc5\xb8\x71\x8f\xd5\xbb\xfa\xb9\xf3\x7a\x43\xdd\x12\x41\x4b\x01\xa2\x59\x74\x60\x1e\xe0\x6d\x68\x26\xfa\x0a\x63\xa3\x88.
The compiler automatically takes "\xd3" and converts it into the expected underlying binary representation. You need to have a method of converting the characters backslash, x, d, 3 into the same binary representation.
If you are certain that you will receive properly formated input, then the answer isn't too hard:
unsigned char c2h(char ch)
{
switch (ch)
{
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
case 'a': return 10;
case 'b': return 11;
case 'c': return 12;
case 'd': return 13;
case 'e': return 14;
case 'f': return 15;
}
}
std::string handle_hex(const std::string& str)
{
std::string result;
for (size_t index = 0; index < str.length(); index += 4) // skip to next hex digit
{
// str[index + 0] is '\\' and str[index + 1] is 'x'
unsigned char ch = c2h(str[index+2]) * 16 + c2h(str[index+3]);
result.append((char)ch);
}
return result;
}
Again assuming perfect formatting, so there is not error handling. I know that I'll lose some points for this answer because it's not the best way of doing this, but I want to make the algorithm as easy to understand as possible.
The problem, as Jeffery points out, is that the compiler processes the \xd3 and generates a character with that value, but when you read into a string \xd3 you are actually reading 4 characters: \, x, d and 3.
You will need to read the string, and then parse it into valid contents. For a simple approach, you can change the format so that the input is a space separated sequence of characters encoded as 0xd3 (as this is really simple to parse):
std::string buffer;
std::string input( "0xd3 0x8c 0x38" ); // this would be read
std::istringstream in( input );
in >> std::hex;
std::copy( std::istream_iterator<int>( in ),
std::istream_iterator<int>(),
std::back_inserter( buffer ) );
Of course, there is no need to change the format, you can process it. For that you will only need to read one character at a time. When you encounter a \ then read the next character, if it is x then read the next two characters (say ch1 and ch2) and transform them into an integer value:
int value_of_hex( char ch ) {
if (ch >= '0' && ch <= '9')
return ch-'0';
if (tolower(ch) >= 'a' && tolower(ch) <= 'f')
return 10 + toupper(ch) - 'a';
// error
throw std::runtime_error( "Invalid input" );
}
value = value_of_hex( ch1 )*16 + value_of_hex( ch2 );
Assume one has an vector or array of N elements (N can be very large) containing the octal representation of a non negative integer. How do I get the decimal representation of the number from this array? The code has to be really fast.
EDIT: array A of N elements contains octal representation of a non-negative integer K, i.e. each element of A belongs to the interval [0; 7] (both ends included)
Example: A[0] = 2; A[1] = 6; A[2] = 3
Now a naive calculation would be 2*8pow0 + 6*8pow1 + 3*8pow2 = 2+ 48+ 192 = 242
I tried this but it does not seem to work for large inputs > 6K
//vector<int> A is the input
using namespace std;
vector<int>::iterator it = A.begin();
unsigned int k = 0;
unsigned int x = 0;
while(it < A.end()){
x = x | (*it<<3*k);
k++;
it++;
}
I am also having problems converting a hexadecimal string to its decimal representation? Is this the correct way to do this in C++:
//Assume S to be your input string containing a hex representation
//like F23
std::stringstream ss;
ss << std::hex << S;
ss >> x;
Arbitray precision octal to decimal conversion is rather annoying because there is no way to localize the computation. In other words a change in the most significant digit of the octal number will change even the least significant digit in the decimal representation.
That said I think I would convert the octal number to a say base-1000000000 number and then I'd print that (this is instead a trivial problem, each base-1000000000 digit just maps trivially to 9 base-10 digits).
The conversion to base-1000000000 is simple because you only need to support incrementing and multiplying by two (just consider the input as binary with three bits for each octal digit).
EDIT
I tried to implement it in C++ and this is the resulting code
#include <stdio.h>
#include <vector>
int main(int argc, const char *argv[]) {
// Base 100,000,000 accumulator
// Initialized with one digit = 0
std::vector<unsigned> big(1);
const unsigned DIGIT = 100000000;
for (int c=getchar(); c >= '0' && c <= '7'; c=getchar()) {
// Multiply accumulator by 8 and add in incoming digit
int carry = c - '0';
for (int i=0,n=big.size(); i<n; i++) {
unsigned x = big[i] * 8 + carry;
carry = x / DIGIT;
big[i] = x - carry * DIGIT;
}
if (carry) big.push_back(carry);
}
// Output result in decimal
printf("%i", big.back());
for (int i=int(big.size())-2; i>=0; i--) {
printf("%08i", big[i]);
}
putchar('\n');
return 0;
}
On my PC the time to convert an 80,000 digit octal number to decimal (resulting in 72246 digits) is about 1.2 seconds. Doing the same using python eval/str the time is about 3 seconds. The number used was "01234567" * 10000.
The code above uses 100,000,000 as base so that it can process one digit (3 bits) at a time with 32-bit arithmetic not overflowing with the intermediate results. I tried also using 64 bit integers or the 53 bit integer part of a double but the code was running always slower than in this case (one reason is probably the division in the inner loop that can be converted to a multiplication in the 32 bit case).
This is still a simple O(n^2) implementation that would take ages to convert a 10,000,000-digits octal number.
This is what I came up with:
template<typename Iter>
int read_octal(Iter begin, Iter end)
{
int x = 0;
int f = 1;
for (; begin != end; ++begin)
{
x += *begin * f;
f *= 8;
}
return x;
}
int main()
{
int test[] = {2, 6, 3};
int result = read_octal(test + 0, test + 3);
std::cout << result << '\n';
}
I tried this but it does not seem to work for large inputs > 6K
What exactly do you mean by 6k? An int usually has 32 bits, and an octal digit has 3 bits. Thus, you cannot have more than 10 elements in your range, otherwise x will overflow.
I am also having problems converting a hexadecimal string to its decimal representation?
Well, you could always write a function to parse a string in hex format yourself:
int parse_hex(const char* p)
{
int x = 0;
for (; *p; ++p)
{
x = x * 16 + digit_value(*p);
}
return x;
}
With the most portable version of digit_value being:
int digit_value(char c)
{
switch (c)
{
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
case 'A':
case 'a': return 10;
case 'B':
case 'b': return 11;
case 'C':
case 'c': return 12;
case 'D':
case 'd': return 13;
case 'E':
case 'e': return 14;
case 'F':
case 'f': return 15;
}
}
given a string say " a 19 b c d 20", how do I test to see if at that particular position on the string there is a number? (not just the character '1' but the whole number '19' and '20').
char s[80];
strcpy(s,"a 19 b c d 20");
int i=0;
int num=0;
int digit=0;
for (i =0;i<strlen(s);i++){
if ((s[i] <= '9') && (s[i] >= '0')){ //how do i test for the whole integer value not just a digit
//if number then convert to integer
digit = s[i]-48;
num = num*10+digit;
}
if (s[i] == ' '){
break; //is this correct here? do nothing
}
if (s[i] == 'a'){
//copy into a temp char
}
}
These are C solutions:
Are you just trying to parse the numbers out of the string? Then you can just walk the string using strtol().
long num = 0;
char *endptr = NULL;
while (*s) {
num = strtol(s, &endptr, 10);
if (endptr == s) { // Not a number here, move on.
s++;
continue;
}
// Found a number and it is in num. Move to next location.
s = endptr;
// Do something with num.
}
If you have a specific location and number to check for you can still do something similar.
For example: Is '19' at position 10?
int pos = 10;
int value = 19;
if (pos >= strlen(s))
return false;
if (value == strtol(s + pos, &endptr, 10) && endptr != s + pos)
return true;
return false;
Are you trying to parse out the numbers without using any library routines?
Note: I haven't tested this...
int num=0;
int sign=1;
while (*s) {
// This could be done with an if, too.
switch (*s) {
case '-':
sign = -1;
case '+':
s++;
if (*s < '0' || *s > '9') {
sign = 1;
break;
}
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
// Parse number, start with zero.
num = 0;
do {
num = (num * 10) + (*s - '0');
s++;
} while (*s >= '0' && *s <= '9');
num *= sign;
// Restore sign, just in case
sign = 1;
// Do something with num.
break;
default:
// Not a number
s++;
}
}
It seems like you want to parse the string and extract all the numbers from it; if so, here's a more "C++" way to do it:
string s = "a 19 b c d 20"; // your char array will work fine here too
istringstream buffer(s);
string token;
int num;
while (!buffer.eof())
{
buffer >> num; // Try to read a number
if (!buffer.fail()) { // if it doesn't work, failbit is set
cout << num << endl; // It's a number, do what you want here
} else {
buffer.clear(); // wasn't a number, clear the failbit
buffer >> token; // pull out the non-numeric token
}
}
This should print out the following:
19
20
The stream extraction operator pulls out space-delimited tokens automatically, so you're saved from having to do any messy character-level operations or manual integer conversion. You'll need to #include <sstream> for the stringstream class.
You can use atoi().
after your if you need to shift to while to collect subqsequent digits until you hit a non-digit.
BUT, more inportantly, have you clearly defined your requirements? Will you allow whitespace between the digits? What if there are two numbers, like abc123def456gh?
Its not very clear what you are looking for.. Assuming you want to extract all the digits from a string and then from a whole number from the found digits you can try the following:
int i;
unsigned long num=0; // to hold the whole number.
int digit;
for (i =0;i<s[i];i++){
// see if the ith char is a digit..if yes extract consecutive digits
while(isdigit(s[i])) {
num = num * 10 + (s[i] - '0');
i++;
}
}
It is assumed that all the digits in your string when concatenated to from the whole number will not overflow the long data type.
There's no way to test for a whole number. Writing a lexer, as you've done is one way to go. Another would be to try and use the C standard library's strtoul function (or some similar function depending on whether the string has floating point numbers etc).
Your code needs to allow for whitespaces and you can use the C library's isdigit to test if the current character is a digit or not:
vector<int> parse(string const& s) {
vector<int> vi;
for (size_t i = 0; i < s.length();) {
while (::isspace((unsigned char)s[ i ]) i++;
if (::isdigit((unsigned char)s[ i ])) {
int num = s[ i ] - '0';
while (::isdigit((unsigned char)s[ i ])) {
num = num * 10 + (s[ i ] - '0');
++i;
}
vi.push_back(num);
}
....
Another approach will be to use boost::lexical_cast:
vector<string> tokenize(string const& input) {
vector<string> tokens;
size_t off = 0, start = 0;
while ((off = input.find(' ', start)) != string::npos) {
tokens.push_back(input.substr(start, off-start));
start = off + 1;
}
return tokens;
}
vector<int> getint(vector<string> tokens) {
vector<int> vi;
for (vector<string> b = tokens.begin(), e = tokens.end(); b! = e; ++b) {
try
{
tokens.push_back(lexical_cast<short>(*b));
}
catch(bad_lexical_cast &) {}
}
return vi;
}