I have problem with giving new char valure to array. I don't know why I get sign "<" even when n is 12? My program should change expression int char* tab = "93+" to one value in this case 12.
char* tab = "93+";
int b = sizeof (tab);
char* tmp = new char[b] ;
tmp [b-1] = '\0';
if(isdigit(tab[i]) && isdigit(tab[i+1]) ){
int n;
if(tab[i+2]=='+' || tab[i+2]=='-' || tab[i+2]=='*'){
switch(tab[i+2]){
case '+':
n = (tab[i]-'0') + (tab[i+1]-'0');
break;
case '-':
n = (tab[i]-'0') - (tab[i+1]-'0');
break;
case '*':
n = (tab[i]-'0') * (tab[i+1]-'0');
break;
}
tmp[i] = n+'0'; // I should have 12 but I get <
}
else if (tab[i+2]!='+' || tab[i+2]!='-' || tab[i+2]!='*'){
goto LAB;
}
}
The problem is in this line:
tmp[i] = n+'0'; // I should have 12 but I get <
n is 12, but 12 + '0' != '12', since '12' isn't a character. You're putting into tmp[i] the char whose ascii value is 12 more than '0', which is '<'.
I believe declaring (and treating) tmp as an int would be better for your purposes.
Also note that sizeof (tab) is the same as sizeof (char *), and not sizeof ("93+"), so you're likely to always get b==4 (in 32-bit machines).
You indeed should get '<'. Here is why: tmp is an array of chars. You calculated n to be 12. This is correct. You then added '0' which is 48. 48 + 12 = 60. So you store 60 in tmp[i].
An ASCII 60 is '<'.
You could use an int tmp, and not add the '0', and you would get then 12 in tmp[i].
Related
I have char byte[0] = '1' (H'0x31)and byte[1] = 'C'(H'0x43)
I am using one more buffer to more buff char hex_buff[0] .i want to have hex content in this hex_buff[0] = 0x1C (i.e combination of byte[0] and byte[1])
I was using below code but i realized that my code is valid for the hex values 0-9 only
char s_nibble1 = (byte[0]<< 4)& 0xf0;
char s_nibble2 = byte[1]& 0x0f;
hex_buff[0] = s_nibble1 | s_nibble2;// here i want to have 0x1C instead of 0x13
What keeps you from using strtol()?
char bytes[] = "1C";
char buff[1];
buff[0] = strtol(bytes, NULL, 16); /* Sets buff[0] to 0x1c aka 28. */
To add this as per chux's comment: strtol() only operates on 0-terminated character arrays. Which does not necessarily needs to be the case for the OP's question.
A possible way to do it, without dependencies with other character manipulation functions:
char hex2byte(char *hs)
{
char b = 0;
char nibbles[2];
int i;
for (i = 0; i < 2; i++) {
if ((hs[i] >= '0') && (hs[i] <= '9'))
nibbles[i] = hs[i]-'0';
else if ((hs[i] >= 'A') && (hs[i] <= 'F'))
nibbles[i] = (hs[i]-'A')+10;
else if ((hs[i] >= 'a') && (hs[i] <= 'f'))
nibbles[i] = (hs[i]-'a')+10;
else
return 0;
}
b = (nibbles[0] << 4) | nibbles[1];
return b;
}
For example: hex2byte("a1") returns the byte 0xa1.
In your case, you should call the function as: hex_buff[0] = hex2byte(byte).
You are trying to get the nibble by masking out the bits of character code, rather than subtracting the actual value. This is not going to work, because the range is disconnected: there is a gap between [0..9] and [A-F] in the encoding, so masking is going to fail.
You can fix this by adding a small helper function, and using it twice in your code:
int hexDigit(char c) {
c = toupper(c); // Allow mixed-case letters
switch(c) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9': return c-'0';
case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F': return c-'A'+10;
default: // Report an error
}
return -1;
}
Now you can code your conversion like this:
int val = (hexDigit(byte[0]) << 4) | hexDigit(byte[1]);
It looks like you are trying to convert ASCII hex into internal representation.
There are many ways to do this, but the one I use most often for each nibble is:
int nibval(unsigned short x)
{
if (('0' <= x) && ('9' >= x))
{
return x - '0';
}
if (('a' <= x) && ('f' >= x))
{
return x - ('a' - 10);
}
if (('A' <= x) && ('F' >= x))
{
return x - ('A' - 10);
}
// Invalid input
return -1;
}
This uses an unsigned int parameter so that it will work for single byte characters as well as wchar_t characters.
I have a string which will be exactly consist of numbers between 1-30 and one of 'R','T'or'M' char. Let me illustrate it by some examples.
string a="15T","1R","12M","24T","24M" ... // they are all valid for my string
Now I need to have a hash function which gives me a unique hash value for every input string. Since my input have a finite set I think it is possible.
Is there anyone who can tell what kind of hash function could I define ?
By the way, I'll create my hash table using vector therefore I guess size is not an important issue but I'll define 10000 as an upper bound. I mean I assume I can not have more than 10000 such a string
Thanks in advance.
Just have a large enough integer type and put the (maximal) three characters into the integer:
std::size_t hash(const char* s) {
std::size_t result = 0;
while(*s) {
result <<= 8;
result |= *s++;
}
return result;
}
You could define an algebraic function:
result = string[0] * 0x010000
+ string[1] * 0x000100
+ string[2];
Basically, each character fits into an uint8_t, which has a range of 256. So each column is a power of 256.
Yes, there are big gaps, but this insures a unique hash.
You could compress the gaps by using various "powers" for the different character columns.
Given "15T":
result = (string[0] - '0') * 10 // 10 == number of digits in the 2nd column
+ (string[1] - '0') * 3; // 3 == number of choices in 1st column.
switch (string[2])
{
case 'T' : result += 0; break;
case 'M' : result += 1; break;
case 'R' : result += 2; break;
}
It's a number / counting system where each column has a different number of digits.
Something along the line of:
unsigned myhash(const char * str)
{
int n = 0;
// Parse the number part
for ( ; *str >= '0' && *str <= '9'; ++str)
n = n * 10 + (*str - '0');
int c = *str == 'R' ? 0 :
*str == 'T' ? 1 :
*str == 'M' ? 2 :
3;
// Check for invalid strings
if ( c == 3 || n <= 0 || n > 30 || *(++str) != 0 )
{
// Some error or anything
// (Or replace the if condition with an assert)
throw std::runtime_error("Invalid string");
}
// Since 0 <= c < 3 and 0 <= (n-1) < 30
// There are only 90 possible values
return c * 30 + (n-1);
}
In my experience whenever you have to deal with something like this it is often better to do the opposite, that is work with integers and have a function to perform the opposite conversion if necessary.
You can rebuild the original string with:
int n = hash % 30 + 1;
int c = hash / 30; // 0 is 'R', 1 is 'T', 2 is 'M'
I'm currently reverse engineering a network protocol and I wrote a small decryption protocol.
I used to define the bytes of the packet into an unsigned character array, as so:
unsigned char buff[] = "\x00\xFF\x0A" etc.
In order to not recompile the program multiple times per packet I made a small GUI tool where it would get the bytes in \xFF notation from a string. I did this the following way:
int length = int(stencString.length());
unsigned char *buff = new unsigned char[length+1];
memcpy(buff, stencString.c_str(), length+1);
When I call my function it gives me a proper decryption when I hardcode it using the prior method but it gives me garbage then the rest of my string when I memcpy from the string to the array. The creepy part? They both have the same print output!
Here's how I'm using it:
http://pastie.org/private/kndfbaqgvmjiuwlounss9g
Here's kdxalgo.h (c) Luigi Auriemma:
http://pastie.org/private/7dzemmwyyqtngiamlxy8tw
Can someone point me in the right direction?
Thanks!
See what happens when you use the following for the hardcoded version of buff.
unsigned char buff[] =
"\\xd3\\x8c\\x38\\x6b\\x82\\x4c\\xe1\\x1e"
"\\x6b\\x7a\\xff\\x4c\\x9d\\x73\\xbe\\xab"
"\\x38\\xc7\\xc5\\xb8\\x71\\x8f\\xd5\\xbb"
"\\xfa\\xb9\\xf3\\x7a\\x43\\xdd\\x12\\x41"
"\\x4b\\x01\\xa2\\x59\\x74\\x60\\x1e\\xe0"
"\\x6d\\x68\\x26\\xfa\\x0a\\x63\\xa3\\x88";
I have a suspicion that it will produce the same output as you entering the following: \xd3\x8c\x38\x6b\x82\x4c\xe1\x1e\x6b\x7a\xff\x4c\x9d\x73\xbe\xab\x38\xc7\xc5\xb8\x71\x8f\xd5\xbb\xfa\xb9\xf3\x7a\x43\xdd\x12\x41\x4b\x01\xa2\x59\x74\x60\x1e\xe0\x6d\x68\x26\xfa\x0a\x63\xa3\x88.
The compiler automatically takes "\xd3" and converts it into the expected underlying binary representation. You need to have a method of converting the characters backslash, x, d, 3 into the same binary representation.
If you are certain that you will receive properly formated input, then the answer isn't too hard:
unsigned char c2h(char ch)
{
switch (ch)
{
case '0': return 0;
case '1': return 1;
case '2': return 2;
case '3': return 3;
case '4': return 4;
case '5': return 5;
case '6': return 6;
case '7': return 7;
case '8': return 8;
case '9': return 9;
case 'a': return 10;
case 'b': return 11;
case 'c': return 12;
case 'd': return 13;
case 'e': return 14;
case 'f': return 15;
}
}
std::string handle_hex(const std::string& str)
{
std::string result;
for (size_t index = 0; index < str.length(); index += 4) // skip to next hex digit
{
// str[index + 0] is '\\' and str[index + 1] is 'x'
unsigned char ch = c2h(str[index+2]) * 16 + c2h(str[index+3]);
result.append((char)ch);
}
return result;
}
Again assuming perfect formatting, so there is not error handling. I know that I'll lose some points for this answer because it's not the best way of doing this, but I want to make the algorithm as easy to understand as possible.
The problem, as Jeffery points out, is that the compiler processes the \xd3 and generates a character with that value, but when you read into a string \xd3 you are actually reading 4 characters: \, x, d and 3.
You will need to read the string, and then parse it into valid contents. For a simple approach, you can change the format so that the input is a space separated sequence of characters encoded as 0xd3 (as this is really simple to parse):
std::string buffer;
std::string input( "0xd3 0x8c 0x38" ); // this would be read
std::istringstream in( input );
in >> std::hex;
std::copy( std::istream_iterator<int>( in ),
std::istream_iterator<int>(),
std::back_inserter( buffer ) );
Of course, there is no need to change the format, you can process it. For that you will only need to read one character at a time. When you encounter a \ then read the next character, if it is x then read the next two characters (say ch1 and ch2) and transform them into an integer value:
int value_of_hex( char ch ) {
if (ch >= '0' && ch <= '9')
return ch-'0';
if (tolower(ch) >= 'a' && tolower(ch) <= 'f')
return 10 + toupper(ch) - 'a';
// error
throw std::runtime_error( "Invalid input" );
}
value = value_of_hex( ch1 )*16 + value_of_hex( ch2 );
What's the fastest way to convert a string represented by (const char*, size_t) to an int?
The string is not null-terminated.
Both these ways involve a string copy (and more) which I'd like to avoid.
And yes, this function is called a few million times a second. :p
int to_int0(const char* c, size_t sz)
{
return atoi(std::string(c, sz).c_str());
}
int to_int1(const char* c, size_t sz)
{
return boost::lexical_cast<int>(std::string(c, sz));
}
Given a counted string like this, you may be able to gain a little speed by doing the conversion yourself. Depending on how robust the code needs to be, this may be fairly difficult though. For the moment, let's assume the easiest case -- that we're sure the string is valid, containing only digits, (no negative numbers for now) and the number it represents is always within the range of an int. For that case:
int to_int2(char const *c, size_t sz) {
int retval = 0;
for (size_t i=0; i<sz; i++)
retval *= 10;
retval += c[i] -'0';
}
return retval;
}
From there, you can get about as complex as you want -- handling leading/trailing whitespace, '-' (but doing so correctly for the maximally negative number in 2's complement isn't always trivial [edit: see Nawaz's answer for one solution to this]), digit grouping, etc.
Another slow version, for uint32:
void str2uint_aux(unsigned& number, unsigned& overflowCtrl, const char*& ch)
{
unsigned digit = *ch - '0';
++ch;
number = number * 10 + digit;
unsigned overflow = (digit + (256 - 10)) >> 8;
// if digit < 10 then overflow == 0
overflowCtrl += overflow;
}
unsigned str2uint(const char* s, size_t n)
{
unsigned number = 0;
unsigned overflowCtrl = 0;
// for VC++10 the Duff's device is faster than loop
switch (n)
{
default:
throw std::invalid_argument(__FUNCTION__ " : `n' too big");
case 10: str2uint_aux(number, overflowCtrl, s);
case 9: str2uint_aux(number, overflowCtrl, s);
case 8: str2uint_aux(number, overflowCtrl, s);
case 7: str2uint_aux(number, overflowCtrl, s);
case 6: str2uint_aux(number, overflowCtrl, s);
case 5: str2uint_aux(number, overflowCtrl, s);
case 4: str2uint_aux(number, overflowCtrl, s);
case 3: str2uint_aux(number, overflowCtrl, s);
case 2: str2uint_aux(number, overflowCtrl, s);
case 1: str2uint_aux(number, overflowCtrl, s);
}
// here we can check that all chars were digits
if (overflowCtrl != 0)
throw std::invalid_argument(__FUNCTION__ " : `s' is not a number");
return number;
}
Why it's slow? Because it processes chars one-by-one. If we'd had a guarantee that we can access bytes upto s+16, we'd can use vectorization for *ch - '0' and digit + 246.
Like in this code:
uint32_t digitsPack = *(uint32_t*)s - '0000';
overflowCtrl |= digitsPack | (digitsPack + 0x06060606); // if one byte is not in range [0;10), high nibble will be non-zero
number = number * 10 + (digitsPack >> 24) & 0xFF;
number = number * 10 + (digitsPack >> 16) & 0xFF;
number = number * 10 + (digitsPack >> 8) & 0xFF;
number = number * 10 + digitsPack & 0xFF;
s += 4;
Small update for range checking:
the first snippet has redundant shift (or mov) on every iteration, so it should be
unsigned digit = *s - '0';
overflowCtrl |= (digit + 256 - 10);
...
if (overflowCtrl >> 8 != 0) throw ...
Fastest:
int to_int(char const *s, size_t count)
{
int result = 0;
size_t i = 0 ;
if ( s[0] == '+' || s[0] == '-' )
++i;
while(i < count)
{
if ( s[i] >= '0' && s[i] <= '9' )
{
//see Jerry's comments for explanation why I do this
int value = (s[0] == '-') ? ('0' - s[i] ) : (s[i]-'0');
result = result * 10 + value;
}
else
throw std::invalid_argument("invalid input string");
i++;
}
return result;
}
Since in the above code, the comparison (s[0] == '-') is done in every iteration, we can avoid this by calculating result as negative number in the loop, and then return result if s[0] is indeed '-', otherwise return -result (which makes it a positive number, as it should be):
int to_int(char const *s, size_t count)
{
size_t i = 0 ;
if ( s[0] == '+' || s[0] == '-' )
++i;
int result = 0;
while(i < count)
{
if ( s[i] >= '0' && s[i] <= '9' )
{
result = result * 10 - (s[i] - '0'); //assume negative number
}
else
throw std::invalid_argument("invalid input string");
i++;
}
return s[0] == '-' ? result : -result; //-result is positive!
}
That is an improvement!
In C++11, you could however use any function from std::stoi family. There is also std::to_string family.
llvm::StringRef s(c,sz);
int n;
s.getAsInteger(10,n);
return n;
http://llvm.org/docs/doxygen/html/classllvm_1_1StringRef.html
You'll have to either write custom routine or use 3rd party library if you're dead set on avoiding string copy.
You probably don't want to write atoi from scratch (it is still possible to make a bug here), so I'd advise to grab existing atoi from public domain or BSD-licensed code and modify it. For example, you can get existing atoi from FreeBSD cvs tree.
If you run the function that often, I bet you parse the same number many times. My suggestion is to BCD encode the string into a static char buffer (you know it's not going to be very long, since atoi only can handle +-2G) when there's less than X digits (X=8 for 32 bit lookup, X=16 for 64 bit lookup) then place a cache in a hash map.
When you're done with the first version, you can probably find nice optimizations, such as skipping the BCD encoding entirely and just using X characters in the string (when length of string <= X) for lookup in the hash table. If the string is longer, you fallback to atoi.
Edit: ... or fallback instead of atoi to Jerry Coffin's solution, which is as fast as they come.
I am trying to practice C++ and while doing so I ran into a problem in my code. I dynamically create a character array and then for each array index, I want to fill that element with an integer. I tried casting the integer to a character but that didn't seem to work. After printing out the array element, nothing comes out. I would appreciate any help, I'm pretty new to this, thanks .
char *createBoard()
{
char *theGameBoard = new char[8];
for (int i = 0; i < 8; i++)
theGameBoard[i] = (char)i; //doesn't work
return theGameBoard;
}
Here is how I ended up doing it.
char *createBoard()
{
char *theGameBoard = new char[8];
theGameBoard[0] = '0';
theGameBoard[1] = '1';
theGameBoard[2] = '2';
theGameBoard[3] = '3';
theGameBoard[4] = '4';
theGameBoard[5] = '5';
theGameBoard[6] = '6';
theGameBoard[7] = '7';
theGameBoard[8] = '8';
return theGameBoard;
}
Basically, your two sections of code are not quite equivalent.
When you set theGameBoard[0] = '0' you are essentially setting it to the value 48 (the ASCII code for the character '0'). So setting theGameBoard[0] = (char)i is not quite the same thing if i = 0. You need to add the offset of '0' in the ASCII table (which is 48) so that theGameBoard[0] is actually 0 + the offset of character '0'.
Here's how you do it:
char *createBoard()
{
char *theGameBoard = new char[8];
for (int i = 0; i < 8; i++)
theGameBoard[i] = '0' + (char)i; //you want to set each array cell
// to an ASCII numnber (so use '0' as an offset)
return theGameBoard;
}
Also, like #Daniel said: make sure that you free up the memory that you are allocating in this function after you are done with using the returned variable. Like so:
int main()
{
char* gameBoard = createBoard();
// you can now use the gameBoard variable here
// ...
// when you are done with it
// make sure to call delete on it
delete[] gameBoard;
// exit the program here..
return 0;
}
Your second function has an off-by-one bug. You allocate an array of length 8, but you copy 9 values into it. (0, 1, 2, 3, 4, 5, 6, 7, 8).
If I was doing this I would use stringstream. It might be heavy weight for this, but it is the C++ way of doing things.
for (int i = 0; i < 8; ++i) {
stringstream sstream;
sstream << i;
sstream >> theGameBoard[i];
}
When you are done using the game board array you need to delete it with this command:
delete[] theGameBoard;
In your character array you must store ASCII values of digits.
For example:
ASCII value of '0' is 48 ( not 0 )
ASCII value of '1' is 49 ( not 1 )
...
In C++ ( and almost every other language ) you can get ASCII value of character putting it in single quote ( '0' == 48 , '1' == 49 , '2' == 50 , ... )
Your array must have values
theGameBoard[0] = '0'
theGameBoard[1] = '1' or theGameBoard = '0' + 1
...
Code that fills your array:
for(int k=0;k<8;++k)
theGameBoard[k] = '0' + k;