Convert char array to integer value

Convert char array to integer value - c++

I'm having a bit of a trouble figuring out the correct calculation of WORD, DWORD etc.
I'm having kind of a knot in my brain, probably sitting on this problem for too long.
I'm reading a PE-section header. So far everything is ok.
Here is a sample output from a random .exe file:
File is 286 Kbytes large
PE-Signature [# 0x108]
0x00000100: ........ ........ 504500
Collect Information (PE file header):
[WORD] Mashinae Type :014C
[WORD] Number of Sections :0006
[DWORD] TimeStamp :5C6ECB00
[DWORD] Pointer to symbol table:00000000
[DWORD] Number of Symbols :00000000
[WORD] Size of optional header:00E0
Now, as you see the size of the "optional" header is 0x00E0, so I was trying to buffer that for later.
(Bc. it would make things faster to just read the complete header).
Where I'm having problems is the point where I am to convert the little-endian values to an actual integer.
I need to read the value from behind (so the second WORD [ 00 ] is actually the first value to be read).
The second value, however, needs to be shifted in some way (bc. significance of bytes), and this is where I am struggeling. I guess the solution is not that hard, I just ran out of wisdom lol.
Here is my draft for a function that should return an integer value with the value:
//get a specific value and safe it for later usage
int getValue(char* memory, int start, int end)
{
if (end <= start)
return 0;
unsigned int retVal = 0;
//now just add up array fields
for (int i = end; i >= start; i--)
{
fprintf(stdout, "\n%02hhx", memory[i]);
retVal &= (memory[i] << 8 * (i- start));
}
fprintf(stdout, "\n\n\n%d",retVal);
return retVal;
}
In other words, I need to parse an array of hex values (or chars) to an actual integer, but in respect of the significance of the bytes.
Also:
[Pointer to symbol table] and [Number of Symbols] seem to always be 0. I'm guessing this is due to the fact the binary is stripped of symbols, but I'm not sure since I am more an expert on Linux Binary Analysis. Is my asumption correct?

I really hope that this helps you. From what I understood so far this will grab the bytes that are within the start to end range and will place them in an integer:
// here I am converting the chars from hex to int
int getBitPattern(char ch)
{
if (ch >= 48 && ch <= 57)
{
return ch - '0';
}
else if (ch >= 65 && ch <= 70)
{
return ch - 55;
}
else
{
// this is in case of invalid input
return -1;
}
}
int getValue(const char* memory, int start, int end)
{
if (end <= start)
return 0;
unsigned int retVal = 0;
//now just add up array fields
for (int i = end, j = 0; i >= start; i--, ++j)
{
fprintf(stdout, "\n%02hhx", memory[i]);
// bitshift in order to insert the next set of 4 bits into their correct spot
retVal |= (getBitPattern(memory[i]) << (4*j));
}
fprintf(stdout, "\n\n\n%d", retVal);
return retVal;
}

boyanhristov96 helped a lot by pointing out the usage of the OR operator instead of AND and it was his / her effort that lead to this solution
A cast to (unsigned char) also had to be made before shifting.
If not, the variable will simply be shifted over it's maximum positive range,resulting in the value
0xFFFFE000 (4294959104)
instead of the desired 0x0000E000 (57344)
We have to left-shift by 8, because we want to shift 2 16bit values at once, like in
0x00FF00 << 8 ; // after operation is 0xFF0000
The final function also uses an OR, here is it:
//now with or operation and cast
int getValue(const char* memory, int start, int end)
{
if (end <= start)
return 0;
unsigned int retVal = 0;
//now just add up array fields
for (int i = end, j = end-start; i >= start; i--, --j)
{
fprintf(stdout, "\n%02hhx", memory[i]);
retVal |= ((unsigned char)(memory[i]) << (8 * j));
}
fprintf(stdout, "\n\n\n%u", retVal);
return retVal;
}
Many thanks for helping
EDIT 16.12.2019:
Returning back here for updated version of function;
It was necassary to rewrite it for 2 reasons:
1) The offset to PE-Header depends on the target binary, so we have to get this value first (at location 0x3c). Then, use a pointer to move from value to value from there.
2) The calculations where shambled, I corrected them, now it should work as intended. The second parameter is the byte-length, f.e. DWORD - 4 byte
Here you go:
//because file shambles
int getValuePNTR(const char* memory, int &start, int size)
{
DWORD retVal = 0;
//now just add up array fields
for (int i = start + size-1,j = size-1; j >= 0; --j ,i--)
{
fprintf(stdout, "\ncycle: %d, memory: [%x]", j, memory[i]);
if ((unsigned char)memory[i] == 00 && j > 0)
retVal <<= 8;
else
retVal |= ((unsigned char)(memory[i]) << (8 * j));
//else
//retVal |= ((unsigned char)(memory[i]));
}
//get the next field after this one
start += size;
return retVal;
}

Related

How to replace a char in string with another char fast(I think test didn't want common way)

I was asked this question in tech test.
They asked how to change ' ' to '_' in string.
I think they didn't want common answer. like this (I can assure this)
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar
{
for(size_t i = 0 ; i < strLength ; i++)
{
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
So I answered like this. Use WORD. ( Actually I didn't write code, They want just explaining how to do)
I think comparing Each 8 byte(64bit OS) of string with mask 8 byte.
if They eqaul, replace 8byte in a time.
When Cpu read data with size less than WORD , Cpu should do operation clearing rest bits.
It's slow. So I tried to use WORD in comparing chars.
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar //
{
size_t mask = 0;
size_t replaced = 0;
for(size_t i = 0 ; i < sizeof(size_t) ; i++)
{
mask |= originalChar << i;
replaced |= newChar << i;
}
for(size_t i = 0 ; i < strLength ; i++)
{
// if 8 byte data equal with 8 byte data filled with originalChar
// replace 8 byte data with 8 byte data filled with newChar
if(i % sizeof(size_t) == 0 &&
strLength - i > sizeof(size_t) &&
*(size_t*)(originalStr + i) == mask)
{
*(size_t*)(originalStr + i) = replaced;
i += sizeof(size_t);
continue;
}
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
Is There any faster way??

Do not try to optimize a code when you do not know what is the bottleneck of the code. Try to write a clear readable code.
This function declaration and definition
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar
{
for(size_t i = 0 ; i < strLength ; i++)
{
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
does not make a sense because it duplicates the behavior of the standard algorithm std::replace.
Moreover for such a simple basic general-purpose function you are using too long identifier names.
If you need to write a similar function specially for C-strings then it can look for example the following way as it is shown in the demonstrative program below
#include <iostream>
#include <cstring>
char * replaceChar( char s[], char from, char to )
{
for ( char *p = s; ( p = strchr( p, from ) ) != nullptr; ++p )
{
*p = to;
}
return s;
}
int main()
{
char s[] = "Hello C strings!";
std::cout << replaceChar( s, ' ', '_' ) << '\n';
return 0;
}
The program output is
Hello_C_strings!
As for your second function then it is unreadable. Using the continue statement in a body of for loop makes it difficult to follow its logic.
As a character array is not necessary aligned by the value of size_t then the function is not as fast as you think.
If you need a very optimized function then you should write it directly in assembler.

The first thing in the road to being fast is being correct. The problem with the original proposal is that sizeof(s) should be a cached value of strlen(s). Then the obvious problem is that this approach scans the string twice -- first to find the terminating character and then the character to be replaced.
This should be addressed by a data structure with known length, or data structure, with enough guaranteed excess data so that multiple bytes can be processed at once without Undefined Behaviour.
Once this is solved (the OP has been edited to fix this) the problem with the proposed approach of scanning 8 bytes worth of data for ALL the bytes being the same is that a generic case does have 8 successive characters, but maybe only 7. In all those cases one would need to scan the same area twice (on top of scanning the string terminating character).
If the string length is not known, the best thing is to use a low level method:
while (*ptr != 0) {
if (*ptr == search_char) {
*ptr = replace_char;
}
++ptr;
}
If the string length is known, it's best to use a library method std::replace, or it's low level counterpart
for (auto i = 0; i < size; ++i) {
if (str[i] == search_char) {
str[i] = replace_char;
}
}
Any decent compiler is able to autovectorize this, although the compiler might generate a larger variety of kernels than intended (one kernel for small sizes, one for intermediate and one to process in chunks of 32 or 64 bytes).

Run-length decompression using C++

I have a text file with a string which I encoded.
Let's say it is: aaahhhhiii kkkjjhh ikl wwwwwweeeett
Here the code for encoding, which works perfectly fine:
void Encode(std::string &inputstring, std::string &outputstring)
{
for (int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i+1]) {
count++;
i++;
}
if(count <= 1) {
outputstring += inputstring[i];
} else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
Output is as expected: 3a4h3i 3k2j2h ikl 6w4e2t
Now, I'd like to decompress the output - back to original.
And I am struggling with this since a couple days now.
My idea so far:
void Decompress(std::string &compressed, std::string &original)
{
char currentChar = 0;
auto n = compressed.length();
for(int i = 0; i < n; i++) {
currentChar = compressed[i++];
if(compressed[i] <= 1) {
original += compressed[i];
} else if (isalpha(currentChar)) {
//
} else {
//
int number = isnumber(currentChar).....
original += number;
}
}
}
I know my Decompress function seems a bit messy, but I am pretty lost with this one.
Sorry for that.
Maybe there is someone out there at stackoverflow who would like to help a lost and beginner soul.
Thanks for any help, I appreciate it.

Assuming input strings cannot contain digits (this cannot be covered by your encoding as e. g. both the strings "3a" and "aaa" would result in the encoded string "3a" – how would you ever want to decompose again?) then you can decompress as follows:
unsigned int num = 0;
for(auto c : compressed)
{
if(std::isdigit(static_cast<unsigned char>(c)))
{
num = num * 10 + c - '0';
}
else
{
num += num == 0; // assume you haven't read a digit yet!
while(num--)
{
original += c;
}
}
}
Untested code, though...
Characters in a string actually are only numerical values, though. You can consider char (or signed char, unsigned char) as ordinary 8-bit integers as well. And you can store a numerical value in such a byte, too. Usually, you do run length encoding exactly that way: Count up to 255 equal characters, store the count in a single byte and the character in another byte. One single "a" would then be encoded as 0x01 0x61 (the latter being the ASCII value of a), "aa" would get 0x02 0x61, and so on. If you have to store more than 255 equal characters you store two pairs: 0xff 0x61, 0x07 0x61 for a string containing 262 times the character a... Decoding then gets trivial: you read characters pairwise, first byte you interpret as number, second one as character – rest being trivial. And you nicely cover digits that way as well.

#include "string"
#include "iostream"
void Encode(std::string& inputstring, std::string& outputstring)
{
for (unsigned int i = 0; i < inputstring.length(); i++) {
int count = 1;
while (inputstring[i] == inputstring[i + 1]) {
count++;
i++;
}
if (count <= 1) {
outputstring += inputstring[i];
}
else {
outputstring += std::to_string(count);
outputstring += inputstring[i];
}
}
}
bool alpha_or_space(const char c)
{
return isalpha(c) || c == ' ';
}
void Decompress(std::string& compressed, std::string& original)
{
size_t i = 0;
size_t repeat;
while (i < compressed.length())
{
// normal alpha charachers
while (alpha_or_space(compressed[i]))
original.push_back(compressed[i++]);
// repeat number
repeat = 0;
while (isdigit(compressed[i]))
repeat = 10 * repeat + (compressed[i++] - '0');
// unroll releat charachters
auto char_to_unroll = compressed[i++];
while (repeat--)
original.push_back(char_to_unroll);
}
}
int main()
{
std::string deco, outp, inp = "aaahhhhiii kkkjjhh ikl wwwwwweeeett";
Encode(inp, outp);
Decompress(outp, deco);
std::cout << inp << std::endl << outp << std::endl<< deco;
return 0;
}

The decompression can't possibly work in an unambiguous way because you didn't define a sentinel character; i.e. given the compressed stream it's impossible to determine whether a number is an original single number or it represents the repeat RLE command. I would suggest using '0' as the sentinel char. While encoding, if you see '0' you just output 010. Any other char X will translate to 0NX where N is the repeat byte counter. If you go over 255, just output a new RLE repeat command

Printing Binary Number Backward

I need to print a binary number backward without explicitly converting to binary or using an array (i.e. if the binary number is 10, it should print as 01). Here is the code I've done for printing the number forward. I'm fairly certain that I just need to tell the code to run through the loop starting at the other end in order to have the number render backward. However, I have no idea how to go about doing that, or if that's even correct.
Bonus question -- can someone walk me through what this code is really doing? It's modified from one we were given in class, and I don't fully understand what it actually does.
NOTE: the test case I have been using is 50.
#include <stdio.h>
char str [sizeof(int)];
const int maxbit = 5;
char* IntToBinary (int n, char * BackwardBinaryString) {
int i;
for(i = 0; i <= maxbit; i++) {
if(n & 1 << i) {
BackwardBinaryString[maxbit - i] = '1';
}
else {
BackwardBinaryString[maxbit - i] = '0';
}
}
BackwardBinaryString[maxbit + 1] = '\0';
return BackwardBinaryString;
}
int main () {
int base10input;
scanf("%d", &base10input);
printf("The backwards binary representation is: %s\n", IntToBinary(base10input, str));
return 0;
}

To your disappointment, your code is wrong in these aspects.
sizeof(int) returns the bytes an int takes, but we need the bit it takes as we store each bit in a char, so we need to multiply it by 8.
Your char array str have a size of 4, which means only str[0] to str[3] are vaild. However, you modified str[4], str[5] and str[6] which are out of bounds and such undefined behavior will result in a disaster.
What you should do first is to create an array holds at least sizeof(int) * 8 + 1 chars. (sizeof(int) * 8 for the binary representation, one for the null-terminator) Then start your convention.
And I also suggest that the str should not be a global variable. It will be better to be a local variable of main function.
Your code should be modified like this. I've explained what it does in the comments.
#include <stdio.h>
#define INTBITS (sizeof(int) * 8) // bits an integer takes
char* IntToBinary(int n, char* backwardBinaryString) {
// convert in reverse order (str[INTBITS - 1] to str[0])
// remember that array subscript starts from 0
for (int i = 0; i < INTBITS; i++) {
// (n & (1 << i)) checks the i th bit of n is 0 or 1
// if it is 1, the value of this expression will be true
if (n & (1 << i)) {
backwardBinaryString[INTBITS - 1 - i] = '1';
}
else {
backwardBinaryString[INTBITS - 1 - i] = '0';
}
// here replacing the if-else with and conditional operator like this
// will make the code shorter and easier to read
// backwardBinaryString[INTBITS - 1 - i] = (n & (1 << i)) ? '1' : '0';
}
// add the null-terminator at the end of str (str[INTBITS + 1 - 1])
backwardBinaryString[INTBITS] = '\0';
return backwardBinaryString;
}
int main() {
char str[INTBITS + 1];
int base10input;
scanf("%d", &base10input);
printf("The backwards binary representation is: %s\n", IntToBinary(base10input, str));
return 0;
}

That code is far more elaborate than it needs to be. Since the requirement is to print the bits, there's no need to store them. Just print each one when it's generated. And that, in turn, means that you don't need to use i to keep track of which bit you're generating:
if (n == 0)
std::cout << '0';
else
while (n != 0) {
std::cout << (n & 1) ? '1' : '0';
n >>= 1;
}
std::cout << '\n';

MAC address parsing

I have a MAC address like "6F:e:5B:7C:b:a" that I want to parse and insert the implicit zeros before the :e:, :b:, :a.
I cannot use Boost at the moment but I have a rough solution. The solution splits on ':'. Then I count the characters between and if there is only one I insert a zero at the front.
I was wondering if anyone had a faster approach?

For the quick and dirty:
if (sscanf(text, "%x:%x:%x:%x:%x:%x",
&mac[0], &mac[1], &mac[2], &mac[3], &mac[4], &mac[5]) != 6) {
// handle error
}
Note that it does not check if numbers are really hex. Usual precautions of sscanf() applies.

First of all you could use script that would convert char to int quite fast, so:
unsigned char hex_to_int(const char c)
{
if( c >= 'a' && c <= 'f'){
return c - 'a' + 10;
}
if( c >= 'A' && c <= 'F'){
return c - 'A' + 10;
}
if( c >= '0' && c <= '9'){
return c - '0';
}
return 0;
}
Then you may create loop that will iterate over the string:
unsigned char mac[6]; /* Resulting mac */
int i; /* Iteration number */
char *buffer; /* Text input - will be changed! */
unsigned char tmp; /* Iteration variable */
for( i = 0; i < 6; ++i){
mac[i] = 0;
/*
* Next separator or end of string
* You may also want to limit this loop to just 2 iterations
*/
while( ((*buffer) != '\0') && ((*buffer) != ':'){
mac[i] <<= 4;
mac[i] |= hex_to_int( *buffer);
++buffer;
}
}
if( (i != 6) || (*buffer != NULL)){
// Error in parsing, failed to get to the 6th iteration
// or having trailing characters at the end of MAC
}
This function doesn't do any error checking, but it's probably the fastest solution you'll be getting.

c++ check all array values at once

what I want to do is check an array of bools to see if 3 or more of them have been set to true. The only way I can think to do this is using a if statement for each possible combination of which there is lots because there are ten bools. Dose anybody have any suggestions on how best to do this.

This would be the easiest way:
std::count(bool_array, std::end(bool_array), true) >= 3
Only problem is it keeps counting even after it has found 3. If that is a problem, then I would use sharptooth's method.
side note
I've decided to fashion an algorithm in the style of std::all_of/any_of/none_of for my personal library, perhaps you will find it useful:
template<typename InIt, typename P>
bool n_or_more_of(InIt first, InIt last, P p, unsigned n)
{
while (n && first != last)
{
if (p(*first)) --n;
++first;
}
return n == 0;
}
For your purpose, you would use it like this:
n_or_more_of(bool_array, std::end(bool_array), [](bool b) { return b; }, 3);

The much easier way would be to loop through the array:
int numberOfSet = 0;
for( int i = 0; i < sizeOfArray; i++ ) {
if( array[i] ) {
numberOfSet++;
//early cut-off so that you don't loop further without need
// whether you need it depends on how typical it is to have
// long arrays that have three or more elements set in the beginning
if( numberOfSet >= 3 ) {
break;
}
}
}
bool result = numberOfSet >= 3;

Whenever you are setting an array element into TRUE value, you can increment a global counter. This will be the simplest way. At any point in your code, the global array will tell you the number of TRUE elements in the Array.
Another thing - if you are keeping upto 32 bool values, you can use a single int variable. int is 32 bits (in Win32) and you can store 32 bool.
char x = 0; // 00000000 // char is 8 bits
// TO SET TRUE
x = x | (1 << 4); // 00010000
x = x | (1 << 7); // 10010000
// TO SET FALSE
x = x & ~(1 << 4); // 10010000 & 11101111 => 10000000
// TO CHECK True/False
if( x & ~(1 << 4) )

If it's an array, what you do is loop over it and count the number of trues. But I'm afraid you mean a bitpattern of some kind, right?

Why not just count the number of trues and then do something if the number is 3 or higher:
int sum = 0;
for (int i = 0; i < length; i++){
if (arr[i]){
sum++;
}
}
if (sum >= 3){
// do something...
}

You can loop through and build a bit-mask representation of the array, then you can compare against up to CHAR_BIT * sizeof (unsigned long) in parallel:
unsigned long mask = 0;
for (std::vector<bool>::const_iterator it = flags.begin(), end_it = flags.end();
it != end_it;
++it)
{
if (*it)
mask |= (1 << (it - flags.begin()));
}
if (mask & (0xaa3)) // or whatever mask you want to check
{
}
This assumes that you're looking for patterns, not just want to count the number of true flags in the array.

Just loop through the array counting the number of bools set to true.
/**
* #param arr The array of booleans to check.
* #param n How many must be true for this function to return true.
* #param len The length of arr.
*/
bool hasNTrue(bool *arr, int n, int len) {
int boolCounter;
for(int i=0; i<len; i++) {
if (arr[i]) boolCounter++;
}
return boolCounter>=n;
}
Then call it like so
hasNTrue(myArray, 3, myArrayLength);

Store the bools as bits in an integer. Then apply one of the bit twiddling hacks.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Convert char array to integer value - c++

Related

How to replace a char in string with another char fast(I think test didn't want common way)

Run-length decompression using C++

Printing Binary Number Backward

MAC address parsing

c++ check all array values at once

Categories

Resources