Convert Address to long - variable results in value? - c++

Can anybody explain this behaviour to me pls?
static short nDoSomething(const char* pcMsg, ...)
{
va_list pvArgument;
long lTest;
void* pvTest = NULL;
va_start(pvArgument, pcMsg);
pvTest = va_arg(pvArgument, void*);
lTest = (long) pvTest;
va_end(pvArgument);
return 0;
}
If I call this function in the main like this:
int main(int argc, char* argv[])
{
char acTest1[20];
nDoSomething("TestMessage", 1234567L, acTest1);
return 0;
}
I thought that the address of pvTest would be in lTest, but in fact it contains 1234567 ...
How is this possible?

Your code contains undefined behavior; the standard requires
that the type extracted using va_arg correspond to the type
passed (modulo cv-qualifiers, perhaps): You passed a long, and
read a void*, so anything which the compiler does is correct.
In practice, most compilers generate code which does no type
checking. If on your machine, long and void* have the same
size (and the machine has linear addressing), you will probably
end up with whatever you passed as long. If the sizes of the
two are different, but the machine is little endian, and you
pass a small enough value, you might end up with the same value
as well. But this is not at all guaranteed.

You are just lucky here.
va_start(pvArgument, pcMsg);
prepares for va_arg(pvArgument,T) to extract the next variable
argument following pcMsg with the presumption that it is of type T.
The next argument after pcMsg is in fact the long int 1234567; but
you wrongly extract it as a void * and then cast it to long into
lTest. You are just lucky that a void * on your system is the
same size as a long.
(Or maybe I mean oddly unlucky)

Related

Taking an index out of const char* argument

I have the following code:
int some_array[256] = { ... };
int do_stuff(const char* str)
{
int index = *str;
return some_array[index];
}
Apparently the above code causes a bug in some platforms, because *str can in fact be negative.
So I thought of two possible solutions:
Casting the value on assignment (unsigned int index = (unsigned char)*str;).
Passing const unsigned char* instead.
Edit: The rest of this question did not get a treatment, so I moved it to a new thread.
The signedness of char is indeed platform-dependent, but what you do know is that there are as many values of char as there are of unsigned char, and the conversion is injective. So you can absolutely cast the value to associate a lookup index with each character:
unsigned char idx = *str;
return arr[idx];
You should of course make sure that the arr has at least UCHAR_MAX + 1 elements. (This may cause hilarious edge cases when sizeof(unsigned long long int) == 1, which is fortunately rare.)
Characters are allowed to be signed or unsigned, depending on the platform. An assumption of unsigned range is what causes your bug.
Your do_stuff code does not treat const char* as a string representation. It uses it as a sequence of byte-sized indexes into a look-up table. Therefore, there is nothing wrong with forcing unsigned char type on the characters of your string inside do_stuff (i.e. use your solution #1). This keeps re-interpretation of char as an index localized to the implementation of do_stuff function.
Of course, this assumes that other parts of your code do treat str as a C string.

Compiler Error: Invalid Conversion from int* to unsigned int* [-fpermissive]

I'm having the strangest issue today. I was working with an example online, and to my lack of surprise, it didn't work (they pretty much never do). I went about fixing it myself, but I seem to be stuck on this error:
Error: Invalid Conversion from int* to unsigned int* [-fpermissive]
I understand this. I'm providing an int*, it wants an unsigned int*. However, I don't actually know why the int* is being generated.
Here's the snippet code throwing the problem:
unsigned char md_value[EVP_MAX_MD_SIZE];
int md_length;
EVP_DigestFinal_ex(md_ctx, md_value, &md_length);
The third argument of that function call, &md_length, is causing the problem. Looking at the documentation for that function call (From OpenSSL, if it
matters), it expects the argument to be of unsigned int* type, which makes since, because it wants an address (or at least that how the example I'm working with is using it).
Funny thing is, I thought that the & operator returned an unsigned int*, as returning an int* doesn't make sense, seeing as computers don't have negative addresses in their memory.
Here's the example I'm following along with, if you wish to take a look: https://www.openssl.org/docs/crypto/EVP_DigestInit.html
Below is the Source Code, should you want to try it out yourself. I doubt you'll actually need to read it to solve this problem, but having it here couldn't hurt.
Source Code:
//* Description *//
// Title: Example
// Author: Boom Blockhead
// Example Program to Test OpenSSL
//* Libraries *//
#include <stdio.h>
#include <cstring>
#include <openssl/evp.h>
//* Namespaces *//
using namespace std;
//* Main Method *//
// Runs the Program and Interprets the Command Line
int main(int argc, char* argv[])
{
// Initialize the Messages
char msg1[] = "Test Message\n";
char msg2[] = "Hello World\n";
// Validate Argument Count
if(argc != 2)
{
printf("Usage: %s digestname\n", argv[0]);
exit(1);
}
// Determine Message Digest by Name
OpenSSL_add_all_digests();
const EVP_MD* md = EVP_get_digestbyname(argv[1]);
// Make sure a Message Digest Type was Found
if(md == 0)
{
printf("Unknown Message Digest %s\n", argv[1]);
exit(1);
}
// Create the Message Digest Context
EVP_MD_CTX* md_ctx = EVP_MD_CTX_create();
// Setup the Message Digest Type
EVP_DigestInit_ex(md_ctx, md, NULL);
// Add the Messages to be Digested
EVP_DigestUpdate(md_ctx, msg1, strlen(msg1));
EVP_DigestUpdate(md_ctx, msg2, strlen(msg2));
// Digest the Message
unsigned char md_value[EVP_MAX_MD_SIZE];
int md_length;
EVP_DigestFinal_ex(md_ctx, md_value, &md_length); // <--- ERROR
// Destroy the Message Digest Context
EVP_MD_CTX_destroy(md_ctx);
// Print the Digested Text
printf("Digest is: ");
for(int i = 0; i < md_length; i++)
printf("%02x", md_value[i]);
printf("\n");
// Clean up the Message Digest
EVP_cleanup();
// Exit the Program
exit(0);
return 0;
}
Also, putting the explicit cast of (unsigned int*) seems to just make things worse, as I then get the follow error:
example.cpp:(.text+0x91): undefined reference to `OpenSSl_add_all_digests`
...
example.cpp:(.text+0x1ec): undefined reference to `EVP_cleanup`
Basically, it complains about all of the OpenSSL functions.
Lastly, (again, this is just to give you guys everything you could possibly need) I'm not sending in any funny arguments to the compiler. I'm just using:
gcc example.cpp -o example.out
Since there's an error, the example.out never actually gets created.
Funny thing is, I thought that the & operator returned an unsigned int*, as returning an int* doesn't make sense, seeing as computers don't have negative addresses in their memory.
That's not "funny"; it's just wrong.
The address-of operator applied to an object of type T gives you a pointer of type T*.
Period.
Whether T is an unsigned or signed type doesn't come into it, and has nothing to do with philosophical debate involved about whether computers may have negative addresses. In fact, pointers are generally signed because, if they weren't, you'd soon get stuck when you try to take the difference between two addresses, or walk backwards in the address space.
But that has nothing to do with the use of the term unsigned in your code.
as returning an int* doesn't make sense, seeing as computers don't have negative addresses in their memory.
You are misunderstanding what the type name means.
unsigned int* is a pointer to an unsigned int. unsigned does not refer to the pointer value.
So the solution is to change your int to unsigned int.
Funny thing is, I thought that the & operator returned an unsigned int*, as returning an int* doesn't make sense, seeing as computers don't have negative addresses in their memory
It's not the signed-ness of the memory address number (the pointer's underlying value), it's the signed-ness of the datatype stored within the memory addresses, in this case integers.
Change your md_length to an unsigned int as per the spec and it should be OK.

Why do they want an 'unsigned char*' and not just a normal string or 'char*'

EDIT: After taking adivce I have rearranged the parameters & types. But the application crashes when I call the digest() function now? Any ideas whats going wrong?
const std::string message = "to be encrypted";
unsigned char* hashMessage;
SHA256::getInstance()->digest( message, hashMessage ); // crash occurs here, what am I doing wrong?
printf("AFTER: n"); //, hashMessage); // line never reached
I am using an open source implementation of the SHA256 algorithm in C++. My problem is understanding how to pass a unsigned char* version of my string so it can be hashed?
This is the function that takes a unsigned char* version of my string:
void SHA256::digest(const std::string &buf, unsigned char *dig) {
init();
update(reinterpret_cast<const unsigned char *>(buf.c_str()), static_cast<unsigned int>(buf.length()));
final();
digest(dig);
}
How can I convert my string(which I want hashed) to an unsigned char*?
The following code I have made causes a runtime error when I go to print out the string contents:
const std::string hashOutput;
char message[] = "to be encrypted";
printf("BEFORE: %s bb\n", hashOutput.c_str());
SHA256::getInstance()->digest( hashOutput, reinterpret_cast<unsigned char *>(message) );
printf("AFTER: %s\n", hashOutput.c_str()); // CRASH occurs here
PS: I have been looking at many implementations of SHA256 & they all take an unsigned char* as the message to be hashed. Why do they do that? Why not a char* or a string instead?
You have the parameters around the wrong way. Buf is the input (data to be hashed) and dig is the output digest ( the hash).
Furthermore, a hash is binary data. You will have to convert said binary data into some string representation prior to printing it to screen. Normally, people choose to use a hexadecimal string for this.
The reason that unsigned char is used is that it has guaranteed behaviours under bitwise operations, shifts, and overflow.
char, (when it corresponds to signed char) does not give any of these guarantees, and so is far less useable for operations intended to act directly on the underlying bits in a string.
The answer to the question: "why does it crash?" is "you got lucky!". Your code has undefined behaviour. In short, you are writing through a pointer hashMessage that has never been initialised to point to any memory. A short investigation of the source code for the library that you are using reveals that it requires the digest pointer to point to a block of valid memory that is at least SHA256_DIGEST_SIZE chars long.
To fix this problem, all that you need to do is to make sure that the pointer that you pass in as the digest argument (hashMessage) is properly initialised, and points to a block of memory of sufficient size. In code:
const std::string message("to be encrypted");
unsigned char hashMessage[SHA256_DIGEST_SIZE];
SHA256::getInstance()->digest( message, hashMessage );
//hashMessage should now contain the hash of message.
I don't know how a SHA256 hash is produced but maybe it involves some sort of arithmetic that needs to be done on a unsigned data type.
Why does it matter? Get a char* from your string object by calling the c_str() method then cast to unsigned char*.

What can I do with an unsigned char* when I needed a string?

Suppose that I have a unsigned char*, let's call it: some_data
unsigned char* some_data;
And some_data has url-like data in it. for example:
"aasdASDASsdfasdfasdf&Foo=cow&asdfasasdfadsfdsafasd"
I have a function that can grab the value of 'foo' as follows:
// looks for the value of 'foo'
bool grabFooValue(const std::string& p_string, std::string& p_foo_value)
{
size_t start = p_string.find("Foo="), end;
if(start == std::string::npos)
return false;
start += 4;
end = p_string.find_first_of("& ", start);
p_foo_value = p_string.substr(start, end - start);
return true;
}
The trouble is that I need a string to pass to this function, or at least a char* (which can be converted to a string no problem).
I can solve this problem by casting:
reinterpret_cast<char *>(some_data)
And then pass it to the function all okie-dokie
...
Until I used valgrind and found out that this can lead to a subtle memory leak.
Conditional jump or move depends on uninitialised value(s) __GI_strlen
From what I gathered, it has to do with the reinterpret casting messing up the null indicating the end of the string. Thus when c++ tries to figure out the length of the string thing's get screwy.
Given that I can't change the fact that some_data is represented by an unsigned char*, is there a way to go about using my grabFooValue function without having these subtle problems?
I'd prefer to keep the value-finding function that I already have, unless there is clearly a better way to rip the foo-value out of this (sometimes large) unsigned char*.
And despite the unsigned char* some_data 's varying, and sometimes large size, I can assume that the value of 'foo' will be somewhere early on, so my thoughts were to try and get a char* of the first X characters of the unsigned char*. This could potentially get rid of the string-length issue by having me set where the char* ends.
I tried using a combination of strncpy and casting but so far no dice. Any thoughts?
You need to know the length of the data your unsigned char * points to, since it isn't 0-terminated.
Then, use e.g:
std::string s((char *) some_data, (char *) some_data + len);

How to convert char* to unsigned short in C++

I have a char* name which is a string representation of the short I want, such as "15" and need to output this as unsigned short unitId to a binary file. This cast must also be cross-platform compatible.
Is this the correct cast: unitId = unsigned short(temp);
Please note that I am at an beginner level in understanding binary.
I assume that your char* name contains a string representation of the short that you want, i.e. "15".
Do not cast a char* directly to a non-pointer type. Casts in C don't actually change the data at all (with a few exceptions)--they just inform the compiler that you want to treat one type into another type. If you cast a char* to an unsigned short, you'll be taking the value of the pointer (which has nothing to do with the contents), chopping off everything that doesn't fit into a short, and then throwing away the rest. This is absolutely not what you want.
Instead use the std::strtoul function, which parses a string and gives you back the equivalent number:
unsigned short number = (unsigned short) strtoul(name, NULL, 0);
(You still need to use a cast, because strtoul returns an unsigned long. This cast is between two different integer types, however, and so is valid. The worst that can happen is that the number inside name is too big to fit into a short--a situation that you can check for elsewhere.)
#include <boost/lexical_cast.hpp>
unitId = boost::lexical_cast<unsigned short>(temp);
To convert a string to binary in C++ you can use stringstream.
#include <sstream>
. . .
int somefunction()
{
unsigned short num;
char *name = "123";
std::stringstream ss(name);
ss >> num;
if (ss.fail() == false)
{
// You can write out the binary value of num. Since you mention
// cross platform in your question, be sure to enforce a byte order.
}
}
that cast will give you (a truncated) integer version of the pointer, assuming temp is also a char*. This is almost certainly not what you want (and the syntax is wrong too).
Take a look at the function atoi, it may be what you need, e.g. unitId = (unsigned short)(atoi(temp));
Note that this assumes that (a) temp is pointing to a string of digits and (b) the digits represent a number that can fit into an unsigned short
Is the pointer name the id, or the string of chars pointed to by name? That is if name contains "1234", do you need to output 1234 to the file? I will assume this is the case, since the other case, which you would do with unitId = unsigned short(name), is certainly wrong.
What you want then is the strtoul() function.
char * endp
unitId = (unsigned short)strtoul(name, &endp, 0);
if (endp == name) {
/* The conversion failed. The string pointed to by name does not look like a number. */
}
Be careful about writing binary values to a file; the result of doing the obvious thing may work now but will likely not be portable.
If you have a string (char* in C) representation of a number you must use the appropriate function to convert that string to the numeric value it represents.
There are several functions for doing this. They are documented here:
http://www.cplusplus.com/reference/clibrary/cstdlib