I have a structure defined as
struct xyz {
char *str;
int x;
int y;
};
Which I am getting as input parameter to the executable from some other program1.(other program did execve of the program2 with input parameter as this structure).
I wish to know, can I do the typecast of this input parameter as (struct xyz*)argv[1];, or I have to convert it to string format before sending it?
You can't pass arbitrary data to a command in that way. You'll have to serialize it to a string, or perform some IPC (e.g. through pipe/socket).
The reason for this is that the strings are null terminated. Your char* member will have a \0 on the end, and even if it doesn't, any int less than 16843009 (0x01010101) will have a null byte in it and fail to copy properly.
argv[1] is a string. You can't cast a string to a struct. You need to create your own wrapper functions to serialize and parse, and then watch out for all sorts of issues -- endian-ness issues and encoding issues (wide chars with Unicode, etc.)
Related
[EDIT]I wanted write uint64_t to char* array in network byte order to send it as UDP datagram with sendto, uint64_t has 8 bytes so I convert them as follow:
void strcat_number(uint64_t v, char* datagram) {
uint64_t net_order = htobe64(v);
for (uint8_t i=0; i<8 ;++i) {
strcat(datagram, (const char*)((uint8_t*)&net_order)[i]);
}
}
wchich give me
warning: cast to pointer from integer of different size [-Wint-to-pointer-xast]
strcat(datagram, (const char*)((uint8_t*)&net_order)[i]);
how can I get rid of this warning or maybe do this number converting simpler or clearer?
((uint8_t*)&net_order)
this is a pointer to net_order casted to a uint8_t pointer
((uint8_t*)&net_order)[i]
this is the i-th byte of the underlying representation of net_order.
(const char*)((uint8_t*)&net_order)[i]
this is the same as above, but brutally casted to a const char *. This is an invalid pointer, and it is what the compiler is warning you about; even just creating this pointer is undefined behavior, and using it in any way will almost surely result in a crash.
Notice that, even if you somehow managed to make this kludge work, strcat is still the wrong function, as it deals with NUL-terminated strings, while here you are trying to put binary data inside your buffer, and binary data can naturally contain embedded NULs. strcat will append at the first NUL (and stop at the first NUL in the second parameter) instead of at the "real" end.
If you are building a buffer of binary data you have to use straight memcpy, and most importantly you cannot use string-related functions that rely on the final NUL to know where the string ends, but you have to keep track explicitly of how many bytes you used (i.e. the current position in the datagram).
I am trying to print the returned value of NtQueryValueKey which is UCHAR Data[1]; i have tried printf, cout, and string(Data, DataLengh), with the first two printing only 1 character and the last one throws an exception... Basically if i changed the Data Type to WCHAR Data[1] and used wstring(Data) it accepts it normally without any complain... also wprintf prints the value normally.
Edit: I meant NtQueryValueKey using the KEY_VALUE_PARTIAL_INFORMATION, I am using VS 2015 btw...
You must have mixed something up. You did not specify what value from the KEY_NAME_INFORMATION enumeration you are using for the second parameter to specify the data type, but a quick look at MSDN shows that all of the structures contain WCHAR Name[1]; or something similar as the last member (which I guess is the one you are interested in). Can you elaborate and provide the link or other means of documentation that states you actually need to use UCHAR ?
WCHAR is an alias for wchar_t. std::wstring operates with wchar_t elements. A WCHAR[] can decay to a wchar_t*, and thus can be assigned directly to a std::wstring.
UCHAR is an alias for unsigned char. std::string operates with char elements instead. A UCHAR[]/UCHAR* cannot be assigned directly to a std::string without a type-cast to char*, as char and unsigned char are distinct data types.
unsigned char is commonly used to represent 8bit bytes (it is the same data type used for BYTE).
NtQueryKey() returns strings as UTF-16LE encoded bytes using WCHAR[] character arrays, not UCHAR[] byte arrays. So your code is declaring things wrong if you are using UCHAR[] to begin with. But even so, you can use UCHAR if you pay attention to the encoding and byte length, and use appropriate type-casts.
Any associated Length value reported by NtQueryKey() is expressed in bytes, not characters. sizeof(UCHAR) is 1 and sizeof(WCHAR) is 2. So every 2 UCHARs represents 1 WCHAR. And the strings are not null-terminated, so you have to take the Length into account when printing or converting.
In Latin-based languages, most commonly used Unicode characters will be <= U+00FF, and thus every other UCHAR in UTF-16LE will usually be 0. That is interpreted as a null terminator when UTF-16 is printed with printf() or std::cout. You need to use wprintf() or std::wcout instead.
Converting Data to a std::string is a valid operation and should not be raising an exception:
std::string((char*)Data, DataLength)
Provided that:
Data is a valid pointer.
DataLength is an accurate byte count.
The only way this could raise an exception is if either:
Data is not pointing at valid memory.
the value of DataLength is more than the actual number of bytes allocated for Data.
available memory is too low to allocate std::string's internal buffer.
memory is corrupted.
Assigning Data by itself to a std::wstring without taking DataLength into account is not a valid operation because the strings are not null-terminated. You must specify the length:
std::wstring(Data, DataLength / sizeof(WCHAR))
If Data is UCHAR then use a type-cast:
std::wstring((WCHAR*)Data, DataLength / sizeof(WCHAR))
When printing Data directly with wprintf(), you must pass DataLength as an input parameter:
wprintf(L"%.*s", DataLength / sizeof(WCHAR), Data);
When printing Data directly with std::wcout, you should use write() instead of operator<< so you can pass DataLength as an input parameter:
std::wcout.write(Data, DataLength / sizeof(WCHAR));
If Data is UCHAR then use a type-cast:
std::wcout.write((WCHAR*)Data, DataLength / sizeof(WCHAR));
There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?
Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.
you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)
Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.
Can someone explain and help me out here please.
Lets say i have function like this where lpData holds a pointer to the data i want.
void foo(LPVOID lpData) {
}
What is the proper way to retreive this. This works but i get weird characters at the end
void foo(LPVOID lpData) {
LPVOID *lpDataP = (LPVOID *)lpData;
char *charData = (char*)lpDataP;
//i log charData....
}
I would prefer to use strings but I don't understand how to retrieve the data, i just get null pointer error when i try to use string. lpData holds a pointer right? (But my function is lpData not *lpData) so it isn't working? Am i doing this all wrong?
string *datastring = reinterpret_cast<std::string *>(lpData);
is what im trying.
This works but i get weird characters at the end
That means that your string isn't null-terminated—that is, it doesn't have a NUL byte (0) marking the end of the string.
C strings have to be null-terminated.* When you log a C string (char *), it keeps logging characters until it finds a NUL. If there wasn't one on the end of the string, it'll keep going through random memory until it finds one (or until you hit a page fault and crash). This is bad. And there's no way to fix it; once you lose the length, there's no way to get it back.
However, an unterminated string along with its length can be useful. Many functions can take the length alongside the char *, as an extra argument (e.g., the string constructor) or otherwise (e.g., width specifiers in printf format strings).
So, if you take the length, and only call functions that also take the length—or just make a null-terminated copy and use that—you're fine. So:
void foo(LPVOID lpData, int cchData) {
string sData(static_cast<const char *>(lpData), cchData);
// now do stuff with sData
}
Meanwhile, casting from LPVOID (aka void *, aka pointer-to-anything) to LPVOID * (aka void **, aka pointer to pointer-to-anything) to then cast to char * (pointer-to-characters) is wrong (and should be giving you a compiler warning in the second cast; if you're getting warnings and ignoring them, don't do that!). Also, it's generally better to use modern casts instead of C-style casts, and it's always better to be const-correct when there's no down side; it just makes things more explicit to the reader and safer in the face of future maintenance.
Finally:
string *datastring = reinterpret_cast<std::string *>(lpData);
This is almost certainly wrong.** The LPVOID is just pointing at a bunch of characters. You're saying you want to interpret those characters as if they were a string object. But a string object is some header information (maybe a length and capacity, etc.) plus a pointer to a bunch of characters. Treating one as the other is going to lead to garbage or crashes.***
* Yes, you're using C++, not C, but a char * is a "C string".
** If you actually have a string object that you've kept alive somewhere, and you stashed a pointer to that object in an LPVOID and have now retrieved it (e.g., with SetWindowLongPtr/GetWindowLongPtr), then a cast from LPVOID to string * would make sense. But I doubt that's what you're doing. (If you are, then you don't need the reinterpret_cast. The whole point of void * is that it's not interpreted, so there's nothing to reinterpret from. Just use static_cast.)
*** Or, worst of all, it may appear to work, but then lead to hard-to-follow crashes or corruption. Some standard C++ libraries use a special allocator to put the header right before the characters and return a pointer to the first character, so that a string can be used anywhere a char * can. Inside the string class, every method has to fudge the this pointer backward; for example, instead of just saying m_length it has to do something like static_cast<_string_header *>(this)[-1]->m_length. But the other way around doesn't work—if you just have a bunch of characters, not a string object, that fudge is going to read whatever bytes happened to be allocated right before the characters and try to interpret them as an integer, so you may end up thinking you have a string of length 0, or 182423742341241243.
There are at least two ways:
void foo(LPVOID lpData)
{
char *charData = (char*)lpData;
//i log charData....
}
or
void foo(LPVOID lpData)
{
char *charData = static_cast<char*>lpData;
//i log charData....
}
I came across the statement:
outbal.write( (char*) &acc , sizeof( struct status ) );
outbal is an object of ofstream and status is a type.
Therefore:
struct status {
// code
};
status acc;
Talking about the second line I don't understand the first argument, which is (char*) &acc. What are we doing and how?
(char*)&acc if the address of the variable acc, cast to a char pointer so as to be compatible with ostream::write(). It is that variable that is being written, to the outbal stream, for a length of sizeof(struct status).
ostream::write takes a memory address and a length and will output that memory to the specified stream. In other words, you're simply outputting the entire memory contents of the acc variable.
Your code is similar to:
struct xyz {int a; float b; void *c};
ostream os("myfile.dat");
struct xyz abc; // 'struct' not technically needed in C++
os.write ( (char *)abc, sizeof (struct xyz));
// <<-memory addr->> <<-----size----->>
You are taking the address of acc and casting it to char*, which is what the ostream::write member function expects.
In short, you are writing the in-memory representation of the struct as-is to a stream.
(char*) &acc takes the address of the struct acc (i.e. a pointer to acc) and then casts it into a pointer to a char.
That's just taking the address of acc and casting it to a pointer to char.
Most likely that .write() method is meant to just blindly write a given amount of bytes out as-is. char is a convienent type to use for that, since (on most platforms) it is exactly one byte in size. So you pass it a pointer to the data you want written out, telling it, "Pretend this is a pointer to char".