Convert Char array to uint8_t vector - c++

For some project i need to send encoded messages but i can only give vetor of uint8_t to be sent, and i have a char array (with numbers and string i converted to hexadecimal in it) and a pointer on the array. I encode the msg which is an object into the buffer then i have to send it and decode it etc.
char buffer[1024]
char *p = buffer
size_t bufferSize = sizeof(buffer)
Encode(msg,p,bufferSize)
std::vector<uint8_t> encodedmsg; //here i need to put my message in the buffer
Send(encodedmsg.data(),encodedmsg.size()) //Only taking uint8_t vector
Here is the prototype of send :
uint32_t Send(const uint8_t * buffer, const std::size_t bufferSize)
I already looked at some questions but no one have to replace it in a vector or convert to uint8_t.
I thinked bout memcpy or reinterpreted cast or maybe using a for loop but i don't really know how to do it whitout any loss.
Thanks,

Actually your code suggest that Send() function takes pointer to uint8_t, not std::vector<uint8_t>.
And since char and uint8_t has same memory size you just could do:
Send(reinterpret_cast<uint8_t*>(p), bufferSize);
But if you want to do everything "right" you could do this:
encodedmsg.resize(bufferSize);
std::transform(p, p + bufferSize, encodedmsg.begin(), [](char v) {return static_cast<uint8_t>(v);});

Related

How to create streambuf with array of unsigned char in C++

I want to create a std::istream object with a stream buffer object that can take raw byte data from array of unsigned char. I searched and found this Link
However they create the stream buffer based on array char:
struct membuf : std::streambuf
{
membuf(char* begin, char* end) {
this->setg(begin, begin, end);
}
};
I thought about type caste , but i don't want to modify the original data.So how i can it be done using unsigned char.
With std::istream you cannot use unsigned char explicitly, because it is a typedef for std::basic_istream<char> docs. You can cast your buffer pointers to char*
this->setg(reinterpret_cast<char*>(begin), reinterpret_cast<char*>(begin), reinterpret_cast<char*>(end));
Note that conversion of values greater than CHAR_MAX to char is implementaion defined (of course, only if you will actually use this values as char).
Or you can try to use std::basic_istream<unsigned char> (I have not tried it though).

Convert char* to uint8_t

I transfer message trough a CAN protocol.
To do so, the CAN message needs data of uint8_t type. So I need to convert my char* to uint8_t. With my research on this site, I produce this code :
char* bufferSlidePressure = ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();//My char*
/* Conversion */
uint8_t slidePressure [8];
sscanf(bufferSlidePressure,"%c",
&slidePressure[0]);
As you may see, my char* must fit in sliderPressure[0].
My problem is that even if I have no error during compilation, the data in slidePressure are totally incorrect. Indeed, I test it with a char* = 0 and I 've got unknow characters ... So I think the problem must come from conversion.
My datas can be Bool, Uchar, Ushort and float.
Thanks for your help.
Is your string an integer? E.g. char* bufferSlidePressure = "123";?
If so, I would simply do:
uint8_t slidePressure = (uint8_t)atoi(bufferSlidePressure);
Or, if you need to put it in an array:
slidePressure[0] = (uint8_t)atoi(bufferSlidePressure);
Edit: Following your comment, if your data could be anything, I guess you would have to copy it into the buffer of the new data type. E.g. something like:
/* in case you'd expect a float*/
float slidePressure;
memcpy(&slidePressure, bufferSlidePressure, sizeof(float));
/* in case you'd expect a bool*/
bool isSlidePressure;
memcpy(&isSlidePressure, bufferSlidePressure, sizeof(bool));
/*same thing for uint8_t, etc */
/* in case you'd expect char buffer, just a byte to byte copy */
char * slidePressure = new char[ size ]; // or a stack buffer
memcpy(slidePressure, (const char*)bufferSlidePressure, size ); // no sizeof, since sizeof(char)=1
uint8_t is 8 bits of memory, and can store values from 0 to 255
char is probably 8 bits of memory
char * is probably 32 or 64 bits of memory containing the address of a different place in memory in which there is a char
First, make sure you don't try to put the memory address (the char *) into the uint8 - put what it points to in:
char from;
char * pfrom = &from;
uint8_t to;
to = *pfrom;
Then work out what you are really trying to do ... because this isn't quite making sense. For example, a float is probably 32 or 64 bits of memory. If you think there is a float somewhere in your char * data you have a lot of explaining to do before we can help :/
char * is a pointer, not a single character. It is possible that it points to the character you want.
uint8_t is unsigned but on most systems will be the same size as a char and you can simply cast the value.
You may need to manage the memory and lifetime of what your function returns. This could be done with vector< unsigned char> as the return type of your function rather than char *, especially if toUtf8() has to create the memory for the data.
Your question is totally ambiguous.
ui->canDataModifiableTableWidget->item(6,3)->text().toUtf8().data();
That is a lot of cascading calls. We have no idea what any of them do and whether they are yours or not. It looks dangerous.
More safe example in C++ way
char* bufferSlidePressure = "123";
std::string buffer(bufferSlidePressure);
std::stringstream stream;
stream << str;
int n = 0;
// convert to int
if (!(stream >> n)){
//could not convert
}
Also, if boost is availabe
int n = boost::lexical_cast<int>( str )

how to parse unsigned char array to numerical data

The setup of my question is as follows:
I have a source sending a UDP Packet to my receiving computer
Receiving computer takes the UDP packet and receives it into unsigned char *message.
I can print the packet out byte-wise using
for(int i = 0; i < sizeof(message); i++) {
printf("0x%02 \n", message[i];
}
And this is where I am! Now I'd like to start parsing these bytes I recieved into the network as shorts, ints, longs, and strings.
I've written a series of functions like:
short unsignedShortToInt(char[] c) {
short i = 0;
i |= c[1] & 0xff;
i <<= 8;
i |= c[0] & 0xff;
return i;
}
to parse the bytes and shift them into ints, longs, and shorts. I can use sprintf() to create strings from byte arrays.
My question is -- what's the best way to get the substrings from my massive UDP packet? The packet is over 100 character in lengths, so I'd like an easy way to pass in message[0:6] or message[20:22] to these variation utility functions.
Possible options:
I can use strcpy() to create a temporary array for each function call, but that seems a bit messy.
I can turn the entire packet into a string and use std::string::substr. This seems nice, but I'm concerned that converting the unsigned chars into signed chars (part of the string conversion process) might cause some errors (maybe this concern is unwarranted?).
Maybe another way?
So I ask you, stackoverflow, to recommend a clean, concise way to do this task!
thanks!
Why not use proper serialization ?
i.e. MsgPack
You'll need a scheme how to differentiate messages. You could for example make them self-describing, something like:
struct my_message {
string protocol;
string data;
};
and dispatch decoding based on the protocol.
You'll most probably be better off using a tested serialization library than finding out that your system is vulnerable to buffer overflow attacks and malfunction.
I think you have two problems to solve here. First you need to make sure the integer data are properly aligned in memory after extracting them from the character buffer. next you need to ensure the correct byte order of the integer data after their extraction.
The alignment problem can be solved with a union containing the integral data type super-imposed upon a character array of the correct size. The network byte order problem can be solved using the standard ntohs() and ntohl() functions. This will only work if the sending software also used the standard byte-order produced by the inverse of these functions.
See: http://www.beej.us/guide/bgnet/output/html/multipage/htonsman.html
Here are a couple of UNTESTED functions you may find useful. I think they should just about do what you are after.
#include <netinet/in.h>
/**
* General routing to extract aligned integral types
* from the UDP packet.
*
* #param data Pointer into the UDP packet data
* #param type Integral type to extract
*
* #return data pointer advanced to next position after extracted integral.
*/
template<typename Type>
unsigned char const* extract(unsigned char const* data, Type& type)
{
// This union will ensure the integral data type is correctly aligned
union tx_t
{
unsigned char cdata[sizeof(Type)];
Type tdata;
} tx;
for(size_t i(0); i < sizeof(Type); ++i)
tx.cdata[i] = data[i];
type = tx.tdata;
return data + sizeof(Type);
}
/**
* If strings are null terminated in the buffer then this could be used to extract them.
*
* #param data Pointer into the UDP packet data
* #param s std::string type to extract
*
* #return data pointer advanced to next position after extracted std::string.
*/
unsigned char const* extract(unsigned char const* data, std::string& s)
{
s.assign((char const*)data, std::strlen((char const*)data));
return data + s.size();
}
/**
* Function to parse entire UDP packet
*
* #param data The entire UDP packet data
*/
void read_data(unsigned char const* const data)
{
uint16_t i1;
std::string s1;
uint32_t i2;
std::string s2;
unsigned char const* p = data;
p = extract(p, i1); // p contains next position to read
i1 = ntohs(i1);
p = extract(p, s1);
p = extract(p, i2);
i2 = ntohl(i2);
p = extract(p, s2);
}
Hope that helps.
EDIT:
I have edited the example to include strings. It very much depends on how the strings are stored in the stream. This example assumes the strings are null-terminated c-strings.
EDIT2:
Whoopse, changed code to accept unsigned chars as per question.
If the array is only 100 characters in length just create a char buffer[100] and a queue of them so you don't miss processing any of the messages.
Next you could just index that buffer as you described and if you know the struct of the message then you know the index points.
next you can union the types i.e
union myType{
char buf[4];
int x;
}
giving you the value as an int from a char if thats what you need

Casting an unsigned int + a string to an unsigned char vector

I'm working with the NetLink socket library ( https://sourceforge.net/apps/wordpress/netlinksockets/ ), and I want to send some binary data over the network in a format that I specify.
The format I have planned is pretty simple and is as follows:
Bytes 0 and 1: an opcode of the type uint16_t (i.e., an unsigned integer always 2 bytes long)
Bytes 2 onward: any other data necessary, such as a string, an integer, a combination of each, etc.. the other party will interpret this data according to the opcode. For example, if the opcode is 0 which represents "log in", this data will consist of one byte integer telling you how long the username is, followed by a string containing the username, followed by a string containing the password. For opcode 1, "send a chat message", the entire data here could be just a string for the chat message.
Here's what the library gives me to work with for sending data, though:
void send(const string& data);
void send(const char* data);
void rawSend(const vector<unsigned char>* data);
I'm assuming I want to use rawSend() for this.. but rawSend() takes unsigned chars, not a void* pointer to memory? Isn't there going to be some loss of data here if I try to cast certain types of data to an array of unsigned chars? Please correct me if I'm wrong.. but if I'm right, does this mean I should be looking at another library that has support for real binary data transfer?
Assuming this library does serve my purposes, how exactly would I cast and concatenate my various data types into one std::vector? What I've tried is something like this:
#define OPCODE_LOGINREQUEST 0
std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);
// and at this point (not shown), I would push_back() the individual characters of the strings of the username and password.. after one byte worth of integer telling you how many characters long the username is (so you know when the username stops and the password begins)
socket->rawSend(loginRequestData);
Ran into some exceptions, though, on the other end when I tried to interpret the data. Am I approaching the casting all wrong? Am I going to lose data by casting to unsigned chars?
Thanks in advance.
I like how they make you create a vector (which must use the heap and thus execute in unpredictable time) instead of just falling back to the C standard (const void* buffer, size_t len) tuple, which is compatible with everything and can't be beat for performance. Oh, well.
You could try this:
void send_message(uint16_t opcode, const void* rawData, size_t rawDataSize)
{
vector<unsigned char> buffer;
buffer.reserve(sizeof(uint16_t) + rawDataSize);
#if BIG_ENDIAN_OPCODE
buffer.push_back(opcode >> 8);
buffer.push_back(opcode & 0xFF);
#elseif LITTLE_ENDIAN_OPCODE
buffer.push_back(opcode & 0xFF);
buffer.push_back(opcode >> 8);
#else
// Native order opcode
buffer.insert(buffer.end(), reinterpret_cast<const unsigned char*>(&opcode),
reinterpret_cast<const unsigned char*>(&opcode) + sizeof(uint16_t));
#endif
const unsigned char* base(reinterpret_cast<const unsigned char*>(rawData));
buffer.insert(buffer.end(), base, base + rawDataSize);
socket->rawSend(&buffer); // Why isn't this API using a reference?!
}
This uses insert which should optimize better than a hand-written loop with push_back(). It also won't leak the buffer if rawSend tosses an exception.
NOTE: Byte order must match for the platforms on both ends of this connection. If it does not, you'll need to either pick one byte order and stick with it (Internet standards usually do this, and you use the htonl and htons functions) or you need to detect byte order ("native" or "backwards" from the receiver's POV) and fix it if "backwards".
I would use something like this:
#define OPCODE_LOGINREQUEST 0
#define OPCODE_MESSAGE 1
void addRaw(std::vector<unsigned char> &v, const void *data, const size_t len)
{
const unsigned char *ptr = static_cast<const unsigned char*>(data);
v.insert(v.end(), ptr, ptr + len);
}
void addUint8(std::vector<unsigned char> &v, uint8_t val)
{
v.push_back(val);
}
void addUint16(std::vector<unsigned char> &v, uint16_t val)
{
val = htons(val);
addRaw(v, &val, sizeof(uint16_t));
}
void addStringLen(std::vector<unsigned char> &v, const std::string &val)
{
uint8_t len = std::min(val.length(), 255);
addUint8(v, len);
addRaw(v, val.c_str(), len);
}
void addStringRaw(std::vector<unsigned char> &v, const std::string &val)
{
addRaw(v, val.c_str(), val.length());
}
void sendLogin(const std::string &user, const std::string &pass)
{
std::vector<unsigned char> data(
sizeof(uint16_t) +
sizeof(uint8_t) + std::min(user.length(), 255) +
sizeof(uint8_t) + std::min(pass.length(), 255)
);
addUint16(data, OPCODE_LOGINREQUEST);
addStringLen(data, user);
addStringLen(data, pass);
socket->rawSend(&data);
}
void sendMsg(const std::string &msg)
{
std::vector<unsigned char> data(
sizeof(uint16_t) +
msg.length()
);
addUint16(data, OPCODE_MESSAGE);
addStringRaw(data, msg);
socket->rawSend(&data);
}
std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
loginRequestData->push_back(opcode);
If unsigned char is 8 bits long -which in most systems is-, you will be loosing the higher 8 bits from opcode every time you push. You should be getting a warning for this.
The decision for rawSend to take a vector is quite odd, a general library would work at a different level of abstraction. I can only guess that it is this way because rawSend makes a copy of the passed data, and guarantees its lifetime until the operation has completed. If not, then is just a poor design choice; add to that the fact that its taking the argument by pointer... You should see this data as a container of raw memory, there are some quirks to get right but here is how you would be expected to work with pod types in this scenario:
data->insert( data->end(), reinterpret_cast< char const* >( &opcode ), reinterpret_cast< char const* >( &opcode ) + sizeof( opcode ) );
This will work:
#define OPCODE_LOGINREQUEST 0
std::vector<unsigned char>* loginRequestData = new std::vector<unsigned char>();
uint16_t opcode = OPCODE_LOGINREQUEST;
unsigned char *opcode_data = (unsigned char *)&opcode;
for(int i = 0; i < sizeof(opcode); i++)
loginRequestData->push_back(opcode_data[i]);
socket->rawSend(loginRequestData);
This will also work for any POD type.
Yeah, go with rawSend since send probably expects a NULL terminator.
You don't lose anything by casting to char instead of void*. Memory is memory. Types are never stored in memory in C++ except for RTTI info. You can recover your data by casting to the type indicated by your opcode.
If you can decide the format of all your sends at compile time, I recommend using structs to represent them. I've done this before professionally, and this is simply the best way to clearly store the formats for a wide variety of messages. And it's super easy to unpack on the other side; just cast the raw buffer into the struct based on the opcode!
struct MessageType1 {
uint16_t opcode;
int myData1;
int myData2;
};
MessageType1 msg;
std::vector<char> vec;
char* end = (char*)&msg + sizeof(msg);
vec.insert( vec.end(), &msg, end );
send(vec);
The struct approach is the best, neatest way to send and receive, but the layout is fixed at compile time.
If the format of the messages is not decided until runtime, use a char array:
char buffer[2048];
*((uint16_t*)buffer) = opcode;
// now memcpy into it
// or placement-new to construct objects in the buffer memory
int usedBufferSpace = 24; //or whatever
std::vector<char> vec;
const char* end = buffer + usedBufferSpace;
vec.insert( vec.end(), buffer, end );
send(&buffer);

Convert a string to and unsigned char []

I currently have a Packet set up like so:
struct Packet {
unsigned short sequenceNumber;
unsigned short length;
unsigned char control;
unsigned char ack;
unsigned short crc;
unsigned char data[];
Packet copy(const Packet& aPacket) {
sequenceNumber = aPacket.sequenceNumber;
length = aPacket.length;
control= aPacket.control;
ack = aPacket.ack;
crc = aPacket.crc;
memcpy (data, aPacket.data, aPacket.length);
}
};
This packet gets converted into a string for encryption and then needs to be taken from its decrypted string form back to a Packet. I am able to do this fine for all of the variables except for the unsigned char data[]. I have tried the following with no success:
string data = thePack.substr(pos, thePack.length()-pos);
unsigned char * cData = new unsigned char[data.length()];
strcpy((char *)cData, data.c_str());
memcpy(p.data, cData, data.length());
where data is the string representation of the data to be copied into the unsigned char [] and p is the Packet.
This gives the following from valgrind:
==16851== Invalid write of size 1
==16851== at 0x4A082E7: strcpy (mc_replace_strmem.c:303)
Even though it cites strcpy as the source, it compiles and runs fine with just the memcpy line commented out.
I have also tried replacing memcpy with strcpy with the same result. Any ideas? I feel that it might be due to the fact that data may have not been initialized and there for not have any memory allocated to it, but I thought memcpy would take care of this.
You haven't specified the size of the data array.
unsigned char data[];
This is legal, but rather difficult to use. The data array will follow the rest of the Packet structure in memory, but the compiler doesn't know how much space to allocate for it. So you have to allocate the extra space yourself:
size_t datalen = thePack.length()-pos;
void* pbuffer = malloc( sizeof (Packet) + datalen + 1 );
Packet* p = new (pbuffer) Packet;
memcpy(p.data, &thePack[pos], datalen);
p.data[datelen] = 0;
What won't work is letting the compiler decide how big a Packet should be, either using new Packet or a local variable Packet p;. That will end up with no space reserved for data. And no, memcpy doesn't allocate memory.
A much cleaner solution would be to use a std::vector for your variable-sized data array.
The char[] you're allocating is one character too small -- you must leave room for the NULL byte at the end:
unsigned char * cData = new unsigned char[data.length() + 1];
Use the strcpy version to copy the string, so the NULL byte gets copied correctly. Although it might run OK without that +1, there's no guarantee, and sometimes it might crash.