I will copy the data I received with recv() and the maximum size is MAX_PACKET for in/out buffers.
Is it safe to copy it with the fixed size MAX_PACKET? Is it necessary for me to set the right size of the buffer when I use memcpy ?
recv(sock,recvbuf,MAX_PACKET,0);
memcpy(inbuffer,recvbuf,MAX_PACKET);
you need to declare inbuffer at least as many bytes as the size of MAX_PACKET
char * inbuffer = new char[MAX_PACKET];
and place before each recv()
memset(inbuffer,0,MAX_BUFFER);
to zero out the buffer so you don't mistakenly see the tail end of a previous packet in the scenario you received two packets where the 2nd is shorter than the 1st.
if your incoming packet has no unique termination byte ie '\r'
you need to add
int recvbytes ;
recvbytes =new recv(...)
since recv returns the number of bytes received on the wire
Safe yes but wasteful. Also don't forget to not use the data returned in inbuffer beyond the actual received data from recvbuf.
Not unless inBuffer is at least as large as MAX_PACKET.
In addition to what Oezbek said, recv won't 0 terminate either unless it actually receives a 0.
It is necessary for you to set the right size of the buffer when you use memcpy because recv will put in recvbuf only as much data as is currently available and up to your specified size of MAX_PACKET. Not even if on the other end you're doing send(sock,sendbuf,MAX_PACKET,0); (and MAX_PACKET has the same value in both places).
The only scenario in which your code would make sense would be if you're using UDP (SOCK_DGRAM) and all your datagrams are of size MAX_PACKET.
an alternative approach
std::string inbuffer; // member variable to the class
.......
int ret = recv(sock,recvbuff,MAX_PACKET);
if(ret != 0)
{
inbuffer.insert(inbuffer.end(),recvbuff, recvbuff + ret);
...... // use the data.
}
this avoids using memcpy. Directly anyways, the STD lib should do the smart thing and call memcpy/ memmove for you :)
Related
in the man pages of GNU/Linux the read function is described with following synopsis:
ssize_t read(int fd, void *buf, size_t count);
I would like to use this function to read data from a socket or a serial port. If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory so pointer decrement is necessary for bringing the pointer to the first byte of data. This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function. I thought of using a C-style array instead of a pointer. Is this the correct approach? If not, what is the correct way to do this? The programming language I'm using is C++.
EDIT:
The code that caused the described situation is as follows:
QSerialPort class was used to configure and open the port with following parameters:
Baudrate of 115200
8 data bits
No parity
One stop bit
No flow control
and for the reading part as long as the stackoverflow is concerned the read is performed exactly like this:
A std::vector containing a number of structs defined this way:
struct DataMember
{
QString name;
size_t count;
char *buff;
}
then within a while loop until the end of the mentioned std::vector is reached, a read() is performed based on count member variable of the said struct and the data is stored in the same struct's buff:
ssize_t nbytes = read(port->handle(), v.at(i).buff, v.at(i).count);
and then the data is printed on the console. In my test case as long as the data is one byte the value printed is correct but for more than one byte the value displayed is the last value that was read from the port plus some garbage values. I don't know why is this happening. Note that the correct result is obtained when the char *buff is changed to char buff[count].
If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory
No. The pointer is passed to the read() method by value, so it is therefore completely and utterly impossible for the value to be any different after the call than it was before, regardless of the count.
so pointer decrement is necessary for bringing the pointer to the first byte of data.
The pointer already points to the first byte of data. No decrement is necessary.
This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function.
This is all nonsense based on an impossibility.
You are mistaken about all this.
In my test case as long as the data is one byte the value printed is correct but for more than one byte the value displayed is the last value that was read from the port plus some garbage values.
From the read(2) manpage:
On success, the number of bytes read is returned (zero indicates end of file),
and the file position is advanced by this number. It is not an error if this number is
smaller than the number of bytes requested; this may happen for example because fewer
bytes are actually available right now (maybe because we were close to end-of-file, or
because we are reading from a pipe, or from a terminal), or because read() was interrupted
by a signal. On error, -1 is returned, and errno is set appropriately. In this case it
is left unspecified whether the file position (if any) changes.
In the case of pipes, sockets and character devices (that includes serial ports) and a blocking file descriptor (default) read will, in practice, not wait for the full count. In your case read() blocks until a byte comes in on the serial port and returns. That is why in the output the first byte is correct and the rest is garbage (uninitialized memory). You have to add a loop around the read() that repeats until count bytes have been read if you need the full count.
I don't know why is this happening.
But I know. char * is just a pointer, but that pointer needs to be initialized to something before you can use it. Without doing so you're invoking undefined behavior and everything might happen.
Instead of the size_t count; and char *buff elements you should just use a std::vector<char>, before making the read call, resize it to the number of bytes you want to read, then take the address of the first element of that vector and pass that to read:
struct fnord {
std::string name;
std::vector data;
};
and use it like this; note that using read requires some additional work to properly deal with signal and error conditions.
size_t readsomething(int fd, size_t count, fnord &f)
{
// reserve memory
f.data.reserve(count);
int rbytes = 0;
int rv;
do {
rv = read(fd, &f.data[rbytes], count - rbytes);
if( !rv ) {
// End of File / Stream
break;
}
if( 0 > rv ) {
if( EINTR == errno ) {
// signal interrupted read... restart
continue;
}
if( EAGAIN == errno
|| EWOULDBLOCK == errno ) {
// file / socket is in nonblocking mode and
// no more data is available.
break;
}
// some critical error happened. Deal with it!
break;
}
rbytes += rv;
} while(rbytes < count);
return rbyteS;
}
Looking at your first paragraph of gibberish:
If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory
What makes you think so? This is not how it works. Most likely you passed some invalid pointer that wasn't properly initialized. Anything can happen.
so pointer decrement is necessary for bringing the pointer to the first byte of data.
Nope. That's not how it works.
This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function.
Nope. That's not how it works!
C and C++ are an explicit languages. Everything happens in plain sight and nothing happens without you (the programmer) explicitly requesting it. No memory is allocated without you requesting this to happen. It can either be an explicit new, some RAII, automatic storage or the use of a container. But nothing happens "out of the blue" in C and C++. There's no built-in garbage collection^1 in C nor C++. Objects don't move around in memory or resize without you explicitly coding something into your program that makes this happen.
[1]: There are GC libraries you can use, but those never will stomp onto anything that can be reached by code that's executing. Essentially garbage collector libraries for C and C++ are memory leak detectors, which will free memory that can no longer be reached by normal program flow.
I am trying to write a wrapper around Windows file functions, one would read num bytes amount of data from the file and retrun it. For some reason I fail to allocate the memory properly, but I just can't find the reason why:
PBYTE Read(int num_bytes, HANDLER hFile){
PBYTE bBuffer;
DWORD new_size = sizeof(BYTE)*num_bytes;
//after the allocation the debugger already displays a 16 char wide placeholder
bBuffer = (PBYTE)malloc(new_size);
OVERLAPPED o = { 0 };
o.Offset = 0;
BOOL bReadDone = ReadFile(hFile, (LPVOID)bBuffer, sizeof(BYTE)*num_bytes, NULL, &o);
return bBuffer;
}
Data gets copied, but the allocated buffer is always too wide and contains extra wierd filler characters. Can sby please explain what am I doing wrong?
"what am I doing wrong?"
sizeof(BYTE) is 1 so you can remove it everywhere and eliminate the redundant new_size variable.
You tagged your question C++ but used malloc to allocate the buffer. Your design makes the caller responsible for freeing the buffer, which is a poor design approach, and even more so by using malloc/free in C++ program. A good C++ solution to this quandry would be to return a
std::vector.
It is vital that you provide the lpNumberOfBytesRead parameter to ReadFile. Without it you don't know how many bytes were read. And if you don't know how many bytes were read you can't tell the difference between "extra wierd filler characters" and unused memory at the end of the buffer. If the data is characters then character-oriented output routines (and debugger tools) don't know the difference either, since there is no null terminator at the end of the data that was actually read. You could use NumberOfBytesRead to put in a nul terminator so you and the debugger don't read beyond the real data.
I doing some packet translation patch.
My dll injects into Chinese game, hooks recv, listen for packets and translates strings received in Chinese.
I was coding and coding and coding... Until I found out how I supposed to write in buf more than the packet length?
int __stdcall Hooked_recv(SOCKET s, char *buf, int len, int flags)
{
h_recv.PreHook();
int ret_val = recv(s, buf, len, flags);
//ret_val is the number of bytes received. Ok, I can increase it, but...
//what to do with buf? Sure I can write there as much as no access violation appears.
//but I need a safe way.
//I guess if I do buf = new char[NEW_SIZE] then caller will fail to read buf because of pointer changed?
//what could I do to make received packet longer?
//I no want to reverse exe and increase buffer in hex editor. at least for now.
h_recv.PostHook();
return ret_val;
}
Just fill the buffer as much as you can. If you have any leftover, save it for the next call to your hooked receive function (put that first, if that fills it, repeat saving the new leftover). You will need to use a buffer, that's unavoidable.
I have the two following functions for sending and receiving packets.
void send(std::string protocol)
{
char *request=new char[protocol.size()+1];
request[protocol.size()] = 0;
memcpy(request,protocol.c_str(),protocol.size());
request_length = std::strlen(request);
boost::asio::write(s, boost::asio::buffer(request, request_length));
}
void receive()
{
char reply[max_length];
size_t reply_length = boost::asio::read(s, boost::asio::buffer(reply, request_length));
std::cout << "Reply is: ";
std::cout.write(reply, reply_length);
std::cout << "\n";
}
The questions pertain to this part boost::asio::buffer(reply, request_length) where the request length is the length of a string which was initially setup when the packet was sent. How do I check the size of the buffer without knowing request_length? Another question is how do I prevent buffer overflow?
To get the size of a buffer, the boost::asio::buffer_size() function can be used. However, in your example, this will most likely be of little use to you.
As explained in the buffer overview, Boost.Asio use buffer classes to represent buffers. These classes provide an abstraction and protect Boost.Asio operations against buffer overruns. Although the result of boost::asio::buffer() is passed to operations, the meta-data, such as the size of the buffer or its underlying type, is not transmitted. Also, these buffers do not own the memory, so it is the applications responsibility to ensure the underlying memory remains valid throughout the duration of the buffer abstraction's lifetime.
The boost::asio::buffer() function provides a convenient way to create the buffer classes, where the size of the buffer is deduced from the type possible. When Boost.Asio is able to deduce the buffer length, then Boost.Asio operations will not invoke a buffer overflow when using the resulting buffer type. However, if the application code specifies the size of the buffer to boost::asio::buffer(), then it is the applications responsibility to ensure that the size is not larger than the underlying memory.
When reading data, a buffer is required. The fundamental question becomes how does one know how much memory to allocate, if Boost.Asio does not transmit the size. There are a few solutions to this problem:
Query the socket for how much data is available via socket::available(), then allocate the buffer accordingly.
std::vector<char> data(socket_.available());
boost::asio::read(socket_, boost::asio::buffer(data));
Use a class that Boost.Asio can grow in memory, such as boost::asio::streambuf. Some operations, such as boost::asio::read() accept streambuf objects as their buffer and will allocate memory as is required for the operation. However, a completion condition should be provided; otherwise, the operation will continue until the buffer is full.
boost::asio::streambuf data;
boost::asio::read(socket_, data,
boost::asio::transfer_at_least(socket_.available()));
As Öö Tiib suggests, incorporate length as part of the communication protocol. Check the Boost.Asio examples for examples of communication protocols. Focus on the protocol, not necessarily on the Boost.Asio API.
In a fixed size protocol, both the data producer and consumer use the same size message. As the reader knows the size of the message, the reader can allocate a buffer in advance.
In a variable length protocol, the messages are often divided into two parts: a header and a body. The header is normally fixed size, and can contain various meta-information, such as the length of the body. This allows a reader to read a header into a fixed size buffer, extract the body length, allocate a buffer for the body, then read the body.
// Read fixed header.
std::vector<char> data(fixed_header_size);
boost::asio::read(socket_, boost::asio::buffer(data));
protocol::header header(data);
network_to_local(header); // Handle endianess.
// Read body.
data.resize(header.body_length());
boost::asio::read(socket_, boost::asio::buffer(data));
protocol::body body(data);
network_to_local(body); // Handle endianess.
Typically a communication protocol either uses fixed length messages or messages that contain header that tells the length of message.
Boost.Asio online documentation contains large set of examples and tutorials so you should perhaps start from there. Wikipedia is good source for explaining data transmission terminology, boost asio documentation does not do it.
I think your question is confusing, but this might help:
void receive() {
enum { max_length = 1024 };
char reply[max_length];
size_t reply_length;
std::cout << "Reply is: ";
while ( (reply_length = ba::read(s, basio::buffer(reply, max_length))) > 0) {
std::cout.write(reply, reply_length);
}
std::cout << "\n";
}
My C++ project has a buffer which could be any size and is filled by Bluetooth. The format of the incoming messages is like 0x43 0x0B 0x00 0x06 0xA2 0x03 0x03 0x00 0x01 0x01 0x0A 0x0B 0x0B 0xE6 0x0D in which starts with 0x43 and ends with 0x0D. So, it means that each time when buffer is filled, it can have different order of contents according to the above message format.
static const int BufferSize = 1024;
byte buffer[BufferSize];
What is the best way to parse the incoming messages in this buffer?
Since I have come from Java and .NET, What is the best way to make each extracted message as an object? Class could be solution?
I have created a separate class for parsing the buffer like bellow, am I in a right direction?
#include<parsingClass.h>
class A
{
parsingClass ps;
public:
parsingClass.parse(buffer, BufferSize);
}
class ReturnMessage{
char *message;
public:
char *getMessage(unsigned char *buffer,int count){
message = new char[count];
for(int i = 1; i < count-2; i++){
message[i-1] = buffer[i];
}
message[count-2] = '\0';
return message;
}
};
class ParserToMessage{
static int BufferSize = 1024;
unsigned char buffer[BufferSize];
unsigned int counter;
public:
static char *parse_buffer()
{
ReturnMessage rm;
unsigned char buffByte;
buffByte = blueToothGetByte(); //fictional getchar() kind of function for bluetooth
if(buffByte == 0x43){
buffer[counter++] = buffByte;
//continue until you find 0x0D
while((buffByte = blueToothGetByte()) != 0x0D){
buffer[counter++] = buffByte;
}
}
return rm.getMessage(buffer,counter);
}
};
Can you have the parser as a method of a 'ProtocolUnit' class? The method could take a buffer pointer/length as a parameter and return an int that indicates how many bytes it consumed from the buffer before it correctly assembled a complete protocol unit, or -1 if it needs more bytes from the next buffer.
Once you have a complete ProtocolUnit, you can do what you wish with it, (eg. queue it off to some processing thread), and create a new one for the remaining bytes/next buffer.
My C++ project has a buffer which could be any size
The first thing I notice is that you have hard-coded the buffer size. You are in danger of buffer overflow if an attempt is made to read data bigger than the size you have specified into the buffer.
If possible keep the buffer size dynamic and create the byte array according to the size of the data to be received into the buffer. Try and inform the object where your byte array lives of the incoming buffer size, before you create the byte array.
int nBufferSize = GetBufferSize();
UCHAR* szByteArray = new UCHAR[nBufferSize];
What is the best way to parse the incoming messages in this buffer?
You are on the right lines, in that you have created and are using a parser class. I would suggest using memcpy to copy the individual data items one at a time, from the buffer to a variable of your choice. Not knowing the wider context of your intention at this point, I cannot add much to that.
Since I have come from Java and .NET, What is the best way to make
each extracted message as an object? Class could be solution?
Depending on the complexity of the data you are reading from the buffer and what your plans are, you could use a class or a struct. If you do not need to create an object with this data, which provides services to other objects, you could use a struct. Structs are great when your need isn't so complex, whereby a full class might be overkill.
I have created a separate class for parsing the buffer like bellow, am
I in a right direction?
I think so.
I hope that helps for starters!
The question "how should I parse this" depends largely on how you want to parse the data. Two things are missing from your question:
Exactly how do you receive the data? You mention Bluetooth but what is the programming medium? Are you reading from a socket? Do you have some other kind of API? Do you receive it byte at a time or in blocks?
What are the rules for dealing with the data you are receiving? Most data is delimited in some way or of fixed field length. In your case, you mention that it can be of any length but unless you explain how you want to parse it, I can't help.
One suggestion I would make is to change the type of your buffer to use std::vector :
std::vector<unsigned char> buffer(normalSize)
You should choose normalSize to be something around the most frequently observed size of your incoming message. A vector will grow as you push items onto it so, unlike the array you created, you won't need to worry about buffer overrun if you get a large message. However, if you do go above normalSize under the covers the vector will reallocate enough memory to cope with your extended requirements. This can be expensive so you don't want to do it too often.
You use a vector in pretty much the same way as your array. One key difference is that you can simply push elements onto the end of the vector, rather than having to keep a running pointer. SO imagine you received a single int at a time from the Bluetooth source, your code might look something like this:
// Clear out the previous contents of the buffer.
buffer.clear();
int elem(0);
// Find the start of your message. Throw away elements
// that we don't need.
while ( 0x43 != ( elem = getNextBluetoothInt() ) );
// Push elements of the message into the buffer until
// we hit the end.
while ( 0x0D != elem )
{
buffer.push_back( elem );
}
buffer.push_back( elem ); // Remember to add on the last one.
The key benefit is that array will automatically resize the vector without you having to do it no matter whether the amount of characters pushed on is 10 or 10,000.