std::vector buffer throwing bad_alloc in TCP socket code - c++

I am trying to send and receive a string using a TCP socket. I found some code online and modified it. Here is my sendString and receiveString code:
static inline void sendString(int socket, std::string s) {
size_t size = s.size();
size_t size_size = sizeof(size_t); // We make our buffer:
std::vector<char> buffer(size + size_size); // Put the size at the front:
char* size_begin = reinterpret_cast<char*>(&size);
std::copy(size_begin, size_begin + size_size, &(buffer[0])); // Copy the string data:
std::copy(s.begin(), s.end(), &(buffer[size_size])); // And finally send it:
send(socket, &buffer, size + size_size, 0);
}
std::string receiveString(int socket) {
size_t size_size = sizeof(size_t);
size_t size; // We read the size:
recv(socket, (char*)&size, size_size, 0);
std::vector<char> buffer(size); /** XXX: BAD ALLOC*/
recv(socket, &buffer[0], size, 0);
return std::string(buffer.begin(), buffer.end());
}
When I try to have my client send an actual string, the server side throws a std::bad_alloc in receiveString where indicated by a comment. Why did similar code work in sendString but not in receiveString? What is causing the bad::alloc issues? Also, would my code work for sending and receiving a string over a TCP socket?
Thanks!

In sendString(), you are not passing the prepared vector content to send() correctly. You need to change &buffer to either &(buffer[0]) or buffer.data() instead.
That being said, the vectors are completely unnecessary in sendString() and recvString(). Just call send()/recv() multiple times, you can send/receive the size_t and string separately, and let the socket handle the buffering of bytes for you.
For that matter, send() and recv() are not guaranteed to actually send/receive the requested buffer in one go. You have to pay attention to their return values, calling them in loops until all bytes have actually been sent/received.
Also, you are not taking into account that different platforms have different sizes and endians for multi-byte integers. So you need to handle that better, too.
Try something more like this:
static inline void sendRaw(int socket, const void *buffer, size_t bufsize) {
const char *ptr = static_cast<const char*>(buffer);
while (bufsize > 0) {
int numSent = send(socket, ptr, bufsize, 0);
if (numSent < 0)
throw std::runtime_error("send failed");
ptr += numSent;
bufsize -= numSent;
}
}
static inline void sendUint32(int socket, uint32_t value) {
value = htonl(value);
sendRaw(socket, &value, sizeof(value));
}
static inline void sendString(int socket, const std::string &s) {
size_t size = s.size();
if (size > std::numeric_limits<uint32_t>::max())
throw std::runtime_error("string is too long in length");
sendUint32(socket, static_cast<uint32_t>(size));
sendRaw(socket, s.c_str(), size);
}
static inline void recvRaw(int socket, void *buffer, size_t bufsize) {
char *ptr = static_cast<char*>(buffer);
while (bufsize > 0) {
int numRecv = recv(socket, ptr, bufsize, 0);
if (numRecv < 0) throw std::runtime_error("recv failed");
if (numRecv == 0) throw std::runtime_error("peer disconnected");
ptr += numRecv;
bufsize -= numRecv;
}
}
static inline uint32_t recvUint32(int socket) {
uint32_t value;
recvRaw(socket, &value, sizeof(value));
return ntohl(value);
}
std::string receiveString(int socket) {
uint32_t size = recvUint32(socket);
std::string s;
if (size > 0) {
s.resize(size);
recvRaw(socket, &s[0], size);
}
return s;
}

std::bad_alloc is thrown when the system can't allocate the requested memory. Most likely - the size is too big.
My crystal ball tells me that you may witness an issue with endianness. I would convert host-to-network going up, and network-to-host on receive.
UPDATE:
As was pointed in multiple comments, if your call to recv() fails, the size will contain uninitialized garbage. You need to do two things to avoid that: initialize size with 0 AND check if recv() succeeded

Related

Send big string into socket

I'm new with C++ and came to this problem. I'm trying to send big string to a socket. I've seen the similar questions on stack but could not found the real answer. For example these:
Sending a long String over a Socket C++
Send a string with sockets in C++ (Winsock TCP/IP)
C++ sending string over socket
Most of them rely on fact that send would send the whole data in one call, or they would use char * instead of std::string.
Here is little code written in C:
int SendAll(SOCKET client_socket, const void *data, int data_size)
{
const char *data_ptr = (const char*) data;
int bytes_sent;
while (data_size > 0)
{
bytes_sent = send(client_socket, data__ptr, data_size, 0);
if (bytes_sent == SOCKET_ERROR)
return -1;
data_ptr += bytes_sent;
data_size -= bytes_sent;
}
return 1;
}
and now imagine that instead of const void *data we have std::string data. The question is how can I move pointer into data like this data_ptr += bytes_sent; with std::string?
One way that I came out is to retrieve the row pointer of std::stirng save it in some const char * var then use that variable in the same way(var += bytes_sent). But as I'm new with C++ I don't know if it's the "C++ way" of doing this? Is this the best solution to this problem or is there better one? thanks
Yes, that is the best way.
You have to obtain a pointer to the data anyway, to use send, so just adjust the pointer as you see fit.
Something like:
int SendAll(SOCKET client_socket, const std::string& str)
{
const char* data_ptr = str.data();
std::size_t data_size = str.size();
int bytes_sent;
while (data_size > 0)
{
bytes_sent = send(client_socket, data_ptr, data_size, 0);
if (bytes_sent == SOCKET_ERROR)
return -1;
data_ptr += bytes_sent;
data_size -= bytes_sent;
}
return 1;
}
This is perfectly fine and idiomatic.
If you want to keep both versions of the function, just forward the string's buffer to your existing overload:
int SendAll(SOCKET client_socket, const std::string& str)
{
return SendAll(
client_socket,
reinterpret_cast<const void*>(str.data()),
str.size()
);
}
ssize_t send(int sockfd, const void *buf, size_t len, int flags);
This is the signature of send. It requires a pointer to the buffer. Although a C++ API would probably prefer a pair of iterators, rather than a pointer and a size, this is not really possible here, seeing that the pointer to the actual buffer is required. So, there's nothing you can do about it, really. You can just use the string's data() member function to get a poninter to the start of the buffer, and work with that. This should be perfectly fine.
As suggested by Some programmer dude in the comments, you could add a simple overload that facilitates this:
int SendAll(SOCKET client_socket, std::string const& str) {
return SendAll(client_socket, reinterpret_cast<const void*>(str.data()), str.size());
}

Reliable way to place char directly after array

I'm using following code to read from socket:
char buf[4097];
int ret = read(fd, buf, sizeof(buf) - 1);
buf[ret] = 0x0;
std::cout << buf << "\n";
However, I don't like the need for 4097 and sizeof(buf) - 1 in there. It's that kind of stuff that's easy to forget. So I wonder, is there some nice way to force compiler to but 0x0 directly on stack right after the array?
What I would love is something like
char buf[4096];
char _ = 0x0;
int ret = read(fd, buf, sizeof(buf));
buf[ret] = 0x0;
std::cout << buf << "\n";
but I have no idea how to force compiler to not but anything in between (afaik #pragma pack works only on structures, not on stack).
I'd keep things simple:
ssize_t read_and_put_0(int fd, void *buf, size_t count)
{
ssize_t ret = read(fd, buf, count - 1);
if (ret != -1) // Or `if (ret <= 0)`
((char *)buf)[ret] = 0x0;
return ret;
}
// ...
char buf[4097];
read_and_put_0(fd, buf, sizeof buf);
I don't like the need for 4097 and sizeof(buf) - 1 in there
Simplicity is beautiful:
constexpr std::size_t size = 4096;
char buf[size + 1];
int ret = read(fd, buf, size);
buf[ret] = 0x0;
You specify exactly the size that you need, no neet to do manual adding. And there's need for neither sizeof, nor subtracting 1.
Remembering the + 1 for terminator is easier in my opinion than remembering to declare a separate character object - which can't be forced to be directly after the array anyway.
That said, there are less error prone ways to read a text file than read.
The relative location in memory of the values of distinct variables in unspecified. Indeed, some variables might not reside in memory at all. If you want to ensure relative layout of data in memory then use a struct or class. For example:
struct {
char buf[4096];
char term;
} tbuf = { { 0 }, 0 };
int ret = read(fd, tbuf.buf, sizeof(tbuf.buf));
if (ret >= 0 && ret < sizeof(tbuf.buf)) {
tbuf.buf[ret] = '\0';
}
The members of the struct are guaranteed to be laid out in memory in the same order that they are declared, so you can be confident that the fail-safe terminator tbuf.term will follow tbuf.buf. You cannot, however, be confident that there is no padding between. Furthermore, this is just a failsafe. You still need to write the null terminator, as shown, in case there is a short read.
Additionally, even though the representation of tbuf is certain to be larger than its buf member by at least one byte, it still produces UB to access tbuf.buf outside its bounds. Overall, then, I don't think you gain much, if anything, by this.
An alternative to HolyBlackCats answer that doesn't require giving the size argument as long as you have the array and not a pointer to some array.
template <size_t N> ssize_t read_and_put_0(int fd, char (&buf)[N]) {
ssize_t ret = read(fd, buf, N - 1);
if(ret != -1) // Or `if (ret <= 0)`
buf[ret] = 0x0;
return ret;
}
char buf[4097];
read_and_put_0(fd, buf);

converting char* to boost::array for sockets use

I'd like to use the method "read_some()" of boost::asio::ip::tcp::socket to fill a buffer represented as a char*.
Here is my method implementation so far:
template<class charType>
int receive_some(charType* buffer, int size)
{
int total_received = 0;
boost::array<charType, size> buf;
while (1)
{
int received = 0;
boost::system::error_code error;
received = _socket.read_some(boost::asio::buffer(buf), error);
if (error == boost::asio::error::eof)
{
break;
}
std::cout.write(buf.data(), received);
total_received += received;
}
return total_received;
}
My problem is I don't see how to convert my charType* buffer into boost::array buf. It seems expensive to iterate over the elements of my boost::array at the end of the process just to fill-in the buffer object...
Any idea ?
template<class charType>
int receive_some(charType* buffer, int size)
{
int total_received = 0;
while (1)
{
int received = 0;
boost::system::error_code error;
received = _socket.read_some(boost::asio::buffer(buffer, size), error);
if (error == boost::asio::error::eof)
{
break;
}
std::cout.write(buffer, received);
total_received += received;
}
return total_received;
}
The boost::asio::buffer function has a lot of overloads to allow to create an asio buffer from diffrent types of sources.
It's worth noting that size has to be the number of bytes to read into buffer and not the number of charType.
Bonus tip: As comments pointed out, that template is suspicious, the best you could do with it is directly write into wide strings but that might better be somewhere else than in a read_some function (actually it might even be better nowhere), in a network function you deal with bytes not characters so you'd better take a simple char* or even void* as a type for the buffer parameter.

Sending a c++ object with vector over socket

I have for quite some time now tried to find a good way to serialize or send a state object over tcp socket. My problem is that I am not able to use any 3. party libraries like boost.
My state object contains multiple objects. The most essential is that it got some objects and a vector of objects, but no pointers (eg. probably no deep copying, if vector dont require this).
To my question: Since I cant use boost or any other libraries, what is the best way to send a object with objects over socket?
I have been thinking that I probably could make a copy constructor and send this to a stream, but I am not quite sure about the consequences of doing this.
Define (de-)serialization functions for your data types.
For example, if you have something like:
class MyClass
{
public:
int field_a;
int field_b;
std::string string;
...
};
typedef std::vector<MyClass> MyVector;
You can define the following:
void write(int fd, const MyClass &arg)
{
// either convert arg to byte array and write it, or write field by field
// here we write it field by field
write_int(fd, arg.field_a);
write_int(fd, arg.field_b);
write_string(fd, arg.string);
}
void write(int fd const MyVector &arg)
{
size_t size = arg.size();
::write(fd, &size, sizeof(size)); // beware: machine-dependent code
for (MyVector::const_iterator it = arg.begin(); it != arg.end(); it++)
{
write(*it);
}
}
Helper functions:
void write_int(int fd, int arg)
{
write(fd, &arg, sizeof(int));
}
void write_string(int fd, const std::string &str)
{
size_t len = str.length();
write(fd, &len, sizeof(len)); // string length go first
write(fd, str.data(), len); // write string data
}
And reading:
MyClass readMC(int fd)
{
// read MyClass data from stream, parse it
int f1, f2;
std::string str;
read_int(fd, f1);
read_int(fd, f2);
read_string(fd, str)
return MyClass(f1, f2, str);
}
void read(int fd, MyVector &arg)
{
size_t size;
size_t i;
read(fd, &size, sizeof(size)); // read number of elements;
arg.reserve(size);
for (i = 0; i < size; i++)
{
arg.push_back(readMC(fd));
}
}
Helper functions:
void read_int(int fd, int &res);
{
read(fd, &res, sizeof(res));
}
void read_string(int fd, std::string &string)
{
size_t len;
char *buf;
read(fd, &len, sizeof(len));
buf = new char[len];
read(fd, buf, len);
string.asssign(buf, len);
delete []buf;
}

How to implement readlink to find the path

Using the readlink function used as a solution to How do I find the location of the executable in C?, how would I get the path into a char array? Also, what do the variables buf and bufsize represent and how do I initialize them?
EDIT: I am trying to get the path of the currently running program, just like the question linked above. The answer to that question said to use readlink("proc/self/exe"). I do not know how to implement that into my program. I tried:
char buf[1024];
string var = readlink("/proc/self/exe", buf, bufsize);
This is obviously incorrect.
This Use the readlink() function properly for the correct uses of the readlink function.
If you have your path in a std::string, you could do something like this:
#include <unistd.h>
#include <limits.h>
std::string do_readlink(std::string const& path) {
char buff[PATH_MAX];
ssize_t len = ::readlink(path.c_str(), buff, sizeof(buff)-1);
if (len != -1) {
buff[len] = '\0';
return std::string(buff);
}
/* handle error condition */
}
If you're only after a fixed path:
std::string get_selfpath() {
char buff[PATH_MAX];
ssize_t len = ::readlink("/proc/self/exe", buff, sizeof(buff)-1);
if (len != -1) {
buff[len] = '\0';
return std::string(buff);
}
/* handle error condition */
}
To use it:
int main()
{
std::string selfpath = get_selfpath();
std::cout << selfpath << std::endl;
return 0;
}
Accepted answer is almost correct, except you can't rely on PATH_MAX because it is
not guaranteed to be defined per POSIX if the system does not have such
limit.
(From readlink(2) manpage)
Also, when it's defined it doesn't always represent the "true" limit. (See http://insanecoding.blogspot.fr/2007/11/pathmax-simply-isnt.html )
The readlink's manpage also give a way to do that on symlink :
Using a statically sized buffer might not provide enough room for the
symbolic link contents. The required size for the buffer can be
obtained from the stat.st_size value returned by a call to lstat(2) on
the link. However, the number of bytes written by readlink() and read‐
linkat() should be checked to make sure that the size of the symbolic
link did not increase between the calls.
However in the case of /proc/self/exe/ as for most of /proc files, stat.st_size would be 0. The only remaining solution I see is to resize buffer while it doesn't fit.
I suggest the use of vector<char> as follow for this purpose:
std::string get_selfpath()
{
std::vector<char> buf(400);
ssize_t len;
do
{
buf.resize(buf.size() + 100);
len = ::readlink("/proc/self/exe", &(buf[0]), buf.size());
} while (buf.size() == len);
if (len > 0)
{
buf[len] = '\0';
return (std::string(&(buf[0])));
}
/* handle error */
return "";
}
Let's look at what the manpage says:
readlink() places the contents of the symbolic link path in the buffer
buf, which has size bufsiz. readlink does not append a NUL character to
buf.
OK. Should be simple enough. Given your buffer of 1024 chars:
char buf[1024];
/* The manpage says it won't null terminate. Let's zero the buffer. */
memset(buf, 0, sizeof(buf));
/* Note we use sizeof(buf)-1 since we may need an extra char for NUL. */
if (readlink("/proc/self/exe", buf, sizeof(buf)-1) < 0)
{
/* There was an error... Perhaps the path does not exist
* or the buffer is not big enough. errno has the details. */
perror("readlink");
return -1;
}
char *
readlink_malloc (const char *filename)
{
int size = 100;
char *buffer = NULL;
while (1)
{
buffer = (char *) xrealloc (buffer, size);
int nchars = readlink (filename, buffer, size);
if (nchars < 0)
{
free (buffer);
return NULL;
}
if (nchars < size)
return buffer;
size *= 2;
}
}
Taken from: http://www.delorie.com/gnu/docs/glibc/libc_279.html
#include <stdlib.h>
#include <unistd.h>
static char *exename(void)
{
char *buf;
char *newbuf;
size_t cap;
ssize_t len;
buf = NULL;
for (cap = 64; cap <= 16384; cap *= 2) {
newbuf = realloc(buf, cap);
if (newbuf == NULL) {
break;
}
buf = newbuf;
len = readlink("/proc/self/exe", buf, cap);
if (len < 0) {
break;
}
if ((size_t)len < cap) {
buf[len] = 0;
return buf;
}
}
free(buf);
return NULL;
}
#include <stdio.h>
int main(void)
{
char *e = exename();
printf("%s\n", e ? e : "unknown");
free(e);
return 0;
}
This uses the traditional "when you don't know the right buffer size, reallocate increasing powers of two" trick. We assume that allocating less than 64 bytes for a pathname is not worth the effort. We also assume that an executable pathname as long as 16384 (2**14) bytes has to indicate some kind of anomaly in how the program was installed, and it's not useful to know the pathname as we'll soon encounter bigger problems to worry about.
There is no need to bother with constants like PATH_MAX. Reserving so much memory is overkill for almost all pathnames, and as noted in another answer, it's not guaranteed to be the actual upper limit anyway. For this application, we can pick a common-sense upper limit such as 16384. Even for applications with no common-sense upper limit, reallocating increasing powers of two is a good approach. You only need log n calls for a n-byte result, and the amount of memory capacity you waste is proportional to the length of the result. It also avoids race conditions where the length of the string changes between the realloc() and the readlink().