Safe use of a function that writes data after a pointer - c++

I have a function foo(void* buffer, size_t len) that calculates a hash from the data at buffer (of size len) and appends it at the end of buffer.
Usually I have a vector that I would pass to foo(&myVec[0], myVec.size())
How would I safely use this function with a vector? Resize it before passing it?
void foo(void* buffer, size_t len)
{
if(buffer == NULL)
{
printf("Error\n");
return;
}
std::vector<unsigned char> hash(128);
gethash(buffer, len, &hash[0]);
unsigned char *data = ((unsigned char*) buffer) + len;
memcpy(data, &hash[0], hash.size());
}

Assuming the vector is a vector<char>, you could avoid all the messyness and do:
void foo(vector<char>& buffer)
{
std::vector<unsigned char> hash(128);
gethash(buffer.data(), buffer.size(), hash.data());
buffer.resize(buffer.size()+hash.size();
unsigned char *data = buffer.data() + buffer.size();
memcpy(data, hash.data(), hash.size());
}
This is still a bit "messy", but a lot less than the code you posted.
As suggested in the comments, something like:
buffer.insert(buffer.end(), hash.begin(), hash.end());
is probably better than the last three lines I wrote.

You can't do it!
If bufferpoints to memory with given len you can't reallocate more memory at exact this place.
2 ways to deal with it:
The buffer comes already with enough size for data and hash, but I would prefer a struct for this solution!
or
Allocate new memory, copy data and hash value to it and return the pointer the new data memory. But don't forget to free this memory later and also the input memory.
The second solution can be done with a vector
void foo( vector<char> &vec )
{
...
gethash(&vec[0], len, &hash[0]);
...
vec.resize(...); // reallocate and copy data if needed
memcpy // which I do not want to use with a vector :-)
}
A resize of a vector results in allocating new memory and copy data from old to new memory and free the old buffer allocated. It is possible that the vector holds (much) more memory as expected so that no reallocation must happen. But how it behaves must not be known until speed is a criteria. But you can also create a vector with a minimum of internal size so that you prevent automatic allocation and copy.

You can do this before calling foo:
int main() {
std::vector<int> buffer;
size_t sz = buffer.size();
size_t tsz = sizeof(decltype(buffer)::value_type);
buffer.resize(128 / tsz + buffer.size());
foo(buffer.data(), sz * tsz);
return 0;
}

You can do it like this:
void foo(vector<unsigned char>& data)
{
if(data.empty())
return;
vector<unsigned char> result(128);
gethash(&data[0], data.size(), &result[0]);
data.insert(data.end(), result.begin(), result.end());
}

Related

copy std::vector<unsigned char> address to unsigned char[] array [duplicate]

I have an std::vector. I want to copy the contents of the vector into a char* buffer of a certain size.
Is there a safe way to do this?
Can I do this?
memcpy(buffer, _v.begin(), buffer_size);
or this?
std::copy(_v.begin(), _v.end(), buffer); // throws a warning (unsafe)
or this?
for (int i = 0; i < _v.size(); i++)
{
*buffer = _v[i];
buffer++;
}
Thanks..
std::copy(_v.begin(), _v.end(), buffer);
This is preferred way to do this in C++. It is safe to copy this way if buffer is large enough.
If you just need char*, then you can do this:
char *buffer=&v[0];//v is guaranteed to be a contiguous block of memory.
//use buffer
Note changing data pointed to by buffer changes the vector's content also!
Or if you need a copy, then allocate a memory of size equal to v.size() bytes, and use std::copy:
char *buffer = new char[v.size()];
std::copy(v.begin(), v.end(), buffer);
Dont forget to delete []buffer; after you're done, else you'll leak memory.
But then why would you invite such a problem which requires you to manage the memory yourself.. especially when you can do better, such as:
auto copy = v; // that's simpler way to make copies!!
// and then use copy as new buffer.
// no need to manually delete anything. :-)
Hope that helps.
The safest way to copy a vector<char> into a char * buffer is to copy it to another vector, and then use that vector's internal buffer:
std::vector<char> copy = _v;
char * buffer = &copy[0];
Of course, you can also access _vs buffer if you don't actually need to copy the data. Also, beware that the pointer will be invalidated if the vector is resized.
If you need to copy it into a particular buffer, then you'll need to know that the buffer is large enough before copying; there are no bounds checks on arrays. Once you've checked the size, your second method is best. (The first only works if vector::iterator is a pointer, which isn't guaranteed; although you could change the second argument to &_v[0] to make it work. The third does the same thing, but is more complicated, and probably should be fixed so it doesn't modify buffer).
Well, you want to assign to *buffer for case 3, but that should work. The first one almost certainly won't work.
EDIT: I stand corrected regarding #2.
static std::vector<unsigned char> read_binary_file (const std::string filename)
{
// binary mode is only for switching off newline translation
std::ifstream file(filename, std::ios::binary);
file.unsetf(std::ios::skipws);
std::streampos file_size;
file.seekg(0, std::ios::end);
file_size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<unsigned char> vec(file_size);
vec.insert(vec.begin(),
std::istream_iterator<unsigned char>(file),
std::istream_iterator<unsigned char>());
return (vec);
}
and then:
auto vec = read_binary_file(filename);
auto src = (char*) new char[vec.size()];
std::copy(vec.begin(), vec.end(), src);
but remember to delete []src later

Function that dynamically construct a byte array and return length

I need to create an encoder function in a class
bool encodeMsg(unsigned char* buffer, unsigned short& len);
This class has some fixed length members and some variable length vectors (of different structures).
I have to encode a Byte stream based on some sequence of these member variables.
Here is a salable version,
class test
{
public:
test();
~test();
bool encodeMsg(unsigned char* buffer);
bool decodeMsg(const unsigned char* buffer, unsigned short len);
private:
unsigned char a; // 0x12
unsigned char b; // 0x34
unsigned char c; // 0x56
}
what I want is 0x123456 in my buffer when I encode.
Questions,
How should I allocate memory? As It is not known before calling this function
Is there a way to map class object memory which basically gives what I want.
I know this is very basic question, but want to know optimal and conventional method to do it.
How should I allocate memory? As It is not known before calling this function
Given you current code, the caller should allocate the memory:
unsigned char buffer[3];
unsigned short len = sizeof buffer;
my_test_object.encodeMsg(buffer, len);
Is there a way to map class object memory which basically gives what I want.
That's very vague. If you use a (possibly compiler-specific) #pragma or attribute to ensure the character values occupy 3 contiguous bytes in memory, and as long as you don't add any virtual functions to the class, you can implement encodeMsg() using:
memcpy(buffer, (unsigned char*)this + offsetof(test, a), 3);
But, what's the point? At best, I can't imagine that memcpy ever being faster than the "nice" way to write it:
buffer[0] = a;
buffer[1] = b;
buffer[2] = c;
If you actually mean something more akin to:
test* p = reinterpret_cast<test*>(buffer);
*p = *this;
That will have undefined behaviour, and may write up to sizeof(test) bytes into the buffer, which is quite likely to be 4 rather than 3, and that could cause some client code buffer overruns, remove an already-set NUL terminator etc.. Hackish and dangerous.
Taking a step back, if you have to ask these sorts of questions you should be worrying about adopting good programming practice - only once you're a master of this kind of thing should you be worrying about what's optimal. For developing good habits, you might want to look at the boost serialisation library and get comfortable with it first.
If you can change the interface of your encodeMsg() function you could store the byte stream in a vector.
bool test::encodeMsg(std::vector<unsigned char>& buffer)
{
// if speed is important you can fill the buffer some other way
buffer.push_back(a);
buffer.push_back(b);
buffer.push_back(c);
return true;
}
If encodeMsg() can't fail (does not need to return bool) you can create and return the vector in it like this:
std::vector<unsigned char> test::encodeMsg()
{
std::vector<unisgned char> buffer;
// if speed is important you can fill the buffer some other way
buffer.push_back(a);
buffer.push_back(b);
buffer.push_back(c);
return buffer;
}
The C++ way would be to use streams. Just implement the insertion operator << for encoding like this
std::ostream& operator<<(std::ostream& os, const test& t)
{
os << t.a;
os << t.b;
os << t.c;
return os;
}
Same with extraction operator >> for decoding
std::istream& operator>>(std::istream& is, test& t)
{
is >> t.a;
is >> t.b;
is >> t.c;
return is;
}
This moves memory management to the stream and caller. If you need a special encoding for the types then derive your codec from istream and ostream and use those.
The memory and the size can be retrieved from the stream when using a stringstream like this
test t;
std::ostringstream strm;
strm << t;
std::string result = strm.str();
auto size = result.length(); // size
auto array = result.data(); // the byte array
For classes that are trivially copyable std::is_trivially_copyable<test>::value == true, encoding and decoding is actually straight forward (assuming you have already allocated the memory for buffer:
bool encodeMsg(unsigned char* buffer, unsigned short& len) {
auto* ptr=reinterprete_cast<unsigned char*>(this);
len=sizeof(test);
memcpy(buffer, ptr, len);
return true;
}
bool decodeMsg(const unsigned char& buffer){
auto* ptr=reinterprete_cast<unsigned char*>(this);
memcpy(ptr, buffer, sizeof(test));
return true;
}
or shorter
bool encodeMsg(unsigned char* buffer, unsigned short& len) {
len=sizeof(test);
memcpy(buffer, (unsigned char*)this, len);
return true;
}
bool decodeMsg(const unsigned char& buffer){
memcpy((unsigned char*)this, buffer, sizeof(test));
return true;
}
Most probably, you will copy 4 bytes instead of 3 though due to stuffing.
As far as interpreting something directly as a byte array goes - casting a pointer from test* to unsigned char* and accessing the object through it is legal,but not the other way round. So what you could write is:
unsigned char* buffer encodeMsg( unsigned short& len) {
len=sizeof(test);
return reinterprete_cast<unsigned char*>(this);
}
bool decodeMsg(const unsigned char& buffer){
auto* ptr=reinterprete_cast<unsigned char*>(this);
memcpy(ptr, buffer, sizeof(test));
return true;
}

Contents of an untyped object copied into vector<unsigned char>

I'm trying to write the contents of an untyped object that holds the bytes of an image into a vector filled with unsigned char. Sadly, i cannot get it to work. Maybe someone could point me in the right direction?
Here is what I have at the moment:
vector<unsigned char> SQLiteDB::BlobData(int clmNum){
//i get the data of the image
const void* data = sqlite3_column_blob(pSQLiteConn->pRes, clmNum);
vector<unsigned char> bytes;
//return the size of the image in bytes
int size = getBytes(clNum);
unsigned char b[size];
memcpy(b, data, size);
for(int j=0;j<size,j++){
bytes.push_back(b[j])M
}
return bytes;
}
If i try to trace the contents of the bytes vector it's all empty.
So the question is, how can i get the data into the vector?
You should use the vector's constructor that takes a couple of iterators:
const unsigned char* data = static_cast<const unsigned char*>(sqlite3_column_blob(pSQLiteConn->pRes, clmNum));
vector<unsigned char> bytes(data, data + getBytes(clNum));
Directly write into the vector, no need for additional useless copies:
bytes.resize(size);
memcpy(bytes.data(), data, size);
Instead of a copy, this has a zero-initialisation, so using the constructor like Maxim demonstrates or vector::insert is better.
const unsigned char* data = static_cast<const unsigned char*>(sqlite3_column_blob(pSQLiteConn->pRes, clmNum));
bytes.insert(data, data + getBytes(clNum));

Nice representation of byte array and its size

How would you represent byte array and its size nicely? I'd like to store (in main memory or within a file) raw byte arrays(unsigned chars) in which first 2/4 bytes will represents its size. But operations on such array does not look well:
void func(unsigned char *bytearray)
{
int size;
memcpy(&size, bytearray, sizeof(int));
//rest of operation when we know bytearray size
}
How can I avoid that? I think about a simple structure:
struct bytearray
{
int size;
unsigned char *data;
};
bytearray *b = reinterpret_cast<bytearray*>(new unsigned char[10]);
b->data = reinterpret_cast<unsigned char*>(&(b->size) + 1);
And I've got an access to a size and data part of bytearray. But it still looks ugly. Could you recommend an another approach?
Unless you have some overwhelming reason to do otherwise, just do the idiomatic thing and use std::vector<unsigned char>.
You're effectively re-inventing the "Pascal string". However
b->data = reinterpret_cast<unsigned char*>(&(b->size) + 1);
won't work at all, because the pointer points to itself, and the pointer will get overwritten.
You should be able to use an array with unspecified size for the last element of a structure:
struct bytearray
{
int size;
unsigned char data[];
};
bytearray *b = reinterpret_cast<bytearray*>(::operator new(sizeof (bytearray) + 10));
b->size = 10;
//...
::operator delete(b);
Unlike std::vector, this actually stores the size and data together, so you can, for example, write it to a file in one operation. And memory locality is better.
Still, the fact that std::vector is already tested and many useful algorithms are implemented for you makes it very attractive.
I would use std::vector<unsigned char> to manage the memory, and write a conversion function to create some iovec like structure for you at the time that you need such a thing.
iovec make_iovec (std::vector<unsigned char> &v) {
iovec iv = { &v[0], v.size() };
return iv;
}
Using iovec, if you need to write both the length and data in a single system call, you can use the writev call to accomplish it.
ssize_t write_vector(int fd, std::vector<unsigned char> &v) {
uint32_t len = htonl(v.size());
iovec iv[2] = { { &len, sizeof(uint32_t) }, make_iovec(v) };
return writev(fd, iv, 2);
}

Copy the contents of std::vector<char> into a char* buffer?

I have an std::vector. I want to copy the contents of the vector into a char* buffer of a certain size.
Is there a safe way to do this?
Can I do this?
memcpy(buffer, _v.begin(), buffer_size);
or this?
std::copy(_v.begin(), _v.end(), buffer); // throws a warning (unsafe)
or this?
for (int i = 0; i < _v.size(); i++)
{
*buffer = _v[i];
buffer++;
}
Thanks..
std::copy(_v.begin(), _v.end(), buffer);
This is preferred way to do this in C++. It is safe to copy this way if buffer is large enough.
If you just need char*, then you can do this:
char *buffer=&v[0];//v is guaranteed to be a contiguous block of memory.
//use buffer
Note changing data pointed to by buffer changes the vector's content also!
Or if you need a copy, then allocate a memory of size equal to v.size() bytes, and use std::copy:
char *buffer = new char[v.size()];
std::copy(v.begin(), v.end(), buffer);
Dont forget to delete []buffer; after you're done, else you'll leak memory.
But then why would you invite such a problem which requires you to manage the memory yourself.. especially when you can do better, such as:
auto copy = v; // that's simpler way to make copies!!
// and then use copy as new buffer.
// no need to manually delete anything. :-)
Hope that helps.
The safest way to copy a vector<char> into a char * buffer is to copy it to another vector, and then use that vector's internal buffer:
std::vector<char> copy = _v;
char * buffer = &copy[0];
Of course, you can also access _vs buffer if you don't actually need to copy the data. Also, beware that the pointer will be invalidated if the vector is resized.
If you need to copy it into a particular buffer, then you'll need to know that the buffer is large enough before copying; there are no bounds checks on arrays. Once you've checked the size, your second method is best. (The first only works if vector::iterator is a pointer, which isn't guaranteed; although you could change the second argument to &_v[0] to make it work. The third does the same thing, but is more complicated, and probably should be fixed so it doesn't modify buffer).
Well, you want to assign to *buffer for case 3, but that should work. The first one almost certainly won't work.
EDIT: I stand corrected regarding #2.
static std::vector<unsigned char> read_binary_file (const std::string filename)
{
// binary mode is only for switching off newline translation
std::ifstream file(filename, std::ios::binary);
file.unsetf(std::ios::skipws);
std::streampos file_size;
file.seekg(0, std::ios::end);
file_size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<unsigned char> vec(file_size);
vec.insert(vec.begin(),
std::istream_iterator<unsigned char>(file),
std::istream_iterator<unsigned char>());
return (vec);
}
and then:
auto vec = read_binary_file(filename);
auto src = (char*) new char[vec.size()];
std::copy(vec.begin(), vec.end(), src);
but remember to delete []src later