How do you read from a memory buffer c++ - c++

I am fairly new at C++ and am trying to understand how memory manipulation works. I am used to Java and Python and haven't really been exposed to this.
I am working on a project that has the following structure that doesn't quite make sense to me.
typedef struct
{
size_t size;
char *data;
} data_buffer;
This structure basically acts as a buffer, with a pointer to the data stored within the buffer and the size of the buffer to allow the program to know how large the buffer is when reading from it.
An example of how the program uses the buffer:
data_buffer buffer = {0};
//Manipulate data here so it contains pertinent information
CFile oFile;
oFile.Write(buffer.data, buffer.size);
The program mostly uses 3rd party code to read the data found within the buffer, so I am having trouble finding an example of how this is done. My main question is how do I read the contents of the buffer, given only a pointer to a character and a size? However, I would also like to understand how this actually works. From what I understand, memory is written to, with a pointer to where it starts and the size of the memory, so I should be able to just iterate through the memory locations, grabbing each character from memory and tagging it onto whatever structure I choose to use, like a CString or a string. Yet, I don't understand how to iterate through memory. Can someone help me understand this better? Thanks.

There is no reason you cannot use a std::string or CString to manipulate that data. (Use higher level constructs when they are available to you.)
To get the data into a std::string, use the constructor or assignment operator:
std::string s( buffer.data, buffer.size );
You can even stick it in a std::stringstream so you can treat the data buffer like a file:
std::istringstream ss( s );
int n;
ss >> n;
Things work similarly for the MFC string class.
To get the data from a string, you'll need to copy it over. Ideally, you'll be able to allocate the data's memory. Assuming you have data written into a stringstream
std::ostringstream ss;
ss << name << "," << employee_number;
You can then allocate the space you need using the function that creates the data_buffer object:
function_that_creates_a_data_buffer( buffer, ss.str().size() );
If there is no such function (there ought to be!) you must malloc() or new it yourself, as appropriate:
buffer.size = ss.str().size();
buffer.data = (char*)malloc( buffer.size );
Now just copy it:
ss.str().copy( buffer.data, buffer.size );
If your buffer needs a null-terminator (I have so far assumed it doesn't), make sure to add one to the size you allocate and set the last character to zero.
buffer.size = ss.str().size + 1;
buffer.data = new char[ buffer.size ];
ss.str().copy( buffer.data, buffer.size );
buffer.data[ buffer.size-1 ] = 0;
Make sure to look at the documentation for the various classes you will use.
Hope this helps.

A variable of type char* is actually a pointer to memory. Your struct contains data which is of type char* so it is a pointer to memory. (I suggest writing char* data instead of char *data, to help keep this clear.)
So you can use it as a starting point to look at your data. You can use another pointer to walk over the buffer.
char* bufferInspectorPointer;
bufferInspectorPointer = buffer.data;
bufferInspectorPointer will now point to the first byte of the buffer's data and
*bufferInsepectorPointer
will return the contents of the byte.
bufferInspectorPointer++
will advance the pointer to the next byte in the buffer.
You can do arithmetic with pointers in C++, so
bufferInspectorPointer - buffer.data
will tell you how many bytes you have covered. You can compare it to buffer.size to see how far you have left to go.

Since you tagged this as C++ I'd recommend using algorithms. You can get your iterators by using buffer.data as start and buffer.data + buffer.size as end. So to copy the memory into a std::string you'd do something like so:
std::string str(buffer.data, buffer.data + buffer.size);
Or perhaps to append onto a string:
str.reserve(str.size() + buffer.size);
std::copy(buffer.data, buffer.data + buffer.size, std::back_inserter(str));
Of course you can always chose a different end so long as it's not past buffer.data + buffer.size.

They are using a char array so that you can access each byte of the data buffer since size of char is usually 1 byte.
Reading the contents of the data buffer depends on the application. If you know how the internal data is encoded, you can write an unpacking function which selects chunks of the char array and convert/typecast it to the target variables.
eg: Lets say the data buffer is actually a list of integers of size 4 bytes.
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char const* argv[])
{
//how the data buffer was probably filled
int *a = (int *)malloc(10*sizeof(int));
int i;
for(i=0;i<10;i++) {
a[i] = i;
}
char *data = (char *)a;
//how we could read from the data buffer
int *b = (int *)malloc(10*sizeof(int));
char *p = data;
for(i=0;i<10;i++) {
b[i]=(int )*p;
printf("got value %d\n",b[i]);
p += sizeof(int);
}
free(a);
free(b);
return 0;
}
Note: That being said, since this is C++, it would be much safer if we could avoid using char pointers and work with strings or vectors. Other answers have explored other options of how to handle such buffers properly in C++.

Related

Copy vector<char> into char*

I'm just studying C and C++ programming.
I've searched and can't seem to find an answer that has a decent response. Of course using <string> is much easier but for this task I am REQUIRED to use only clib <string.h> functions; I'm also not allowed to use C++11 functions.
I have the 2 variables below, and want to move the contents of buffer into c.
vector<char> buffer;
char* c = "";
How can I do this easily?
I have this so far but it obviously doesn't work, otherwise I wouldn't be here.
for (int b = 0; b < buffer.size(); b++)
{
c += &buffer[b];
}
The simplest way I can think of is;
std::vector<char> buffer;
// some code that places data into buffer
char *c = new char[buffer.size()];
std::copy(buffer.begin(), buffer.end(), c);
// use c
delete [] c;
std::copy() is available in the standard header <algorithm>.
This assumes the code that places data into buffer explicitly takes care of inserting any trailing characters with value zero ('\0') into the buffer. Without that, subsequent usage of c cannot assume the presence of the '\0' terminator.
If you want to ensure a trailing '\0' is present in c even if buffer does not contain one, then one approach is;
std::vector<char> buffer;
// some code that places data into buffer
char *c = new char[buffer.size() + 1]; // additional room for a trailing '\0'
std::copy(buffer.begin(), buffer.end(), c);
c[buffer.size()] = '\0';
// use c
delete [] c;
One could also be sneaky and use another vector;
std::vector<char> container;
// some code that places data into buffer
std::vector<char> v(container); // v is a copy of container
v.push_back('\0'); // if we need to ensure a trailing '\0'
char *c = &v[0]
// use c like a normal array of char
As long as the code that uses c does not do anything that will resize v, the usage of c in this case is exactly equivalent to the preceding examples. This has an advantage that v will be released when it passes out of scope (no need to remember to delete anything) but a potential disadvantage that c cannot be used after that point (since it will be a dangling pointer).
First, allocate space for the data by assigning c = new char[buffer.size()];
Then use memcpy to copy the data: memcpy(c, buffer.data(), buffer.size())
Your for loop would work in place of memcpy, too.
Also note that if vector<char> stays in place all the time when you use char*, and you are allowed to change the content of the vector, you could simply use the data behind the vector with a simple assignment, like this:
char *c = buffer.data();
I'm noticing some weird behavior when I create my char* of the given size is that it creates it bigger with some random "hereýýýý««««««««" values after my word
It looks like you do need a null-terminated C string after all. In this case you need to allocate one extra character at the end, and set it to zero:
char *c = new char[buffer.size()+1];
memcpy(c, buffer.data(), buffer.size());
c[buffer.size()] = 0;
You can do it in this way:
vector<char> buffer;
//I am assuming that buffer has some data
char *c = new char[buffer.size()+1];
for( int i=0; i<buffer.size(); i++ )
c[i] = buffer[i];
c[i] = '\0';
buffer.clear();

How to make an array to store char arrays of variable size?

I need an array to store char arrays of variable size. I could use vectors or anything else, but unfortunately this is for a MPI project and I am forced to use an array so I can send it using MPI::COMM_WORLD.Send(...) function.
My idea comes from this link.
This is a simplified example of the problem I have:
char* arrayStorage[3]; //I want to store 3 char arrays of variable size!
int index = 0;
char array_1[RANDOM_SIZE] = {.....};
char array_2[RANDOM_SIZE] = {.....};
char array_3[RANDOM_SIZE] = {.....};
arraySorage[index] = array_1;
index++;
arraySorage[index] = array_2;
index++;
arraySorage[index] = array_3;
index++;
I have also seen people talking about malloc and stuff like that, but I don't know much about pointers. I do malloc, I have to call free and I don't know where, so I am avoiding that for now.
This code obviously doesn't work, array_1, array_2, array_3 are all OK, but when I try to access them I get garbage. The problem seems to be inside the index variable. Maybe I shouldn't be doing index++, perhaps I should be doing index += RANDOM_SIZE, but that also fails.
How can I store variable size char arrays in an array?
Use malloc and free (or new and delete in C++). You can do it with vectors too - as vectors can be treated as arrays.
char *str = "hello world";
// need the +1 for null character
arraySorage[0] = (char *)malloc (strlen(str) + 1);
strcpy(arraySorage[0], str);
...
free(arraySorage[0]);
with new/delete
arraySorage[0] = new char[strlen(str)+1];
strcpy(arraySorage[0], str);
...
delete arraySorage[0];
Using vector and std::string is the correct C++ way, for lots of reasons, including not leaking memory and proper handling of exceptions.

Copy unsigned char * to unsigned char*

I need to save packet state for a while.
So I read the packet data which is represented as unsigned char* and than I create a record with this data and save the record in the list for a while.
Which will be a better way to represent the packet in the record as char* or as char[].
How do i copy the read data ( unsigned char ) to both options :
To unsigned char[] and to unsigned char*
I need to copy the data because each time I read packet it will be readed to the same char*,so when I save it for a while I need to copy data first
If the packet data is binary I'd prefer using std::vector to store the data, as opposed to one of the C strXXX functions, to avoid issues with a potential NULL character existing in the data stream. Most strXXX functions look for NULL characters and truncate their operation. Since the data is not a string, I'd also avoid std::string for this task.
std::vector<unsigned char> v( buf, buf + datalen );
The vector constructor will copy all the data from buf[0] to buf[datalen - 1] and will deallocate the memory when the vector goes out of scope. You can get a pointer to the underlying buffer using v.data() or &v[0].
So, it sounds like you need to save the data from multiple packets in a list until some point in the future.
If it was me, I'd use std::string or std::vector normally because that removes allocation issues and is generally plenty fast.
If you do intend to use char* or char[], then you'd want to use char*. Declaring a variable like "char buf[1024];" allocates it on the stack, which means that when that function returns it goes away. To save it in a list, you'd need to dynamically allocate it, so you would do something like "char *buf = new char[packet.size];" and then copy the data and store the pointer and the length of the data in your list (or, as I said before, use std::string which avoids keeping the length separately).
How do you copy the data?
Probably memcpy. The strcpy function would have problems with data which can have nul characters in it, which is common in networking situations. So, something like:
char *buf = new char[packet_length];
memcpy(buf, packet_data, packet_length);
// Put buf and packet_length into a structure in your list.

Avoid overwriting on array

Is there any way where overwritting of the array can be avoided? In my implementation I have to write data to an buffer/array of fixed size say buff[100] and will be using buff[100] whenever I want to o/p data I will write to buff[100] (i.e will you again use the same buff[100]) the next time when I use buff[100] it should append the data.
Maintain an index into the array. When the length of the data you want to write plus the index is greater than or equal to 100, write out the buffer and the data. Otherwise, shove the data into the buffer at that offset and add the length of the data to the index.
For example, assuming that the following variables are in scope:
#define BUFFER_LENGTH 100
char buffer[BUFFER_LENGTH];
int buffer_index;
int output_fd;
You could have a function like this:
void write_buffered(char *data, int data_length)
{
if (data_length + buffer_index >= BUFFER_LENGTH) {
write(output_fd, buffer, buffer_index);
write(output_fd, data, data_length);
buffer_index = 0;
return;
}
memcpy(&buffer[buffer_index], data, data_length);
buffer_index += data_length;
}
This is written C-style because I know C better than C++, but the basic principles are sound. Obviously, avoid the use of global variables and alter the write() calls to whatever call you are already using.
Since you mention C++, why don't you use a std::vector or similar container? It would be much simpler and less error-prone.

Dereferencing Variable Size Arrays in Structs

Structs seem like a useful way to parse a binary blob of data (ie a file or network packet). This is fine and dandy until you have variable size arrays in the blob. For instance:
struct nodeheader{
int flags;
int data_size;
char data[];
};
This allows me to find the last data character:
nodeheader b;
cout << b.data[b.data_size-1];
Problem being, I want to have multiple variable length arrays:
struct nodeheader{
int friend_size;
int data_size;
char data[];
char friend[];
};
I'm not manually allocating these structures. I have a file like so:
char file_data[1024];
nodeheader* node = &(file_data[10]);
As I'm trying to parse a binary file (more specifically a class file). I've written an implementation in Java (which was my class assignment), no I'm doing a personal version in C++ and was hoping to get away without having to write 100 lines of code. Any ideas?
Thanks,
Stefan
You cannot have multiple variable sized arrays. How should the compiler at compile time know where friend[] is located? The location of friend depends on the size of data[] and the size of data is unknown at compile time.
This is a very dangerous construct, and I'd advise against it. You can only include a variable-length array in a struct when it is the LAST element, and when you do so, you have to make sure you allocate enough memory, e.g.:
nodeheader *nh = (nodeheader *)malloc(sizeof(nodeheader) + max_data_size);
What you want to do is just use regular dynamically allocated arrays:
struct nodeheader
{
char *data;
size_t data_size;
char *friend;
size_t friend_size;
};
nodeheader AllocNodeHeader(size_t data_size, size_t friend_size)
{
nodeheader nh;
nh.data = (char *)malloc(data_size); // check for NULL return
nh.data_size = data_size;
nh.friend = (char *)malloc(friend_size); // check for NULL return
nh.friend_size = friend_size;
return nh;
}
void FreeNodeHeader(nodeheader *nh)
{
free(nh->data);
nh->data = NULL;
free(nh->friend);
nh->friend = NULL;
}
You can't - at least not in the simple way that you're attempting. The unsized array at the end of a structure is basically an offset to the end of the structure, with no build-in way to find the end.
All the fields are converted to numeric offsets at compile time, so they need to be calculable at that time.
The answers so far are seriously over-complicating a simple problem. Mecki is right about why it can't be done the way you are trying to do it, however you can do it very similarly:
struct nodeheader
{
int friend_size;
int data_size;
};
struct nodefile
{
nodeheader *header;
char *data;
char *friend;
};
char file_data[1024];
// .. file in file_data ..
nodefile file;
file.header = (nodeheader *)&file_data[0];
file.data = (char *)&file.header[1];
file.friend = &file.data[file->header.data_size];
For what you are doing you need an encoder/decoder for the format. The decoder takes the raw data and fills out your structure (in your case allocating space for the copy of each section of the data), and the decoder writes raw binary.
(Was 'Use std::vector')
Edit:
On reading feedback, I suppose I should expand my answer. You can effectively fit two variable length arrays in your structure as follows, and the storage will be freed for you automatically when file_data goes out of scope:
struct nodeheader {
std::vector<unsigned char> data;
std::vector<unsigned char> friend_buf; // 'friend' is a keyword!
// etc...
};
nodeheader file_data;
Now file_data.data.size(), etc gives you the length and and &file_data.data[0] gives you a raw pointer to the data if you need it.
You'll have to fill file data from the file piecemeal - read the length of each buffer, call resize() on the destination vector, then read in the data. (There are ways to do this slightly more efficiently. In the context of disk file I/O, I'm assuming it doesn't matter).
Incidentally OP's technique is incorrect even for his 'fine and dandy' cases, e.g. with only one VLA at the end.
char file_data[1024];
nodeheader* node = &(file_data[10]);
There's no guarantee that file_data is properly aligned for the nodeheader type. Prefer to obtain file_data by malloc() - which guarantees to return a pointer aligned for any type - or else (better) declare the buffer to be of the correct type in the first place:
struct biggestnodeheader {
int flags;
int data_size;
char data[ENOUGH_SPACE_FOR_LARGEST_HEADER_I_EVER_NEED];
};
biggestnodeheader file_data;
// etc...