Avoid overwriting on array

Avoid overwriting on array - c++

Is there any way where overwritting of the array can be avoided? In my implementation I have to write data to an buffer/array of fixed size say buff[100] and will be using buff[100] whenever I want to o/p data I will write to buff[100] (i.e will you again use the same buff[100]) the next time when I use buff[100] it should append the data.

Maintain an index into the array. When the length of the data you want to write plus the index is greater than or equal to 100, write out the buffer and the data. Otherwise, shove the data into the buffer at that offset and add the length of the data to the index.
For example, assuming that the following variables are in scope:
#define BUFFER_LENGTH 100
char buffer[BUFFER_LENGTH];
int buffer_index;
int output_fd;
You could have a function like this:
void write_buffered(char *data, int data_length)
{
if (data_length + buffer_index >= BUFFER_LENGTH) {
write(output_fd, buffer, buffer_index);
write(output_fd, data, data_length);
buffer_index = 0;
return;
}
memcpy(&buffer[buffer_index], data, data_length);
buffer_index += data_length;
}
This is written C-style because I know C better than C++, but the basic principles are sound. Obviously, avoid the use of global variables and alter the write() calls to whatever call you are already using.

Since you mention C++, why don't you use a std::vector or similar container? It would be much simpler and less error-prone.

Related

Decompressing a vector into another vector using miniz in c++

I'm trying to do the simplest thing here. I want to create a method that will take in a byte (char) array, inflate it using miniz tinfl_decompress method and then return a byte array containing the inflated data.
First things first. The arrays given will never be bigger than 100kB, vast majority will be smaller than 50k. Hence, I don't think I need to use any kind of buffer for it. Anyway, this is what I've got:
std::vector<unsigned char> unzip(std::vector<unsigned char> data)
{
unsigned char *outBuffer = new unsigned char[1024 * 1024];
tinfl_decompressor inflator;
tinfl_status status;
tinfl_init(&inflator);
size_t inBytes = data.size() - 9;
size_t outBytes = 1024 * 1024;
status = tinfl_decompress(&inflator, (const mz_uint8 *)&data[9], &inBytes, outBuffer, (mz_uint8 *)outBuffer, &outBytes, 0);
return ???
}
I know the output I want begins at memory location &outBuffer, but I don't know how long it is (I do happen to know it will be less than 1MB), so I cannot pack it into a vector and send it on it's way. I had hoped that outBytes would hold the size of the output, but they are set to 1 after the decompression. I know that decompression didn't fail, since status returned is TINFL_STATUS_DONE (0).
Is this even the right way of doing it? This is a method that will be called a lot in my program, so I want something that is as fast as possible.
How do I get the vector out of it? Should I use a different data type? An array (the [] type)? The decompressed data will be read sequentially only once, after what it will be discarded.
EDIT:
It seems that the file I was trying to decompress was not of the proper format; it was zip, this takes zlib.

Caveat: Totally untested code.
It should go something like exchange
unsigned char *outBuffer = new unsigned char[1024 * 1024];
for
std::vector<unsigned char> outBuffer(1024 * 1024);
to get a vector. Then call tinfl_decompress using the data method to get the vector's underlying buffer. It should look something like
status = tinfl_decompress(&inflator,
(const mz_uint8 *)&data[9],
&inBytes,
(mz_uint8 *)outBuffer.data(),
(mz_uint8 *)outBuffer.data(),
&outBytes,
0);
And then resize the vector to the number of bytes stored in the vector for convenience later.
outBuffer.resize(outBytes);
Note the vector will NOT be resized down. It will still have a capacity of 1 MiB. If this is a problem, an additional call to std::vector::shrink_to_fit is required.
Finally
return outBuffer;

How do you read from a memory buffer c++

I am fairly new at C++ and am trying to understand how memory manipulation works. I am used to Java and Python and haven't really been exposed to this.
I am working on a project that has the following structure that doesn't quite make sense to me.
typedef struct
{
size_t size;
char *data;
} data_buffer;
This structure basically acts as a buffer, with a pointer to the data stored within the buffer and the size of the buffer to allow the program to know how large the buffer is when reading from it.
An example of how the program uses the buffer:
data_buffer buffer = {0};
//Manipulate data here so it contains pertinent information
CFile oFile;
oFile.Write(buffer.data, buffer.size);
The program mostly uses 3rd party code to read the data found within the buffer, so I am having trouble finding an example of how this is done. My main question is how do I read the contents of the buffer, given only a pointer to a character and a size? However, I would also like to understand how this actually works. From what I understand, memory is written to, with a pointer to where it starts and the size of the memory, so I should be able to just iterate through the memory locations, grabbing each character from memory and tagging it onto whatever structure I choose to use, like a CString or a string. Yet, I don't understand how to iterate through memory. Can someone help me understand this better? Thanks.

There is no reason you cannot use a std::string or CString to manipulate that data. (Use higher level constructs when they are available to you.)
To get the data into a std::string, use the constructor or assignment operator:
std::string s( buffer.data, buffer.size );
You can even stick it in a std::stringstream so you can treat the data buffer like a file:
std::istringstream ss( s );
int n;
ss >> n;
Things work similarly for the MFC string class.
To get the data from a string, you'll need to copy it over. Ideally, you'll be able to allocate the data's memory. Assuming you have data written into a stringstream
std::ostringstream ss;
ss << name << "," << employee_number;
You can then allocate the space you need using the function that creates the data_buffer object:
function_that_creates_a_data_buffer( buffer, ss.str().size() );
If there is no such function (there ought to be!) you must malloc() or new it yourself, as appropriate:
buffer.size = ss.str().size();
buffer.data = (char*)malloc( buffer.size );
Now just copy it:
ss.str().copy( buffer.data, buffer.size );
If your buffer needs a null-terminator (I have so far assumed it doesn't), make sure to add one to the size you allocate and set the last character to zero.
buffer.size = ss.str().size + 1;
buffer.data = new char[ buffer.size ];
ss.str().copy( buffer.data, buffer.size );
buffer.data[ buffer.size-1 ] = 0;
Make sure to look at the documentation for the various classes you will use.
Hope this helps.

A variable of type char* is actually a pointer to memory. Your struct contains data which is of type char* so it is a pointer to memory. (I suggest writing char* data instead of char *data, to help keep this clear.)
So you can use it as a starting point to look at your data. You can use another pointer to walk over the buffer.
char* bufferInspectorPointer;
bufferInspectorPointer = buffer.data;
bufferInspectorPointer will now point to the first byte of the buffer's data and
*bufferInsepectorPointer
will return the contents of the byte.
bufferInspectorPointer++
will advance the pointer to the next byte in the buffer.
You can do arithmetic with pointers in C++, so
bufferInspectorPointer - buffer.data
will tell you how many bytes you have covered. You can compare it to buffer.size to see how far you have left to go.

Since you tagged this as C++ I'd recommend using algorithms. You can get your iterators by using buffer.data as start and buffer.data + buffer.size as end. So to copy the memory into a std::string you'd do something like so:
std::string str(buffer.data, buffer.data + buffer.size);
Or perhaps to append onto a string:
str.reserve(str.size() + buffer.size);
std::copy(buffer.data, buffer.data + buffer.size, std::back_inserter(str));
Of course you can always chose a different end so long as it's not past buffer.data + buffer.size.

They are using a char array so that you can access each byte of the data buffer since size of char is usually 1 byte.
Reading the contents of the data buffer depends on the application. If you know how the internal data is encoded, you can write an unpacking function which selects chunks of the char array and convert/typecast it to the target variables.
eg: Lets say the data buffer is actually a list of integers of size 4 bytes.
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char const* argv[])
{
//how the data buffer was probably filled
int *a = (int *)malloc(10*sizeof(int));
int i;
for(i=0;i<10;i++) {
a[i] = i;
}
char *data = (char *)a;
//how we could read from the data buffer
int *b = (int *)malloc(10*sizeof(int));
char *p = data;
for(i=0;i<10;i++) {
b[i]=(int )*p;
printf("got value %d\n",b[i]);
p += sizeof(int);
}
free(a);
free(b);
return 0;
}
Note: That being said, since this is C++, it would be much safer if we could avoid using char pointers and work with strings or vectors. Other answers have explored other options of how to handle such buffers properly in C++.

How do I combine multiple char reads into a std::vector?

I'm reading multiple reports from a HID device into an unsigned char, then trying to copy the data to a std::vector. I'm also writing the data out to a file for hex analysis, whose content appears to be correct when I view it. However, the std::vector doesn't appear to contain the correct data when I dump it to the console.
This is the code:
typedef vector<unsigned char> buffer_t;
buffer_t sendCommand (hid_device *devh, const unsigned char cmd[], int reports) {
unsigned char outbuf[0x40];
buffer_t retbuf(0x40 * reports);
hid_write(devh, cmd, 0x41);
int i;
FILE *file = fopen("test.out", "w+b");
while (i++ < reports) {
hid_read(devh, outbuf, 0x40);
fwrite(outbuf, 1, sizeof(outbuf), file);
retbuf.push_back(*outbuf);
}
fclose(file);
cout << &retbuf[0];
return retbuf;
}
I have a feeling I'm way off the mark here; I'm fairly new to C/C++, and I've been stuck with this for a while now. Can anyone tell me what I'm doing wrong, or point me in a better direction?

You want to add multiple unsigned char objects to your vector, but push_back only adds one.
So, replace retbuf.push_back(*outbuf); with either:
for (size_t i = 0; i < sizeof(outbuf); ++i) {
retbuf.push_back(outbuf[i]);
}
or
std::copy(outbuf, outbuf+sizeof(outbuf), std::back_inserter(retbuf));
or
retbuf.insert(retbuf.end(), outbuf, outbuf+sizeof(outbuf));
which all do the same thing.
You create your vector with a certain size:
buffer_t retbuf(0x40 * reports);
but push_back increases the size of the vector by adding an element at the end. You should create it empty:
buffer_t retbuf;
Optionally, you could arrange for the vector to have enough space allocated, ready for the elements you're going to add:
retbuf.reserve(0x40 * reports);
This is purely a performance issue, but sometimes it's a significant issue for large vectors, or vectors of types that (unlike unsigned char) are expensive to copy/move when the vector runs out of internal space and has to allocate more.
A note on style: you repeat the literal value 0x40 a few times, and also use sizeof(outbuf). It's often best to define a constant, and use the name throughout:
const int report_size = 0x40;
This is partly in case the number changes in future, but also it's about the readability of your code -- if someone sees 0x40 they may or may not immediately understand why that is the correct value. If someone sees report_size then they don't know what value that actually is until they look it up, but they do know why you're using that value.

The problem is in this line: buffer_t retbuf(0x40 * reports); It means that you create vector with 0x40 * reports elements filled with default value for unsigned char (zero). Then push_back() just adds new elements to the end of vector and doesn't affect existing elements.
You need to rewrite it this way:
buffer_t retbuf; // Empty vector
retbuf.reserve(0x40 * reports); // Preallocate memory for known element count
This way push_back() will work as expected and add elements to empty vector from beginning.
And of course you shall push_back() all elements of outbuf, not only first one (*outbuf).

To push back multiple values use std::vector's function assign. For example:
std::vector<char>vec1;
char array[3] = {'a', 'b', 'c'};
vec1.assign(array, array+3);
I am currently working on a project were I had to do this.

Your vector is of a type unsigned char, which means every element of it is of this type. Your outbuf is an array of unsigned chars.
The push_back() only appends one item to the end of the vector, so push_back(*outbuf) will only add the first element of the outbuf to the vector, not all of them.
To put all the data into the vector, you will need to push_back them one-by-one, or use std::copy.

Note that since outbuf is a char array, then *outbuf will be the first element of the char array because of the array/pointer duality.
I think you probably wanted to do:
typedef vector<string> buffer_t; // alternatively vector<unsigned char*>
...
retbuf.push_back(outbuf);
...
Or
typedef vector<unsigned char> buffer_t;
...
for (size_t i = 0; i < sizeof(outbuf); i++)
retbuf.push_back(outbuf);
...

Determining the correct size for a C++ array

I need to be able to set the size of an array based on the number of bytes in a file.
For example, I want to do this:
// Obtain the file size.
fseek (fp, 0, SEEK_END);
size_t file_size = ftell(fp);
rewind(fp);
// Create the buffer to hold the file contents.
char buff[file_size];
However, I get a compile time error saying that the size of the buffer has to be a constant.
How can I accomplish this?

Use a vector.
std::vector<char> buff(file_size);
The entire vector is filled with '\0' first, automatically. But the performance "lost" might not be noticable. It's certainly safer and more comfortable. Then access it like a usual array. You may even pass the pointer to the data to legacy C functions
legacy(&buff[0]); // valid!

You should use a std::vector and not an array.
Real arrays require you to specify their size so that the compiler can create some space for them -- this is why the compiler complains when you don't supply a constant integer. Dynamic arrays are represented by a pointer to the base of the array -- and you have to retrieve the memory for the dynamic array yourself. You may then use the pointer with subscript notation. e.g.,
int * x;
x = (int *) malloc( sizeof(int) *
getAmountOfArrayElements() /* non-const result*/
);
x[5] = 10;
This leads to two types of problems:
Buffer over/under flows : you might subscript-index past either end of the array.
You might forget to release the memory.
Vector provides a nice little interface to hide these problems from you -- if used correctly.

Replace
char buff[file_size];
with
char *buff = new char[file_size];
and once the use of the buff is done..you can free the memory using:
delete[] buff;

There are two points in your question I'd like to cover.
The actual question, how do you create the array. Johannes answered this. You use a std::vector and create it with a size allocation.
Your error message. When you declare an array of some type, you must declare it with a constant size. So for example
const int FileSize = 1000;
// stuff
char buffer[FileSize];
is perfectly legitimate.
On the other hand, what you did, attempting to declare an array with variable size, and then not allocating with new, generates an error.

Problem is that buff needs be created on the heap (instead of stack). Compiler want s to know the exact size to create on the stack.
char* buff = new char[file_size];

Dereferencing Variable Size Arrays in Structs

Structs seem like a useful way to parse a binary blob of data (ie a file or network packet). This is fine and dandy until you have variable size arrays in the blob. For instance:
struct nodeheader{
int flags;
int data_size;
char data[];
};
This allows me to find the last data character:
nodeheader b;
cout << b.data[b.data_size-1];
Problem being, I want to have multiple variable length arrays:
struct nodeheader{
int friend_size;
int data_size;
char data[];
char friend[];
};
I'm not manually allocating these structures. I have a file like so:
char file_data[1024];
nodeheader* node = &(file_data[10]);
As I'm trying to parse a binary file (more specifically a class file). I've written an implementation in Java (which was my class assignment), no I'm doing a personal version in C++ and was hoping to get away without having to write 100 lines of code. Any ideas?
Thanks,
Stefan

You cannot have multiple variable sized arrays. How should the compiler at compile time know where friend[] is located? The location of friend depends on the size of data[] and the size of data is unknown at compile time.

This is a very dangerous construct, and I'd advise against it. You can only include a variable-length array in a struct when it is the LAST element, and when you do so, you have to make sure you allocate enough memory, e.g.:
nodeheader *nh = (nodeheader *)malloc(sizeof(nodeheader) + max_data_size);
What you want to do is just use regular dynamically allocated arrays:
struct nodeheader
{
char *data;
size_t data_size;
char *friend;
size_t friend_size;
};
nodeheader AllocNodeHeader(size_t data_size, size_t friend_size)
{
nodeheader nh;
nh.data = (char *)malloc(data_size); // check for NULL return
nh.data_size = data_size;
nh.friend = (char *)malloc(friend_size); // check for NULL return
nh.friend_size = friend_size;
return nh;
}
void FreeNodeHeader(nodeheader *nh)
{
free(nh->data);
nh->data = NULL;
free(nh->friend);
nh->friend = NULL;
}

You can't - at least not in the simple way that you're attempting. The unsized array at the end of a structure is basically an offset to the end of the structure, with no build-in way to find the end.
All the fields are converted to numeric offsets at compile time, so they need to be calculable at that time.

The answers so far are seriously over-complicating a simple problem. Mecki is right about why it can't be done the way you are trying to do it, however you can do it very similarly:
struct nodeheader
{
int friend_size;
int data_size;
};
struct nodefile
{
nodeheader *header;
char *data;
char *friend;
};
char file_data[1024];
// .. file in file_data ..
nodefile file;
file.header = (nodeheader *)&file_data[0];
file.data = (char *)&file.header[1];
file.friend = &file.data[file->header.data_size];

For what you are doing you need an encoder/decoder for the format. The decoder takes the raw data and fills out your structure (in your case allocating space for the copy of each section of the data), and the decoder writes raw binary.

(Was 'Use std::vector')
Edit:
On reading feedback, I suppose I should expand my answer. You can effectively fit two variable length arrays in your structure as follows, and the storage will be freed for you automatically when file_data goes out of scope:
struct nodeheader {
std::vector<unsigned char> data;
std::vector<unsigned char> friend_buf; // 'friend' is a keyword!
// etc...
};
nodeheader file_data;
Now file_data.data.size(), etc gives you the length and and &file_data.data[0] gives you a raw pointer to the data if you need it.
You'll have to fill file data from the file piecemeal - read the length of each buffer, call resize() on the destination vector, then read in the data. (There are ways to do this slightly more efficiently. In the context of disk file I/O, I'm assuming it doesn't matter).
Incidentally OP's technique is incorrect even for his 'fine and dandy' cases, e.g. with only one VLA at the end.
char file_data[1024];
nodeheader* node = &(file_data[10]);
There's no guarantee that file_data is properly aligned for the nodeheader type. Prefer to obtain file_data by malloc() - which guarantees to return a pointer aligned for any type - or else (better) declare the buffer to be of the correct type in the first place:
struct biggestnodeheader {
int flags;
int data_size;
char data[ENOUGH_SPACE_FOR_LARGEST_HEADER_I_EVER_NEED];
};
biggestnodeheader file_data;
// etc...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Avoid overwriting on array - c++

Since you mention C++, why don't you use a std::vector or similar container? It would be much simpler and less error-prone.

Related

Decompressing a vector into another vector using miniz in c++

How do you read from a memory buffer c++

How do I combine multiple char reads into a std::vector?

Determining the correct size for a C++ array

Dereferencing Variable Size Arrays in Structs

Categories

Resources