How to copy data into certain parts of a byte array

How to copy data into certain parts of a byte array - c++

I want to create a byte array out of an unknown struct and add a number additionally in the front of this byte array. How do I do this?
I currently have this code:
template <class T>
void CopterConnection::infoToByteArray(char *&bit_data, size_t *msglen,
T data) {
// Determine which kind of element is in the array, will change in the final code
char typeID = -1;
*msglen = sizeof(data);
*msglen += 1; // take in account of typeID
// Create the pointer to the byte representation of the struct
bit_data = new char[*msglen];
// copy the information from the struct into the byte array
memcpy(bit_data, &data+1, *msglen-1);
bit_data[1] = typeID;
}
But this is not working. I guess I use the memcpy wrong. I want to copy the unkown struct T into the positions bit_data[1] to bit_data[*end*]. What is the best way to achieve this?

One possible problem and one definitive problem:
The possible problem is that array indexing starts at zero. So you should copy to bit_data + 1 to skip over the first byte, and then of course use bit_data[0] to set the type id.
The definitive problem is that &data + 1 is equal to (&data)[1], and that will be out of bounds and lead to undefined behavior. You should just copy from &data.
Putting it all together the last to lines should be
memcpy(bit_data + 1, &data, *msglen-1);
bit_data[0] = typeID;
There is another possible problem, which depends on what you're doing with the data in bit_data and what T is. If T is not a POD type then you simply can not expect a bitwise copy (what memcpy does) to work very well.
Also if T is a class or structure with members that are pointers then you can't save those to disk or transfer to another computer or even to another process on the same computer.

There are a few bugs in there, in addition to the fact you are messing around with new.
The memcpy line itself you use &data + 1 as the source which here will be undefined behaviour. It will add sizeof(data) bytes to the address which is copied so in the stack somewhere and whilst "one past the end" is a valid pointer so this address is valid in pointer arithmetic, nothing you read from it will be, nor anything after it.
bit_data[1] is the 2nd character in your buffer.

Related

Length of pointer to pointer

I searched and I didn't find anything that is like my situation. I have a float** and well I know that it is a special type of pointer because it is an array of elements that have a float* that points to another zone of memory. So I write a simple code to detect the length of this matrix, to be more precise the length of the float + elements inside float**; But it results in a segmentation fault.
Here there is my code:
int Loader:: Length(float** length)
{
int count=0;
while(*length[count]!='\0'){
count++;
}
std::cout<<count<<std::endl;
return count;
}
Sorry for my english and sorry for the stupid question. Thanks to all.

I have a float** and well I know that it is a special type of pointer
Not really. A double pointer is just a special case of a single pointer. It is still a pointer to T, and that T happens to be float*.
because it is an array of elements
No! It is not an array. It may point to the first element of an array.
that have a float* that points to another zone of memory.
So, more precisely, a float** may point to the first element of an array full of float*. Where those individual float*s point to is another story.
So I write a simple code to detect the length of this matrix,
You cannot. When all you have is a pointer to the beginning of an array, then the size information is already lost.
That is, unless you have a convention for the last element, like C-style strings or string literals with their '\0' terminator. Which brings us to the next point...
int Loader:: Length(float** length)
{
int count=0;
while(*length[count]!='\0'){
Here's the culprit. Not all arrays are terminated by '\0'. In fact, it's not typical at all for arbitrary arrays containing a zero separator.
So unless the array to whose first element length points to actually contains an element which compares to '\0', then your loop will go one element past the end of the array and try to read from there. In that very moment, undefined behaviour is invoked and your program can do anything, including random crashes.
The best solution to your problem is to use std::vector, because a std::vector always knows its own size. So make it std::vector<float*>. Or better yet, a std::vector<std::vector<float>>.
In fact, if it's really a matrix, then make it std::vector<float>, store all contents contiguously, additionally store the matrix' width somewhere, and always calculate the offset for X/Y.

The problem is caused by your expectation that all arrays operate in the same way as literal character strings, ie that they are automatically terminated by a 0 value. Neither C++ or C work that way.
If you want the array length you need to do one of the following:
Pass the length along with the array everywhere.
Use a std::vector, std::deque or std::array instead of an array, and get the length from that.

proper way to use GNU/Linux read() function

in the man pages of GNU/Linux the read function is described with following synopsis:
ssize_t read(int fd, void *buf, size_t count);
I would like to use this function to read data from a socket or a serial port. If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory so pointer decrement is necessary for bringing the pointer to the first byte of data. This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function. I thought of using a C-style array instead of a pointer. Is this the correct approach? If not, what is the correct way to do this? The programming language I'm using is C++.
EDIT:
The code that caused the described situation is as follows:
QSerialPort class was used to configure and open the port with following parameters:
Baudrate of 115200
8 data bits
No parity
One stop bit
No flow control
and for the reading part as long as the stackoverflow is concerned the read is performed exactly like this:
A std::vector containing a number of structs defined this way:
struct DataMember
{
QString name;
size_t count;
char *buff;
}
then within a while loop until the end of the mentioned std::vector is reached, a read() is performed based on count member variable of the said struct and the data is stored in the same struct's buff:
ssize_t nbytes = read(port->handle(), v.at(i).buff, v.at(i).count);
and then the data is printed on the console. In my test case as long as the data is one byte the value printed is correct but for more than one byte the value displayed is the last value that was read from the port plus some garbage values. I don't know why is this happening. Note that the correct result is obtained when the char *buff is changed to char buff[count].

If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory
No. The pointer is passed to the read() method by value, so it is therefore completely and utterly impossible for the value to be any different after the call than it was before, regardless of the count.
so pointer decrement is necessary for bringing the pointer to the first byte of data.
The pointer already points to the first byte of data. No decrement is necessary.
This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function.
This is all nonsense based on an impossibility.
You are mistaken about all this.

In my test case as long as the data is one byte the value printed is correct but for more than one byte the value displayed is the last value that was read from the port plus some garbage values.
From the read(2) manpage:
On success, the number of bytes read is returned (zero indicates end of file),
and the file position is advanced by this number. It is not an error if this number is
smaller than the number of bytes requested; this may happen for example because fewer
bytes are actually available right now (maybe because we were close to end-of-file, or
because we are reading from a pipe, or from a terminal), or because read() was interrupted
by a signal. On error, -1 is returned, and errno is set appropriately. In this case it
is left unspecified whether the file position (if any) changes.
In the case of pipes, sockets and character devices (that includes serial ports) and a blocking file descriptor (default) read will, in practice, not wait for the full count. In your case read() blocks until a byte comes in on the serial port and returns. That is why in the output the first byte is correct and the rest is garbage (uninitialized memory). You have to add a loop around the read() that repeats until count bytes have been read if you need the full count.

I don't know why is this happening.
But I know. char * is just a pointer, but that pointer needs to be initialized to something before you can use it. Without doing so you're invoking undefined behavior and everything might happen.
Instead of the size_t count; and char *buff elements you should just use a std::vector<char>, before making the read call, resize it to the number of bytes you want to read, then take the address of the first element of that vector and pass that to read:
struct fnord {
std::string name;
std::vector data;
};
and use it like this; note that using read requires some additional work to properly deal with signal and error conditions.
size_t readsomething(int fd, size_t count, fnord &f)
{
// reserve memory
f.data.reserve(count);
int rbytes = 0;
int rv;
do {
rv = read(fd, &f.data[rbytes], count - rbytes);
if( !rv ) {
// End of File / Stream
break;
}
if( 0 > rv ) {
if( EINTR == errno ) {
// signal interrupted read... restart
continue;
}
if( EAGAIN == errno
|| EWOULDBLOCK == errno ) {
// file / socket is in nonblocking mode and
// no more data is available.
break;
}
// some critical error happened. Deal with it!
break;
}
rbytes += rv;
} while(rbytes < count);
return rbyteS;
}
Looking at your first paragraph of gibberish:
If the count is greater than one, the pointer supplied in the function argument will point to the last byte that was read from the port in the memory
What makes you think so? This is not how it works. Most likely you passed some invalid pointer that wasn't properly initialized. Anything can happen.
so pointer decrement is necessary for bringing the pointer to the first byte of data.
Nope. That's not how it works.
This is dangerous because using it in a language like C++ with it's dynamic memory allocation of containers based on their size and space needs could corrupt data at the point of return from read() function.
Nope. That's not how it works!
C and C++ are an explicit languages. Everything happens in plain sight and nothing happens without you (the programmer) explicitly requesting it. No memory is allocated without you requesting this to happen. It can either be an explicit new, some RAII, automatic storage or the use of a container. But nothing happens "out of the blue" in C and C++. There's no built-in garbage collection^1 in C nor C++. Objects don't move around in memory or resize without you explicitly coding something into your program that makes this happen.
[1]: There are GC libraries you can use, but those never will stomp onto anything that can be reached by code that's executing. Essentially garbage collector libraries for C and C++ are memory leak detectors, which will free memory that can no longer be reached by normal program flow.

Storing dynamic length data 'inside' structure

Problem statement : User provides some data which I have to store inside a structure. This data which I receive come in a data structure which allows user to dynamically add data to it.
Requirement: I need a way to store this data 'inside' the structure, contiguously.
eg. Suppose user can pass me strings which I have to store. So I wrote something like this :
void pushData( string userData )
{
struct
{
string junk;
} data;
data.junk = userData;
}
Problem : When I do this kind of storage, actual data is not really stored 'inside' the structure because string is not POD. Similar problem comes when I receive vector or list.
Then I could do something like this :
void pushData( string userData )
{
struct
{
char junk[100];
} data;
// Copy userdata into array junk
}
This store the data 'inside' the structure, but then, I can't put an upper limit on the size of string user can provide.
Can someone suggest some approach ?
P.S. : I read something about serializability, but couldnt really make out clearly if it could be helpful in my case. If it is the way to go forward, can someone give idea how to proceed with it ?
Edit :
No this is not homework.
I have written an implementation which can pass this kind of structure over message queues. It works fine with PODs, but I need to extend it to pass on dynamic data as well.
This is how message queue takes data:
i. Give it a pointer and tell the size till which it should read and transfer data.
ii. For plain old data types, data is store inside the structure, I can easily pass on the pointer of this structure to message queue to other processes.
iii. But in case of vector/string/list etc, actual data is not inside the structure and thus if I pass on the pointer of this structure, message queue will not really pass on the actual data, but rather the pointers which would be stored inside this structure.
You can see this and this. I am trying to achieve something similar.

void pushData( string userData )
{
struct Data
{
char junk[1];
};
struct Data* data = malloc(userData.size() + 1);
memcpy(data->junk, userData.data(), userData.size());
data->junk[userData.size()] = '\0'; // assuming you want null termination
}
Here we use an array of length 1, but we allocate the struct using malloc so it can actually have any size we want.

You ostensibly have some rather artificial constraints, but to answer the question: for a single struct to contain a variable amount of data is not possible... the closest you can come is to have the final member be say char [1], put such a struct at the start of a variably-sized heap region, and use the fact that array indexing is not checked to access memory beyond that character. To learn about this technique, see http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html (or the answer John Zwinck just posted)
Another approach is e.g. template <size_t N> struct X { char data_[size]; };, but each instantiation will be a separate struct type, and you can't pre-instantiate every size you might want at run-time (given you've said you don't want an upper bound). Even if you could, writing code that handles different instantiations as the data grows would be nightmarish, as would the code bloat caused.
Having a structure in one place with a string member with data in another place is almost always preferable to the hackery above.
Taking a hopefully-not-so-wild guess, I assume your interest is in serialising the object based on starting address and size, in some generic binary block read/write...? If so, that's still problematic even if your goal were satisfied, as you need to find out the current data size from somewhere. Writing struct-specific serialisation routines that incorporates the variable-length data on the heap is much more promising.

Simple solution:estimate max_size of data (ex 1000), to prevent memory leak(if free memory & malloc new size memory -> fragment memory) when pushData multiple called.
#define MAX_SIZE 1000
void pushData( string userData )
{
struct Data
{
char junk[MAX_SIZE];
};
memcpy(data->junk, userData.data(), userData.size());
data->junk[userData.size()] = '\0'; // assuming you want null termination
}

As mentioned by John Zwinck....you can use dynamic memory allocation to solve your problem.
void pushData( string userData )
{
struct Data
{
char *junk;
};
struct Data *d = calloc(sizeof(struct data), 1);
d->junk = malloc(strlen(userData)+1);
strcpy(d->junk, userdata);
}

How to convert u_char* to char[] in C

I am working with snmp and the requests->requestvb->val.string function returns me a u_char* and I am trying to store that into a char[255].
u_char newValue = *(requests->requestvb->val.string)
char myArray[255];
I have tried a few approaches to copy the contents of newValue into myArray but everything seems to segfault. What am I doing wrong?
I have tried
memcpy(myArray, newValue);
Another attempt strncopy(myArray, newValue, sizeof(myArray));
What am I doing wrong?

Your newValue is of type char, and for all intents and purposes, your myArray is of type char*.
First off, I'm going to assume that you're using memcpy correctly, and that you're passing in 3 parameters instead of 2, where the 3rd parameter is the same as the one you use in strncpy.
When you try using strncpy or memcpy, you're going beyond the one character "limit" in newValue when attempting to copy everything to myArray.
The fix should be quite simple:
u_char* newValue = requests->requestvb->val.string;
Once you've done that, this should work. Of course, that's assuming that the size of myArray is in fact greater than or equal to 255 :)
As a side note (and this should go without saying), please make sure that your myArray has a null terminating character at the end if you ever plan on printing it. Not having one after performing copy operations, and then trying to print is a very common mistake and can also lead to seg faults.

How do I fit a variable sized char array in a struct?

I don't understand how the reallocation of memory for a struct allows me to insert a larger char array into my struct.
Struct definition:
typedef struct props
{
char northTexture[1];
char southTexture[1];
char eastTexture[1];
char westTexture[1];
char floorTexture[1];
char ceilingTexture[1];
} PROPDATA;
example:
void function SetNorthTexture( PROPDATA* propData, char* northTexture )
{
if( strlen( northTexture ) != strlen( propData->northTexture ) )
{
PROPDATA* propPtr = (PROPDATA*)realloc( propData, sizeof( PROPDATA ) +
sizeof( northTexture ) );
if( propPtr != NULL )
{
strcpy( propData->northTexture, northTexture );
}
}
else
{
strcpy( propData->northTexture, northTexture );
}
}
I have tested something similar to this and it appears to work, I just don't understand how it does work. Now I expect some people are thinking "just use a char*" but I can't for whatever reason. The string has to be stored in the struct itself.
My confusion comes from the fact that I haven't resized my struct for any specific purpose. I haven't somehow indicated that I want the extra space to be allocated to the north texture char array in that example. I imagine the extra bit of memory I allocated is used for actually storing the string, and somehow when I call strcpy, it realises there is not enough space...
Any explanations on how this works (or how this is flawed even) would be great.

Is this C or C++? The code you've posted is C, but if it's actually C++ (as the tag implies) then use std::string. If it's C, then there are two options.
If (as you say) you must store the strings in the structure itself, then you can't resize them. C structures simply don't allow that. That "array of size 1" trick is sometimes used to bolt a single variable-length field onto the end of a structure, but can't be used anywhere else because each field has a fixed offset within the structure. The best you can do is decide on a maximum size, and make each an array of that size.
Otherwise, store each string as a char*, and resize with realloc.

This answer is not to promote the practice described below, but to explain things. There are good reasens not to use malloc and suggestions to use std::string, in other answers, are valid.
I think You have come across the trick used for example by Microsoft to avid the cost of a pointer dereference. In the case of Unsized Arrays in Structures (please check the link) it relies on a non-standard extension to the language. You can use a trick like that, even without the extension, but only for the struct member, that is positioned at it's end in the memory. Usually the last member in the structure declaration is also the last, in the memory, but check this question to know more about it. For the trick to work, You also have to make sure, the compiler won't add padding bytes at the end of the structure.
The general idea is like this: Suppose You have a structure with an array at the end like
struct MyStruct
{
int someIntField;
char someStr[1];
};
When allocating on the heap, You would normally say something like this
MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct));
However, if You allocate more space, than Your stuct actually occupies, You can reference the bytes, that are laid out in the memory, right behind the struct with "out of bounds" access to the array elements. Assuming some typical sizes for the int and the char, and lack of padding bytes at the end, if You write this:
MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct) + someMoreBytes);
The memory layout should look like:
| msp | msp+1 | msp+2 | msp+3 | msp+4 | msp+5 | msp+6 | ... |
| <- someIntField -> |someStr[0]| <- someMoreBytes -> |
In that case, You can reference the byte at the address msp+6 like this:
msp->someStr[2];

strcpy is not that intelligent, and it is not really working.
The call to realloc() allocates enough space for the string - so it doesn't actually crash but when you strcpy the string to propData->northTexture you may be overwriting anything following northTexture in propData - propData->southTexture, propData->westTexture etc.
For example is you called SetNorthTexture(prop, "texture");
and printed out the different textures then you would probably find that:
northTexture is "texture"
southTexture is "exture"
eastTexture is "xture" etc (assuming that the arrays are byte aligned).
Assuming you don't want to statically allocate char arrays big enough to hold the largest strings, and if you absolutely must have the strings in the structure then you can store the strings one after the other at the end of the structure. Obviously you will need to dynamically malloc your structure to have enough space to hold all the strings + offsets to their locations.
This is very messy and inefficient as you need to shuffle things around if strings are added, deleted or changed.

My confusion comes from the fact that
I haven't resized my struct for any
specific purpose.
In low level languages like C there is some kind of distinction between structs (or types in general) and actual memory. Allocation basically consists of two steps:
Allocation of raw memory buffer of right size
Telling the compiler that this piece of raw bytes should be treated as a structure
When you do realloc, you do not change the structure, but you change the buffer it is stored in, so you can use extra space beyond structure.
Note that, although your program will not crash, it's not correct. When you put text into northTexture, you will overwrite other structure fields.

NOTE: This has no char array example but it is the same principle. It is just a guess of mine of what are you trying to achieve.
My opinion is that you have seen somewhere something like this:
typedef struct tagBITMAPINFO {
BITMAPINFOHEADER bmiHeader;
RGBQUAD bmiColors[1];
} BITMAPINFO, *PBITMAPINFO;
What you are trying to obtain can happen only when the array is at the end of the struct (and only one array).
For example you allocate sizeof(BITMAPINFO)+15*sizeof(GBQUAD) when you need to store 16 RGBQUAD structures (1 from the structure and 15 extra).
PBITMAPINFO info = (PBITMAPINFO)malloc(sizeof(BITMAPINFO)+15*sizeof(GBQUAD));
You can access all the RGBQUAD structures like they are inside the BITMAPINFO structure:
info->bmiColors[0]
info->bmiColors[1]
...
info->bmiColors[15]
You can do something similar to an array declared as char bufStr[1] at the end of a struct.
Hope it helps.

One approach to keeping a struct and all its strings together in a single allocated memory block is something like this:
struct foo {
ptrdiff_t s1, s2, s3, s4;
size_t bufsize;
char buf[1];
} bar;
Allocate sizeof(struct foo)+total_string_size bytes and store the offsets to each string in the s1, s2, etc. members and bar.buf+bar.s1 is then a pointer to the first string, bar.buf+bar.s2 a pointer to the second string, etc.
You can use pointers rather than offsets if you know you won't need to realloc the struct.
Whether it makes sense to do something like this at all is debatable. One benefit is that it may help fight memory fragmentation or malloc/free overhead when you have a huge number of tiny data objects (especially in threaded environments). It also reduces error handling cleanup complexity if you have a single malloc failure to check for. There may be cache benefits to ensuring data locality. And it's possible (if you use offsets rather than pointers) to store the object on disk without any serialization (keeping in mind that your files are then machine/compiler-specific).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js