Related
I have a class that wraps C functions for reading and writing data using file descriptors
I'm currently stuck at read method.
I want to create a read method that wraps the C function ssize_t read(int fd, void *buf, size_t count);
The function above uses void *buf as an output and returns the number of bytes written in the buffer.
I want to have a method read that would return a variable size object that would contain that data or nullptr if no data was read.
What is the best way to do that?
EDIT: I already have a char array[4096] that I use to read data. I just want to return them and also give the caller the ability to know the length of the data that I return.
The char array[4096] is a member of the class that wraps C read. The reason I use it is to store the data temporarily before return them to the caller. Every time I call the wrapper read the char array will ovewriten by design. An upper layer will be responsible for concatenate the data and construct messages. This upper layer is the one that needs to know how much data has arrived.
The size of the char array[4096] is randomly chosen. It could be very small but more calls would be needed.
The object that contains the member char array will always be global.
I use C++17
Should I use std::vector or std::queue ?
The general answer here is: Don't use mutable global state. It breaks reentrancy and threading. And don't compound the issue by trying to return views of mutable global state, which makes even sequential calls a problem.
Just allocate a per-call buffer and use that; if you want to allow the caller to provide a buffer, that's also acceptable. Examples would look like:
// Some class assumed to have an fd member for reading via the C API
class Reader
{
// Define member attributes, e.g. fd
public:
std::string_view read(std::string& buf) {
ssize_t numread = read(fd, buf.data(), buf.size());
// Error checking if applicable, presumably handling negative return values
// by raising exception
return std::string_view(buf.data(), numread); // Guaranteed copy-elision
}
std::string read(size_t max_read) {
std::string buf(max_read, '\0'); // Allocate appropriately sized buffer
auto view = read(buf); // Delegate to view-based API
buf.resize(view.size()); // Resize to match amount actually read
return buf; // Likely (but not guaranteed) NRVO based copy-elision
}
}
std::string and std::string_view could be replaced with std::vector and std::span of some type in C++20 if you preferred (std::span would allow receiving a std::span instead of std::string& in C++20, making the code more generic).
This provides the caller with multiple options:
Call read with an existing pre-sized std::string (maybe change to std::span for C++20) that the caller can reuse over and over
Call read with an explicit size and get a freshly allocated std::string with few if any no copies involved (NRVO will avoid copying the std::string being returned in most cases, though if the underlying read reads very little, the resize call might reallocate the underlying storage and trigger a copy of whatever real data exists)
For maximum efficiency, many callers calling this repeatedly would choose #1 (they'd just create a local std::string of a given size, pass it in by reference, then use the returned std::string_view to limit how much of the buffer they actually work with), but for simple one-off uses, option #2 is convenient.
EDIT: I already have a char array[4096] that I use to read data. I just want to return them and also give the caller the ability to know the length of the data that I return.
Right, so the key information is that you don't want to copy that (or at least you don't want to force an additional copy).
Current preferred return type is std::span, but that's C++20 and you're still on 17.
Second preference is std::string_view. It'll work fine for binary data but may confuse people who expect it to be printable, not contain null terminators and so on.
Otherwise you can obviously return some struct or tuple with pointer & length (and possiblyerrno, which is otherwise discarded).
Returning something that might be nullptr is pretty much the least preferred option. Don't do it. It's actually harder to use correctly than the original C interface.
You could use function overloading:
void read(int fileDescriptor, short int & variable)
{
static_cast<void>(read(fileDescriptor, &variable, sizeof(variable));
}
void read(int fileDescriptor, int & variable)
{
static_cast<void>(read(fileDescriptor, &variable, sizeof(variable));
}
You may want to also look into using templates.
I am trying to create a LabVIEW DLL and call it from a C++ program but I am facing a problem of data passing.
A scientific camera I recently bought comes with a LabVIEW SDK, and nothing else. The example program provided with the SDK is mainly a while loop around two functions, ReadData and DecodeData.
ReadData collects data from USB (using VISA read), the data obtained in one call contains several complete data blocks and an incomplete incoming block.
DecodeData is called multiple times to process all the complete blocks (it removes the processed data from the buffer). When all the complete blocks have been processed, the remaining data (the beginning of the incoming block) is passed to ReadData which will concatenate its new data at the end of the buffer.
Full example code:
Details of ReadData:
Details of DecodeData:
In the example program, written in LabVIEW, everything works fine. The problem is when I export these functions in a DLL. The memory buffers, inputs and outputs of both functions, are char arrays. After ReadData, my C++ program correctly obtains a buffer containg data, including null bytes.
The problem is when I inject this buffer in DecodeData, it seems that LabVIEW only takes into account the bytes before the first null byte... I guess that the char[] input is just processed as a null-terminated string and the rest of the data is just discarded.
I tried to add data converters ("string to byte array" at outputs and "byte array to string" at inputs) but the conversion function also discards the data after the first null character.
I could modify the .vi from the sdk to only handle byte arrays and not strings, but it uses lots of character processing functions and I would prefer leaving it as is.
How can I pass the data buffer from C++ to the LabVIEW DLL without losing part of my data?
Edit: here is the C++ code.
The header exported with the LabVIEW DLL:
int32_t __cdecl CORE_S_Read_data_from_USB(char VISARefIn[],
Enum1 blockToProcessPrevCycle, uint32_t bytesToProcessPrevCycle,
uint8_t inBytesRead[], uint32_t *BytesReceived, LVBoolean *DataReception,
uint8_t outBytesRead[], Enum1 *blockToProcess, uint32_t *bytesToProcess,
int32_t longueur, int32_t longueur2);
void __cdecl CORE_S_Decode_data(uint8_t inBytesRead[],
LVBoolean LUXELL256TypeB, uint32_t bytesToProcess, Enum1 blockToProcess,
Cluster2 *PrevHeader, LVBoolean *FrameCompleto,
uint32_t *bytesToProcessNextCycle, Enum1 *blockToProcessNextCycle,
Cluster2 *HeaderOut, uint8_t outBytesRead[], Int16Array *InfraredImage,
Cluster2 *Header, int32_t longueur, int32_t longueur2, int32_t longueur3);
Usage in my C++ source:
while (...)
{
// Append new data in uiBytesRead
ret = CORE_S_Read_data_from_USB(VISARef, blockToProcess, bytesToProcess, uiBytesRead, &BytesReceived,
&DataReception, uiBytesRead, &blockToProcess, &bytesToProcess, BUFFER_SIZE, BUFFER_SIZE);
if (DataReception == 0)
continue;
bool FrameCompleto = true;
while (FrameCompleto)
{
// Removes one frame of uiBytesRead per call
CORE_S_Decode_data(uiBytesRead, LUXELL256TypeB, 0, blockToProcess, &Header, &FrameCompleto, &bytesToProcess, &blockToProcess, &Header,
uiBytesRead, &InfraredImage, &Header, BUFFER_SIZE, BUFFER_SIZE, BUFFER_SIZE);
}
}
It is a little tricky to answer in this specific case but assuming that the problem is that NULL values in the buffer data are causing issues then it might be worth looking at the option to use String Handle Pointers for the String-Type controls and indicators of the VIs you are exporting.
This option can be selected during the "Define VI Prototype" stage of configuring the DLL Build
LabVIEW manages String Types internally as an integer of the string's length and an unsigned char array so it shouldn't matter what characters are used. For interfacing with external code, LabVIEW's extcode.h header defines an LStrHandle as follows:
typedef struct {
int32 cnt; /* number of bytes that follow */
uChar str[1]; /* cnt bytes */
} LStr, *LStrPtr, **LStrHandle;
So a String Handle Pointer is of type *LStrHandle.
extcode.h provides the macros LHStrBuf(LStrHandle) and LHStrLen(LStrHandle) which can ease dereferencing for the String Handle Pointer when you want to read or update the string content and length. Also, note a NULL handle can be used to represent an empty string so don't assume that the handle will be valid without checking.
When creating or resizing String Handle Pointers to pass to a function, it is worth noting that an LStr has exactly the same in-memory representation as a LabVIEW-array so the function NumericArrayResize with typeCode uB can create/resize a large enough buffer to store the string and the length-integer.
An example of creating a new String Handle Pointer for a string of length required_string_length is achieved by passing NumericArrayResize a handle pointer where the handle is NULL.
LStrHandle* new_string_handle_pointer;
// assign NULL value to handle
*new_string_handle_pointer=0;
err = NumericArrayResize(uB, 1, (UHandle *)new_string_handle_pointer, required_string_length);
// new_string_handle_pointer will now reference the new LStrHandle
When updating the String value in a String Handle remember to write the string's characters to the uChar array and to update the size integer. From a performance view, it might not be worth shrinking a String Handle when updating it to a shorter string but you will need to resize it if you know the string you are writing to it will be longer than what it can hold.
You should clean up any handle that is passed to you from LabVIEW or a LabVIEW-based DLL so once you have finished with it, call DSDisposeHandle on the handle that the handle-pointer references.
For more information on LabVIEW's memory manager function please read this guide.
I'm using IOKit framework to communicate with my driver using IOConnectCallMethod from the user-space client and IOExternalMethodDispatch on the driver side.
So far I was able to send fixed length commands, and now I wish to send a varied size array of chars (i.e. fullpath).
However, it seems that the driver and the client sides command lengths are coupled, which means that checkStructureInputSize from IOExternalMethodDispatch in driver must be equal to inputStructCnt from
IOConnectCallMethod in client side.
Here are the struct contents on both sides :
DRIVER :
struct IOExternalMethodDispatch
{
IOExternalMethodAction function;
uint32_t checkScalarInputCount;
uint32_t checkStructureInputSize;
uint32_t checkScalarOutputCount;
uint32_t checkStructureOutputSize;
};
CLIENT:
kern_return_t IOConnectCallMethod(
mach_port_t connection, // In
uint32_t selector, // In
const uint64_t *input, // In
uint32_t inputCnt, // In
const void *inputStruct, // In
size_t inputStructCnt, // In
uint64_t *output, // Out
uint32_t *outputCnt, // In/Out
void *outputStruct, // Out
size_t *outputStructCnt) // In/Out
Here's my failed attempt to use a varied size command :
std::vector<char> rawData; //vector of chars
// filling the vector with filePath ...
kr = IOConnectCallMethod(_connection, kCommandIndex , 0, 0, rawData.data(), rawData.size(), 0, 0, 0, 0);
And from the driver command handler side, I'm calling IOUserClient::ExternalMethod with IOExternalMethodArguments *arguments and IOExternalMethodDispatch *dispatch but this requires the exact length of data I'm passing from the client which is dynamic.
this doesn't work unless I set the dispatch function with the exact length of data it should expect.
Any idea how to resolve this or perhaps there's a different API I should use in this case ?
As you have already discovered, the answer for accepting variable-length "struct" inputs and outputs is to specify the special kIOUCVariableStructureSize value for input or output struct size in the IOExternalMethodDispatch.
This will allow the method dispatch to succeed and call out to your method implementation. A nasty pitfall however is that structure inputs and outputs are not necessarily provided via the structureInput and structureOutput pointer fields in the IOExternalMethodArguments structure. In the struct definition (IOKit/IOUserClient.h), notice:
struct IOExternalMethodArguments
{
…
const void * structureInput;
uint32_t structureInputSize;
IOMemoryDescriptor * structureInputDescriptor;
…
void * structureOutput;
uint32_t structureOutputSize;
IOMemoryDescriptor * structureOutputDescriptor;
…
};
Depending on the actual size, the memory region might be referenced by structureInput or structureInputDescriptor (and structureOutput or structureOutputDescriptor) - the crossover point has typically been 8192 bytes, or 2 memory pages. Anything smaller will come in as a pointer, anything larger will be referenced by a memory descriptor. Don't count on a specific crossover point though, that's an implementation detail and could in principle change.
How you handle this situation depends on what you need to do with the input or output data. Usually though, you'll want to read it directly in your kext - so if it comes in as a memory descriptor, you need to map it into the kernel task's address space first. Something like this:
static IOReturn my_external_method_impl(OSObject* target, void* reference, IOExternalMethodArguments* arguments)
{
IOMemoryMap* map = nullptr;
const void* input;
size_t input_size;
if (arguments->structureInputDescriptor != nullptr)
{
map = arguments->structureInputDescriptor->createMappingInTask(kernel_task, 0, kIOMapAnywhere | kIOMapReadOnly);
if (map == nullptr)
{
// insert error handling here
return …;
}
input = reinterpret_cast<const void*>(map->getAddress());
input_size = map->getLength();
}
else
{
input = arguments->structureInput;
input_size = arguments->structureInputSize;
}
// …
// do stuff with input here
// …
OSSafeReleaseNULL(map); // make sure we unmap on all function return paths!
return …;
}
The output descriptor can be treated similarly, except without the kIOMapReadOnly option of course!
CAUTION: SUBTLE SECURITY RISK:
Interpreting user data in the kernel is generally a security-sensitive task. Until recently, the structure input mechanism was particularly vulnerable - because the input struct is memory-mapped from user space to kernel space, another userspace thread can still modify that memory while the kernel is reading it. You need to craft your kernel code very carefully to avoid introducing a vulnerability to malicious user clients. For example, bounds-checking a userspace-supplied value in mapped memory and then re-reading it under the assumption that it's still within the valid range is wrong.
The most straightforward way to avoid this is to make a copy of the memory once and then only use the copied version of the data. To take this approach, you don't even need to memory-map the descriptor: you can use the readBytes() member function. For large amounts of data, you might not want to do this for efficiency though.
Recently (during the 10.12.x cycle) Apple changed the structureInputDescriptor so it's created with the kIOMemoryMapCopyOnWrite option. (Which as far as I can tell was created specifically for this purpose.) The upshot of this is that if userspace modifies the memory range, it doesn't modify the kernel mapping but transparently creates copies of the pages it writes to. Relying on this assumes your user's system is fully patched up though. Even on a fully patched system, the structureOutputDescriptor suffers from the same issue, so treat it as write-only from the kernel's point of view. Never read back any data you wrote there. (Copy-on-write mapping makes no sense for the output struct.)
After going through the relevant manual again, I've found the relevant paragraph :
The checkScalarInputCount, checkStructureInputSize, checkScalarOutputCount, and checkStructureOutputSize fields allow for sanity-checking of the argument list before passing it along to the target object. The scalar counts should be set to the number of scalar (64-bit) values the target's method expects to read or write. The structure sizes should be set to the size of any structures the target's method expects to read or write. For either of the struct size fields, if the size of the struct can't be determined at compile time, specify kIOUCVariableStructureSize instead of the actual size.
So all I had to do in order to avoid the size verification, is to set the field checkStructureInputSize to value kIOUCVariableStructureSize in IoExternalMethodDispatch and the command passed to the driver properly.
I want to create a byte array out of an unknown struct and add a number additionally in the front of this byte array. How do I do this?
I currently have this code:
template <class T>
void CopterConnection::infoToByteArray(char *&bit_data, size_t *msglen,
T data) {
// Determine which kind of element is in the array, will change in the final code
char typeID = -1;
*msglen = sizeof(data);
*msglen += 1; // take in account of typeID
// Create the pointer to the byte representation of the struct
bit_data = new char[*msglen];
// copy the information from the struct into the byte array
memcpy(bit_data, &data+1, *msglen-1);
bit_data[1] = typeID;
}
But this is not working. I guess I use the memcpy wrong. I want to copy the unkown struct T into the positions bit_data[1] to bit_data[*end*]. What is the best way to achieve this?
One possible problem and one definitive problem:
The possible problem is that array indexing starts at zero. So you should copy to bit_data + 1 to skip over the first byte, and then of course use bit_data[0] to set the type id.
The definitive problem is that &data + 1 is equal to (&data)[1], and that will be out of bounds and lead to undefined behavior. You should just copy from &data.
Putting it all together the last to lines should be
memcpy(bit_data + 1, &data, *msglen-1);
bit_data[0] = typeID;
There is another possible problem, which depends on what you're doing with the data in bit_data and what T is. If T is not a POD type then you simply can not expect a bitwise copy (what memcpy does) to work very well.
Also if T is a class or structure with members that are pointers then you can't save those to disk or transfer to another computer or even to another process on the same computer.
There are a few bugs in there, in addition to the fact you are messing around with new.
The memcpy line itself you use &data + 1 as the source which here will be undefined behaviour. It will add sizeof(data) bytes to the address which is copied so in the stack somewhere and whilst "one past the end" is a valid pointer so this address is valid in pointer arithmetic, nothing you read from it will be, nor anything after it.
bit_data[1] is the 2nd character in your buffer.
My C++ project has a buffer which could be any size and is filled by Bluetooth. The format of the incoming messages is like 0x43 0x0B 0x00 0x06 0xA2 0x03 0x03 0x00 0x01 0x01 0x0A 0x0B 0x0B 0xE6 0x0D in which starts with 0x43 and ends with 0x0D. So, it means that each time when buffer is filled, it can have different order of contents according to the above message format.
static const int BufferSize = 1024;
byte buffer[BufferSize];
What is the best way to parse the incoming messages in this buffer?
Since I have come from Java and .NET, What is the best way to make each extracted message as an object? Class could be solution?
I have created a separate class for parsing the buffer like bellow, am I in a right direction?
#include<parsingClass.h>
class A
{
parsingClass ps;
public:
parsingClass.parse(buffer, BufferSize);
}
class ReturnMessage{
char *message;
public:
char *getMessage(unsigned char *buffer,int count){
message = new char[count];
for(int i = 1; i < count-2; i++){
message[i-1] = buffer[i];
}
message[count-2] = '\0';
return message;
}
};
class ParserToMessage{
static int BufferSize = 1024;
unsigned char buffer[BufferSize];
unsigned int counter;
public:
static char *parse_buffer()
{
ReturnMessage rm;
unsigned char buffByte;
buffByte = blueToothGetByte(); //fictional getchar() kind of function for bluetooth
if(buffByte == 0x43){
buffer[counter++] = buffByte;
//continue until you find 0x0D
while((buffByte = blueToothGetByte()) != 0x0D){
buffer[counter++] = buffByte;
}
}
return rm.getMessage(buffer,counter);
}
};
Can you have the parser as a method of a 'ProtocolUnit' class? The method could take a buffer pointer/length as a parameter and return an int that indicates how many bytes it consumed from the buffer before it correctly assembled a complete protocol unit, or -1 if it needs more bytes from the next buffer.
Once you have a complete ProtocolUnit, you can do what you wish with it, (eg. queue it off to some processing thread), and create a new one for the remaining bytes/next buffer.
My C++ project has a buffer which could be any size
The first thing I notice is that you have hard-coded the buffer size. You are in danger of buffer overflow if an attempt is made to read data bigger than the size you have specified into the buffer.
If possible keep the buffer size dynamic and create the byte array according to the size of the data to be received into the buffer. Try and inform the object where your byte array lives of the incoming buffer size, before you create the byte array.
int nBufferSize = GetBufferSize();
UCHAR* szByteArray = new UCHAR[nBufferSize];
What is the best way to parse the incoming messages in this buffer?
You are on the right lines, in that you have created and are using a parser class. I would suggest using memcpy to copy the individual data items one at a time, from the buffer to a variable of your choice. Not knowing the wider context of your intention at this point, I cannot add much to that.
Since I have come from Java and .NET, What is the best way to make
each extracted message as an object? Class could be solution?
Depending on the complexity of the data you are reading from the buffer and what your plans are, you could use a class or a struct. If you do not need to create an object with this data, which provides services to other objects, you could use a struct. Structs are great when your need isn't so complex, whereby a full class might be overkill.
I have created a separate class for parsing the buffer like bellow, am
I in a right direction?
I think so.
I hope that helps for starters!
The question "how should I parse this" depends largely on how you want to parse the data. Two things are missing from your question:
Exactly how do you receive the data? You mention Bluetooth but what is the programming medium? Are you reading from a socket? Do you have some other kind of API? Do you receive it byte at a time or in blocks?
What are the rules for dealing with the data you are receiving? Most data is delimited in some way or of fixed field length. In your case, you mention that it can be of any length but unless you explain how you want to parse it, I can't help.
One suggestion I would make is to change the type of your buffer to use std::vector :
std::vector<unsigned char> buffer(normalSize)
You should choose normalSize to be something around the most frequently observed size of your incoming message. A vector will grow as you push items onto it so, unlike the array you created, you won't need to worry about buffer overrun if you get a large message. However, if you do go above normalSize under the covers the vector will reallocate enough memory to cope with your extended requirements. This can be expensive so you don't want to do it too often.
You use a vector in pretty much the same way as your array. One key difference is that you can simply push elements onto the end of the vector, rather than having to keep a running pointer. SO imagine you received a single int at a time from the Bluetooth source, your code might look something like this:
// Clear out the previous contents of the buffer.
buffer.clear();
int elem(0);
// Find the start of your message. Throw away elements
// that we don't need.
while ( 0x43 != ( elem = getNextBluetoothInt() ) );
// Push elements of the message into the buffer until
// we hit the end.
while ( 0x0D != elem )
{
buffer.push_back( elem );
}
buffer.push_back( elem ); // Remember to add on the last one.
The key benefit is that array will automatically resize the vector without you having to do it no matter whether the amount of characters pushed on is 10 or 10,000.