std::fstream with multiple buffers?

std::fstream with multiple buffers? - c++

You can specify one buffer for your file stream like that:
char buf[BUFFER_SIZE];
std::ofstream file("file", std::ios_base::binary | std::ios_base::out);
if (file.is_open())
{
file.rdbuf()->pubsetbuf(buf, BUFFER_SIZE);
file << "abcd";
}
What I want to do now, is using more than just one buffer:
char* buf[] = { new char[BUFFER_SIZE], new char[BUFFER_SIZE], new char[BUFFER_SIZE], };
Is it possible without creating a custom derivation of std::streambuf?
EDIT:
I think I need to explain what I want to do in more detail. Please consider the following situation:
- The file(s) I want to read won't fit into memory
- The file while be accessed by some kind of a binary jump search
So, if you split the file into logical pages of a specific size, then I would like to provide multiple buffers which are representing specific pages. This would increase performance when a file location is read and the related page is already in a buffer.

I gather from the comment that you want to do a kind of scatter-gather I/O. I'm pretty sure there's no support for that in the C++ standard I/O streams library, so you'll have to roll your own.
If you want to do this efficiently, you can use OS support for scatter-gather. E.g., POSIX/Unix-like systems have writev for this purpose.

There's nothing like this provided by Standard. However, depending on your platform, you can use Memory Mapped Files, which provide the same functionality. Windows and Linux both provide them.

I will take a look at boost::iostreams::mapped_file, but I think my requirement is much simpler. I've created a custom class derived from basic_filebuf.
template<typename char_type>
class basic_filemultibuf : public std::basic_filebuf<char_type/*, std::char_traits<char_type>*/>
{
private:
char_type** m_buffers;
std::ptrdiff_t m_buffer_count,
m_curent_buffer;
std::streamsize m_buffer_size;
protected:
virtual int_type overflow(int_type meta = traits_type::eof())
{
if (this->m_buffer_count > 0)
{
if (this->m_curent_buffer == this->m_buffer_count)
this->m_curent_buffer = 0;
this->basic_filebuf::setbuf(this->m_buffers[this->m_curent_buffer++], this->m_buffer_size);
}
return this->basic_filebuf::overflow(meta);
}
public:
basic_filemultibuf(basic_filebuf const& other)
: basic_filebuf(other),
m_buffers(NULL),
m_buffer_count(0),
m_curent_buffer(-1),
m_buffer_size(0)
{
}
basic_filemultibuf(basic_filemultibuf const& other)
: basic_filebuf(other),
m_buffers(other.m_buffers),
m_buffer_count(other.m_buffer_count),
m_curent_buffer(other.m_curent_buffer),
m_buffer_size(other.m_buffer_size)
{
}
basic_filemultibuf(FILE* f = NULL)
: basic_filemultibuf(basic_filebuf(f))
{
}
basic_filemultibuf* pubsetbuf(char** buffers, std::ptrdiff_t buffer_count, std::streamsize buffer_size)
{
if ((this->m_buffers = buffers) != NULL)
{
this->m_buffer_count = buffer_count;
this->m_buffer_size = buffer_size;
this->m_curent_buffer = 0;
}
else
{
this->m_buffer_count = 0;
this->m_buffer_size = 0;
this->m_curent_buffer = -1;
}
this->basic_filebuf::setbuf(NULL, 0);
return this;
}
};
Example usage:
typedef basic_filemultibuf<char> filemultibuf;
std::fstream file("file", std::ios_base::binary | std::ios_base::in | std::ios_base::out);
char** buffers = new char*[2];
for (int i = 0; i < n; ++i)
buffers[i] = new char[4096];
filemultibuf multibuf(*file.rdbuf());
multibuf.pubsetbuf(buffers, 2, 4096);
file.set_rdbuf(&multibuf);
//
// do awesome stuff with file ...
//
for (int i = 0; i < n; ++i)
delete[] buffers[i];
That's pretty much it. The only thing I would really like to do is offer this functionally for other streambufs, because the usage of multiple buffers should not be restricted to filebuf. But it seems to me it isn't possible without rewriting the file specific functions.
What do you think about that?

Related

Byte offset greater than Byte Length in BufferView

I'm trying to read data from scene.bin files using Microsoft::glTF SDK. TinyGLTF is not an option. When I try to read MeshPrimitive attribute called TEXCOORD_0 i get a situation where BufferView byteOffset is greater than byteLength. Therefore, I don't know how to properly read given data and my program crashes.
I tried reading data using IStreamReader which is a part of SDK, and is a must when reading bin files using this SDK. I calculate data offset by adding accessor.byteOffset + bufferView.byteOffset which is > byteLength.
struct BuffersAccessors {
Microsoft::glTF::Accessor accessor;
Microsoft::glTF::BufferView view;
Microsoft::glTF::Buffer buffer;
void operator=(BuffersAccessors accessors);
};
template<typename T> struct BufferInfo {
BuffersAccessors buffersAccessors;
std::vector<T> bufferData;
BufferInfo<T>();
BufferInfo<T>(BuffersAccessors buffersAccessors, std::vector<T> bufferData);
const void operator=(const BufferInfo<T> &info) {
buffersAccessors = info.buffersAccessors;
bufferData = info.bufferData;
};
};
template<typename T>
std::vector<T> readBufferData(Microsoft::glTF::Document document, BufferInfo<T> bufferInfo, std::filesystem::path path) {
std::vector<T> stream;
if (bufferInfo.buffersAccessors.buffer.uri.length() > 0 || bufferInfo.buffersAccessors.buffer.byteLength > 0) {
Microsoft::glTF::Buffer buffer = bufferInfo.buffersAccessors.buffer;
path += bufferInfo.buffersAccessors.buffer.uri;
path = std::filesystem::absolute(path);
buffer.uri = path.string();
std::shared_ptr<StreamReader> streamReader = std::make_shared<StreamReader>(path);
Microsoft::glTF::GLTFResourceReader reader(streamReader);
stream = reader.ReadBinaryData<T>(buffer, bufferInfo.buffersAccessors.view);
}
return stream;
}
template<typename T>
BufferInfo<T> getFullBufferData(Microsoft::glTF::Document document, std::string accessorKey, std::filesystem::path path) {
BufferInfo<T> bufferInfo{};
BuffersAccessors mainPart = getBufferAccessorFromDocument(document, accessorKey);
bufferInfo.buffersAccessors = mainPart;
std::vector<T> bufferData = vkglTF::readBufferData<T>(document, bufferInfo, path);
const size_t bufferDataOffset = mainPart.accessor.byteOffset + mainPart.view.byteOffset; //How to properly calculate offset?
bufferData.erase(bufferData.begin(), bufferData.begin() + bufferDataOffset);
bufferInfo.bufferData = bufferData;
return bufferInfo;
}
I expect data in formats like uint8 and uint16 but my program crashes when trying to do bufferData.erase(..).
Edit: This happens while reading WEIGHTS_0 too.

I think the most likely error with your code is the mixing of byte offsets and vector element indices. Have you tried dividing bufferDataOffset by sizeof(T)?
Second, if you only want to read an accessor's data then try using the ReadBinaryData overload that accepts an Accessor parameter instead. That way the glTF SDK will handle all of the offset calculations for you.
There is no documentation but the deserialize sample demonstrates the basic code structure recommended when using the glTF SDK.

c/c++ get large size data like 180 array from another class in stm32

I have an 32-bit ARM Cortex M4 (the processor in Pixhawk) to write two classes, each one is one threading in Pixhawk codebase setting.
The first one is LidarScanner, which dealing with incoming serial data and generates "obstacle situation". The second one is Algorithm, which handle "obstacle situation" and take some planning strategy. Here are my solution right now, use the reference function LidarScanner::updateObstacle(uint8_t (&array)[181]) to update "obstacle situation" which is 181 size array.
LidarScanner.cpp:
class LidarScanner{
private:
struct{
bool available = false;
int AngleArr[181];
int RangeArr[181];
bool isObstacle[181] = {}; //1: unsafe; 0:safe;
}scan;
......
public:
LidarScanner();
//main function
void update()
{
while(hal.uartE->available()) //incoming serial data is available
{
decode_data(); //decode serial data into three kind data: Range, Angle and Period_flag
if(complete_scan()) //determine if the lidarscanner one period is completed
{
scan.available = false;
checkObstacle(); //check obstacle situation and store safety in isObstacle[181]
scan.available = true;
}
}
}
//for another API recall
void updateObstacle(uint8_t (&array)[181])
{
for(int i=0; i<=181; i++)
{
array[i]=scan.isObstacle[i];
}
}
//for another API recall
bool ScanAvailable() const { return scan.available; }
......
}
Algorithm.cpp:
class Algorithm{
private:
uint8_t Obatcle_Value[181] = {};
class LidarScanner& _lidarscanner;
......
public:
Algorithm(class LidarScanner& _lidarscanner);
//main funcation
void update()
{
if (hal.uartE->available() && _lidarscanner.ScanAvailable())
{
//Update obstacle situation into Algorithm phase and do more planning strategy
_lidarscanner.updateObstacle(Obatcle_Value);
}
}
......
}`
Usually, it works fine. But I want to improve the performances so that I want to know what's the most effective way to do that. thanks!!!!

The most efficient way to copy data is to use the DMA.
DMAx_Channelx->CNDTR = size;
DMAx_Channelx->CPAR = (uint32_t)&source;
DMAx_Channelx->CMAR = (uint32_t)&destination;
DMAx_Channelx->CCR = (0<<DMA_CCR_MSIZE_Pos) | (0<<DMA_CCR_PSIZE_Pos)
| DMA_CCR_MINC | DMA_CCR_PINC | DMA_CCR_MEM2MEM ;
while(!(DMAx->ISR & DMA_ISR_TCIFx ));
AN4031 Using the DMA controller.

When using memory BIOs with OpenSSL, how can you find the 'needed size' for the input BIO?

Here's some sample code which shows how I'm using OpenSSL:
BIO *CreateMemoryBIO() {
if (BIO *bio = BIO_new(BIO_s_mem())) {
BIO_set_mem_eof_return(bio, -1);
return bio;
}
throw std::runtime_error("Could not create memory BIO");
}
m_readBIO = CreateMemoryBIO();
m_writeBIO = CreateMemoryBIO();
SSL_set_bio(m_ssl, m_readBIO, m_writeBIO);
Now, if I do an SSL_Read, and I get SSL_ERROR_WANT_READ, is there any way for me to find out how much it had tried to read internally (in other words, how much do I need to write with BIO_write to m_readBIO before SSL_Read would be satisfied?)
A good lower bound would work for me as well, my issue is that I need to report how much data to read to the layer above me, and it will not return control to me until it has read that much data (and I don't want to degenerate into 1-byte reads).
I'm aware that SSL_Read and SSL_Write may both alternately read & write due to handshaking and such, but I'm interested in the 'current' read that is being done internally.
If it's not possible to do with the standard BIO_s_mem, I assume it could be done if I wrote my own BIO which 'remembered' the size of the last read request which failed, so any pointers to documentation on writing custom BIOs (which, to my knowledge, is supported by OpenSSL) would also be appreciated.

Thanks to CristiFati for the suggesting BIO_set_callback, it seems to work. If you want to make your comment into an answer, I'll accept it, but I want to put the details here for posterity.
Inside my 'SSLSocket' class:
in the constructor:
BIO_set_callback(m_readBIO, &BIOCallback);
BIO_set_callback_arg(m_readBIO, reinterpret_cast<char*>(this));
long SSLSocket::BIOCallback(
BIO *in_bio,
int in_operation,
const char* in_arg1,
int in_arg2,
long in_arg3,
long in_returnValue)
{
// in_bio isn't provided for BIO_CB_FREE.
if (BIO_CB_FREE == in_operation)
{
return in_returnValue;
}
assert(in_arg1);
return reinterpret_cast<SSLSocket*>(BIO_get_callback_arg(in_bio))->DoBIOCallback(
in_bio,
in_operation,
in_arg1,
in_arg2,
in_arg3,
in_returnValue);
long SSLSocket::DoBIOCallback(
BIO *in_bio,
int in_operation,
const char* in_arg1,
int in_arg2,
long in_arg3,
long in_returnValue)
{
UNUSED(in_arg3);
// We only care about the return callback for BIO_read()
if ((BIO_CB_READ | BIO_CB_RETURN) == in_operation)
{
const int shouldRetry = BIO_should_retry(in_bio);
const int bytesRequested = in_arg2;
assert(bytesRequested > 0);
if ((in_returnValue <= 0) && shouldRetry)
{
m_needBytes = bytesRequested;
}
else if ((in_returnValue > 0) && (in_returnValue < bytesRequested) && shouldRetry)
{
m_needBytes = bytesRequested - in_returnValue;
}
else
{
m_needBytes = 0;
}
}
return in_returnValue;
}
Then I use m_needBytes to decide how much to write in BIO_write().

Keeping the downloaded torrent in memory rather than file libtorrent

Working with Rasterbar libtorrent I dont want the downloaded data to sit on my hard drive rather a pipe or variable or something Soft so I can redirect it to somewhere else, Mysql, or even trash if it is not what I want, is there anyway of doing this in preferably python binding if not in C++ using Libtorrent?
EDIT:--> I like to point out this is a libtorrent question not a Linux file handling or Python file handling question. I need to tell libtorrent to instead of save the file traditionally in a normal file save it to my python pipe or variable or etc.

You can do this by implementing your own storage class to use with libtorrent. Unfortunately this is not possible to do in python, but you can do it in c++. The documentation for it is a bit scarce and can be found here.
Here's a simple example of how to do this by storing all the data in RAM:
struct temp_storage : storage_interface
{
temp_storage(file_storage const& fs) : m_files(fs) {}
virtual bool initialize(bool allocate_files) { return false; }
virtual bool has_any_file() { return false; }
virtual int read(char* buf, int slot, int offset, int size)
{
std::map<int, std::vector<char> >::const_iterator i = m_file_data.find(slot);
if (i == m_file_data.end()) return 0;
int available = i->second.size() - offset;
if (available <= 0) return 0;
if (available > size) available = size;
memcpy(buf, &i->second[offset], available);
return available;
}
virtual int write(const char* buf, int slot, int offset, int size)
{
std::vector<char>& data = m_file_data[slot];
if (data.size() < offset + size) data.resize(offset + size);
std::memcpy(&data[offset], buf, size);
return size;
}
virtual bool rename_file(int file, std::string const& new_name) { assert(false); return false; }
virtual bool move_storage(std::string const& save_path) { return false; }
virtual bool verify_resume_data(lazy_entry const& rd, error_code& error) { return false; }
virtual bool write_resume_data(entry& rd) const { return false; }
virtual bool move_slot(int src_slot, int dst_slot) { assert(false); return false; }
virtual bool swap_slots(int slot1, int slot2) { assert(false); return false; }
virtual bool swap_slots3(int slot1, int slot2, int slot3) { assert(false); return false; }
virtual size_type physical_offset(int slot, int offset) { return slot * m_files.piece_length() + offset; };
virtual sha1_hash hash_for_slot(int slot, partial_hash& ph, int piece_size)
{
int left = piece_size - ph.offset;
TORRENT_ASSERT(left >= 0);
if (left > 0)
{
std::vector<char>& data = m_file_data[slot];
// if there are padding files, those blocks will be considered
// completed even though they haven't been written to the storage.
// in this case, just extend the piece buffer to its full size
// and fill it with zeroes.
if (data.size() < piece_size) data.resize(piece_size, 0);
ph.h.update(&data[ph.offset], left);
}
return ph.h.final();
}
virtual bool release_files() { return false; }
virtual bool delete_files() { return false; }
std::map<int, std::vector<char> > m_file_data;
file_storage m_files;
};
You'd also need a constructor function to pass in through the add_torrent_params struct when adding a torrent:
storage_interface* temp_storage_constructor(
file_storage const& fs, file_storage const* mapped
, std::string const& path, file_pool& fp
, std::vector<boost::uint8_t> const& prio)
{
return new temp_storage(fs);
}
From this point it should be fairly straight forward to store it in a MySQL database or any other back-end.

If you're on Linux, you could torrent into a tmpfs mount; this will avoid writing to disk. That said, this obviously means you're storing large files in RAM; make sure you have enough memory to deal with this.
Note also that most Linux distributions have a tmpfs mount at /dev/shm, so you could simply point libtorrent to a file there.

I've implemented a torrent client in Go just for this purpose. I wanted to able to handle and control the data directly, for use in writing torrentfs, and to have storage backends to S3 and various databases.
It would be trivial to plug in an in-memory storage backend to this client.

Try giving the library a cStringIO "file handle" instead of a real file handle. That works for most python libraries.

linux to windows C++ byte array

I have to replicate the following Java functionality in C++ to get data from Linux to Windows. Is Winsock2 the best way to go?.
Also, any reference code to suggest?
TIA,
B
import java.nio.ByteBuffer;
public class MessageXdr {
private ByteBuffer buffer;
private int size;
// taille max corps de message
private static final int T_MAX_CORPS_MSG = 16384;
public MessageXdr() {
buffer = ByteBuffer.allocate(4 * T_MAX_CORPS_MSG);
size =0;
}
public MessageXdr(byte[] array)
{
ByteBuffer tmpBuffer = ByteBuffer.wrap(array);
buffer = tmpBuffer.asReadOnlyBuffer();
size = array.length;
}
public int getSize()
{
return size;
}
public int getPosition()
{
return buffer.position();
}
public byte[] getArray()
{
return buffer.array();
}
public void resetBuffer()
{
size = 0;
buffer.rewind();
}
public int readInt()
{
int retour = buffer.getInt();
return retour;
}
public long readUnsignedInt()
{
ByteBuffer tmp = ByteBuffer.allocate(8);
tmp.putInt(0);
tmp.putInt(buffer.getInt());
return tmp.getLong(0);
}
public float readFloat()
{
float retour = buffer.getFloat();
return retour;
}
public void writeInt(int v)
{
buffer.putInt(v);
size+=4;
}
public void writeFloat(float v)
{
buffer.putFloat(v);
size+=4;
}
}

If you are allowed to use the MFC classes (CSocket), it might be closer to the code you have in Java.
http://msdn.microsoft.com/en-us/library/wxzt95kb(VS.80).aspx
Otherwise, Winsock2 is fine (the MFC classes just use that in their implementation).

I haven't worked with it yet, but when it comes to marshalling more complex data structures i would look into boost for the serialization part.
For the actual data transmission, winsock2 is the basic socket api in windows, all other api's are built on it (well, don't know about Windows 7) .But again, looking into boost could provide you with something platform independent you don't have to figure out twice. But from my experience, sockets are complex beasts, so you will have to figure out a lot anyway...
And avoid the CSocket from MFC, that's the worst implementation ever. (Even if some say that they fixed some of it's misbehaviours, it's just not worth it.)

Strict byte arrays don't need any translation from linux to windows or other systems. If you are dealing with integers and floats however...
Personally I would use Poco::BinaryWriter and Poco::BinaryReader
http://pocoproject.org/docs/Poco.BinaryWriter.html
using namespace Poco;
using namespace std;
std::ofstream myFile("path", ios::in | ios::binary);
BinaryWriter writer(myFile, BIG_ENDIAN_BYTE_ORDER);
writer << 10.0f;
writer << 10000;
//etc etc
myFile.close();
Now to read
std::ifstream myFile("path", ios::in | ios::binary);
BinaryReader reader(myFile, BIG_ENDIAN_BYTE_ORDER);
int intVariable;
float floatVariable;
reader >> floatVariable;
reader >> intVariable;
//etc etc
myFile.close();

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::fstream with multiple buffers? - c++

There's nothing like this provided by Standard. However, depending on your platform, you can use Memory Mapped Files, which provide the same functionality. Windows and Linux both provide them.

Related

Byte offset greater than Byte Length in BufferView

c/c++ get large size data like 180 array from another class in stm32

When using memory BIOs with OpenSSL, how can you find the 'needed size' for the input BIO?

Keeping the downloaded torrent in memory rather than file libtorrent

linux to windows C++ byte array

Categories

Resources