Can protobuf read partially? - c++

I want to save my terrain data to a file and load only some parts of it, because it's just too big to store it in memory as a whole. Actually I don't even know whether the protobuf is good for this purposes.
For example I would have a structure like (might be invalid gramatically, I know only simple basics):
message Quad {
required int32 x = 1;
required int32 z = 2;
repeated int32 y = 3;
}
The x and z values are available in my program and by using them I would like to find the correct Quad object with the same x and z (in the file) to obtain y values. However, I can't just parse the file with the ParseFromIstream(), because (I think so) it loads whole file into memory, but in my case the file is just too big.
So, is the protobuf able to load one object, send me for checking it and if the object is wrong give me the second one?
Actually... I could just ask: does the ParseFromIstream() loads whole file into memory?

While some libraries to allow you to read files partially, the technique recommended by Google is to simply have the file consist of multiple messages:
https://developers.google.com/protocol-buffers/docs/techniques
Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if
you are dealing in messages larger than a megabyte each, it may be time to consider an
alternate strategy.
That said, Protocol Buffers are great for handling individual messages within a large data
set. Usually, large data sets are really just a collection of small pieces, where each small
piece may be a structured piece of data.
So you could just write a long sequence of Quad messages to the file, delimited by the lengths of the messages. If you need to seek randomly to specific Quads, you may want to add some kind of an index.

This depends on which implementation you are using. Some have "read as a sequence" APIs. For example, assuming you stored it as a "repeated Quad", then with protobuf-net that would be:
int x = ..., y = ...;
var found = Serializer.DeserializeItems<Quad>(source)
.Where(q => q.x ==x && q.y == y);
The point being: it yields a spooling (not loaded all at once) and short-circuiting sequence.
I don't know the c++ api specifically, but I would hope it has something similar - but worst case you could parse the varint headers and prepare a length-capped stream.

Related

How to copy every N-th byte(s) of a C array

I am writing bit of code in C++ where I want to play a .wav file and perform an FFT (with fftw) on it as it comes (and eventually display that FFT on screen with ncurses). This is mainly just as a "for giggles/to see if I can" project, so I have no restrictions on what I can or can't use aside from wanting to try to keep the result fairly lightweight and cross-platform (I'm doing this on Linux for the moment). I'm also trying to do this "right" and not just hack it together.
I'm using SDL2_audio to achieve the playback, which is working fine. The callback is called at some interval requesting N bytes (seems to be desiredSamples*nChannels). My idea is that at the same time I'm copying the memory from my input buffer to SDL I might as well also copy it in to fftw3's input array to run an FFT on it. Then I can just set ncurses to refresh at whatever rate I'd like separate from the audio callback frequency and it'll just pull the most recent data from the output array.
The catch is that the input file is formatted where the channels are packed together. I.E "(LR) (LR) (LR) ...". So while SDL expects this, I need a way to just get one channel to send to FFTW.
The audio callback format from SDL looks like so:
void myAudioCallback(void* userdata, Uint8* stream, int len) {
SDL_memset(stream, 0, sizeof(stream));
SDL_memcpy(stream, audio_pos, len);
audio_pos += len;
}
where userdata is (currently) unused, stream is the array that SDL wants filled, and len is the length of stream (I.E the number of bytes SDL is looking for).
As far as I know there's no way to get memcpy to just copy every other sample (read: Copy N bytes, skip M, copy N, etc). My current best idea is a brute-force for loop a la...
// pseudocode
for (int i=0; i<len/2; i++) {
fftw_in[i] = audio_pos + 2*i*sizeof(sample)
}
or even more brute force by just reading the file a second time and only taking every other byte or something.
Is there another way to go about accomplishing this, or is one of these my best option? It feels kind of kludgey to go from a nice one line memcpy to send to the data to SDL to some sort of weird loop to send it to fftw.
Very hard OP's solution can be simplified (for copying bytes):
// pseudocode
const char* s = audio_pos;
for (int d = 0; s < audio_pos + len; d++, s += 2*sizeof(sample)) {
fftw_in[d] = *s;
}
If I new what fftw_in is, I would memcpy blocks sizeof(*fftw_in).
Please check assembly generated by #S.M.'s solution.
If the code is not vectorized, I would use intrinsics (depending on your hardware support) like _mm_mask_blend_epi8

zlib's compress function is not doing anything. Why?

before = new unsigned char[mSizeNeeded*4];
uLong value = compressBound(mSizeNeeded*4);
after = new unsigned char[value];
compress(after, &value, before, mSizeNeeded*4);
fwrite(&after, 1, value, file);
'before' has a bunch of audio data stored into it and I am trying to compress it and store it into 'after'. I then write it into a file. The file is the same size as the original file, it also contains the same data that was in before (as far as I can tell).
Compress also returns OK so I know that the compression is not failing.
Okay, so it looks like my only problem is somewhere in the compression (I think). I am able to run compress and then I can uncompress and get the correct data out. Also, it is writing into the file and fwrite returns 561152 but the count (value) is 684964. So it looks like something is wrong with fwrite. I looked more carefully and the after data is different than the before data.
561152 is the same size as the original audio data in a .wav file that I have (stripped of the .wav headers of course).
Based on your original text:
fwrite (&before, ...
I am trying to compress it and store it into 'after'. I then write it into a file.
I think not. You are writing the original data to the file, you should probably be writing after instead.
The other thing you should get in the habit of doing is checking return values from functions that you care about. In other words, compress() will tell you if a problem occurs yet you seem to be totally ignoring the possibility.
Similarly, fwrite() also uses its return value to indicate whether it was successful or not. Since you haven't included the code showing how that's set up, this is also a distinct possibility. In particular fwrite is under no obligation to write your entire block to the file in one hit (device may be full, etc), that's why it has a return value, so you can detect and adjust for that situation. Often, a better option than:
fwrite (&after, 1, value, file);
is:
fwrite (&after, value, 1, file);
since the latter will always give you one for a fully successful write, something else for a failure of some description.
That would be my first step in establishing where the problem lies.
On top of that, there are numerous other (generally-applicable) methods you can use to track down the issue, such as:
outputting all variables after they change or are set (like the return values of functions, after, before, value and so on).
delete the output file before running your program, to ensure it's created afresh.
run the code through a debugger so you can see what's happening under the covers.
clearing after to all zero bytes (or a known pattern) to ensure you don't get stale data in there.
And, as a final approach (given that the zlib source code is freely available), you can also modify (or debug into) it so that you can clearly see what's going on under the covers.

Fortran unformatted output with each MPI process writing part of an array

In my parallel program, there was a big matrix. Each process computed and stored a part of it. Then the program wrote the matrix to a file by letting each process wrote its own part of the matrix in the correct order. The output file is in "unformatted" form. But when I tried to read the file in a serial code (I have the correct size of the big matrix allocated), I got an error which I don't understand.
My question is: in an MPI program, how do you get a binary file as the serial version output for a big matrix which is stored by different processes?
Here is my attempt:
if(ThisProcs == RootProcs) then
open(unit = file_restart%unit, file = file_restart%file, form = 'unformatted')
write(file_restart%unit)psi
close(file_restart%unit)
endif
#ifdef USEMPI
call mpi_barrier(mpi_comm_world,MPIerr)
#endif
do i = 1, NProcs - 1
if(ThisProcs == i) then
open(unit = file_restart%unit, file = file_restart%file, form = 'unformatted', status = 'old', position = 'append')
write(file_restart%unit)psi
close(file_restart%unit)
endif
#ifdef USEMPI
call mpi_barrier(mpi_comm_world,MPIerr)
#endif
enddo
Psi is the big matrix, it is allocated as:
Psi(N_lattice_points, NPsiStart:NPsiEnd)
But when I tried to load the file in a serial code:
open(2,file=File1,form="unformatted")
read(2)psi
forrtl: severe (67): input statement requires too much data, unit 2 (I am using MSVS 2012+intel fortran 2013)
How can I fix the parallel part to make the binary file readable for the serial code? Of course one can combine them into one big matrix in the MPI program, but is there an easier way?
Edit 1
The two answers are really nice. I'll use access = "stream" to solve my problem. And I just figured I can use inquire to check whether the file is "sequential" or "stream".
This isn't a problem specific to MPI, but would also happen in a serial program which took the same approach of writing out chunks piecemeal.
Ignore the opening and closing for each process and look at the overall connection and transfer statements. Your connection is an unformatted file using sequential access. It's unformatted because you explicitly asked for that, and sequential because you didn't ask for anything else.
Sequential file access is based on records. Each of your write statements transfers out a record consisting of a chunk of the matrix. Conversely, your input statement attempts to read from a single record.
Your problem is that while you try to read the entire matrix from the first record of the file that record doesn't contain the whole matrix. It doesn't contain anything like the correct amount of data. End result: "input statement requires too much data".
So, you need to either read in the data based on the same record structure, or move away from record files.
The latter is simple, use stream access
open(unit = file_restart%unit, file = file_restart%file, &
form = 'unformatted', access='stream')
Alternatively, read with a similar loop structure:
do i=1, NPROCS
! read statement with a slice
end do
This of course requires understanding the correct slicing.
Alternatively, one can consider using MPI-IO for output, which is very similar to using stream output. Read this back in with stream access. You can find about this concept elsewhere on SO.
Fortran unformatted sequential writes in record files are not quite completely raw data. Each write will have data before and after the record in a processor dependent form. The size of your reads cannot exceed the record size of your writes. This means if psi is written in two writes, you will need to read it back in two reads, you cannot read it in at once.
Perhaps the most straightforward option is to instead use stream access instead of sequential. A stream file is indexed by bytes (generally) and does not contain record start and end information. Using this access method you can split the write but read all at once. Stream access is a feature of Fortran 2003.
If you stick with sequential access, you'll need to know how many MPI ranks wrote the file and loop over properly sized records to read the data as it was written. You could make the user specify the number of ranks or store that as the first record in the file and read that first to determine how to read the rest of the data.
If you are writing MPI, why not MPI-IO? Each process will call MPI_File_set_view to set a subarray view of the file, then each process can collectively write the data with MPI_FILE_WRITE_ALL . This approach is likely to scale really well on big machines (though your approach will be fine up to oh, maybe 100 processors.)

Data structure for quick access to glpyh textures via char

I am attempting to create an edit box that allows users to input text. I've been working on this for some time now and have tossed around different ideas. Ultimately, the one I think that would offer the best performance is to load all the characters from the .ttf (I'm using SDL to manage events, windows, text, and images for openGL) onto their own surface, and then render those surfaces onto textures one time. Then each frame, I can just bind an appropriate texture in the appropriate location.
However, now I'm thinking how to access these glyphs. My limited bkg would say something like this:
struct CharTextures {
char glpyh;
GLuint TextureID;
int Width;
int Height;
CharTextures* Next;
}
//Code
CharTexture* FindGlyph(char Foo) {
CharTextures* Poo = _FirstOne;
while( Poo != NULL ) {
if( Foo == Poo->glyph ) {
return Poo;
}
Poo = Poo->Next;
}
return NULL;
}
I know that will work. However, it seems very wasteful to iterate the entire list each time. My scripting experience has taught me some lua and they have tables in lua that allow for unordered indices of all sorts of types. How could I mimic it in C++ such that instead of this iteration, I could do something like:
CharTexture* FindGlyph(char Foo) {
return PooPointers[Foo]; //somehow use the character as a key to get pointer to glyph without iteration
}
I was thinking I could try converting to the numerical value, but I don't know how to convert char to UTF8 values and if I could use those as keys. I could convert to ascii but would that handle all the characters I would want to be able to type? I am trying to get this application to run on mac and windows and am not sure about the machine specifics. I've read about the differences of the different format (ascii v unicode v utf8 v utf16 etc)... I understand it has to do with bit width and endianness but I understand relatively little about the interface differences between platforms and implications of said endianness on my code.
Thank you
What you probably want is
std::map<char,CharTexture*> PooPointers;
using the array access operator will also use some search in the map behind the scene, but optimized.
What g-makulik has said is probably right. The map may be what you're after. To expand on the reply, maps are automatically sorted base on the key (char in this case) and so lookups based on the character is extremely quick using
CharTexture* pCharTexture = PooPointers[char];
If you want a sparse data structure where you don't predefine the texture for each character.
Note that running the code above where an entry doesn't exist will create a default entry in the map.
Depending on your general needs you could also use a simple vector if generalized sorting isn't important or if you know that you'll always have a fixed number of characters. You could fill the vector with predefined data for each possible character.
It all depends on your memory requirements.

Protocol Buffers - Reading header (nested message) common across all messages

I am currently evaluating Protocol Buffers for use in a project (no code written as of yet). One of the things I'm unclear on is how you would read part of an encoded message, for example say I have a common header:
message Header {
required uint16 msg_type = 1;
required uint16 length = 2;
}
And say I deliver multiple different messages to a queue. How would the consumer work out how much data to read per message and what message type is should be constructed as?
There should be no need for a Header message here; the most common approach is to follow the "streaming" advice from here. Within that, you could either treat it as a sequence of identical union type messages, or (my preference) when writing, instead of just writing a length-prefix before each, include a varint that indicates the message type then the length (as a varint). The number that indicates the message type is some arbitrary map you invent, so 1 = Foo, 2 = Bar, 3 = Blap, etc). If you left-shift the message-type by 3 bits then "or" 2, then it will also be a well-formed protobuf stream itself, 100% identical to a repeated YourUnionType.
Basically, this is exactly the same as this answer, but instead of being field 1 each time, the number varies per message-type. Most implementations have a reader/writer API that make it possible to read and write raw varints, and to length-restrict the reader API. Some implementations have helper mechanisms to support streams of heterogeneous messages directly (basically, doing all the above for you).
In a recent project, I used Protocol Buffers like this:
We had one 'container' message that included all the actual messages as optional members:
message ContainerMessage {
optional Message1 message_1 = 1;
optional Message2 message_2 = 2;
//...
optional MessageN message_N = N;
}
Inside an application, you could just use ContainerMessage as a discriminated union of the real Messages.
Between applications, we serialized/deserialized the ContainerMessage and sent the serialized content, prefixed with a simple header containing the length of the serialized content.
That will depend on the protocol you are using.
Note that e.g. a lot of protocols go via serial interfaces, where you might have extra lines telling when a message starts and stops.
Often, messages will have there length at a fixed offset after the message start.
In other cases, you might need to parse the message element by element to find out how much of the message is left. So a string embedded in the message may be of fixed length, or have the length at the beginning, or might have \0 as end marker.
Mostly, when you store messages in a queue for further processing, you will want to add some more information to make your life easier - like when you just have an extra signal telling you when the message stops, you might store the message internally with its length.