Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
So, I'm currently working on a project where Protocol Buffers is used extensively, mainly as a way to store complex objects in a key-value database.
Would a migration to Flat Buffers provide a considerable benefit in terms of performance?
More in general, is there ever a good reason to use Protocol Buffers instead of Flat Buffers?
Protocol buffers are optimized for space consumption on the wire, so for archival and storage, they are very efficient. However, the complex encoding is expensive to parse, and so they are computationally expensive, and the C++ API makes heavy use of dynamic allocations. Flat buffers, on the other hand, are optimized for efficient parsing and in-memory representation (e.g. offering zero-copy views of the data in some cases).
It depends on your use case which of those aspects is more important to you.
Quoting from the flatbuffer page:
Why not use Protocol Buffers, or .. ?
Protocol Buffers is indeed relatively similar to FlatBuffers, with the
primary difference being that FlatBuffers does not need a parsing/
unpacking step to a secondary representation before you can access
data, often coupled with per-object memory allocation. The code is an
order of magnitude bigger, too. Protocol Buffers has neither optional
text import/export nor schema language features like unions.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 months ago.
Improve this question
I'm coding a little HTTP 1.1 web server in C++98 (c++ version mandated by my school) and i haven't make a decision about which data type i'm gonna use to perform the request parsing, and how.
Since i'll be receiving read-only (by read-only i mean that i don't have to modify the buffer) data from a user-agent, would it make sense to use std::string to store the incoming data ?
HTTP syntax is very straightforward, and can be parse using a finite state machine. Iterating over a const char * seems enough and doesn't make any allocations, i can use buffer that recv gives me.
On the other hand, i could use std::string facilities like find and substr to parse the request, but that would lead to memory allocations.
My server doesn't need to be as efficient as nginx, but i'm still worried about the performance of my application.
I'm eager to know your thoughts.
Definitely. It's a school project, not a high-performance production server (in which case you'd be using a more modern C++ variant).
The biggest performance problem you'd typically have with std::string is not parsing, but string building. a + b + c + d + e can be rather inefficient. Details, really: just start by writing a correct implementation, and then see which parts are too slow in testing. There are very few projects, even in commercial software development, where that's not the right approach.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm developing a server for a game. You kown in games,many data structures should be mutable. But Clojure's data structures is immutable. Is there some good idea to
do this? Should i use clojure for it?
Mutating data structures allows you to squeeze the last ounces of performance out of your code, but given that you're writing a server, network latency probably has a much greater impact than memory allocations. Clojure should be suitable, certainly as a starting point.
While Clojure's data structures are immutable, application state can be managed via atoms, refs, core.async loop state, and data pipelines. A Clojure application is hardly static just because its data structures are.
The biggest risk you face right now is figuring out what to build, and Clojure's live development model will accelerate the learning loop. You can redefine functions while the server is running and see their effects immediately.
I suggest you prototype your server in Clojure, then, if performance gains need to be made, profile the code. If necessary, you can introduce transients and volatiles, and port performance critical sections to Java.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
John Carmack recently tweeted:
"Another outdated habit is making separate buffer objects for vertexes
and indexes."
I am afraid I don't fully understand what he means. Is he implying that packing all 3D data including vertex, index, and joint data into a single vertex buffer is optimal compared to separate buffers for each? And, if so, would such a technique apply only to OpenGL or could a Vulkan renderer benefit as well?
I think he means there's no particular need to put them in different buffer objects. You probably don't want to interleave them at fine-granularity, but putting e.g. all the indices for a mesh at the beginning of a buffer and then all the vertex data for the mesh following it is not going to be any worse than using separate buffer objects. Use offsets to point the binding points at the correct location in the buffer.
Whether it's better to put them in one buffer I don't know: if it is, I think it's probably ancillary things like having fewer larger memory allocations tends to be a little more efficient, you (or the driver) can do one large copy instead of two smaller ones when copies are necessary, etc.
Edit: I'd expect this all to apply to both GL and Vulkan.
John Carmack has replied with an answer regarding his original tweet:
"The performance should be the same, just less management. I wouldn't bother changing old code."
...
"It isn't a problem, it just isn't necessary on modern hardware. No big efficiency to packing them, just a modest management overhead."
So perhaps it isn't an outdated habit at all especially since it goes against the intended use-case for most APIs and in some cases can break compatibility -
as noted by Nico.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have been trying to understand how buffer is constructed? As I understand buffer is a hardware construct such as logic gates are (please correct me if I am wrong). So, I was wondering whether buffer is the location/block always fixed by hardware manufacturer or it could be any location reserved by the software/OS. I mean any buffer i.e. data buffer, cache buffer, etc.
Apologies if my question is bit vague. I am just trying to understand how buffer is implemented and at what level.
A buffer is simply a temporary storage facility for passing data between subsystems. The nature of that buffer (and definition of subsystems) depends on how and where it is used.
Hardware (such as a CPU) may implement a memory cache which is a type of buffer. Being in hardware the size is pretty much fixed but the actual size depends on the hardware design.
(Generically) In software a buffer is typically a chunk of memory reserved by the application that is used to temporarily store data generated by a producer and passed to a consumer for processing. It can be a static (fixed) size or expanded/contracted dynamically. It really depends on the application needs and is defined by the developer/designer.
A buffer is typically used for passing data between software and hardware. The most familiar being I/O. Because I/O is typically slow, data is usually buffered in some way to allow the software to continue running without having to wait for the I/O subsystem to finish.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I need a C++ library that can store and retrieve time series on demand to stream to client front-ends. I will be storing each component as structure of arrays format. I am currently using MySQL for correctness, but the DB access is starting to get ridiculously slow. I am trying to migrate away from this. Intuitively I can build such a library but it is not my business goal and will take quite a bit of implementation to get working. I am looking for an existing solution that can meet the following requirements:
O(1) lookup scheme
Excellent compression, each component is separated, so there should be plenty of redundancy that can be removed
Scalable to terabytes
(optional: Audit tracking)
Most important: Transactional support. There is going to be BIG data, and I can't have the possibility of a bad run corrupt an entire dataset which will create an unnecessary burden for backups and downtime during restores.
Also checkout TempoDB: http://tempo-db.com I'm a co-founder, and we built the service to solve this problem. We don't have a C++ client yet, but could work with you to develop one.
Take a look at OpenTSDB it's been develop at StumbleUpon by Benoit Sigoure:
http://opentsdb.net/
TeaFiles provide simple and efficient time series storage in flat files, enriched with item metadata and description. They might be a building block of the system you aim for. Currently free open source libraries exist for C++ (github.com/discretelogics/TeaFiles), C# and Python.
I am a founder of discretelogics and we coined this file format to overcome litations flat file time series storage while preserving its unrivaled speed.
Take a look at HDF5. It has a quick lookup scheme, has C, C++, Python interfaces. Has compression. Can get pretty big. Maintains metadata. Doesn't do auditing. You'll need a wrapper to handle multi-user capability.