C++ fill an empty buffer with a single value - c++

I apologize in advance if I am using the incorrect terminology, I'm new to the C++ language. I have a class with a constructor that creates an empty buffer using malloc
LPD6803PWM::LPD6803PWM(uint16_t leds, uint8_t dout, uint8_t cout) {
numLEDs = leds;
pixels = (uint16_t *) malloc(numLEDs);
dataPin = dout;
clockPin = cout;
}
My understanding is that this creates an empty buffer with the length of whatever I pass to numLEDs this is essentially a dynamically created array correct? I'm using malloc because this code goes on an Arduino that has very limited memory and I want to avoid overflows and from what I have read, this is the best way to declare arrays is you don't know what size the array will be and you want to avoid overflow errors.
My question is, once this array has been created is there a faster way than a traditional for loop to fill the array with a single value. Very often I will want to do this and even microseconds make a difference in this application. I know that from the C++ standard library array classes have a fill method, but what about an array declared in this way?

My question is, once this array has been created is there a faster way than a traditional for loop to fill the array with a single value.
The C standard library provides memset() and related functions for filling a buffer. There's also calloc(), which allocates a buffer just like malloc(), but fills the buffer with 0 at the same time.
Very often I will want to do this and even microseconds make a difference in this application.
In that case you might consider ways to avoid repeatedly allocating the array, which could take more time than filling an existing array. As well, the easiest way to make your code go faster is to run it on faster hardware. Arduino is a great platform, but Raspberry Pi Zero costs less ($5, if you can find them), has a LOT more memory, and has a clock speed that's 64x faster than a typical Arduino (1Ghz vs. 16MHz). Computing is often a tradeoff between good, cheap, and fast, but in this case you get all three.

You can still use std::fill (or std::fill_n), most standard library implementations will delegate to memset for RandomAccessIterator (e.g. gcc and Clang). Trust in the standard library writers!

You can use memset. But you have to be careful about the value you want to set. And you won't be much faster than using a for loop. The computer needs to set all these values somehow! memset may set larger contiguous memory spans and therefore be faster, but a smart compiler may do the same for a for loop.
If you're really concerned about microseconds you need to do some profiling.

Well, you can use memset from stdlib.h:
memset(array, 0, size_of_array_in_bytes);
Note however that memset works byte for byte,e.g it sets the first byte to 0 or whatever value you set as the second parameter, then the second byte and so on, which means that you must be careful.
Just a note:
malloc gets its size as the size of arrays in bytes, so you might consider multiplying its parameter by sizeof(uint16_t)

Related

char* buffer = new vs char buffer[] in C++

1. char* buffer = new char[size]
2. char buffer[size]
I'm new to C++ and I see most places creating buffers using the first example. I know in the first method, the data in that part of memory can be passed on until manually deleted using delete[]. While using the second method, the buffer would have a lifetime depending on the scope. If I only plan on the buffer lasting through a particular function and I don't plan on passing it to anything else, does it matter which method I use?
char* buffer = new char[size]
This is portable, but should be avoided. Until you really know what you're doing, using new directly is almost always a mistake (and when you do know what you're doing, it's still a mistake, but you'll know that without being told).
char buffer[size]
This depends on how you've defined size. If it's a constant (and fairly small), then this is all right. If it's not a constant, then any properly functioning compiler is required to reject it (but some common ones accept it anyway).
If it's constant, but "large", the compiler will accept the code, but it's likely to fail when you try to execute it. In this case, anything over a million is normally too large, and anything more than a few hundred thousand or so becomes suspect.
There is one exception to that though: if this is defined outside any function (i.e., as a global variable), then it can safely be much larger than a local variable can be. At the same time, I feel obliged to point out that I consider global variables something that should normally be avoided as a rule (and I'm far from being alone in holding that opinion).
Also note that these two are (more or less) mutually exclusive: if size is a constant, you generally want to avoid dynamic allocation, but it has to be a constant to just define an array (again, with a properly functioning compiler).
Unless size is fairly small constant, most of the time you should avoid both of these. What you most likely want is either:
std::string buffer;
or:
std::vector<char> buffer(size);
or possibly:
std::array<char, size> buffer;
The first two of these can allocate space for the buffer dynamically, but generally keep the allocation "hidden", so you don't normally need to deal with it directly. The std::array is pretty much like the char buffer[size], (e.g., has a fixed size, and is really on suitable for fairly small sizes) but enforces that the size has to be a const, and gives you roughly the same interface as vector (minus anything that would change the number of elements, since that's a constant with std::array).
Main difference is that the first variant is dynamic allocation and the second one is not. You require dynamic allocation when you do not know at compile time, how much memory you will need. That means when "size" is not entirely a constant but somehow calculated at runtime depending on external input.
It is a good practice¹ to use containers that handle dynamic memory internally and thus ensure that you do not have to delete manually which is often a source for bugs and memory leaks.
A common, dynamic container for all kinds of data is
std::vector<char> (don't forget to #include <vector> )
However if you do handle texts, use the class std::string which also handles the memory internally. Raw char* arrays are a remainder from old C.
¹that good practice has the main exception when you don't use primitive data types but your own classes which store some massive amount of data. Reason is that std::vector<> performs copy operations when resized (and those are more expensive the larger the data).
However once you have come that far in your C++ projects, you should know about "smart pointers" by then which are the safe solution for those special cases.
By the way, with &the_vector[0] (address of the first element in the vector) you can get a pointer that behaves pretty much like the char array and thus can be used for older functions that do not accept vectors directly.

Memory allocation of C++ vector<bool>

The vector<bool> class in the C++ STL is optimized for memory to allocate one bit per bool stored, rather than one byte. Every time I output sizeof(x) for vector<bool> x, the result is 40 bytes creating the vector structure. sizeof(x.at(0)) always returns 16 bytes, which must be the allocated memory for many bool values, not just the one at position zero. How many elements do the 16 bytes cover? 128 exactly? What if my vector has more or less elements?
I would like to measure the size of the vector and all of its contents. How would I do that accurately? Is there a C++ library available for viewing allocated memory per variable?
I don't think there's any standard way to do this. The only information a vector<bool> implementation gives you about how it works is the reference member type, but there's no reason to assume that this has any congruence with how the data are actually stored internally; it's just that you get a reference back when you dereference an iterator into the container.
So you've got the size of the container itself, and that's fine, but to get the amount of memory taken up by the data, you're going to have to inspect your implementation's standard library source code and derive a solution from that. Though, honestly, this seems like a strange thing to want in the first place.
Actually, using vector<bool> is kind of a strange thing to want in the first place. All of the above is essentially why its use is frowned upon nowadays: it's almost entirely incompatible with conventions set by other standard containers… or even those set by other vector specialisations.

Variable Length Array Performance Implications (C/C++)

I'm writing a fairly straightforward function that sends an array over to a file descriptor. However, in order to send the data, I need to append a one byte header.
Here is a simplified version of what I'm doing and it seems to work:
void SendData(uint8_t* buffer, size_t length) {
uint8_t buffer_to_send[length + 1];
buffer_to_send[0] = MY_SPECIAL_BYTE;
memcpy(buffer_to_send + 1, buffer, length);
// more code to send the buffer_to_send goes here...
}
Like I said, the code seems to work fine, however, I've recently gotten into the habit of using the Google C++ style guide since my current project has no set style guide for it (I'm actually the only software engineer on my project and I wanted to use something that's used in industry). I ran Google's cpplint.py and it caught the line where I am creating buffer_to_send and threw some comment about not using variable length arrays. Specifically, here's what Google's C++ style guide has to say about variable length arrays...
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Variable-Length_Arrays_and_alloca__
Based on their comments, it appears I may have found the root cause of seemingly random crashes in my code (which occur very infrequently, but are nonetheless annoying). However, I'm a bit torn as to how to fix it.
Here are my proposed solutions:
Make buffer_to_send essentially a fixed length array of a constant length. The problem that I can think of here is that I have to make the buffer as big as the theoretically largest buffer I'd want to send. In the average case, the buffers are much smaller, and I'd be wasting about 0.5KB doing so each time the function is called. Note that the program must run on an embedded system, and while I'm not necessarily counting each byte, I'd like to use as little memory as possible.
Use new and delete or malloc/free to dynamically allocate the buffer. The issue here is that the function is called frequently and there would be some overhead in terms of constantly asking the OS for memory and then releasing it.
Use two successive calls to write() in order to pass the data to the file descriptor. That is, the first write would pass only the one byte, and the next would send the rest of the buffer. While seemingly straightforward, I would need to research the code a bit more (note that I got this code handed down from a previous engineer who has since left the company I work for) in order to guarantee that the two successive writes occur atomically. Also, if this requires locking, then it essentially becomes more complex and has more performance impact than case #2.
Note that I cannot make the buffer_to_send a member variable or scope it outside the function since there are (potentially) multiple calls to the function at any given time from various threads.
Please let me know your opinion and what my preferred approach should be. Thanks for your time.
You can fold the two successive calls to write() in your option 3 into a single call using writev().
http://pubs.opengroup.org/onlinepubs/009696799/functions/writev.html
I would choose option 1. If you know the maximum length of your data, then allocate that much space (plus one byte) on the stack using a fixed size array. This is no worse than the variable length array you have shown because you must always have enough space left on the stack otherwise you simply won't be able to handle your maximum length (at worst, your code would randomly crash on larger buffer sizes). At the time this function is called, nothing else will be using the further space on your stack so it will be safe to allocate a fixed size array.

Efficiently collect data from multiple 1-D arrays in to a single 1-D array

I've got a prewritten function in C that fills an 1-D array with data, e.g.
int myFunction(myData **arr,...);
myData *array;
int arraySize;
arraySize = myFunction(&arr, ...);
I would like to call the function n times in a row with slightly different parameters (n is dependent on user input), and I need all the data collected in a single C array afterwards. The size of the returned array is not always fixed. Oh, and myFunction does the memory allocation internally. I want to do this in a memory-efficient way, but using realloc in each iteration does not sound like a good idea.
I do have all the C++ functionality available (the project is in C++, just using a C library), but using std::vector is no good because the collected data is later sent in to a function with a definition similar to:
void otherFunction(myData *data, int numData, ...);
Any ideas? Only things I can think of are realloc or using a std::vector and copying the data into an array afterwards, and those don't sound too promising.
Using realloc() in each iteration sounds like a very fine idea to me, for two reasons:
"does not sound like a good idea" is what people usually say when they have not established a performance requirement for their software, and they have not tested their software against the performance requirement to see if there is any need to improve it.
Instead of reallocating a new block each time, the realloc method will simply keep expanding your memory block which will presumably be at the top of the memory heap, so it won't be wasting any time either traversing memory block lists, or copying data around. This holds true provided that whatever memory allocated by myFunction() gets freed before it returns. You can verify it by looking at the pointer returned by realloc() and seeing that it always (or almost always(*1)) is the exact same pointer as the one you gave it to reallocate.
EDIT (*1) some C++ runtimes implement two heaps, one for small allocations and one for large allocations, so if your block gets allocated in the heap for small blocks, and then it grows large, there is a possibility that it will be moved once to the heap for large blocks. So, don't expect the pointer to always be the same; just most of the time.
Just copy all of the data into an std::vector. You can call otherFunction on a vector v with
otherFunction(&v[0], v.size(), ...)
or
otherFunction(v.data(), v.size(), ...)
As for your efficiency requirement: it looks to me like your optimizing prematurely. First try this option, then measure how fast it is and only look for other solutions if it's really too slow.
If you know that you are going to call the function N times, and returned arrays are always M long, then why don't you just allocate one array M*N initially? Or if you don't know one of M or N, then set a worst case maximum. Or are M and N both dependent on user-input?
Then, change how you call your user-input-getting function, such that the array pointer you pass it is actually an offset into that large array, so that it stores the data in the right location. Then, next iteration, offset further, and call again.
I think best solution would be to write your own 1D array class with some methods which you need.
depending on how you write the class you'll get such result. (sorry bad grammar)..

Using read() directly into a C++ std:vector

I'm wrapping up user space linux socket functionality in some C++ for an embedded system (yes, this is probably reinventing the wheel again).
I want to offer a read and write implementation using a vector.
Doing the write is pretty easy, I can just pass &myvec[0] and avoid unnecessary copying. I'd like to do the same and read directly into a vector, rather than reading into a char buffer then copying all that into a newly created vector.
Now, I know how much data I want to read, and I can allocate appropriately (vec.reserve()). I can also read into &myvec[0], though this is probably a VERY BAD IDEA. Obviously doing this doesn't allow myvec.size to return anything sensible. Is there any way of doing this that:
Doesn't completely feel yucky from a safety/C++ perspective
Doesn't involve two copies of the data block - once from kernel to user space and once from a C char * style buffer into a C++ vector.
Use resize() instead of reserve(). This will set the vector's size correctly -- and after that, &myvec[0] is, as usual, guaranteed to point to a continguous block of memory.
Edit: Using &myvec[0] as a pointer to the underlying array for both reading and writing is safe and guaranteed to work by the C++ standard. Here's what Herb Sutter has to say:
So why do people continually ask whether the elements of a std::vector (or std::array) are stored contiguously? The most likely reason is that they want to know if they can cough up pointers to the internals to share the data, either to read or to write, with other code that deals in C arrays. That’s a valid use, and one important enough to guarantee in the standard.
I'll just add a short clarification, because the answer was already given. resize() with argument greater than current size will add elements to the collection and default - initialize them. If You create
std::vector<unsigned char> v;
and then resize
v.resize(someSize);
All unsigned chars will get initialized to 0. Btw You can do the same with a constructor
std::vector<unsigned char> v(someSize);
So theoretically it may be a little bit slower than a raw array, but if the alternative is to copy the array anyway, it's better.
Reserve only prepares the memory, so that there is no reallocation needed, if new elements are added to the collection, but You can't access that memory.
You have to get an information about the number of element written to Your vector. The vector won't know anything about it.
Assuming it's a POD struct, call resize rather than reserve. You can define an empty default constructor if you really don't want the data zeroed out before you fill the vector.
It's somewhat low level, but the semantics of construction of POD structs is purposely murky. If memmove is allowed to copy-construct them, I don't see why a socket-read shouldn't.
EDIT: ah, bytes, not a struct. Well, you can use the same trick, and define a struct with just a char and a default constructor which neglects to initialize it… if I'm guessing correctly that you care, and that's why you wanted to call reserve instead of resize in the first place.
If you want the vector to reflect the amount of data read, call resize() twice. Once before the read, to give yourself space to read into. Once again after the read, to set the size of the vector to the number of bytes actually read. reserve() is no good, since calling reserve doesn't give you permission to access the memory allocated for the capacity.
The first resize() will zero the elements of the vector, but this is unlikely to create much of a performance overhead. If it does then you could try Potatoswatter's suggestion, or you could give up on the size of the vector reflecting the size of the data read, and instead just resize() it once, then re-use it exactly as you would an allocated buffer in C.
Performance-wise, if you're reading from a socket in user mode, most likely you can easily handle data as fast as it comes in. Maybe not if you're connecting to another machine on a gigabit LAN, or if your machine is frequently running 100% CPU or 100% memory bandwidth. A bit of extra copying or memsetting is no big deal if you are eventually going to block on a read call anyway.
Like you, I'd want to avoid the extra copy in user-space, but not for performance reasons, just because if I don't do it, I don't have to write the code for it...