Memset an int (16 bit) array to short's max value

Memset an int (16 bit) array to short's max value - c++

Can't seem to find the answer to this anywhere,
How do I memset an array to the maximum value of the array's type?
I would have thought memset(ZBUFFER,0xFFFF,size) would work where ZBUFFER is a 16bit integer array. Instead I get -1s throughout.
Also, the idea is to have this work as fast as possible (it's a zbuffer that needs to initialize every frame) so if there is a better way (and still as fast or faster), let me know.
edit:
as clarification, I do need a signed int array.

In C++, you would use std::fill, and std::numeric_limits.
#include <algorithm>
#include <iterator>
#include <limits>
template <typename IT>
void FillWithMax( IT first, IT last )
{
typedef typename std::iterator_traits<IT>::value_type T;
T const maxval = std::numeric_limits<T>::max();
std::fill( first, last, maxval );
}
size_t const size=32;
short ZBUFFER[size];
FillWithMax( ZBUFFER, &ZBUFFER[0]+size );
This will work with any type.
In C, you'd better keep off memset that sets the value of bytes. To initialize an array of other types than char (ev. unsigned), you have to resort to a manual for loop.

-1 and 0xFFFF are the same thing in a 16 bit integer using a two's complement representation. You are only getting -1 because either you have declared your array as short instead of unsigned short. Or because you are converting the values to signed when you output them.
BTW your assumption that you can set something except bytes using memset is wrong. memset(ZBUFFER, 0xFF, size) would have done the same thing.

In C++ you can fill an array with some value with the std::fill algorithm.
std::fill(ZBUFFER, ZBUFFER+size, std::numeric_limits<short>::max());
This is neither faster nor slower than your current approach. It does have the benefit of working, though.

Don't attribute speed to language. That's for implementations of C. There are C compilers that produce fast, optimal machine code and C compilers that produce slow, inoptimal machine code. Likewise for C++. A "fast, optimal" implementation might be able to optimise code that seems slow. Hence, it doesn't make sense to call one solution faster than another. I'll talk about the correctness, and then I'll talk about performance, however insignificant it is. It'd be a better idea to profile your code, to be sure that this is in fact the bottleneck, but let's continue.
Let us consider the most sensible option, first: A loop that copies int values. It is clear just by reading the code that the loop will correctly assign SHRT_MAX to each int item. You can see a testcase of this loop below, which will attempt to use the largest possible array allocatable by malloc at the time.
#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void) {
size_t size = SIZE_MAX;
volatile int *array = malloc(size);
/* Allocate largest array */
while (array == NULL && size > 0) {
size >>= 1;
array = malloc(size);
}
printf("Copying into %zu bytes\n", size);
for (size_t n = 0; n < size / sizeof *array; n++) {
array[n] = SHRT_MAX;
}
puts("Done!");
return 0;
}
I ran this on my system, compiled with various optimisations enabled (-O3 -march=core2 -funroll-loops). Here's the output:
Copying into 1073741823 bytes
Done!
Process returned 0 (0x0) execution time : 1.094 s
Press any key to continue.
Note the "execution time"... That's pretty fast! If anything, the bottleneck here is the cache locality of such a large array, which is why a good programmer will try to design systems that don't use so much memory... Well, then let us consider the memset option. Here's a quote from the memset manual:
The memset() function copies c (converted to an unsigned char) into
each of the first n bytes of the object pointed to by s.
Hence, it'll convert 0xFFFF to an unsigned char (and potentially truncate that value), then assign the converted value to the first size bytes. This results in incorrect behaviour. I don't like relying upon the value SHRT_MAX to be represented as a sequence of bytes storing the value (unsigned char) 0xFFFF, because that's relying upon coincidence. In other words, the main problem here is that memset isn't suitable for your task. Don't use it. Having said that, here's a test, derived from the test above, which will be used to test the speed of memset:
#include <limits.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void) {
size_t size = SIZE_MAX;
volatile int *array = malloc(size);
/* Allocate largest array */
while (array == NULL && size > 0) {
size >>= 1;
array = malloc(size);
}
printf("Copying into %zu bytes\n", size);
memset(array, 0xFFFF, size);
puts("Done!");
return 0;
}
A trivial byte-copying memset loop will iterate sizeof (int) times more than the loop in my first example. Considering that my implementation uses a fairly optimal memset, here's the output:
Copying into 1073741823 bytes
Done!
Process returned 0 (0x0) execution time : 1.060 s
Press any key to continue.
These tests are likely to vary, however significantly. I only ran them once each to get a rough idea. Hopefully you've come to the same conclusion that I have: Common compilers are pretty good at optimising simple loops, and it's not worth postulating about micro-optimisations here.
In summary:
Don't use memset to fill ints with values (with an exception for the value 0), because it's not suitable.
Don't postulate about optimisations prior to running tests. Don't run tests until you have a working solution. By working solution I mean "A program that solves an actual problem". Once you have that, use your profiler to identify more significant opportunities to optimise!

This is because of two's complement. You have to change your array type to unsigned short, to get the max value, or use 0x7FFF.

for (int i = 0; i < SIZE / sizeof(short); ++i) {
ZBUFFER[i] = SHRT_MAX;
}
Note this does not initialize the last couple bytes, if (SIZE % sizeof(short))

In C, you can do it like Adrian Panasiuk said, and you can also unroll the copy loop. Unrolling means copying larger chunks at a time. The extreme end of loop unrolling is copying the whole frame over with a zero frame, like this:
init()
{
for (int i = 0; i < sizeof(ZBUFFER) / sizeof(ZBUFFER[0]; ++i) {
empty_ZBUFFER[i] = SHRT_MAX;
}
}
actual clearing:
memcpy(ZBUFFER, empty_ZBUFFER, SIZE);
(You can experiment with different sizes of the empty ZBUFFER, from four bytes and up, and then have a loop around the memcpy.)
As always, test your findings, if a) it's worth optimizing this part of the program and b) what difference the different initializing techniques makes. It will depend on a lot of factors. For the last few per cents of performance, you may have to resort to assembler code.

#include <algorithm>
#include <limits>
std::fill_n(ZBUFFER, size, std::numeric_limits<FOO>::max())
where FOO is the type of ZBUFFER's elements.

When you say "memset" do you actually have to use that function? That is only a byte-by-byte assign so it won't work with signed arrays.
If you want to set each value to the maximum you would use something like:
std::fill( ZBUFFER, ZBUFFER+len, std::numeric_limits<short>::max() )
when len is the number of elements (not the size in bytes of your array)

Related

What does size_t mean in c++? [duplicate]

I'm just wondering should I use std::size_t for loops and stuff instead of int?
For instance:
#include <cstdint>
int main()
{
for (std::size_t i = 0; i < 10; ++i) {
// std::size_t OK here? Or should I use, say, unsigned int instead?
}
}
In general, what is the best practice regarding when to use std::size_t?

A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.
std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.
If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.

size_t is the result type of the sizeof operator.
Use size_t for variables that model size or index in an array. size_t conveys semantics: you immediately know it represents a size in bytes or an index, rather than just another integer.
Also, using size_t to represent a size in bytes helps making the code portable.

The size_t type is meant to specify the size of something so it's natural to use it, for example, getting the length of a string and then processing each character:
for (size_t i = 0, max = strlen (str); i < max; i++)
doSomethingWith (str[i]);
You do have to watch out for boundary conditions of course, since it's an unsigned type. The boundary at the top end is not usually that important since the maximum is usually large (though it is possible to get there). Most people just use an int for that sort of thing because they rarely have structures or arrays that get big enough to exceed the capacity of that int.
But watch out for things like:
for (size_t i = strlen (str) - 1; i >= 0; i--)
which will cause an infinite loop due to the wrapping behaviour of unsigned values (although I've seen compilers warn against this). This can also be alleviated by the (slightly harder to understand but at least immune to wrapping problems):
for (size_t i = strlen (str); i-- > 0; )
By shifting the decrement into a post-check side-effect of the continuation condition, this does the check for continuation on the value before decrement, but still uses the decremented value inside the loop (which is why the loop runs from len .. 1 rather than len-1 .. 0).

By definition, size_t is the result of the sizeof operator. size_t was created to refer to sizes.
The number of times you do something (10, in your example) is not about sizes, so why use size_t? int, or unsigned int, should be ok.
Of course it is also relevant what you do with i inside the loop. If you pass it to a function which takes an unsigned int, for example, pick unsigned int.
In any case, I recommend to avoid implicit type conversions. Make all type conversions explicit.

short answer:
Almost never. Use signed version ptrdiff_t or non-standard ssize_t. Use function std::ssize instead of std::size.
long answer:
Whenever you need to have a vector of char bigger that 2gb on a 32 bit system. In every other use case, using a signed type is much safer than using an unsigned type.
example:
std::vector<A> data;
[...]
// calculate the index that should be used;
size_t i = calc_index(param1, param2);
// doing calculations close to the underflow of an integer is already dangerous
// do some bounds checking
if( i - 1 < 0 ) {
// always false, because 0-1 on unsigned creates an underflow
return LEFT_BORDER;
} else if( i >= data.size() - 1 ) {
// if i already had an underflow, this becomes true
return RIGHT_BORDER;
}
// now you have a bug that is very hard to track, because you never
// get an exception or anything anymore, to detect that you actually
// return the false border case.
return calc_something(data[i-1], data[i], data[i+1]);
The signed equivalent of size_t is ptrdiff_t, not int. But using int is still much better in most cases than size_t. ptrdiff_t is long on 32 and 64 bit systems.
This means that you always have to convert to and from size_t whenever you interact with a std::containers, which not very beautiful. But on a going native conference the authors of c++ mentioned that designing std::vector with an unsigned size_t was a mistake.
If your compiler gives you warnings on implicit conversions from ptrdiff_t to size_t, you can make it explicit with constructor syntax:
calc_something(data[size_t(i-1)], data[size_t(i)], data[size_t(i+1)]);
if just want to iterate a collection, without bounds cheking, use range based for:
for(const auto& d : data) {
[...]
}
here some words from Bjarne Stroustrup (C++ author) at going native
For some people this signed/unsigned design error in the STL is reason enough, to not use the std::vector, but instead an own implementation.

size_t is a very readable way to specify the size dimension of an item - length of a string, amount of bytes a pointer takes, etc.
It's also portable across platforms - you'll find that 64bit and 32bit both behave nicely with system functions and size_t - something that unsigned int might not do (e.g. when should you use unsigned long

Use std::size_t for indexing/counting C-style arrays.
For STL containers, you'll have (for example) vector<int>::size_type, which should be used for indexing and counting vector elements.
In practice, they are usually both unsigned ints, but it isn't guaranteed, especially when using custom allocators.

Soon most computers will be 64-bit architectures with 64-bit OS:es running programs operating on containers of billions of elements. Then you must use size_t instead of int as loop index, otherwise your index will wrap around at the 2^32:th element, on both 32- and 64-bit systems.
Prepare for the future!

size_t is returned by various libraries to indicate that the size of that container is non-zero. You use it when you get once back :0
However, in the your example above looping on a size_t is a potential bug. Consider the following:
for (size_t i = thing.size(); i >= 0; --i) {
// this will never terminate because size_t is a typedef for
// unsigned int which can not be negative by definition
// therefore i will always be >= 0
printf("the never ending story. la la la la");
}
the use of unsigned integers has the potential to create these types of subtle issues. Therefore imho I prefer to use size_t only when I interact with containers/types that require it.

When using size_t be careful with the following expression
size_t i = containner.find("mytoken");
size_t x = 99;
if (i-x>-1 && i+x < containner.size()) {
cout << containner[i-x] << " " << containner[i+x] << endl;
}
You will get false in the if expression regardless of what value you have for x.
It took me several days to realize this (the code is so simple that I did not do unit test), although it only take a few minutes to figure the source of the problem. Not sure it is better to do a cast or use zero.
if ((int)(i-x) > -1 or (i-x) >= 0)
Both ways should work. Here is my test run
size_t i = 5;
cerr << "i-7=" << i-7 << " (int)(i-7)=" << (int)(i-7) << endl;
The output: i-7=18446744073709551614 (int)(i-7)=-2
I would like other's comments.

It is often better not to use size_t in a loop. For example,
vector<int> a = {1,2,3,4};
for (size_t i=0; i<a.size(); i++) {
std::cout << a[i] << std::endl;
}
size_t n = a.size();
for (size_t i=n-1; i>=0; i--) {
std::cout << a[i] << std::endl;
}
The first loop is ok. But for the second loop:
When i=0, the result of i-- will be ULLONG_MAX (assuming size_t = unsigned long long), which is not what you want in a loop.
Moreover, if a is empty then n=0 and n-1=ULLONG_MAX which is not good either.

size_t is an unsigned type that can hold maximum integer value for your architecture, so it is protected from integer overflows due to sign (signed int 0x7FFFFFFF incremented by 1 will give you -1) or short size (unsigned short int 0xFFFF incremented by 1 will give you 0).
It is mainly used in array indexing/loops/address arithmetic and so on. Functions like memset() and alike accept size_t only, because theoretically you may have a block of memory of size 2^32-1 (on 32bit platform).
For such simple loops don't bother and use just int.

I have been struggling myself with understanding what and when to use it. But size_t is just an unsigned integral data type which is defined in various header files such as <stddef.h>, <stdio.h>, <stdlib.h>, <string.h>, <time.h>, <wchar.h> etc.
It is used to represent the size of objects in bytes hence it's used as the return type by the sizeof operator. The maximum permissible size is dependent on the compiler; if the compiler is 32 bit then it is simply a typedef (alias) for unsigned int but if the compiler is 64 bit then it would be a typedef for unsigned long long. The size_t data type is never negative(excluding ssize_t)
Therefore many C library functions like malloc, memcpy and strlen declare their arguments and return type as size_t.
/ Declaration of various standard library functions.
// Here argument of 'n' refers to maximum blocks that can be
// allocated which is guaranteed to be non-negative.
void *malloc(size_t n);
// While copying 'n' bytes from 's2' to 's1'
// n must be non-negative integer.
void *memcpy(void *s1, void const *s2, size_t n);
// the size of any string or `std::vector<char> st;` will always be at least 0.
size_t strlen(char const *s);
size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.

size_t is an unsigned integral type, that can represent the largest integer on you system.
Only use it if you need very large arrays,matrices etc.
Some functions return an size_t and your compiler will warn you if you try to do comparisons.
Avoid that by using a the appropriate signed/unsigned datatype or simply typecast for a fast hack.

size_t is unsigned int. so whenever you want unsigned int you can use it.
I use it when i want to specify size of the array , counter ect...
void * operator new (size_t size); is a good use of it.

Understanding the use of memset in CUDA device code

I have a linear int array arr, which is on CUDA global memory. I want to set sub-arrays of arr to defined values. The sub-array start indexes are given by the starts array, while the length of each sub-array is given in counts array.
What I want to do is to set the value of sub-array i starting from starts[i] and continuing upto counts[i] to the value starts[i]. That is, the operation is:
arr[starts[i]: starts[i]+counts[i]] = starts[i]
I thought of using memset() in the kernel for setting the values. However, it is not getting correctly written ( the array elements are being assigned some random values). The code I am using is:
#include <stdlib.h>
__global__ void kern(int* starts,int* counts, int* arr,int* numels)
{
unsigned int idx = threadIdx.x + blockIdx.x*blockDim.x;
if (idx>=numels[0])
return;
const int val = starts[idx];
memset(&arr[val], val, sizeof(arr[0])*counts[idx]) ;
__syncthreads();
}
Please note that numels[0] contains the number of elements in starts array.
I have checked the code with cuda-memcheck() but didn't get any errors. I am using PyCUDA, if it's relevant. I am probably misunderstanding the usage of memset here, as I am learning CUDA.
Can you please suggest a way to correct this? Or other efficient way of doint this operation.
P.S: I know that thrust::fill() can probably do this well, but since I am learning CUDA, I would like to know how to do this without using external libraries.

The memset and memcpy implementations in CUDA device code emit simple, serial, byte values operations (and note that memset can't set anything other than byte values, which might be contributing to the problem you see if the values you are trying to set are larger than 8 bits).
You could replace the memset call with something like this:
const int val = starts[idx];
//memset(&arr[val], val, sizeof(arr[0])*counts[idx]) ;
for(int i = 0; i < counts[idx]; i++)
arr[val + i] = val;
The performance of that code will probably be better than the built-in memset.
Note also that the __syncthreads() call at the end of your kernel is both unnecessary, and a potential source of deadlock and should be removed. See here for more information.

Initializing entire array with memset

I have initialised the entire array with value 1 but the output is showing some garbage value. But this program works correctly if i use 0 or -1 in place of 1. So are there some restrictions on what type of values can be initialised using memset.
int main(){
int a[100];
memset(a,1,sizeof(a));
cout<<a[5]<<endl;
return 0;
}

memset, as the other say, sets every byte of the array at the specified value.
The reason this works with 0 and -1 is because both use the same repeating pattern on arbitrary sizes:
(int) -1 is 0xffffffff
(char) -1 is 0xff
so filling a memory region with 0xff will effectively fill the array with -1.
However, if you're filling it with 1, you are setting every byte to 0x01; hence, it would be the same as setting every integer to the value 0x01010101, which is very unlikely what you want.

Memset fills bytes, from cppreference:
Converts the value ch to unsigned char and copies it into each of the first count characters of the object pointed to by dest.
Your int takes several bytes, e.g. a 32bit int will be filled with 1,1,1,1 (in base 256, endianess doesn't matter in this case), which you then falsly interpreted as a "garbage" value.

The other answers have explained std::memset already. But it's best to avoid such low level features and program at a higher level. So just use the Standard Library and its C++11 std::array
#include <array>
std::array<int, 100> a;
a.fill(1);
Or if you prefer C-style arrays, still use the Standard Library with the std::fill algorithm as indicated by #BoPersson
#include <algorithm>
#include <iterator>
int a[100];
std::fill(std::begin(a), std::end(a), 1);
In most implementations, both versions will call std::memset if it is safe to do so.

memset is an operation that sets bits.
If you want to set a value use a for-loop.
Consider a 4-bit-integer:
Its value is 1 when the bits are 0001 but memset sets it to 1111

Copying array of ints vs pointers to bools

I'm working on a program that requires an array to be copied many thousands/millions of times. Right now I have two ways of representing the data in the array:
An array of ints:
int someArray[8][8];
where someArray[a][b] can have a value of 0, 1, or 2, or
An array of pointers to booleans:
bool * someArray[8][8];
where someArray[a][b] can be 0 (null pointer), otherwise *someArray[a][b] can be true (corresponding to 1), or false (corresponding to 2).
Which array would be copied faster (and yes, if I made the pointers to booleans array, I would have to declare new bools every time I copy the array)?

Which would copy faster is beside the point, The overhead of allocating and freeing entries, and dereferencing the pointer to retrieve each value, for your bool* approach will swamp the cost of copying.
If you just have 3 possible values, use an array of char and that will copy 4 times faster than int. OK, that's not a scientifically proven statement but the array will be 4 times smaller.

Actually, both look more or less the same in terms of copying - an array of 32-bit ints vs an array of 32-bit pointers. If you compile as 64-bit, then the pointer would probably be bigger.
BTW, if you store pointers, you probably don't want to have a SEPARATE instance of "bool" for every field of that array, do you? That would be certainly much slower.
If you want a fast copy, reduce the size as much as possible, Either:
use char instead of int, or
devise a custom class with bit manipulations for this array. If you represent one value as two bits - a "null" bit and "value-if-not-null" bit, then you'd need 128 bits = 4 ints for this whole array of 64 values. This would certainly be copied very fast! But the access to any individual bit would be a bit more complex - just a few cycles more.
OK, you made me curious :) I rolled up something like this:
struct BitArray {
public:
static const int DIMENSION = 8;
enum BitValue {
BitNull = -1,
BitTrue = 1,
BitFalse = 0
};
BitArray() {for (int i=0; i<DIMENSION; ++i) data[i] = 0;}
BitValue get(int x, int y) {
int k = x+y*DIMENSION; // [0 .. 64)
int n = k/16; // [0 .. 4)
unsigned bit1 = 1 << ((k%16)*2);
unsigned bit2 = 1 << ((k%16)*2+1);
int isnull = data[n] & bit1;
int value = data[n] & bit2;
return static_cast<BitValue>( (!!isnull)*-1 + (!isnull)*!!value );
}
void set(int x, int y, BitValue value) {
int k = x+y*DIMENSION; // [0 .. 64)
int n = k/16; // [0 .. 4)
unsigned bit1 = 1 << ((k%16)*2);
unsigned bit2 = 1 << ((k%16)*2+1);
char v = static_cast<char>(value);
// set nullbit to 1 if v== -1, else 0
if (v == -1) {
data[n] |= bit1;
} else {
data[n] &= ~bit1;
}
// set valuebit to 1 if v== 1, else 0
if (v == 1) {
data[n] |= bit2;
} else {
data[n] &= ~bit2;
}
}
private:
unsigned data[DIMENSION*DIMENSION/16];
};
The size of this object for an 8x8 array is 16 bytes, which is a nice improvement compared to 64 bytes with the solution of char array[8][8] and 256 bytes of int array[8][8].
This is probably as low as one can go here without delving into greater magic.

I would say you need to redesign your program. Converting between int x[8][8] and bool *b[8][8] "millions" of times cannot be "right" however your definition of "right" is lax.

The answer to your question will be linked to the size of the data types. Typically bool is one byte while int is not. A pointer varies in length depending on the architecture, but these days is usually 32- or 64-bits.
Not taking caching or other processor-specific optimizations into consideration, the data type that is larger will take longer to copy.
Given that you have three possible states (0, 1, 2) and 64 entries you can represent your entire structure in 128 bits. Using some utility routines and two unsigned 64-bit integers you can efficiently copy your array around very quickly.

I am not 100% sure, but I think they will take roughly the same time, though I prefer using stack allocation (since dynamic allocation might take some time looking for a free space).
Consider using short type instead of int since you do not need a wide range of numbers.
I think it might be better to use one dimension array if you really want maximum speed since using the for loops in the wrong order which the compiler use for storing multidimensional arrays (raw major or column major) could cause performance penalty!

Without knowing too much about how you use the arrays, this is a possible solution:
typedef char Array[8][8];
Array someArray, otherArray;
memcpy(someArray, otherArray, sizeof(Array));
These arrays are only 64 bytes and should copy fairly fast. You can change the data type to int but that means copying at least 256 bytes.

"copying" this array with the pointers would require a deep copy, since otherwise changing the copy will affect the original, which is probably not what you want. This is going to slow things down immensely due to the memory allocation overhead.
You can get around this by using boost::optional to represent "optional" quantities - which is the only reason you're adding the level of indirection here. There are very few situations in modern C++ where a raw pointer is really the best thing to be using :) However, since you only need a char to store the values {0, 1, 2} anyway, that will probably be better in terms of space. I am pretty sure that sizeof(boost::optional<bool>) > 1, though I haven't tested it. I would be impressed if they specialized for this :)
You could even bit-pack an array of 2-bit quantities, or use two bit-packed boolean arrays (one "mask" and then another set of actual true-false values) - using std::bitset for example. That will certainly save space and reduce copying time, although it would probably increase access time (assuming you really do need to access one value at a time).

Bit field vs Bitset

I want to store bits in an array (like structure). So I can follow either of the following two approaches
Approach number 1 (AN 1)
struct BIT
{
int data : 1
};
int main()
{
BIT a[100];
return 0;
}
Approach number 2 (AN 2)
int main()
{
std::bitset<100> BITS;
return 0;
}
Why would someone prefer AN 2 over AN 1?

Because approach nr. 2 actually uses 100 bits of storage, plus some very minor (constant) overhead, while nr. 1 typically uses four bytes of storage per Bit structure. In general, a struct is at least one byte large per the C++ standard.
#include <bitset>
#include <iostream>
struct Bit { int data : 1; };
int main()
{
Bit a[100];
std::bitset<100> b;
std::cout << sizeof(a) << "\n";
std::cout << sizeof(b) << "\n";
}
prints
400
16
Apart from this, bitset wraps your bit array in a nice object representation with many useful operations.

A good choice depends on how you're going to use the bits.
std::bitset<N> is of fixed size. Visual C++ 10.0 is non-conforming wrt. to constructors; in general you have to provide a workaround. This was, ironically, due to what Microsoft thought was a bug-fix -- they introduced a constructor taking int argument, as I recall.
std::vector<bool> is optimized in much the same way as std::bitset. Cost: indexing doesn't directly provide a reference (there are no references to individual bits in C++), but instead returns a proxy object -- which isn't something you notice until you try to use it as a reference. Advantage: minimal storage, and the vector can be resized as required.
Simply using e.g. unsigned is also an option, if you're going to deal with a small number of bits (in practice, 32 or less, although the formal guarantee is just 16 bits).
Finally, ALL UPPERCASE identifiers are by convention (except Microsoft) reserved for macros, in order to reduce the probability of name collisions. It's therefore a good idea to not use ALL UPPERCASE identifiers for anything else than macros. And to always use ALL UPPERCASE identifiers for macros (this also makes it easier to recognize them).
Cheers & hth.,

bitset has more operations

Approach number 1 will most likely be compiled as an array of 4-byte integers, and one bit of each will be used to store your data. Theoretically a smart compiler could optimize this, but I wouldn't count on it.
Is there a reason you don't want to use std::bitset?

To quote cplusplus.com's page on bitset, "The class is very similar to a regular array, but optimizing for space allocation". If your ints are 4 bytes, a bitset uses 32 times less space.
Even doing bool bits[100], as sbi suggested, is still worse than bitset, because most implementations have >= 1-byte bools.
If, for reasons of intellectual curiosity only, you wanted to implement your own bitset, you could do so using bit masks:
typedef struct {
unsigned char bytes[100];
} MyBitset;
bool getBit(MyBitset *bitset, int index)
{
int whichByte = index / 8;
return bitset->bytes[whichByte] && (1 << (index = % 8));
}
bool setBit(MyBitset *bitset, int index, bool newVal)
{
int whichByte = index / 8;
if (newVal)
{
bitset->bytes[whichByte] |= (1 << (index = % 8));
}
else
{
bitset->bytes[whichByte] &= ~(1 << (index = % 8));
}
}
(Sorry for using a struct instead of a class by the way. I'm thinking in straight C because I'm in the middle of a low-level assignment for school. Obviously two huge benefits of using a class are operator overloading and the ability to have a variable-sized array.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Memset an int (16 bit) array to short's max value - c++

In C++ you can fill an array with some value with the std::fill algorithm. std::fill(ZBUFFER, ZBUFFER+size, std::numeric_limits<short>::max()); This is neither faster nor slower than your current approach. It does have the benefit of working, though.

This is because of two's complement. You have to change your array type to unsigned short, to get the max value, or use 0x7FFF.

for (int i = 0; i < SIZE / sizeof(short); ++i) { ZBUFFER[i] = SHRT_MAX; } Note this does not initialize the last couple bytes, if (SIZE % sizeof(short))

#include <algorithm> #include <limits> std::fill_n(ZBUFFER, size, std::numeric_limits<FOO>::max()) where FOO is the type of ZBUFFER's elements.

Related

What does size_t mean in c++? [duplicate]

Understanding the use of memset in CUDA device code

Initializing entire array with memset

Copying array of ints vs pointers to bools

Bit field vs Bitset

Categories

Resources