How do I cast a slice of unsigned integers to signed integers of the same size? - casting

My best current effort is
// `value` is a `&[u8]`
let v = unsafe { slice::from_raw_parts(value.as_ptr() as *const i8, value.len()) };
It seems overkill to need unsafe for this. I'd like this to be zero cost.

Your code is as zero cost as it can be already. But it has to use unsafe code underneath because not all kinds of slice conversions are guaranteed to be safe by the compiler when done through slice::from_parts or mem::transmute. Whichever "safe" function is suitable here would likely enclose that while ensuring that the item types are compatible with this conversion (i.e. same size and memory alignment).
You may be able to find multiple crates which proper testing and maintenance for this conversion. The crate safe-transmute implements this conversion, while supporting more bound guards and alignment checks (disclaimer: I'm one of the collaborators).
use safe_transmute::transmute_many_pedantic;
let value: &[u8] = &[0x00, 0x01, 0x12, 0x24, 0x00];
let words: &[i8] = transmute_many_pedantic(values)?;

Related

C++: Disadvatages of using std::uint_fastn_t

So I stumbled upon When should I use the C++ fixed-width integer types and how do they impact performance? and Should I use cstdint? where the advantages and disadvantages of fixed width integer types as defined in <cstdint> are listed.
I kinda do like to encode the intended range of a variable into its type but I also don't want to enforce the CPU to do extra operations just to use a uint16_t instead of a plain int when I am not strictly required to have a variable holding exactly 16 bits.
I also read about types like std::uint_fast16_t and so on. From my understanding using this type should ensure that I am guaranteed to be able to store a 16-bit number in that variable but I should never have to pay any runtime penality for using this type since on every architecture where e.g. uint32_t would be faster, that would be used automatically for me.
This leaves me with the question: Aside from the case that I really need a variable of exact bit width, are there any disadvantages of using std::uint_fast16_t instead of say unsigned int?
EDIT: This is of course assuming that memory consumption is not an issue. If it was, I would use std::uint_least16_t instead.
are there any disadvantages of using std::uint_fast16_t instead of say unsigned int.
One disadvantage: uncertain type due do the usual promotions. Does uint16_fast_t convert to a signed or unsigned?
uint16_fast_t fa = 1;
unsigned un = 1;
int i;
fa some_operator i --> may result in an `int` or `uint16_fast_t`
un some_operator i --> result is unsigned.
The ambiguity may negatively affect more complicated equations and behavior on overflow.
IMO, uint16_fast_t only useful in narrow controlled code, not for a general performance improvement. Be careful of Is premature optimization really the root of all evil?.
Many factors affect this conclusion, yet generally for performance, usually best to go for clarity and type unsigned.

Why is std::ssize being forced to a minimum size for its signed size type?

In C++20, std::ssize is being introduced to obtain the signed size of a container for generic code. (And the reason for its addition is explained here.)
Somewhat peculiarly, the definition given there (combining with common_type and ptrdiff_t) has the effect of forcing the return value to be "either ptrdiff_t or the signed form of the container's size() return value, whichever is larger".
P1227R1 indirectly offers a justification for this ("it would be a disaster for std::ssize() to turn a size of 60,000 into a size of -5,536").
This seems to me like an odd way to try to "fix" that, however.
Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.
The same thing would occur for containers using a uint8_t size and 127 elements, respectively.
In desktop environments, you probably don't care; but this might be important for embedded or otherwise resource-constrained environments, especially if the resulting type is used for something more persistent than a stack variable.
Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.
If there still exist platforms for which ptrdiff_t is smaller than 32 bits, they will hit the same problem as well.
Wouldn't it be better to just use the signed type as-is (without extending its size) and to assert that a conversion error has not occurred (eg. that the result is not negative)?
Am I missing something?
To expand on that last suggestion a bit (inspired by Nicol Bolas' answer): if it were implemented the way that I suggested, then this code would Just Work™:
void DoSomething(int16_t i, T const& item);
for (int16_t i = 0, len = std::ssize(rng); i < len; ++i)
{
DoSomething(i, rng[i]);
}
With the current implementation, however, this produces warnings and/or errors unless static_casts are explicitly added to narrow the result of ssize, or to use int i instead and then narrow it in the function call (and the range indexing), neither of which seem like an improvement.
Containers which intentionally define a uint16_t size and are known to never exceed 32,767 elements will still be forced to use a larger type than required.
It's not like the container is storing the size as this type. The conversion happens via accessing the value.
As for embedded systems, embedded systems programmers already know about C++'s propensity to increase the size of small types. So if they expect a type to be an int16_t, they're going to spell that out in the code, because otherwise C++ might just promote it to an int.
Furthermore, there is no standard way to ask about what size a range is "known to never exceed". decltype(size(range)) is something you can ask for; sized ranges are not required to provide a max_size function. Without such an ability, the safest assumption is that a range whose size type is uint16_t can assume any size within that range. So the signed size should be big enough to store that entire range as a signed value.
Your suggestion is basically that any ssize call is potentially unsafe, since half of any size range cannot be validly stored in the return type of ssize.
Containers which use the default size_t size on 32-bit platforms but which nevertheless do contain between 2B and 4B items will hit exactly the same problem as above.
Assuming that it is valid for ptrdiff_t to not be a signed 64-bit integer on such platforms, there isn't really a valid solution to that problem. So yes, there will be cases where ssize is potentially unsafe.
ssize currently is potentially unsafe in cases where it is not possible to be safe. Your proposal would make ssize potentially unsafe in all cases.
That's not an improvement.
And no, merely asserting/contract checking is not a viable solution. The point of ssize is to make for(int i = 0; i < std::ssize(rng); ++i) work without the compiler complaining about signed/unsigned mismatch. To get an assert because of a conversion failure that didn't need to happen (and BTW, cannot be corrected without using std::size, which we are trying to avoid), one which is ultimately irrelevant to your algorithm? That's a terrible idea.
if it were implemented the way that I suggested, then this code would Just Work™:
Let us ignore the question of how often it is that a user would write this code.
The reason your compiler will expect/require you to use a cast there is because you are asking for an inherently dangerous operation: you are potentially losing data. Your code only "Just Works™" if the current size fits into an int16_t; that makes the conversion statically dangerous. This is not something that should implicitly take place, so the compiler suggests/requires you to explicitly ask for it. And users looking at that code get a big, fat eyesore reminding them that a dangerous thing is being done.
That is all to the good.
See, if your suggested implementation were how ssize behaved, then that means we must treat every use of ssize as just as inherently dangerous as the compiler treats your attempted implicit conversion. But unlike static_cast, ssize is small and easily missed.
Dangerous operations should be called out as such. Since ssize is small and difficult to notice by design, it therefore should be as safe as possible. Ideally, it should be as safe as size, but failing that, it should be unsafe only to the extend that it is impossible to make it safe.
Users should not look on ssize usage as something dubious or disconcerting; they should not fear to use it.

How do I cast from one slice type to another? [duplicate]

I have a [u8; 16384] and a u16. How would I "temporarily transmute" the array so I can set the two u8s at once, the first to the least significant byte and the second to the most significant byte?
The obvious, safe and portable way is to just use math.
fn set_u16_le(a: &mut [u8], v: u16) {
a[0] = v as u8;
a[1] = (v >> 8) as u8;
}
If you want a higher-level interface, there's the byteorder crate which is designed to do this.
You should definitely not use transmute to turn a [u8] into a [u16], because that doesn't guarantee anything about the byte order.
slice::align_to and slice::align_to_mut are stable as of Rust 1.30. These functions handle the alignment concerns that sellibitze brings up.
The big- and little- endian problems are still yours to worry about. You may be able to use methods like u16::to_le to help with that. I don't have access to a big-endian computer to test with, however.
fn example(blob: &mut [u8; 16], value: u16) {
// I copied this example from Stack Overflow without providing
// rationale why my specific case is safe.
let (head, body, tail) = unsafe { blob.align_to_mut::<u16>() };
// This example simply does not handle the case where the input data
// is misaligned such that there are bytes that cannot be correctly
// reinterpreted as u16.
assert!(head.is_empty());
assert!(tail.is_empty());
body[0] = value
}
fn main() {
let mut data = [0; 16];
example(&mut data, 500);
println!("{:?}", data);
}
As DK suggests, you probably shouldn't really use unsafe code to reinterpret the memory... but you can if you want to.
If you really want to go that route, you should be aware of a couple of gotchas:
You could have an alignment problem. If you just take a &mut [u8] from somewhere and convert it to a &mut [u16], it could refer to some memory region that is not properly aligned to be accessed as a u16. Depending on what computer you run this code on, such an unaligned memory access might be illegal. In this case, the program would probably abort somehow. For example, the CPU could generate some kind of signal which the operating system responds to in order to kill the process.
It'll be non-portable. Even without the alignment issue, you'll get different results on different machines (little- versus big-endian machines).
If you can switch it around (creating a u16 array and temporarily dealing with it on a byte level), you would solve the potential memory alignment problem:
/// warning: The resulting byte view is system-specific
unsafe fn raw_byte_access(s16: &mut [u16]) -> &mut [u8] {
use std::slice;
slice::from_raw_parts_mut(s16.as_mut_ptr() as *mut u8, s16.len() * 2)
}
On a big-endian machine, this function will not do what you want; you want a little-endian byte order. You can only use this as an optimization for little-endian machines and need to stick with a solution like DK's for big- or mixed-endian machines.

Best way to convert 8 boolean to one byte?

I want to save 8 boolean to one byte and then save it to a file(this work must be done for a very large data), I've used the following code but I'm not sure it is the best one(in terms of speed and space):
int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
a=a<<1;
a+=bits[i]
}
//and then save "a"
can anyone give me a better code(more speed) ?
If you don't mind using SSE intrinsics, then _mm_movemask_epi8 is an excellent fit. It uses 16 bytes, but you can just set the others to zero.
For example (not tested)
__m128i values = _mm_loadl_epi64((__m128i*)array);
__m128i order = _mm_set_epi8(0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
0, 1, 2, 3, 4, 5, 6, 7);
values = _mm_shuffle_epi8(values, order);
int result = _mm_movemask_epi8(_mm_slli_epi32(values, 7));
This assumes the array is an array of chars. If you can't make that happen, it takes some more loads and packs and it becomes a bit annoying.
Regarding
” can anyone give me a better code(more speed)
you should measure. Most of the impact on the speed of serializing to file is i/o speed. What you do with the bits will likely have an unmeasurably small impact, but if it has any impact then that is likely mostly influenced by your original representation of the sequence of booleans.
Now regarding the given code
int bits[]={1,0,0,0,0,1,1,1};
char a='\0';
for (int i=0;i<8;i++){
a=a<<1;
a+=bits[i]
}
//and then save "a"
Use unsigned char as byte type, just on principle.
Use bitlevel OR, the | operator, again just on principle.
Use prefix ++, yes, also that just on principle.
The “on principle” for the first point is because in practice your code will not run on any machine with sign-and-magnitude or one's complement representation of signed integers, where char is signed. But I think it's generally a good idea to express in the code exactly what one intends doing, instead of rewriting it as something slightly different. And the intention here is to deal with bits, an unsigned byte.
The “on principle” for the bitlevel OR is because for this particular case there's no practical difference between bitlevel OR and addition. But in general it's a good idea to write in code what one means to express. And then it's no good to write a bitlevel OR as an addition: it might even trip you up, bite you in the a**, in some other context.
The “on principle” for the prefix ++ is because in practice the compiler will optimize prefix and postfix ++ for a basic type, when the expression result isn't used, to the very same machine code. But again it's generally better to write what one intends to express. Asking for an original value (the postfix ++) is just misleading a reader of the code when you're not ever using that original value – and as with the bitlevel OR expressed as addition, the pure increment expressed as postfix ++ might trip you up, bite you in the a**, in some other context, e.g. with iterators.
The general approach of explicitly coding up shifting and ORing appears to me to be fine because std::bitset does not support initialization from a sequence of booleans (only initialization from a text string), so it doesn't save you any work. But generally it's a good idea to check the standard library, whether it supports whatever one wants to do. It might even happen that someone else chimes in here with some standard library based approach that I didn't think of! ;-)
Replace the += operator by |=, which is the bit-wise operation (and actually what you want to do here).
Use unsigned char for your truth values, if possible.
Unless you want to hand-unroll your loops and/or use SIMD intrinsics, that would be the most compiler-optimizable solution, I guess.
there's another trick: structs can have bit offsets, and you can use union on them to misuse them as ints.
By the way: your code is buggy. You shift first, then write; you use addition, but a signed char, which will definitely go wrong for the 7th and 8th bits (given you erroneously shift too early; if you did that properly, only the 8th bit will cause hazard).

Can I use data types like bool to compress data while improving readability?

My official question will be: "Is there a clean way to use data types to "encode and compress" data rather than using messy bit masking." The hopes would be to save space in the case of compressing, and I would like to use native data types, structures, and arrays in order to improve readability over bit masking. I am proficient in bit masking from my assembly background but I am learning C++ and OOP. We can store so much information in a 32 bit register by using individual bits and I feel that I am trying to get back to that low level environment while having the readability of C++ code.
I am attempting to save some space because I am working with huge resource requirements. I am still learning more about how c++ treats the bool data type. I realize that memory is stored in byte chunks and not individual bits. I believe that a bool usually uses one byte and is masked somehow. In my head I could use 8 bool values in one byte.
If I malloc in C++ an array of 2 bool elements. Does it allocate two bytes or just one?
Example: We will use DNA as an example since it can be encoded into two bit to represent A,C,G and T. If I make a struct with an array of two bool called DNA_Base, then I make an array of 20 of those.
struct DNA_Base{ bool Bit_1; bool Bit_2; };
DNA_Base DNA_Sequence[7] = {false};
cout << sizeof(DNA_Base)<<sizeof(DNA_Sequence)<<endl;
//Yields a 2 and a 14.
//I would like this to say 1 and 2.
In my example I would also show the case where the DNA sequence can be 20 bases long which would require 40 bits to encode. GATTACA could only take up a maximum of 2 bytes? I suppose an alternative question would have been "How to make C++ do the bit masking for me in a more readable way" or should I just make my own data type and classes and implement the bit masking using classes and operator overloading.
Not fully what you want but you can use bitfield:
struct DNA_Base
{
unsigned char Bit_1 : 1;
unsigned char Bit_2 : 1;
};
DNA_Base DNA_Sequence[7];
So sizeof(DNA_Base) == 1 and sizeof(DNA_Sequence) == 7
So you have to pack the DNA_Base to avoid to lose place with padding, something like:
struct DNA_Base_4
{
unsigned char base1 : 2; // may have value 0 1 2 or 3
unsigned char base2 : 2;
unsigned char base3 : 2;
unsigned char base4 : 2;
};
So sizeof(DNA_Base_4) == 1
std::bitset is an other alternative, but you have to do the interpretation job yourself.
An array of bools will be N-elements x sizeof(bool).
If your goal is to save space in registers, don't bother, because it is actually more efficient to use a word size for the processor in question than to use a single byte, and the compiler will prefer to use a word anyway, so in a struct/class the bool will usually be expanded to a 32-bit or 64-bit native word.
Now, if you like to save room on disk, or in RAM, due to needing to store LOTS of bools, go ahead, but it isn't going to save room in all cases unless you actually pack the structure, and on some architectures packing can also have performance impact because the CPU will have to perform unaligned or byte-by-byte access.
A bitmask (or bitfield), on the other hand, is performant and efficient and as dense as possible, and uses a single bitwise operation. I would look at one of the abstract data types that provide bit fields.
The standard library has bitset http://www.cplusplus.com/reference/bitset/bitset/ which can be as long as you want.
Boost also has something I'm sure.
Unless you are on a 4 bit machine, the final result will be using bit arithmetic. Whether you do it explicitly, have the compiler do it via bit fields, or use a bit container, there will be bit manipulation.
I suggest the following:
Use existing compression libraries.
Use the method that is most readable or understood by people other
than yourself.
Use the method that is most productive (talking about development
time).
Use the method that you will inject the least amount of defects.
Edit 1:
Write each method up as a separate function.
Tell the compiler to generate the assembly language for each function.
Compare the assembly language of each function to each other.
My belief is that they will be very similar, enough that wasting time discussing them is not worthwhile.
You can't operate on bits directly, but you can treat the smallest unit available to you as a multiple data store, and define
enum class DNAx4 : uint8_t {
AAAA = 0x00, AAAC = 0x01, AAAG = 0x02, AAAT = 0x03,
// .... And the rest of them
AAAA = 0xFC, AAAC = 0xFD, AAAG = 0xFE, AAAT = 0xFF
}
I'd actually go further, and create a structure DNAx16 or DNAx32 to efficiently use the native word size on your machine.
You can then define functions on the data type, which will have to use the underlying bit representation, but at least it allows you to encapsulate this and build higher level operations from these primitives.