Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am not a C++ programmer, but this algorithm appeared in an operating manual for a machine I'm using and I'm struggling to make sense of it. I'd like someone to explain some terms within it, or possibly the flow of the entire code, given that I don't have time to learn C in the course of the project.
Waveform files for the machine in question are made up of a number of tags in curly brackets. The checksum is calculated using the WAVEFORM tag, {WAVEFORM-length: #data}.
The "data" consists of a number of bytes represented as hexadecimal numbers. "length" is the number of bytes in the "data", while "start" apparently points to the first byte in the "data".
I've managed to work out some of the terms, but I'm particularly unsure about my interpretation of ((UINT32 *)start)[i]
UINT32 checksum(void *start, UINT32 length)
{
UINT32 i, result = 0xA50F74FF;
for(i=0; i < length/4; i++)
result = result ^ ((UINT32 *)start)[i];
return(result);
}
So from what I can tell, the code does the following:
Take the address of the first byte in the "data" and the length of the "data"
Create a variable called result, which is an unsigned integer A50F74FF
For each byte in the first 25% of the data string, raise "result" to that power (presumably modulo 2^32)
Return result as the value checksum
Am I correct here or have I misread the algorithm on one of the steps? I feel like I can't be correct, because basing a checksum on only part of the data wouldn't spot errors in the later parts of the data.
For each byte in the first 25% of the data string, raise "result" to that power (presumably modulo 2^32)
This is wrong. ^ is the bitwise XOR operation. It does not raise to a power.
Also, about "of the data string". The algorithm iterates the pointed data as if it is an array of UINT32. In fact, if start doesn't point to (an element of) an array of UINT32, then the behaviour of the program is undefined1. It would be much better to declare the argument to be UINT32* in the first place, and not use the explicit cast.
Also, about "For each byte in the first 25% of the data string", the algorithm appears to go through (nearly2) all bytes from start to start + length. length is presumably measured in bytes, and UINT32 is presumably a type that consists of 4 bytes. Thus an array of UINT32 objects of N bytes contains N/4 elements UINT32 of objects. Note that this assumes that the byte is 8 bits wide which is probably an assumption that the manual can make, but keep in mind that it is not an assumption portable to all systems.
1 UB as far as the C++ language is concerned. But, if it's shown in the operating manual of a machine, then perhaps the special compiler for the particular hardware specifies defined behaviour for this. That said, it is also quite possible for the author of the manual to have made a mistake.
2 If length is not divisible by 4, then the remaining 1-3 bytes are not used.
So the pseudocode for this function is roughly like this:
function checksum(DATA)
RESULT = 0xA50F74FF;
for each DWORD in DATA do
RESULT = RESULT xor DWORD
return RESULT
where DWORD is a four-byte integer value.
The function is actually going though (almost) all of the data (not 25%) but it's doing it in 4-byte increments that's why the length which is in bytes is divided by 4.
Related
I'm writing alignment-dependent code, and quite surprised that there's no standard function testing if a given pointer is aligned properly.
It seems that most code on the internet use (long)ptr or reinterpret_cast<uintptr_t>(ptr) to test the alignment and I also used them, but I wonder if using the casted pointer to integral type is standard-conformant.
Is there any system that fires the assertion here?
char ch[2];
assert(reinterpret_cast<char*>(reinterpret_cast<uintptr_t>(&ch[0]) + 1)
== &ch[1]);
To answer the question in the title: No.
Counter example: On the old Pr1me mini-computer, a normal pointer was two 16-bit words. First word was 12-bit segment number, 2 ring bits, and a flag bit (can't remember the 16th bit). Second word was a 16-bit word offset within a segment. A char* (and hence void*) needed a third word. If the flag bit was set, the third word was either 0 or 8 (being the bit offset within the addressed word). A uintptr_t for such a machine would need to be uint48_t or uint64_t. Either way, adding 1 to such an integer would not advance to the next character in memory.
A capability addressed machine is also likely to have pointers which are much larger than the address space, and there is no particular reason why the least significant part of the corresponding integer should be part of the "address" rather than part of the extra info.
In practise of course, nobody is writing C++ for a Pr1me, and capability addressed machines seem not to have appeared either. It will work on all real systems - but the standard doesn't guarantee it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Bit fields in a structure can be used to save some bytes of memory, I have heard. How can we use this particular bytes for any purposes?
typedef struct
{
char A : 1;
int B : 1;
} Struct1;
The char value itself with a width of one bit is not particularly useful. In fact char values as bit fields are not standard. It's an extension that Microsoft added. On the other hand the B field can be used as on / off value since it can hold the values 1 or 0
Struct1 s;
s.B = 0;
if (s.B) {
...
}
This particular example doesn't really demonstrate the savings offered by bit fields particularly well. Need a more complex strut for that. Consider the following
typedef struct {
int Value1;
int Value2;
} S1;
On most platforms S1 will have a size of 8 (each int field being 4 bytes long). Imagine though that Value1 and Value2 will always have values between 0 and 10. This could be stored in 4 bits but we're using 32 bits meaning. Using bit fields we could reduce the waste significantly here
typedef struct {
int Value1 : 4;
int Value2 : 4;
} S1;
Now the size of S1 is likely 1 byte and can still hold all of the necessary values
In embedded systems, the bit fields in a structure can be used to represent bit fields or a hardware device.
Other uses for bit fields are in protocols (messages). One byte (or 4 bytes) to represent the presences or absences of many things would occupy a lot a space and wasted transmission time. So in 1 byte you could represent 8 Boolean conditions rather than using 8 bytes or 8 words to do so.
The bit fields in a structure are usually used as convenience. The same operations to extract, set or test bit fields can be performed using the arithmetic bit operators (such as AND).
Saving memory means, that you need less memory. This can increase the program performance due to less swapping or cache misses.
In your example the two parts A and B would be stored in a byte (or whatever the compiler decides to use), instead of two. A better example is, if you want to store the occupied seats in an opera house with 1000 seats. If you would store them as a boolean which is often stored in a byte per boolean, they could be stored in 128 bytes, because per seat only one bit is needed.
The downside is the performance loss. Accessing the bits need some additional shifting or xor-ing. Thus, it's a trade-off memory for computation time.
This is maybe a dumb question but,
I would like that N processors write all a different byte count in the same file with a different offset to make the data contignously.
I would like to use MPI_File_write_all(file,data,count,type,status) (individual file pointers, collective, blocking) function.
The first question can each processor specify a different value for the count parameter?
I could not find anything mentioned in the MPI 3.0 reference. (My intention is that it is not possible?)
What I found out so far is the following two problems:
When I want to write a large amount of MPI_BYTES the integer (32bit) count in the MPI_File_write... functions is to little and gives overflow of course!
I do not (cannot)/want to use a derived data type in MPI because as mentioned above all processor write a different byte count and the type is MPI_BYTES
Thanks for any help on this topic!
You've rolled up a few questions here
Absolutely, proceses can specify different or even zero amounts of data to the collective MPI_File_write_all routine. Not only can the count parameter differ, there's no reason the datatype parameter needs to be the same either.
Problem #1: If you want to write more than an int worth of MPI_BYTE data, you'll have to create a new datatpye. For example, let's say you wanted to write 9 billion bytes. Create a contig type of size 1 billion, then write 9 of those. (if the amount of data you want to write is not evenly divisible, you might need an hindexed or struct type).
problem #2: It's not at all a problem to have every MPI process create its own datatype or count of datatype.
What I understood about char type from a few questions asked here is that it is always 1 byte in C++, but number of bits can vary from system to system.
sizeof() operator uses char as a unit so sizeof(char) is always 1 in bytes of C++.(which takes number of bits of smallest unit of address of local machine) If when using file functions of fstream() in binary mode, we directly read and write from/to an address of any variable in RAM, the size of variable as smallest unit of data written to file should be in size of the value read from RAM and for one read from file it is vice-versa. Then can we say that data may not be written 8 by 8 in bits if something like this is tried:
ofstream file;
file.open("blabla.bin",ios::out|ios::binary);
char a[]="asdfghjkkll";
file.seekp(0);
file.write((char*)a,sizeof(a)-1);
file.close();
Unless char is always used in bytes existing standard 8 bits, what happens if a heap of data is written to file in a 16 bit machine and is read in a 32 bit machine? Or should I use OS-dependent text mode? If not, and I misunderstood what is truth?
Edit : I have corrected my mistake.
Thanks for warning.
Edit2: My system is 64 bit but I get number of bits of char type as 8.What is wrong? Is the way I get the result of 8false?
I got a 00000... by shifting a char variable more than possible size of it with bitwise operators.After guaranteeing that all bits of the variable is zero, I got a 111... by inverting it. And shifted until it become zero.If we shift it its size time, we get a zero, so we can get number of bits from indice of the loop terminated below.
char zero,test;
zero<<=64; //hoping that system is not more than 64 bit(most likely)
test=~zero; //we have a 111...
int i;
for(i=0; test!=zero; i++)
test=test<<1;
Value of variable of i after the loop is number of bits in char type.According to this, the result is 8.
My last question is:
Are filesystem byte and char type different data types because how computer adresses pointers in file stream is different from standart char type which is at least 8 bits?
So, exactly what is going on the background?
Edit3: Why these minuses? What is my mistake? Isn't the question clear enough? Maybe my question is stupid but why there is no any response related to my question?
A language standard can't really specify what the filesystem does - it can only specify how the language interacts with it. The C and C++ standards also don't address anything to do with interoperability or communication between different implementations. In other words, there isn't a general answer to this question except to say that:
the VAST majority of systems use 8-bit bytes
the C and C++ standard require that char is at least 8 bits
it is very likely that greater-than-8-bit systems have mechanisms in place to somehow utilize (or at least transcode) 8-bit files.
Is there some reasonably fast code out there which can help me quickly search a large bitmap (a few megabytes) for runs of contiguous zero or one bits?
By "reasonably fast" I mean something that can take advantage of the machine word size and compare entire words at once, instead of doing bit-by-bit analysis which is horrifically slow (such as one does with vector<bool>).
It's very useful for e.g. searching the bitmap of a volume for free space (for defragmentation, etc.).
Windows has an RTL_BITMAP data structure one can use along with its APIs.
But I needed the code for this sometime ago, and so I wrote it here (warning, it's a little ugly):
https://gist.github.com/3206128
I have only partially tested it, so it might still have bugs (especially on reverse). But a recent version (only slightly different from this one) seemed to be usable for me, so it's worth a try.
The fundamental operation for the entire thing is being able to -- quickly -- find the length of a run of bits:
long long GetRunLength(
const void *const pBitmap, unsigned long long nBitmapBits,
long long startInclusive, long long endExclusive,
const bool reverse, /*out*/ bool *pBit);
Everything else should be easy to build upon this, given its versatility.
I tried to include some SSE code, but it didn't noticeably improve the performance. However, in general, the code is many times faster than doing bit-by-bit analysis, so I think it might be useful.
It should be easy to test if you can get a hold of vector<bool>'s buffer somehow -- and if you're on Visual C++, then there's a function I included which does that for you. If you find bugs, feel free to let me know.
I can't figure how to do well directly on memory words, so I've made up a quick solution which is working on bytes; for convenience, let's sketch the algorithm for counting contiguous ones:
Construct two tables of size 256 where you will write for each number between 0 and 255, the number of trailing 1's at the beginning and at the end of the byte. For example, for the number 167 (10100111 in binary), put 1 in the first table and 3 in the second table. Let's call the first table BBeg and the second table BEnd. Then, for each byte b, two cases: if it is 255, add 8 to your current sum of your current contiguous set of ones, and you are in a region of ones. Else, you end a region with BBeg[b] bits and begin a new one with BEnd[b] bits.
Depending on what information you want, you can adapt this algorithm (this is a reason why I don't put here any code, I don't know what output you want).
A flaw is that it does not count (small) contiguous set of ones inside one byte ...
Beside this algorithm, a friend tells me that if it is for disk compression, just look for bytes different from 0 (empty disk area) and 255 (full disk area). It is a quick heuristic to build a map of what blocks you have to compress. Maybe it is beyond the scope of this topic ...
Sounds like this might be useful:
http://www.aggregate.org/MAGIC/#Population%20Count%20%28Ones%20Count%29
and
http://www.aggregate.org/MAGIC/#Leading%20Zero%20Count
You don't say if you wanted to do some sort of RLE or to simply count in-bytes zeros and one bits (like 0b1001 should return 1x1 2x0 1x1).
A look up table plus SWAR algorithm for fast check might gives you that information easily.
A bit like this:
byte lut[0x10000] = { /* see below */ };
for (uint * word = words; word < words + bitmapSize; word++) {
if (word == 0 || word == (uint)-1) // Fast bailout
{
// Do what you want if all 0 or all 1
}
byte hiVal = lut[*word >> 16], loVal = lut[*word & 0xFFFF];
// Do what you want with hiVal and loVal
The LUT will have to be constructed depending on your intended algorithm. If you want to count the number of contiguous 0 and 1 in the word, you'll built it like this:
for (int i = 0; i < sizeof(lut); i++)
lut[i] = countContiguousZero(i); // Or countContiguousOne(i)
// The implementation of countContiguousZero can be slow, you don't care
// The result of the function should return the largest number of contiguous zero (0 to 15, using the 4 low bits of the byte, and might return the position of the run in the 4 high bits of the byte
// Since you've already dismissed word = 0, you don't need the 16 contiguous zero case.