I have the following tutorial question in regards to bit manipulation and permissions, i dont quite understand. Specifically B as i understand it >> 3 operand is a bitshift of 3 spaces but that would result in all zeros for a bit protection of type... |0000400| >> 3 00000000|4 ...?
// File types
#define S_IFDIR (0040000) // Directory
#define S_IFREG (0100000) // Regular file
#define S_IFLNK (0120000) // Symbolic link
// Protection bits
#define S_IRUSR (0000400) // Read by owner
#define S_IWUSR (0000200) // Write by owner
#define S_IXUSR (0000100) // Execute by owner
#define S_IRGRP (S_IRUSR >> 3) // Read by group
#define S_IWGRP (S_IWUSR >> 3) // Write by group
#define S_IXGRP (S_IXUSR >> 3) // Execute by group
#define S_IROTH (S_IRGRP >> 3) // Read by others
#define S_IWOTH (S_IWGRP >> 3) // Write by others
#define S_IXOTH (S_IXGRP >> 3) // Execute by others
Note that above constants are in octal, not decimal or hexadecimal.
For each of the following scenarios, give an octal representation of a bit-string that would capture the relevant privileges:
a. a regular file that is only readable and writeable to its owner
b. a regular file that is writeable to its owner, but readable by owner/group/anyone
c. a regular file that is only executable to owner/group/anyone
d. a directory that only the owner can read, create files in, or enter
e. a directory that only the owner can create files in, but anyone can read/enter
f. a directory that owner/group/anyone can read, create files in, or enter
answers:
a. 0100600
b. 0100644
c. 0100111
d. 0040700
e. 0040755
f. 0040777
Specifically B as i understand it >> 3 operand is a bitshift of 3 spaces but that would result in all zeros for a bit protection of type... |0000400| >> 3 00000000|4 ...?
>> 3 is a bitshift of 3 bits, not 3 digits. In octal representation, each digit corresponds to 3 bits (8 = 23), so bitshifting by 3 bits would correspond to shifting by 1 digit of the octal representation.
Therefore, 00004000 >> 3 = 00000400 in octal.
Related
I've recently needed to convert mnist data-set to images and labels, it is binary and the structure is in the previous link, so i did a little research and as I'm fan of c++ ,I've read the I/O binary in c++,after that I've found this link in stack. That link works well but no code commenting and no explanation of algorithm so I've get confused and that raise some question in my mind which i need a professional c++ programmer to ask.
1-What is the algorithm to convert the data-set in c++ with help of ifstream?
I've realized to read a file as a binary with file.read and move to the next record, but in C , we define a struct and move it inside the file but i can't see any struct in c++ program for example to read this:
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
How can we go to the specific offset for example 0004 and read for example 32 bit integer and put it to an integer variable.
2-What the function reverseInt is doing? (It is not obviously doing simple reversing an integer)
int ReverseInt (int i)
{
unsigned char ch1, ch2, ch3, ch4;
ch1 = i & 255;
ch2 = (i >> 8) & 255;
ch3 = (i >> 16) & 255;
ch4 = (i >> 24) & 255;
return((int) ch1 << 24) + ((int)ch2 << 16) + ((int)ch3 << 8) + ch4;
}
I've did a little debugging with cout and when it revised for example 270991360 it return 10000 , which i cannot find any relation, I understand it AND the number multiples with two with 255 but why?
PS :
1-I already have the MNIST converted images but i want to understand the algorithm.
2-I've already unzip the gz files so the file is pure binary.
1-What is the algorithm to convert the data-set in c++ with help of ifstream?
This function read a file (t10k-images-idx3-ubyte.gz) as follow:
Read a magic number and adjust endianness
Read number of images and adjust endianness
Read number rows and adjust endianness
Read number of columns and adjust endianness
Read all the given images x rows x columns characters (but loose them).
The function use normal int and always switch endianness, that means it target a very specific architecture and is not portable.
How can we go to the specific offset for example 0004 and read for example 32 bit integer and put it to an integer variable.
ifstream provides a function to seek to a given position:
file.seekg( posInBytes, std::ios_base::beg);
At the given position, you could read the 32-bit integer:
int32_t val;
file.read ((char*)&val,sizeof(int32_t));
2- What the function reverseInt is doing?
This function reverse order of the bytes of an int value:
Considering an integer of 32bit like aaaaaaaabbbbbbbbccccccccdddddddd, it return the integer ddddddddccccccccbbbbbbbbaaaaaaaa.
This is useful for normalizing endianness, however, it is probably not very portable, as int might not be 32bit (but e.g. 16bit or 64bit)
If we open a file for reading, we may define one or more state flags,
for example: ios::out as well as ios::out | iso::app
I read about the bitwise OR, and how it "merges" the two bit sets,
for example: 1010 | 0111 = 1111
now that being said, I do not understand how it works "behind the scenes" when we use a method like ifstream.open(filename, stateflagA | stateflagB | stateflagC) and so on.
Can someone elaborate more on the inner workings of these state flags and their memory representation?
EDIT:
To give more emphasis on what i am trying to understand (if it helps),
I would assume that the open method could receive one or more state flags as separate arguments in the signature, and not delimited by a bitwise OR, so i want to understand how the bitwise OR works on these state flags to produce a different final state when combining several flags, and as a result allows me to use only one argument for a state flag or a set of state flags.
ie:
ifstream.open(filename, stateflagA | stateflagB | stateflagC)
and NOT
ifstream.open(filename, stateflagA , stateflagB , stateflagC)
Bit flags are represented in the same exact way all integral values are represented. What makes them "flags" is your program's interpretation of their values.
Bit flags are used for compact representation of small sets of values. Each value is assigned a bit index. All integer numbers with the bit at that index set to 1 are interpreted as sets that include the corresponding member.
Consider a small example: let's say we need to represent a set of three colors - red, green, and blue. We assign red an index of zero, green and index of 1, and blue an index of two. This corresponds to the following representation:
BINARY DECIMAL COLOR
------ ------- -----
001 1 Red
010 2 Green
100 4 Blue
Note that each flag is a power of two. That's the property of binary numbers that have a single bit set to 1. Here is how it would look in C++:
enum Color {
Red = 1 << 0
, Green = 1 << 1
, Blue = 1 << 2
};
1 << n is the standard way of constructing an integer with a single bit at position n set to 1.
With this representation in hand we can construct sets that have any combination of these colors:
BINARY DECIMAL COLOR
------ ------- -----
001 1 Red
010 2 Green
011 3 Red+Green
100 4 Blue
101 5 Blue+Red
110 6 Blue+Green
111 7 Blue+Green+Red
Here is when bit operations come into play: we can use them to construct sets and check membership in a single operation.
For example, we can construct a set of Red and Blue with an | like this:
Color purple = Red | Blue;
Behind the scenes, all this does is assigning 5 to purple, because 4 | 1 is 5. But since your program interprets 5 as a set of two colors, the meaning of that 5 is not the same as that of an integer 5 that represents, say, the number of things in a bag.
You can check if a set has a particular member by applying & to it:
if (purple & Red) {
// returns true
}
if (purple & Green) {
// returns false
}
The flags used by I/O library work in the same way. Some of the flags are combined to produce bit masks. They work in the same way as individual flags, but instead of letting you find membership they let you find set intersection in a single bit operation:
Color yellow = Blue | Green;
Color purple = Red | Blue;
Color common = yellow & purple; // common == Blue
If we take the GNU libstdc++ implementation and look at how these are actually implemented, we find:
enum _Ios_Openmode
{
_S_app = 1L << 0,
_S_ate = 1L << 1,
_S_bin = 1L << 2,
_S_in = 1L << 3,
_S_out = 1L << 4,
_S_trunc = 1L << 5,
_S_ios_openmode_end = 1L << 16
};
These values are then used as this:
typedef _Ios_Openmode openmode;
static const openmode app = _S_app;
/// Open and seek to end immediately after opening.
static const openmode ate = _S_ate;
/// Perform input and output in binary mode (as opposed to text mode).
/// This is probably not what you think it is; see
/// http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt11ch27s02.html
static const openmode binary = _S_bin;
/// Open for input. Default for #c ifstream and fstream.
static const openmode in = _S_in;
/// Open for output. Default for #c ofstream and fstream.
static const openmode out = _S_out;
/// Open for input. Default for #c ofstream.
static const openmode trunc = _S_trunc;
Since the values are chosen as 1 << n, they are exactly one "bit" each, which allows us to combine then using | (or) - as well as other similar operations.
So app in binary is 0000 0001 and bin is 0000 0100, so if we do app | bin as a mode for opening the file, we get 0000 0101. The internals of the impplementation of fstream can then use
if (mode & bin) ... do stuff for binary file ...
and
if (mode & app) ... do stuff for appending to the file ...
Other C++ library implementations may choose a different set of bit values for each flag, but will use a similar system.
"Behind the scene", in the memory of the computer every information is ultimately coded as a group of bits. Your CPU is wired to perform basic binary algebra operations (AND, OR, XOR, NOT) on such elementary information.
C++ operators | & and ^ just give direct access to these CPU operations on any integral types. For flag management it's wise to use an unsigned integral type such as unsigned int or unsigned char.
An express overview:
the trick is that every flag corresponds to a fixed bit. This is usually done with a power of 2 constant (ex: 1,2,4,8 which are binary coded as 0001,0010, 0100 and 1000).
constants are named because it's clearer than using litterals (ex: const unsigned FlagA=1, FlagB=2, FlagC=4;)
binary AND x & y ensures that only bits that are 1 in both x and y remain 1. So this is used to reset flags by "anding" with a value where the flag is 0. So x & FlagB reset all flags exept flag B
binary OR x | y any bits that are 1 either in x or y become 1. So it's used to set flags. Example: x | FlagB sets the flag B.
a binary AND is also a quick way to check if a flag is set: (x & FlagB) will be true if and only if the flag B was set.
EDIT: About your specific question on ifstream::open() parameters: it's a design choice, for convenience. As you can see there are 6 flags that influence the way the file is handled (some of them being used very rarely). So instead of providing each of the 6 flags every time, the standard decide that you'd provide them combined in an openmode. Variable number of arguments would not have been an alternative, as the called function would have to know how many arguments you've provided.
I don't understand what this code is doing at all, could someone please explain it?
long input; //just here to show the type, assume it has a value stored
unsigned int output( input >> 4 & 0x0F );
Thanks
bitshifts the input 4 bits to the right, then masks by the lower 4 bits.
Take this example 16 bit number: (the dots are just for visual separation)
1001.1111.1101.1001 >> 4 = 0000.1001.1111.1101
0000.1001.1111.1101 & 0x0F = 1101 (or 0000.0000.0000.1101 to be more explicit)
& is the bitwise AND operator. "& 0x0F" is sometimes done to pad the first 4 bits with 0s, or ignore the first(leftmost) 4 bits in a value.
0x0f = 00001111. So a bitwise & operation of 0x0f with any other bit pattern will retain only the rightmost 4 bits, clearing the left 4 bits.
If the input has a value of 01010001, after doing &0x0F, we'll get 00000001 - which is a pattern we get after clearing the left 4 bits.
Just as another example, this is a code I've used in a project:
Byte verflag = (Byte)(bIsAck & 0x0f) | ((version << 4) & 0xf0). Here I'm combining two values into a single Byte value to save space because it's being used in a packet header structure. bIsAck is a BOOL and version is a Byte whose value is very small. So both these values can be contained in a single Byte variable.
The first nibble in the resultant variable will contain the value of version and the second nibble will contain the value of bIsAck. I can retrieve the values into separate variables at the receiving by doing a 4 bits >> while taking the value of version.
Hope this is somewhere near to what you asked for.
That is doing a bitwise right shift the contents of "input" by 4 bits, then doing a bitwise AND of the result with 0x0F (1101).
What it does depends on the contents and type of "input". Is it an int? A long? A string (which would mean the shift and bitwise AND are being done on a pointer to the first byte).
Google for "c++ bitwise operations" for more details on what's going on under the hood.
Additionally, look at C++ operator precedence because the C/C++ precedence is not exactly the same as in many other languages.
How would i go about accessing the individual bits inside a c++ type, char or any c++ other type for example.
If you want access bit N:
Get: (INPUT >> N) & 1;
Set: INPUT |= 1 << N;
Unset: INPUT &= ~(1 << N);
Toggle: INPUT ^= 1 << N;
You would use the binary operators | (or), & (and) and ^ (xor) to set them. To set the third bit of variable a, you would type, for instance:
a = a | 0x4
// c++ 14
a = a | 0b0100
Note that 4’s binary representation is 0100
That is very easy
Lets say you need to access individual bits of an integer
Create a mask like this
int mask =1;
now, anding your numberwith this mask gives the value set at the zeroth bit
in order to access the bit set at ith position (indexes start from zero) , just and with (mask<
If you want to look at the nth bit in a number you can use: number&(1<<n).
Essentially the the (1<<n) which is basically 2^n(because you shift the 1 bit in ...0001 n times, each left shift means multiply by 2) creates a number which happens to be 0 everywhere but 1 at the nth position(this is how math works).
You then & that with number. This returns a number which is either 0 everywhere or a number that has a 1 somewhere(essentially an integer which is either 0 or not).
Example:
2nd bit in in 4, 4&(1<<2)
0100
& 0010
____
0000 = 0
Therefore the 2nd bit in 4 is a 0
It will also work with chars because they are also numbers in C,C++
I've read about [ostream] << hex << 0x[hex value], but I have some questions about it
(1) I defined my file stream, output, to be a hex output file stream, using output.open("BWhite.bmp",ios::binary);, since I did that, does that make the hex parameter in the output<< operation redundant?
(2)
If I have an integer value I wanted to store in the file, and I used this:
int i = 0;
output << i;
would i be stored in little endian or big endian? Will the endi-ness change based on which computer the program is executed or compiled on?
Does the size of this value depend on the computer it's run on? Would I need to use the hex parameter?
(3) Is there a way to output raw hex digits to a file? If I want the file to have the hex digit 43, what should I use?
output << 0x43 and output << hex << 0x43 both output ASCII 4, then ASCII 3.
The purpose of outputting these hex digits is to make the header for a .bmp file.
The formatted output operator << is for just that: formatted output. It's for strings.
As such, the std::hex stream manipulator tells streams to output numbers as strings formatted as hex.
If you want to output raw binary data, use the unformatted output functions only, e.g. basic_ostream::put and basic_ostream::write.
You could output an int like this:
int n = 42;
output.write(&n, sizeof(int));
The endianness of this output will depend on the architecture. If you wish to have more control, I suggest the following:
int32_t n = 42;
char data[4];
data[0] = static_cast<char>(n & 0xFF);
data[1] = static_cast<char>((n >> 8) & 0xFF);
data[2] = static_cast<char>((n >> 16) & 0xFF);
data[3] = static_cast<char>((n >> 24) & 0xFF);
output.write(data, 4);
This sample will output a 32 bit integer as little-endian regardless of the endianness of the platform. Be careful converting that back if char is signed, though.
You say
"Is there a way to output raw hex digits to a file? If I want the file to have the hex digit 43, what should I use? "
"Raw hex digits" will depend on the interpretation you do on a collection of bits. Consider the following:
Binary : 0 1 0 0 1 0 1 0
Hex : 4 A
Octal : 1 1 2
Decimal : 7 4
ASCII : J
All the above represents the same numeric quantity, but we interpret it differently.
So you can simply need to store the data as binary format, that is the exact bit pattern which is represent by the number.
EDIT1
When you open a file in text mode and write a number in it, say when you write 74 (as in above example) it will be stored as two ASCII character '7' and '4' . To avoid this open the file in binary mode ios::binary and write it with write () . Check http://courses.cs.vt.edu/~cs2604/fall00/binio.html#write
The purpose of outputting these hex digits is to make the header for a .bmp file.
You seem to have a large misconception of how files work.
The stream operators << generate text (human readable output). The .bmp file format is a binary format that is not human readable (will it is but its not nice and I would not read it without tools).
What you really want to do is generate binary output and place it the file:
char x = 0x43;
output.write(&x, sizeof(x));
This will write one byte of data with the hex value 0x43 to the output stream. This is the binary representation you want.
would i be stored in little endian or big endian? Will the endi-ness change based on which computer the program is executed or compiled on?
Neither; you are again outputting text (not binary data).
int i = 0;
output.write(reinterpret_cast<char*>(&i), sizeof(i)); // Writes the binary representation of i
Here you do need to worry about endianess (and size) of the integer value and this will vary depending on the hardware that you run your application on. For the value 0 there is not much tow worry about endianess but you should worry about the size of the integer.
I would stick some asserts into my code to validate the architecture is OK for the code. Then let people worry about if their architecture does not match the requirements:
int test = 0x12345678;
assert((sizeof(test) * CHAR_BITS == 32) && "BMP uses 32 byte ints");
assert((((char*)&test)[0] == 0x78) && "BMP uses little endian");
There is a family of functions that will help you with endianess and size.
http://www.gnu.org/s/hello/manual/libc/Byte-Order.html
Function: uint32_t htonl (uint32_t hostlong)
This function converts the uint32_t integer hostlong from host byte order to network byte order.
// Writing to a file
uint32_t hostValue = 0x12345678;
uint32_t network = htonl(hostValue);
output.write(&network, sizeof(network));
// Reading from a file
uint32_t network;
output.read(&network, sizeof(network);
uint32_t hostValue = ntohl(network); // convert back to platform specific value.
// Unfortunately the BMP was written with intel in-mind
// and thus all integers are in liitle-endian.
// network bye order (as used by htonl() and family) is big endian.
// So this may not be much us to you.
Last thing. When you open a file in binary format output.open("BWhite.bmp",ios::binary) it does nothing to stream apart from how it treats the end of line sequence. When the file is in binary format the output is not modified (what you put in the stream is what is written to the file). If you leave the stream in text mode then '\n' characters are converted to the end of line sequence (OS specific set of characters that define the end of line). Since you are writing a binary file you definitely do not want any interference in the characters you write so binary is the correct format. But it does not affect any other operation that you perform on the stream.