Is it possible to left align a printed out value in Fortran, instead of the default right alignment ?
For example, I have formatters like this
"(A3,10A12)"
and
"(A1, I12, 9F12.6)"
I would like the strings and numbers printed with these formatters to be left aligned, instead of right aligned.
Isn't right-justified the default?
If you first write a number to a string, you could use the intrinsics adjustL or adjustR to get either adjustment, then output the string.
Related
Okay, this will be a very beginner question, Though I can´t seem to find a good resource on this topic.
What I want is simple. take a string (or char*) and convert it to a binary file that I can store somewhere on my system.
Then, at a later date, I want to be able to read that binary file and convert it back to a string (or char*).
Now...
Whenever I search for this I often get to the concept of Serialisation, which is basically what I want.
There´s a problem though, most often "Boost-Serialisation" is recommended. Which (IMO) is quite heavy for just converting simple text to binary and converting simple binary to text. (ok, I know it isn´t THAT easy, but you get the idea)
There has got to be an easier way to handle this. I hope you can help me find it. :D
Thank you very much in advance for your answers.
How to convert Text to Binary (and Reverse)
There's nothing to do. Text is already data, and the in-memory presentation of all data in any modern computer is always binary.
You need to know what you mean. If you just mean "write it to a file" (in any representation), then just do that:
std::string my_text;
std::ofstream ofs("myfile.bin", std::ios::binary);
ofs.write(my_text.data(), my_text.size());
If you need some specific representation (different character sets, encodings or even (archive) file formats) you might need to do that conversion.
Oh, lest I forget, to read-back:
std::ifstream ifs("myfile.bin", std::ios::binary);
std::string my_text(std::istreambuf_iterator<char>(ifs), {});
You just have to use bit manipulation tricks. C and C++ both have operators that allow you to run integer values through logic gates. So for example:
x = 3 & 1 ;
Will set x to 1 because when you do an AND operation you're taking every bit from the left side of the & and the corresponding bit from the right side of the & and putting them through an And operation.
You can also do bit shifting. Where you shift the bits over by some number. For example:
y = 1 << 2;
Will shift all the bits in the integer 1 over by two, and the new rightmost bits will be set to zero.
So the way to do this is: for every byte in the string, do an AND operation with 128 and then if the value from that is zero then that means the left-most bit is zero (and you print "0"), if not then the value is one (so you print "1"). Then shift it to the left by one and do the operation again. Do that eight times and you've converted one byte to binary.
You could use ofstream/ifstream. They work similar to cout/cin except it reads and writes files instead of the console. Maybe this link is helpful: https://www.cplusplus.com/doc/tutorial/files/
I am using c++ under 64 bit linux, the compiler (g++) is also 64 bit. When I print the address of some variable, for example an integer, it is supposed to print a 64 bit integer, but in fact it prints a 48 bit integer.
int i;
cout << &i << endl;
output: 0x7fff44a09a7c
I am wondering where are the other two bytes. Looking forward to you help.
Thanks.
The printing of addresses in most C++ implementations suppresses leading zeroes to make things more readable. Stuff like 0x00000000000013fd does not really add value.
When you wonder why you will normally not see anything more than 48bit values in userspace, this is because the current AMD64 architecture is just defined to have 48bit of virtual address space (as can be seen by e.g. cat /proc/cpuinfo on linux)
They are there - they haven't gone anywhere - it's just the formatting in the stream. It skips leading zeros (check out fill and width properties of stream).
EDIT: on second thoughts, I don't think there is a nice way of changing the formatting for the default operator<< for pointers. The fill and width attributes can be changed if you are streaming out using the std::hex manipulator.
For fun you could use the C output and see if it's more like what you're after:
printf("0x%p");
I'm reading up on the write method of basic_ostream objects and this is what I found on cppreference:
basic_ostream& write( const char_type* s, std::streamsize count );
Behaves as an UnformattedOutputFunction. After constructing and checking the sentry object, outputs the characters from successive locations in the character array whose first element is pointed to by s. Characters are inserted into the output sequence until one of the following occurs:
exactly count characters are inserted
inserting into the output sequence fails (in which case setstate(badbit) is called)
So I get that it writes a chunk of characters from a buffer into the stream. And the number of characters are the bytes specified by count. But there are a few things of which I'm not sure. These are my questions:
Should I use write only when I want to specify how many bytes I want to write to a stream? Because normally when you print a char array it will print the entire array until it reaches the null byte, but when you use write you can specify how many characters you want written.
char greeting[] = "Hello World";
std::cout << greeting; // prints the entire string
std::cout.write(greeting, 5); // prints "Hello"
But maybe I'm misinterpreting something with this one.
And I often see this in code samples that use write:
stream.write(reinterpret_cast<char*>(buffer), sizeof(buffer));
Why is the reinterpret_cast to char* being use? When should I know to do something like that when writing to a stream?
If anyone can help me with these two question it would be greatly appreciated.
•Should I use write only when I want to specify how many bytes I want to write to a stream?
Yes - you should use write when there's a specific number of bytes of data arranged contiguously in memory that you'd like written to the stream in order. But sometimes you might want a specific number of bytes and need to get them another way, such as by formatting a double's ASCII representation to have specific width and precision.
Other times you might use >>, but that has to be user-defined for non builtin types, and when it is defined - normally for better but it may be worse for your purposes - it prints whatever the class designer choose, including potentially data that's linked from the object via pointers or references and static data of interest, and/or values calculated on the fly. It may change the data representation: say converting binary doubles to ASCII representations, or ensuring a network byte order regardless of the host's endianness. It may also omit some of the object's data, such as cache entries, counters used to manage but not logically part of the data, array elements that aren't populated etc..
Why is the reinterpret_cast to char* being use? When should I know to do something like that when writing to a stream?
The write() function signature expects a const char* argument, so this conversion is being done. You'll need to use a cast whenever you can't otherwise get a char* to the data.
The cast reflects the way write() treats data starting at the first byte of the object as 8-bit values without any consideration of the actual pre-cast type of the data. This ties in with being able to do things like say a write() of the last byte of a float and first 3 bytes of a double appearing next in the same structure - all the data boundaries and interpretation is lost after the reinterpret_cast<>.
(You've actually got to be more careful of this when doing a read() of bytes from an input stream... say you read data that constituted a double when written into memory that's not aligned appropriately for a double, then try to use it as a double, you may get a SIGBUS or similar alignment exception from your CPU or degraded performance depending on your system.)
basic_ostream::write and its counterpart basic_istream::read, is used to perform unformatted I/O on a data stream. Typically, this is raw binary data which may or may not contain printable ascii characters.
The main difference between read/write and the other formatted operators like <<, >>, getline etc. is that the former doesn't make any assumptions on the data being worked on -- you have full control over what bytes get read from and written to the stream. Compared to the latter which may skip over whitespaces, discard or ignore them etc.
To answer your second question, the reinterpret_cast <char *> is there to satisfy the function signature and to work with the buffer a byte at a time. Don't let the type char fool you. The reason char is used is because it's the smallest builtin primitive type provided by the language. Perhaps a better name would be something like uint8 to indicate it's really an unsigned byte type.
I was writing a piece of code where I use sizeof("somestring") as a parameter of a function, then I noticed the function was not returning the expected value, so I went to see the corresponding asm code and I found an unpleasant surprise. Does anyone have an explanation for this (see the picture)?
I know there are 1000+ different ways of doing this, I already implemented another one of them, but I do want to know the reason behind this behaviour.
For the curious, this is Visual Studio 2008 SP1.
The value 5 is correct. The constant includes the zero terminator byte. The display of 4 in the watch window is the one that does not appear to be correct.
String literals are of type "array of n const char" ([lex.string], ¶8), where n is the number of chars of which the string is composed. Since the string is null-terminated, sizeof will return the number of "normal" characters plus 1; the watch window is wrong, it's probably a bug (as #Gene Bushuyev said, it's probably interpreting it as a pointer instead of as a literal=array).
The fact that the value 5 is embedded into the code is normal, being sizeof a compile-time operator.
there is a C-String terminator '\0' on the end of every C-String so "pdfa" is actually the following char array {'p', 'd', 'f', 'a', '\0'} but the \0 will not be printed. Use strlen("pdfa") instead.
Remember that C strings contain an ending zero \0. Five is the correct value.
Well, 5 is the correct value of sizeof("PDFA"). 4 characters + trailing zero.
Also, keep in mind, that "The result does not necessarily correspond to the size calculated by adding the storage requirements of the individual members. The /Zp compiler option and the pack pragma affect alignment boundaries for members."
Speaking of Watch window, I think it is simply shows you the size of the pointer (const char*) itself. Try to recompile program in 64-bit mode and check what Watch window would show then. If I am right, then you will see 8.
The reason that Things Go Wrong™ here is that you have chosen a too low level of abstraction, the memcmp.
One level up you have strcmp and wcscmp.
And one level up from that you have std::string and std::wstring.
The "speed" (hah!) of your chosen lowest level possible abstraction is offset by
Incorrect result.
Inefficiency due to lack of type knowledge (wide or narrow string, your code doesn't know).
Inefficiency due to lack of data knowledge (uppercase or lowercase).
Instead of wasting time on fixing the problems of the inefficient lowest level code, and wasting time on figuring out baffling details of low level tools, use a higher and safer level of abstraction.
Just for the record, sizeof( "abcd" ) is 5. The watch window is probably, as Hans Passant remarked, displaying the size of a pointer. However, I disagree with Hans that the debugger generally has no way to know the size of an array: for a debug build it can know anything and everything about the original source, including the verbatim original source if needed (and it is displaying that verbatim original source, in context). So, that 4 is IMHO a bug one way or the other. Either a bug in the debugger code, or a bug in its design.
Sizeof is a operator that evaluates to a size_t,usually an unsigned int on 32 bit platforms. That is why you see it as 4 in the debugger. The sizeof operator is also an rvalue, so you cannot set a watch point on the memory. If you could, the location would contain 5. The size of your string plus terminator.
I have some existing code that I've used to write out an image to a bitmap file. One of the lines of code looks like this:
bfh.bfType='MB';
I think I probably copied that from somewhere. One of the other devs says to me "that doesn't look right, isn't it supposed to be 'BM'?" Anyway it does seem to work ok, but on code review it gets refactored to this:
bfh.bfType=*(WORD*)"BM";
A google search indicates that most of the time, the first line seems to be used, while some of the time people will do this:
bfh.bfType=0x4D42;
So what is the difference? How can they all give the correct result? What does the multi-byte character constant mean anyway? Are they the same really?
All three are (probably) equivalent, but for different reasons.
bfh.bfType=0x4D42;
This is the simplest to understand, it just loads bfType with a number that happens to represent ASCII 'M' in bits 8-15 and ASCII 'B' in bits 0-7. If you write this to a stream in little-endian format, then the stream will contain 'B', 'M'.
bfh.bfType='MB';
This is essentially equivalent to the first statement -- it's just a different way of expressing an integer constant. It probably depends on the compiler exactly what it does with it, but it will probably generate a value according to the endian-ness of the machine you compile on. If you compile and execute on a machine of the same endian-ness, then when you write the value out on the stream you should get 'B', 'M'.
bfh.bfType=*(WORD*)"BM";
Here, the "BM" causes the compiler to create a block of data that looks like 'B', 'M', '\0' and get a char* pointing to it. This is then cast to WORD* so that when it's dereferenced it will read the memory as a WORD. Hence it reads the 'B', 'M' into bfType in whatever endian-ness the machine has. Writing it out using the same endian-ness will obviously put 'B', 'M' on your stream. So long as you only use bfType to write out to the stream this is the most portable version. However, if you're doing any comparisons/etc with bfType then it's probably best to pick an endian-ness for it and convert as necessary when reading or writing the value.
I did not find the API, but according to http://cboard.cprogramming.com/showthread.php?t=24453, the bfType is a bitmapheader. A value of BM would most likely mean "bitmap".
0x4D42 is a hexadecimal value (0x4D for M and 0x42 for B). In the little endian way of writing (least significate byte first), that would be the same as "BM" (not "MB"). If it also works with "MB" then probably some default value is taken.
Addendum to tehvan's post:
From Wikipedia's entry on BMP:
File header
Note that the first two bytes of the BMP file format (thus the BMP header) are stored in big-endian order. This is the magic number 'BM'. All of the other integer values are stored in little-endian format (i.e. least-significant byte first).
So it looks like the refactored code is correct according to the specification.
Have you tried opening the file with 'MB' as the magic number with a few different photo-editors?