I am trying to read a formatted 2D array from a file on disk into a variable. I have the write operation, which is rather simple, but am stuck on reading the same file. Could someone point me to a sample/writeup on how to do this? The net seems saturated, but I can't find a useful article.
By the way, the reason for the formatted file is to keep it human readable as it contains configuration options.
I've actually found that the physical documentation that comes with the compiler is generally the most readable and most informative for Fortran compilers. Of course, that's not an option if you're using g95 or something like that.
Here's a pretty good page describing most of the technical specs of the read statement. Particularly, see the section on "Format Edit Descriptors" - very handy.
On a side note, if you have the exact write format string, you can usually drop that into a read format string, but if you're writing with WRITE(*,*) or something like that, you probably won't have a valid write format statement to use.
Finally, if you're dumping this out to ASCII so people can read it, and you don't have to worry about backward compatibility, consider dumping everything out as fixed-length fields, as they are by far the easiest things to read back in.
Sorry I can't think of better online resources, but Fortran is woefully underdocumented on the web. I remember once checking to see if g95 had Fortran reference docs, but they mostly only have docs on their specific compiler settings. Good luck, though!
Related
I wasn't able to recover a similar thread, but I'm surprised nobody asked something so elementary before.
I would like to convert a couple of (quite long, so long I don't want to do it manually) LaTeX notes into something I can post in a forum which supports TeX code between the [tex]...[/tex] BBCode delimiters.
Hence I would like to find an automated way to replace, say,
$e^{i\pi}$
with
[tex]e^{i\pi}[/tex]
and vice versa (easier); possibly something I can write once and for all and execute each time I need it. The best of all would be a solution which also converts \section{...}, \subsection{...} and other environments, but this isn't mandatory, since the only issue with these documents is that they contain tons of math.
My impression is that a professional tool like, say, PanDoc, is too much a "nuke the fly" approach (not to mention I'm not able to use it)... I'm able to use a couple of features of the sublime-text editor, so it would be wonderful if you want to help me referring to it. In any case, keep in mind that I feel kinda yahoo about regex-stuff and suchlike (I've always seen them like a sorcery, or better, I was too dumb to learn them), so please be verbose. :)
LaTeX is a Turing-complete programming language, so a simple regex won't do what you want in general. That said, Andrew Stacey specializes in compiling LaTeX code to various formats, e.g. today's post on G+. I bet he has a program that would parse your latex and emit bbcode.
This might be a simple question for most people out there but I'm like stuck on it.
I was wondering,most bank softwares or lets say any commercial software when closed at the end of the day and then re-opened the next,how do those programs remember everything from the previous day? I hope I make myself clear, thanks in advance for your guidance
Best.
This is not black magic.
The answer is by saving its data. You do this by putting it in a database, or writing data files.
The trick is to write your programs in a way that makes it easy to guarantee that you've restored the state you thought you saved.
A common approach is to use serialization. This means that you are able to take your giant data structure and recursively call a 'Save' function on it and its contained objects. This is very intuitive if you are taking advantage of object inheritance and polymorphism. Of course, you also write a 'Load' function to do the reverse.
You write your data in such a way that it can be read back in. For example, if you wanted to write a string, you might first write its length and then its characters. That way, when you read it you know how many bytes to allocate.
The above approach is pretty standard if you are writing binary file formats. In fact, it's the philosophy behind chunk-based formats such as AVI.
For text-based, you might choose to serialize your data in popular formats like XML or JSON. But you are only restricted by your imagination.
I'm looking forward to make my data usable during every restart of my program. The I am curious which is the best way to store to file than read back to program. i have been reading some stuff over the internet and the big question is XML or binary format? I'm still learning c++ i do not master it. the program's objects are of type string int int ... Which way do you recomand me to use and why?
One more thing does anyone know a good tutorial for this to binary or to XML?
Sorry for missing code part but i wanted to know some opinions of more advanced programmers than me. :P
In addition to Matthais comment:
I think the most obvious format is the correct on in your case, and that is just plain text.
Just serialize your data in plain text (often delimited by spaces). The benifit of PT is that it is human readable, human modifiable, easy to process using streams (>> tokenisation or boost tokeniser) and flexible and is much light weight than XML.
For example you might want to store
struct {
std::string name;
int age;
double height;
};
you you would just write:
John 21 5.4
Bill 31 4.9
or whatever have you. This is always convient, for example name could contain two words so:
John Smith
And the tokeniser would split on spaces and try and parse smith as an int but that is an easy problem to fix using delimiters. Such as ""
I do not agree. There are plenty options available. Two more i name you here:
1) You may look to a file format called json which has an own website (some of us don't do). It claims to be a lightweight data-interchange format.
2) There is a file-format called csv. The usage of which was already discussed on stackoverflow here
Do you need robust behavior even when your process is killed prematurely (e.g. due to a power cut, hardware failure or a serious bug within your code itself)?
If so, consider an "embedded" database such as SQLite or MS SQL Server Compact (etc.). The transactional nature of these systems should ensure you can't end-up with corrupted data, that would later prevent your program from starting correctly.
Also, some file systems support transactions (e.g. transactional NTFS in Windows Vista or later).
I am trying to create a program that will write a series of 10-30 letters/numbers to a disk in raw format (not to a file that the OS will read). Perhaps to make my attempt clearer, if you were to open the disk in a hex editor, you would see the 10-30 letters/numbers but a file manager such as Windows Explorer would not see it (because the data is not a file).
My goal is to be able to "sign" a disk with a series of characters and to be able to read and write that "signature" in my program. I understand NTFS signs its partitions with a NTFS flag as do other file systems and I have to be careful to not write my signature to any of those critical parts.
Are there any libraries in C++/C that could help me write at a low level to a disk and how will I know a safe sector to start writing my signature to? To narrow this down, it only needs to be able to write to NTFS, FAT, FAT32, FAT16 and exFAT file systems and run on Windows. Any links or references are greatly appreciated!
Edit: After some research, USB drives allow only 1 partition without applying hacking tricks that would unfold further problems for the user. This rules out the "partition idea" unfortunately.
First, as the commenters said, you should look at why you're trying to do this, and see if it's really a good idea. Most apps which try to circumvent the normal UI the user has for using his/her computer are "bad", in various ways.
That said, you could try finding a well-known file which will always be on the system and has some slack in the block size for the disk, and write to the slack. I don't think most filesystems would care about extra data in the slack, and it would probably even be copied if the file happens to be relocated (more efficient to copy the whole block at the disk level).
Just a thought; dunno how feasible it would be, but seems like it could work in theory.
Though I think this is generally a pretty poor idea, the obvious way to do it would be to mark a cluster as "bad", then use it for your own purposes.
Problems with that:
Marking it as bad is non-trivial (on NTFS bad clusters are stored in a file named something like $BadClus, but it's not accessible to user code (and I'm not even sure it's accessible to a device driver either).
There are various programs to scan for (and attempt to repair) bad clusters/sectors. Since we don't even believe this one is really bad, almost any of these that works at all will find that it's good and put it back into use.
Most of the reasons people think of doing things like this (like tying a particular software installation to a particular computer) are pretty silly anyway.
You'd have to scan through all the "bad" sectors to see if any of them contained your signature.
This is very dangerous, however, zero-fill programs do the same thing so you can google how to wipe your hard drive with zero's in C++.
The hard part is finding a place you KNOW is unused and won't be used.
I have a huge set of log lines and I need to parse each line (so efficiency
is very important).
Each log line is of the form
cust_name time_start time_end (IP or URL )*
So ip address, time, time and a possibly empty list of ip addresses or urls separated by semicolons. If there is only ip or url in the last list there is no separator. If there
is more than 1, then they are separated by semicolons.
I need a way to parse this line and read it into a data structure. time_start or
time_end could be either system time or GMT. cust_name could also have multiple strings
separated by spaces.
I can do this by reading character by character and essentially writing my own parser.
Is there a better way to do this ?
Maybe Boost RegExp lib will help you.
http://www.boost.org/doc/libs/1_38_0/libs/regex/doc/html/index.html
I've had success with Boost Tokenizer for this sort of thing. It helps you break an input stream into tokens with custom separators between the tokens.
Using regular expressions (boost::regex is a nice implementation for C++) you can easily separate different parts of your string - cust_name, time_start ... and find all that urls\ips
Second step is more detailed parsing of that groups if needed. Dates for example you can parse using boost::datetime library (writing custom parser if string format isn't standard).
Why do you want to do this in C++? It sounds like an obvious job for something like perl.
Consider using a Regular Expressions library...
Custom input demands custom parser. Or, pray that there is an ideal world and errors don't exist. Specially, if you want to have efficiency. Posting some code may be of help.
for such a simple grammar you can use split, take a look at http://www.boost.org/doc/libs/1_38_0/doc/html/string_algo/usage.html#id4002194
UPDATE changed answer drastically!
I have a huge set of log lines and I need to parse each line (so efficiency is very important).
Just be aware that C++ won't help much in terms of efficiency in this situation. Don't be fooled into thinking that just because you have a fast parsing code in C++ that your program will have high performance!
The efficiency you really need here is not the performance at the "machine code" level of the parsing code, but at the overall algorithm level.
Think about what you're trying to do.
You have a huge text file, and you want to convert each line to a data structure,
Storing huge data structure in memory is very inefficient, no matter what language you're using!
What you need to do is "fetch" one line at a time, convert it to a data structure, and deal with it, then, and only after you're done with the data structure, you go and fetch the next line and convert it to a data structure, deal with it, and repeat.
If you do that, you've already solved the major bottleneck.
For parsing the line of text, it seems the format of your data is quite simplistic, check out a similar question that I asked a while ago: C++ string parsing (python style)
In your case, I suppose you could use a string stream, and use the >> operator to read the next "thing" in the line.
see this answer for example code.
Alternatively, (I didn't want to delete this part!!)
If you could write this in python it will be much simpler. I don't know your situation (it seems you're stuck with C++), but still
Look at this presentation for doing these kinds of task efficiently using python generator expressions: http://www.dabeaz.com/generators/Generators.pdf
It's a worth while read.
At slide 31 he deals with what seems to be something very similar to what you're trying to do.
It'll at least give you some inspiration.
It also demonstrates quite strongly that performance is gained not by the particular string-parsing code, but the over all algorithm.
You could try to use a simple lex/yacc|flex/bison vocabulary to parse this kind of input.
The parser you need sounds really simple. Take a look at this. Any compiled language should be able to parse it at very high speed. Then it's an issue of what data structure you build & save.