Groovy read file backwards [duplicate] - regex

This question already has answers here:
Tail a file in Groovy
(2 answers)
Closed 6 years ago.
Any groovy way to read a file backwards? Took a look at Reader class, but nothing there seems to help. My use case is mostly finding the last line of a file that matches a condition (regex, contains a string etc.).
Later Edit:
I think this question is not really a duplicate of the tail one. I see tail as more of a 'live' processing of a file. My problem is more into processing big log files (size in tens of GB), so loading whole file into memory is not an option. The file content is static (not updated during processing).
For example, each time an object is updated we log a line saying which user did it and at some later point we need to the last user that generated that update.
Thanks

This worked for me:
String filePath = '/path/to/file'
File file = new File(filePath)
String sample = 'searchSample'
file.text.split('\n').reverse().find {it.contains(sample)}
UPD
Also maybe FileUtils#backWardsRead() will be helpful for you.

This doesn't read the file backwards, but it does process the lines backwards, which I believe lines up with the intended use case of finding the "last line in the file that matches a condition."
import java.util.stream.Collectors
new File('myfile.csv').newReader()
.lines()
.collect(Collectors.toList())
.reverse()
.find { line -> line ==~ myRegex }
This requires Java 8 as it uses the Stream API.

Related

C++ append to existing file [duplicate]

This question already has answers here:
How to write to middle of a file in C++?
(3 answers)
Closed 6 years ago.
I'm trying to take data from multiple files and append them into one file using fstream, however whenever I try to output to an existing file using
std::ofstream Out("mushroom.csv", std::ofstream::app);
it outputs to the end of the file, I want it to append to the same line, for example if this is the previous file:
1,2,3,4,5,6,7
8,9,10,11,12,13
I want it to become:
1,2,3,4,5,6,7,a,b,c
8,9,10,11,12,13,c,d,e
You can't. Files don't really have lines, they just store a bunch of characters/binary data. When you have
1,2,3,4,5
6,7,8,9,0
It only looks that was because there is an invisible character in there that tells it to write the second line to the second line. The actual data in the file is
1,2,3,4,5\n6,7,8,9,0
So you can see then end of the file is after the 0 and to get after the 5 you would need to seek into the middle of the file.
The way you can get around this is to read each line of the file into some container and then add your data to the end of each line. Then you would write that whole thing back o the file replacing the original contents.

C++: Get line by the line number [duplicate]

This question already has answers here:
Moving the file cursor up lines?
(3 answers)
C++ Get Total File Line Number
(7 answers)
Closed 8 years ago.
Is there a fast way to get a line from a text file by the line number? If I wanted only line 20 is there anything that will allow me to do something like get line 20? I know getline(in, line) reads in each line one at a time but I rather not call getline 20 times to get the 20th line.
Thanks!
No, there is no fast and magical method.
Background
Text file records are variable length. Each text line may vary in the number of characters. Fixed records are easy since their length is known.
To find the Nth record, you have to find the beginnings or endings of the text records. This is often performed by searching for a newline character. Still tedious.
Converting to Random Access
If the data is requested many times, a map or dictionary of the record line number and its position would be handy. Use the line number, retrieve the file postion, then set the file pointer to the given position.
Memory mapped file
If there is enough memory, the file could be read and stored in memory.
However, one still has to search for the newlines and count them to find line X.
Summary
There is no fast method to find the start of a text line in a file, the first time. In any case, the text must be searched for the newlines and the newlines counted.
There are methods to speed up the process, but those involve reading the file one or more times. The mapping of line numbers to file positions is fast but requires an initial scan. Loading the file into memory (memory mapping) requires reading the file into memory (first read) then searching the memory; also, the OS may only load portions of file that are requested and not the entire file.
No, you have to use a loop that will advance to the next line twenty times.
The reason it is not possible to do what you want is the way the file is structured: It's a sequence of bytes, and a new line is just another byte (or a sequence of two bytes, by the Windows convention).

c++ write to the beginning of current line of file [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I created a ofstream file.
How can I write to the beginning of current line on my file?
For example: I write:
a b c d e f
and now I want to add to the beginning the number of my letters (6) like this:
6 a b c d e f
You have to read the whole file in a byte array.
Then you write your "prefix" followed by you write the byte array to a tmp file.
Finally you have to delete the original file and rename the tmp file.
If you want to write at the beginning of an arbitrary line then you should read the whole file in an array of arrays of bytes, append your prefix to the line you want to edit and finally overwrite the original file.
HINT:-
If it is a text file then the best solution would be to flush the old contents into a temporary location, write what you need and append the old contents
Files are pretty static and don't support adding characters anywhere except at the end. If you need to add characters elsewhere, you need to rewrite the file. Also, files don't really have a concept of lines.
What you could do is recording the position of the file at the beginning of the line (using file.tellp()), write a couple of placeholders (e.g., spaces), and then the rest of the line. Once the line is complete, you'd reposition the write position (using file.seekp()) and overwrite some of the placeholders.
Personally, I wouldn't do anything like that! Instead, I would format the line into a std::ostringstream and, once completed write the line start information followed by the firmatted line (obtained from the std::ostringstream using str()). Well, ideally I'd write the information in one sequence directly to the file if it is readily available.
Files are essentially a stream of bytes that start at a specific location. The only way to insert new data in the front (or in the middle) of a file is to move the data that is after it. Since you are expecting to rewrite the first line, that would mean you would need to read the entire file, prepend your new data, and write out the entire (new) file over the existing one. You can do this with a single std::fstream object, but you will need to reset the file cursor to the beginning after you read the file. It would be more clear to read the file in using an std::ifstream object and then to overwrite the file with an std::ofstream object.
I have on my code:
file << args;
-->here I want to add to the beggining of this line a new argument.. (This argument has information of args But I must write args and after I have the information for the argument)
file << endl;

Efficiently read the last row of a csv file

Is there an efficient C or C++ way to read the last row of a CSV file? The naive approach involves reading in the entire file and then going to the end. Is there a quicker way this can be done (particularly if the CSV files are large)?
What you can do is guess the line length, then jump 2-3 lines before the end of the file and read the remaining lines. The last line you read is the last one, as long you read at least one line prior (otherwise, you still start again with a bigger offset)
I posted some sample code for doing a similar thing (reading last N lines) in this answer (in PHP, but serves as an illustration)
For implementations in a variety of languages, see
C++ : c++ fastest way to read only last line of text file?
Python : Efficiently finding the last line in a text file
Perl : How can I read lines from the end of file in Perl?
C# : Get last 10 lines of very large text file > 10GB c#
PHP : how to read only 5 last line of the txt file
Java: Read last n lines of a HUGE file
Ruby: Reading the last n lines of a file in Ruby?
Objective-C : How to read data from NSFileHandle line by line?
You can try working backwards. Read some size block of bytes from the end of the file, and look for the newline. If there is no newline in that block, then read the previous block, and so on.
Note that if the size of a row relative to the size of the file is large that this may result in worse performance, because most file caching schemes assume someone reads forward in the file.
You can use Perl module File::ReadBackwards.
Your problem falls into the same domain as searching for a string within a file. As you rightly point out, it's not always a great idea to read the entire file into memory and then search for your string. But you can always do the next best thing. Memory map your file. Then use your string searching functions to search backwards from the end of the string for your newline.
It's an extremely efficient mechanism with minimal memory footprint and optimum disk I/O.
Read with what and on what? On a Unix system, if you want the last line, it is as simple as
tail -n1 file.csv
If you want this approach from within your C++ app, you can do something like
system("tail -n1 file.csv")
if you want a quick and dirty way to accomplish this task.

Incorporating text files in applications?

Is there anyway I can incorporate a pretty large text file (about 700KBs) into the program itself, so I don't have to ship the text files together in the application directory ? This is the first time I'm trying to do something like this, and I have no idea where to start from.
Help is greatly appreciated (:
Depending on the platform that you are on, you will more than likely be able to embed the file in a resource container of some kind.
If you are programming on the Windows platform, then you might want to look into resource files. You can find a basic intro here:
http://msdn.microsoft.com/en-us/library/y3sk7e6b.aspx
With more detailed information here:
http://msdn.microsoft.com/en-us/library/zabda143.aspx
Have a look at the xxd command and its -include option. You will get a buffer and a length variable in a C formatted file.
If you can figure out how to use a resource file, that would be the preferred method.
It wouldn't be hard to turn a text file into a file that can be compiled directly by your compiler. This might only work for small files - your compiler might have a limit on the size of a single string. If so, a tiny syntax change would make it an array of smaller strings that would work just fine.
You need to convert your file by adding a line at the top, enclosing each line within quotes, putting a newline character at the end of each line, escaping any quotes or backslashes in the text, and adding a semicolon at the end. You can write a program to do this, or it can easily be done in most editors.
This is my example document:
"Four score and seven years ago,"
can be found in the file c:\quotes\GettysburgAddress.txt
Convert it to:
static const char Text[] =
"This is my example document:\n"
"\"Four score and seven years ago,\"\n"
"can be found in the file c:\\quotes\\GettysburgAddress.txt\n"
;
This produces a variable Text which contains a single string with the entire contents of your file. It works because consecutive strings with nothing but whitespace between get concatenated into a single string.